Consistency Trade-offs for SDN Controllers. Colin Dixon, IBM February 5, 2014



Similar documents
Accelerate SDN Adoption with Open Source SDN Control Plane

NoSQL Databases. Institute of Computer Science Databases and Information Systems (DBIS) DB 2, WS 2014/2015

Cassandra A Decentralized Structured Storage System

Distributed Data Stores

Transactions and ACID in MongoDB

Introduction to NOSQL

The Cloud Trade Off IBM Haifa Research Storage Systems

NoSQL replacement for SQLite (for Beatstream) Antti-Jussi Kovalainen Seminar OHJ-1860: NoSQL databases

extensible record stores document stores key-value stores Rick Cattel s clustering from Scalable SQL and NoSQL Data Stores SIGMOD Record, 2010

The Microsoft Large Mailbox Vision

Data Consistency on Private Cloud Storage System

SCALABILITY AND AVAILABILITY

NoSQL Database Options

Study and Comparison of Elastic Cloud Databases : Myth or Reality?

MongoDB in the NoSQL and SQL world. Horst Rechner Berlin,

Structured Data Storage

Distributed Systems. Tutorial 12 Cassandra

CAP Theorem and Distributed Database Consistency. Syed Akbar Mehdi Lara Schmidt


Practical Cassandra. Vitalii

In Memory Accelerator for MongoDB

Eventually Consistent

How To Build Cloud Storage On Google.Com

Amr El Abbadi. Computer Science, UC Santa Barbara

WINDOWS AZURE DATA MANAGEMENT

The CAP theorem and the design of large scale distributed systems: Part I

NoSQL Databases. Nikos Parlavantzas

Although research on distributed database systems. Consistency Tradeoffs in Modern Distributed Database System Design COVER FEATURE

CSE-E5430 Scalable Cloud Computing Lecture 11

The NoSQL Ecosystem, Relaxed Consistency, and Snoop Dogg. Adam Marcus MIT CSAIL

How to Choose Between Hadoop, NoSQL and RDBMS

1. Comments on reviews a. Need to avoid just summarizing web page asks you for:

High Availability Storage

Systems Engineering II. Pramod Bhatotia TU Dresden dresden.de

Where We Are. References. Cloud Computing. Levels of Service. Cloud Computing History. Introduction to Data Management CSE 344

Facebook: Cassandra. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Overview Design Evaluation

So What s the Big Deal?

SQL VS. NO-SQL. Adapted Slides from Dr. Jennifer Widom from Stanford

Lecture Data Warehouse Systems

Big Data & Scripting storage networks and distributed file systems

OpenFlow and Onix. OpenFlow: Enabling Innovation in Campus Networks. The Problem. We also want. How to run experiments in campus networks?

G Porcupine. Robert Grimm New York University

Big Data Management and NoSQL Databases

Distributed Systems (CS236351) Exercise 3

LARGE-SCALE DATA STORAGE APPLICATIONS

A COMPARATIVE STUDY OF NOSQL DATA STORAGE MODELS FOR BIG DATA

Can the Elephants Handle the NoSQL Onslaught?

A survey of big data architectures for handling massive data

SanDisk ION Accelerator High Availability

WINDOWS AZURE DATA MANAGEMENT AND BUSINESS ANALYTICS

Bigdata High Availability (HA) Architecture

COS 318: Operating Systems

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

Availability and Disaster Recovery: Basic Principles

High Throughput Computing on P2P Networks. Carlos Pérez Miguel

Do Relational Databases Belong in the Cloud? Michael Stiefel

Logistics. Database Management Systems. Chapter 1. Project. Goals for This Course. Any Questions So Far? What This Course Cannot Do.

these three NoSQL databases because I wanted to see a the two different sides of the CAP

5 Reasons Your Business Needs Network Monitoring

nosql and Non Relational Databases

Practical Online Filesystem Checking and Repair

Distributed Storage Systems

SMaRtLight: A Practical Fault-Tolerant SDN Controller

Hadoop and Map-Reduce. Swati Gore

F5 Application Delivery in a Virtual Network

6.S897 Large-Scale Systems

Develop an intelligent disaster recovery solution with cloud technologies

Design and Evolution of the Apache Hadoop File System(HDFS)

Lecture 3: Scaling by Load Balancing 1. Comments on reviews i. 2. Topic 1: Scalability a. QUESTION: What are problems? i. These papers look at

Big Data Storage, Management and challenges. Ahmed Ali-Eldin

Distributed Software Systems

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney

NoSQL, But Even Less Security Bryan Sullivan, Senior Security Researcher, Adobe Secure Software Engineering Team

Chapter 7: Distributed Systems: Warehouse-Scale Computing. Fall 2011 Jussi Kangasharju

References. Introduction to Database Systems CSE 444. Motivation. Basic Features. Outline: Database in the Cloud. Outline

Introduction to Database Systems CSE 444

Consistency Models for Cloud-based Online Games: the Storage System s Perspective

Trends in Application Recovery. Andreas Schwegmann, HP

International Journal of Advancements in Research & Technology, Volume 3, Issue 2, February ISSN

Disk Array Data Organizations and RAID

Data Management in the Cloud

High Availability with Postgres Plus Advanced Server. An EnterpriseDB White Paper

NoSQL - What we ve learned with mongodb. Paul Pedersen, Deputy CTO paul@10gen.com DAMA SF December 15, 2011

Cluster Computing. ! Fault tolerance. ! Stateless. ! Throughput. ! Stateful. ! Response time. Architectures. Stateless vs. Stateful.

Some quick definitions regarding the Tablet states in this state machine:

High availability on the Catalyst Cloud

The Google File System

Transcription:

Consistency Trade-offs for SDN Controllers Colin Dixon, IBM February 5, 2014

The promises of SDN Separa&on of control plane from data plane Logical centraliza&on of control plane Common abstrac&ons for network devices

Why control can t actually be centralized In prac&ce, we have a distributed cluster of controllers, rather than just one so that we can tolerate faults we can scale out our performance and maybe so that if the network par&&ons there are controllers on both sides

Distributed Systems Primer

Terminology Tradi&onally people talk about replicated state machines I m not going to People also talk about clients and servers instead, I m going to talk about clients and controllers to keep us thinking SDN When I say client, think switch or REST client

Things are easy when there s one controller Controller Clients (Switches, REST, etc.) If two clients interact with a controller, it s the same controller They can t get inconsistent results

How do we get N controllers to act as one? Clients (Switches, REST, etc.) Replicas (Controller Instances) Consistent if clients talk to the same controller

How do we get N controllers to act as one? Clients (Switches, REST, etc.) Replicas (Controller Instances) Problems if clients talk to different controllers

How do we get N controllers to act as one? Fixed if each client talks to a majority of controllers

How do we get N controllers to act as one? Client can resolve that blue is right based on logical &mestamps Fixed if each client talks to a majority of controllers

How do we get N controllers to act as one? Or if each controller confers with a majority

Does it need to be a majority? Property we used was that two majori&es must overlap on at least one controller Generaliza&on: sloppy quorum Read from R controllers Write to W controllers Ensure R + W > N, e.g., any read set overlaps with any write set at at least one controller Can help scale- out and availability of reads or writes by sacrificing the other

Where this goes wrong and the CAP theorem

Network partitions make everything complex Before the par&&on, everything goes normally

Network partitions make everything complex network par&&on With a par&&on, clients can read old state

Network partitions make everything complex network par&&on With a par&&on, clients can read old state

CAP Theorem Eric Brewer conjectured that you could not be consistent, available and par&&on tolerant Conjecture in 2000, proof in 2002 There are details, but the general idea is that you have to choose 2 of the 3 CP: favor consistency AP: favor availability CA: hope par&&ons don t happen

Network partitions make everything complex On this side things are s&ll happy majority minority On this side, writes break consistency and reads can be steale

Network partitions make everything complex If all clients are on this side, we re happy majority minority

How modern systems deal with the CAP theorem

Favor availability (AP) When there s a par&&on, just read from/write to as many controllers as you can and hope You lose consistence a\er one such write Ge]ng it back is hard In prac&ce, these systems try to get strong consistency in the first place Hard to program against or reason about Effec-vely abandons the idea of logically centralized Examples: Cassandra, MongoDB, Dynamo

Favor consistency (CP) When there s a par&&on, the minority side(s) stops being able to write Thus, the minority side(s) are not available This may be fine as long as the clients you care about are on the majority side Great if you can make the majority side the side with the Internet :- ) Examples: Chubby, ZooKeeper, some SQL DBs, MegaStore, Spanner

What about CA? This is assuming par&&ons will not happen Generally, people think this is a bad assump&on Good design + some expense can make par&&ons rare though: hdp://aphyr.com/posts/288- the- network- is- reliable In general, systems that assume par&&ons won t occur wind up with neither consistency nor availability when a par&&on does occur

Other fanciness The CAP theorem is weaker than many assert Assumes you want linearizability for consistency; loosely: all reads get the latest write Implies you must make the trade- off once Some modern systems dance around this Detect par&&ons and only sacrifice C when they happen; recovering it on healing is tricky Provide less than linearizability, e.g., all your reads reflect your previous writes

What about SDN?

Some thoughts on SDN and consistency we ve found the hard problems to be with distributed state which are app specific. Not sure how the controller can help much.

Should we pick AP, CP, or something else? We need switches to work even if they re par&&oned off! we need availability So, we should pick AP over CP AP generally means eventual consistency This means the controller and apps are not logically centralized from the programmer s view Hard for programmers to work with Ugh is this what mar&n was talking about?

SDN controller examples Floodlight Cassandra (AP) deals with eventual consistency Discussion w/floodlight devs on mailing list: hdps://lists.opendaylight.org/pipermail/controller- dev/2013- June/000592.html ONOS ZooKeeper (CP) for mapping switches to controller instances Instances own the state for their switches See September 30, 2013 Tech Work Stream Call

What can we do? We want to provide programmers with something easy (or at least easier) to use We want to provide network connec&vity and services on both sides of a par&&on CAP says we can t have it all, all of the &me We need to look at some of the fanciness

Forking timeline consistency When there s a par&&on, the minority side forks a\er some sen&nel &meout, e.g., 1 sec Declares new group by fiat Creates two, internally consistent, &melines

Resolving forks When the par&&on heals, something must merge the two &melines O\en easier than you d expect JGroups provides view changes and view merges

Exploiting locality Divide networking into regions have one one controller or a small number of controllers responsible for each region Controllers Network Devices

Exploiting locality Divide networking into regions have one one controller or a small number of controllers responsible for each region Controllers Network Devices

Exploiting locality This is (loosely) what ONOS does The MD- SAL could provide metadata to automa&cally distribute state per region Loca&ng controllers near their devices can Minimize latency/improve performance Lower risk of separa&ng devices from controllers Controller for each region can be a cluster also with lower risk of intra- cluster par&&ons

Scale-out

Using majorities prevents scale-out If we have N controllers and use majori&es, then (N/2)+1 must be involved in each ac&on That means our cluster can handle less than twice the load of a single controller! That s not exactly scale- out!

Scaling out while staying consistent If the workload is read heavy Read from K controllers for a small K, e.g., 1 or 2 Write to (N- K)+1 controllers; guarantees overlap Reads are fast and scale out, writes are slow If you can par&&on your state Map each par&&on to a subset of the controllers Use majori&es of the subsets Get scale- out of N/(size of subsets)

What about OpenDaylight?

What we do today Clustering Services Based on Infinispan replicated caches Which are in turn based on JGroups Covered in June 24, 2013 Tech Work Stream Call It seems as though it s AP, but not clear JGroups seems to have a more formal model MD- SAL Uses ZeroMQ for replica&on Doesn t seem to have a formal model

What we can do going forward ZooKeeper Strong consistency, stops working on minority side of par&&ons if they happen JGroups Seems to provide flexible ways to handle par&&ons and healing Help beder expose this through Infinispan Akka Seems to be best of breed AP with well- defined consistency both with and without par&&ons

Backup Slides

Consistency in the controller vs. the network Academic work on network updates: frene&c, pyre&c, Mahajan & Wadenhofer s consistent updates, etc. Focuses on maintaining invariants in forwarding behavior, e.g., no loops, even during updates to forwarding tables Very useful, but different from this talk

What about 2PC, active-backup, etc. Two Phase Comment (2PC) Special case of sloppy quorum with R=1, W=N Ac&ve- backup Special case where N=2, R=1, W=2 Ac&ve replicates writes to backup so W=2

References and pointers if you re interested

References Academic papers: Pyre&c [NSDI 13] Consistent Updates [HotNets 13] CAP for Networks [HotSDN 13] Exploi&ng Locality [HotSDN 13] Frene&c (Network Update) [SIGCOMM 12] Logically Centralized [HotSDN 12] Kandoo [HotSDN 12] F1 [SIGMOD 12] Spanner [OSDI 12] Onix [OSDI 10] MegaStore [SIGMOD 08] Dynamo [SOSP 07]

References Open source systems: CP: ZooKeeper AP: Cassandra, MongoDB, Riak, Akka, Infinispan?

References Long blog post looking at a lot of data on whether network par&&ons are rare or common and if they can be prevented hdp://aphyr.com/posts/288- the- network- is- reliable Conclusion is that they can be made very rare with significant exper&se and expense CAP Twelve Years Later hdp://www.infoq.com/ar&cles/cap- twelve- years- later- how- the- rules- have- changed Eric Brewer s recent thoughts on the CAP theorem

References The road to Akka Cluster and beyond hdp://www.slideshare.net/jboner/the- road- to- akka- cluster- and- beyond Good overview of distributed systems much beder than this deck and a bit about Akka