State and Specification Abstraction: Implemented by SDN controllers Bernhard Ager Parts are based on slide sets by Scott Shenker, Nick Feamster, Rob Sherwood, Jeremy Stribling, or as attributed Monday, 06 October 2014 1
OpenFlow programming is not easy Parallel tasks hard to implement (e.g., routing, monitoring) OpenFlow is a very low-level abstraction Controller only sees packets when the switch does not have a rule Race-conditions due to rule installation delay Monday, 06 October 2014 2
Why SDN? Why is SDN the right choice for the future? Obviously efficiency, scalability, security, functionality, versatility, Well, not directly What are the fundamental aspects of SDN? Obviously OpenFlow Actually, not at all Monday, 06 October 2014 3
Northbound API helps Abstracts away lowlevel details Enables quick customization using popular languages Vendor independent https://www.sdncentral.com/technology/the-sdngold-rush-to-the-northbound-api/2012/11/ Monday, 06 October 2014 4
Abstractions should separate 3 problems Constrained forwarding model Distributed state Detailed configuration (Actually, this is the minimum set) Monday, 06 October 2014 5
Forwarding abstraction Flexible forwarding model Behaviour specified by control program, not by forwarding abstraction Abstract away forwarding hardware For evolving beyond vendor-specific solutions Flexibility and vendor-neutrality both valuable One architecturally, the other economically Example: OpenFlow (-> last lecture) Monday, 06 October 2014 6
State distribution abstraction Control programs should not have to deal with problems caused by distributed state Complicated, source of errors Abstraction should hide state distribution details Natural abstraction: global network view Control program works on global view Input: global view, e.g., network graph Output: configuration of each device Monday, 06 October 2014 7
Specification abstraction Control program should express desired behaviour Control program should not be responsible for implementing that behaviour Natural abstraction: work on simplified model of network Only enough detail to specify goals Monday, 06 October 2014 8
Network Operating Systems (NOS) Distributed system that creates network view = State abstraction Runs on servers in the network Communicates with forwarding elements Get state from forwarding elements Send control directives to forwarding elements Utilizes forwarding abstraction Control program works on view of network Doesn t have to be a distributed system Computes configuration Monday, 06 October 2014 9
NOX: Towards an OS for Networks Natasha Gude, Teemu Koponen et al, SIGCOMM CCR, 2008. Talk by Martin Casado, A Network Operating System for OpenFlow, SDN Workshop 2009 The first take on a NOS Targeted to implement the abstractions described before Implemented in C++, Python runtime on top Monday, 06 October 2014 10
NOX components Monday, 06 October 2014 11
NOX programming interface Network view and namespace Writes to network view should be limited Monitoring DNS allows mapping host names to flows Events Event handlers handled according to priority Control: OpenFlow Higher-level services System libraries for routing, packet classification, standard services (DHCP and DNS), network filtering Monday, 06 October 2014 12
NOX s network view Network view contains Switch-level topology Locations of users Locations of hosts, middleboxes, other network elements Bindings between names and addresses DNS snooping Not in network view Current state of network traffic! Design decision for improved scalability Monday, 06 October 2014 13
NOX design overview Granularity: Scalability vs. flexibility Aware of switch-level topology, user locations, middleboxes, services, Forwarding management on flow level Switch abstraction and operation Adopts OpenFlow model Scalability: Leverage parallelism for flow arrivals Maintain centralized network view: indexed hash tables, distributed access with local caching Monday, 06 October 2014 14
NOX usage example: Policy lookup Enables User-based VLAN tagging User-based traffic isolation Ethane Network-wide access control Monday, 06 October 2014 15
Similar event-based controllers POX: NOX in Python Diverse (reusable) examples included in source code Ryu: Python Support for OpenStack Web-based GUI Replication support Supports OF, SNMP, Netconf, OpenVSwitch, Trema: Ruby/C comes with built-in simulator for testing OpenFaucet: Python/Twisted Monday, 06 October 2014 16
POX programming example def _handle_packetin (self, event): msg = of.ofp_packet_out() msg.buffer_id = event.ofp.buffer_id msg.in_port = packet_in.in_port msg.match = of.ofp_match.from_packet(packet) action = of.ofp_action_output(port = of.ofpp_flood) msg.actions.append(action) self.connection.send(msg) Observation: This is not as nice and clean as it should be Monday, 06 October 2014 17
These controllers don t solve all the problems Managing state Strong consistency? Persistence? Scalability? Reliability and robustness? Generality What if the switch doesn t speak OpenFlow? Monday, 06 October 2014 18
ONIX Network operating system (closed-source) Onix: A Distributed Control Platform for Large-scale Production Networks. T. Koponen, et al. OSDI 2010. Design goals: Generality, Scalability, Reliability, Simplicity, Performance Used in the Google backbone Speculation: asset for Nicira deal (1.26 Billion $) Monday, 06 October 2014 19
ONIX components Similar to NOX, but distributed by design Monday, 06 October 2014 20
ONIX contributions Centralized logic on distributed system Scalability and robustness Useful tools to application developers Simple, general API Flexible statedistributionmechanisms Flexible scalingandreliabilitymechanisms Production quality system Monday, 06 October 2014 21
Onix API: Data-centric Program on a network graph Nodes are physical network entities Attributes on edges and nodes represented by Network Information Base (NIB) ExternalchangesimportedtoNIB, NIB notifiesapplication LocalchangesexportedfromNIB tonetworkelements Uses OpenFlow Monday, 06 October 2014 22
Onix state abstraction(s) Different requirements depending on application Strong consistency vs. eventual consistency Different storage options Replicated transactional (SQL) storage Distributed hash table (DHT) Storage requirements specified at startup But during run-time only interact with NIB Strong consistency for critical, stable state Eventual consistency for dynamic, inconsistency-tolerant state Monday, 06 October 2014 23
Queries to the NIB Category Query Create, destroy Access attributes Notifications Synchronize Configuration Pull Purpose Find entities Create and remove entities Inspect and modify entities Receive updates about changes Wait for updates being exported to network elements and controllers Configure how state is imported to and exported from the NIB Ask for entities to be imported ondemand Monday, 06 October 2014 24
Distributed Coordination Distributed control programs need to synchronize NIB access themselves Zookeeper integration Apps can useitfor: Leader election Distributed locking Barriers. Monday, 06 October 2014 25
Onix scaling Support for distributed controllers Multiple dimensions for partitioning By task By subsetting NIB: only subset of NIB locally available Switches connect to subset of ONIX controllers By aggregation Partition network, e.g., by prefix, VLAN, MPLS, network area, Only distribute averaged NIB information to other partitions Monday, 06 October 2014 26
Example: ARP server Switch topology: candidate for hard state IP-MAC mappings: candidate for soft state Free choice: number of controllers for every switch Free choice: local ARP tables or single global ARP table Monday, 06 October 2014 27
Achieving reliability Network Element & link failures Connectivity infrastructure failures Onix failures Application s responsibility Assume reliable management connectivity; implement multi-pathing protocol Distributed coordination facilities for app failover Monday, 06 October 2014 28
Use cases Ethane Enterprise-level security control Per-flow decision of embedding Resulting high load handledwith DHT Distributed Virtual Switch Centralized managementof hypervisors virtual switches Security and QoS enforcedformobile VMs Single instancewith cold-standbysufficient Monday, 06 October 2014 29
Use cases (2) Multi-tenant data center Virtual L2 networkswith data center Isolation in addressingand resourceusage Translates to managing tons of tunnels RequirespartitioningofNIB due to size Scale-out IP router Scales with number of physical switches Manage routing with, e.g., Quagga Monday, 06 October 2014 30
ONIX does not solve all problems Not clear how to write applications That work on the same header space That need to perform non-local decisions Not clear if NIB abstraction is abstract enough Specification abstraction? Not really Monday, 06 October 2014 31
Floodlight Open-source Java-based OpenFlow Controller Apache licensed Support by community and Big Switch Offers a module loading system that make it simple to extend and enhance Supports a broad range of virtual and physical switches (OpenFlow and non-openflow) Support for OpenStack cloud orchestration platform http://www.projectfloodlight.org/ Monday, 06 October 2014 Source : Floodlight 32
Floodlight Fork from Beacon OpenFlow Controller Production-level performance but steep learning curve Good documentation Powerful REST API that allows applications to interact with it/switches Monday, 06 October 2014 33
Floodlight Architecture = Controller + Applications Source : BigSwitch Monday, 06 October 2014 34
Floodlight Controller Modules Controller modules implement functions that are of common use to a majority of applications, such as: discovering and exposing network states and events (topology, devices, flows) enabling controller communication with network switches (i.e., the OpenFlow protocol) managing Floodlight modules and shared resources such as storage, threads, tests a web UI and debug server (Jython) Monday, 06 October 2014 35
Floodlight Programming Model Source : BigSwitch Monday, 06 October 2014 36
Floodlight REST API Source : BigSwitch Monday, 06 October 2014 37
OpenDayLight Not just a controller : The OpenDaylight Project is a collaborative open source project that aims to accelerate adoption of Software-Defined Networking (SDN) and create a solid foundation for Network Functions Virtualization (NFV) for a more transparent approach that fosters new innovation and reduces risk Industry supported platform Robust and extensible open source codebase Common abstractions for northbound capabilities Integration with OpenStack Cloud applications Production-level performance Heavy industry support Monday, 06 October 2014 38
OpenDayLight : Design Choices Java chosen as an enterprise-grade cross platform compatible language Java interfaces are used for event listening, specifications and forming patterns Maven : Build automation system for Java OSGi : Allows dynamically loading bundles Allows registering dependencies and services exported For exchanging information across bundles Monday, 06 October 2014 39
Source : OpenDayLight Monday, 06 October 2014 40
OpenDayLight : Life of a Packet Source : OpenDayLight Wiki Monday, 06 October 2014 41
OpenDaylight state abstraction MD-SAL Model-Driven Service Abstraction Layer Performs event routing YANG: XML-based language for service abstraction State: Hierarchical database in MD-SAL Contains all kinds of information, including flow forwarding rules, topology information, etc Plugins can register for change notification events, e.g., the Flow Routing Manager (FRM) registers to flow forwarding rules Monday, 06 October 2014 42
Specification abstraction Monday, 06 October 2014 43
Network programming languages Many proposals (and implementations) Frenetic Pyretic Merlin Maple: Algorithmic policies NetCore: predecessor to NetKAT NetKAT: Kleene algebra with test (KAT) There are more Monday, 06 October 2014 44
SDN programming Monday, 06 October 2014 45
Reading state from the network Switch counts pkts/bytes for every switch rule Controller can poll statistics E.g., for traffic engineering Rules can be complex/complicated All traffic to 192.168.1.1 not targeted at port 80 needs two flow entries: 1. dstip = 192.168.1.1, dstport = 80 2. dstip = 192.168.1.1 # for monitoring Predicates can be automatically translated (dstip = 192.168.1.1) && (dstport!= 80) Monday, 06 October 2014 46
Further useful primitives Grouping by feature: to construct histograms Limited spacein switchtables Run-time can dynamically generatemonitoring rules as traffic arrives Example: GroupBy(srcip): One new rule per src IP Limit number of events E.g. more packets arriving while OF rule is installed: Programmer specifies: Limit (1) Monday, 06 October 2014 47
Frenetic: SQL-like query language Frenetic: A network programming language. Foster, N. et al. Acm SIGPLAN Notices 46.9 (2011). Example: Select(bytes) * Where(in:2 & srcport:80) * GroupBy([dstmac]) * Every(60) Every 60s provide statistics on port 80 traffic via switch port 2 Familiar abstraction Result is a streamthatcan be furtherprocessed E.g., to route heavy hitters differently Monday, 06 October 2014 48
Applications interact Modular SDN programming with Pyretic. J. Reich, C. Monsanto, N. Foster, J. Rexford, and D. Walker. USENIX NSDI 2013 Applications specify only part of packet handling Monday, 06 October 2014 49
Explanation of interaction problems Monolithic control programs need to interleave tasks Even simple policy requires programmer s finesse Put traffic to 10.0.0.0/8 on output port 2 Poll statistics for traffic coming from Internet every 5 seconds Traffic over VLAN 12 should be load balanced every 2 minutes Hard to debug, test, reuse Problem cannot be solved with slicing of the network! Solution: Modularization But how can modules be combined to handle the same traffic? Monday, 06 October 2014 50
Approach: Composition Solve combinatorial problem programmatically Parallel composition: Perform tasks in parallel monitor() + forward() Sequential composition: Perform one task after the other firewall() >> switch() Monday, 06 October 2014 51
Virtualization with Pyretic Pyretic works on Packets Actually, these are flow tuples Virtual header fields Label pushing allows to virtualize network Eg. use NAT for tunneling traffic = push IP address: [ 192.168.0.1 ] -> [ 10.0.0.1, 192.168.0.1 ] Useful for certain applications Anonymise packet while packet is inside network Transparently re-use address space Monday, 06 October 2014 52
Topology abstraction Network objects virtualize network topology Map virtual network ports to underlying network ports Perform forwarding between underlying ports Usage examples: Big switch comprising the whole network Split one switch into many virtual switches Work on subgraph of actual topology Implementation can make use of packet virtualization Monday, 06 October 2014 53
Example: Big Switch Edge links in underlying network (= port in virtual switch) MAC-Learning Switch Module V Packet ingress/egress { switch: A, [V, A], inport: 3, [4, 3], vswitch: V,... } C Routing in virtual switch/ fabric A Switching Fabric Module Monday, 06 October 2014 54 B
Constructing policies Policies control packet delivery Actions: A ::= drop identity fwd(port) flood push(h=v) pop(h) modify(h=v) Filters: F ::= all_packets no_packets match(h=v) ingress egress P & P (P P) ~P Queries: Q ::= packets(limit, [h]) count_packets(every, [h]) count_bytes(every, [h]) Composition: C ::= A Q P[C] (C + C) C >> C if_(f, C, C) Monday, 06 October 2014 55
Primitive Actions Receive a packet with location information as input and returns a set of located packets as output. Example: Input is { switch: A, inport: 3 }, output could be {{ switch: A, outport: 4 }}. drop produces the empty set (no packet is output) passthrough produces the input packet fwd(port) changes outport flood floods packet using minimum spanning tree push, pop, move change packet value stacks Monday, 06 October 2014 56
Predicates Needed to define conditions on packets If C is a policy and P is a predicate, then C[P] means to apply C to all packets for which P is true all_packets, no_packets return true or false, resp. ingress, egress return true if packet enters (leaves) match(h=v) return true if header h has value v P & P, P P, ~P: composition of predicates: Conjunction: P1 & P2 is true if both P1 and P2 are true Disjunction: P1 P2 is true if at least one of P1 and P2 are true Negation: ~P is true if P is false Monday, 06 October 2014 57
Query Policies Direct information from phys. network to controller Packets aren t moved to phys. port on phys. switch Rather, they are put into buckets on controller counts: packet goes to counts bucket packets: packet goes to packets bucket Applications are informed about arrival of packets packets: entire packets, counts: packet counts packets(1, [ srcip ]) passes each packet with new source address counts(every, [ srcip ]) calls listeners every every seconds with the number of times each source ip has been seen Monday, 06 October 2014 58
Policy Composition Policies can be simple actions (flood etc) Or query policies (not discussed here) Or conditional policies P[C] Applies policy C if the packet satisfies predicate P Composed so that they either act in parallel C1 + C2 (C1 and C2 together) or in sequence C1 >> C2 (first C1, then C2) Or conditionally if_(p, C1, C2) (if P, then C1, else C2) Monday, 06 October 2014 59
Policies by Example (1) Broadcast every packet entering the network flood Broadcast all packets entering switch s2 on port 3 match(switch=s2,inport=3)[flood] Drop all HTTP packets on switch s3 match(switch=s2,dstport=80)[drop] Forward packets on s2 from port 2 to port 3 match(switch=s2,inport=2)[fwd(3)] Drop all packets matching some predicate P if_(p, drop, passthrough) Monday, 06 October 2014 60
Policies By Example (2) Forward packets from port 2 to port 3 if they fulfill some predicate P (match(switch=s2,inport=2) & P)[fwd(3)] if_(p, drop, passthrough) >> match(switch=s2,inport=2)[fwd(3)] The combination of if_, drop, and passthrough is very convenient for composing policies! Monday, 06 October 2014 61
Examples (1) from pyretic.lib import * def main(): return flood Monday, 06 October 2014 62
Examples (2) from pyretic.lib import * def printer(pkt): print pkt def dpi(): q = packets(none, []) q.when(printer) return match(srcip= 1.2.3.4 )[q] def main(): return dpi() + flood Monday, 06 October 2014 63
Examples (2), Detail Process ALL the packets! def dpi(): q = packets(none, []) When I get a packet, print it! q.when(printer) Print all packets matching source IP 1.2.3.4 return match(srcip= 1.2.3.4 )[q] Monday, 06 October 2014 64
Examples (3) From pyretic.lib import * def learn(self): def update(pkt): self.p = if_(match(dstmac=pkt['srcmac'], switch=pkt['switch']), fwd(pkt['inport']), self.p) q = packets(1,['srcmac','switch']) q.when(update) self.p = flood + q def main(): return dynamic(learn)() The first time you see new source MAC on a switch, update the policy Monday, 06 October 2014 65
Example (3): Decision tree self.p = With every new MAC address, insert one new decision at the root of the tree. Monday, 06 October 2014 66
Pyretic limitations Dynamicity is limited Re-routing flows not easily possible What does this imply for link-failures? Only available for POX Multiple controllers? But still under development Monday, 06 October 2014 67
Network OS and controller conclusion Abstractionsareessential tosimplifymanagementof complex systems SDN = forwarding, specification, distribution abstractions Network operating systems Simplify managinga networkbyproviding interfaces Don t control the network themselves, apps do that! Line between NOS and controller is not sharp Floodlight, Opendaylight, and others call themselves controllers, but provide more services than, e.g., NOX NOS is just a specialtype of controller Monday, 06 October 2014 68