Software Defined Networking Advanced Computer Network Technologies Petr Grygarek, 2014
Some buzzwords related to SDN concept Server/Network virtualization Virtual FW, LB, VPN concentrator, Network APIs for orchestration Controller-driven network Cisco OER/PFR ( SDN 0.9 ) Openflow, Cisco Application Policy Infrastructure Controller (APIC) Automatic server/network deployment Server/Network virtualization Openstack, VMWare, KVM, Virtual network appliances VMWare Vshield/Vpath VMWare NSX, Juniper SDN distributed routing Orchestration Integration with other service provider system Application-level Overlay networks Virtual control plane + virtual data plane e.g. VxLANs Extreme approach: network provides just L2 or L3 connectivity, everything else is overlaid IaaS Cloud ;-)
What is SDN? Network created and managed using abstract model APIs Customized control plane Control and state logically centralized Automatic topology discovery Custom forwarding logic, online traffic engineering, security enforcement, service chaining, Path Computation Element (PCE) Flow-level routing control granularity Programming of data plane forwarding engines Controller-agent model Latency, scalability and availability issues of control traffic to be taken into considerations Hybrid solution may be interesting Classical distributed control partially optimized by external control logic (hybrid model) Network-as-a-Service Virtualization of network devices and network topology Common abstraction model and model API Part of comprehensive orchestration platform (compute+storage+network service) Network integration with applications SDN Controller integrations Northbound: API provided to applications Southbound: utilizes network devices s programming APIs (OpenFlow, NetConf, ) normalisation of network control through open APIs of individual network devices as well as of network as a whole
SDN Advantages Network may adapt to application needs Easiest automation & orchestration Reduced deployment time Flexibility Cost reduction
Traditional vs. SDN Architecture
Examples of Application-Network Interactions Influence network control plane Utilize network state info e.g. proximity API Automated provisioning of computational elements in software-defined network environment Customer self-service cloud portals, orchestrators
Example: Adapting application for SDN Automatically scalable applications Pools of equal-functionality components (of various types) Application may automatically ask to deploy another components to scale based on required performance Failure of one component does not hurt Spine-and-leaf network architecture L3 preferred, ECMP routing
OpenFlow Note: Figures in this section were taken from official OpenFlow specification, version 1.4.0
OpenFlow Original aim: allows researchers to run experimental protocols on real HW without exposing vendor-specific OS and HW internals OF Controller + dumb HW devices + OF protocol In IT industry, it allows 3 rd parties to provide control plane logic independent of forwarding engines Customizable forwarding logic (OF controller) Programmed to OF switch based on received data packets or proactively Single HW engine can act as switch, router, firewall, load balancer or another network component only according to controller implementation Easy implementation of new protocols and features
OpenFlow Components
Nothing new under the Sun Can you remember Cisco Multilayer Switching? BUT Forwarding engine is only loosely coupled with controller (TCP/IP connection) Protocol between forwarding engine and controller is completely standardized and open for anybody BUT We have to take higher latency between controller and OF switch into account this time
OpenFlow Switch Internals One or more Flow Tables (pipeline) packet matching, manipulation and, forwarding Numbered sequentially, starting from 0 Packet processing stops when no next table # is associated by flow entry instructions GoTo instruction may only go forward (table with higher seq #) metadata may be passed together with packet between flow tables Group Table Kind of macros Entries contain action buckets OpenFlow protocol processor Processess control messages from controller Add/update/delete flow table entries, Sends async informational control messages to controller Sending data packets to/from controller Meter Table Flow rate-limiting
OF Switch Tables
OpenFlow-hybrid switches Network may be sliced so that only part of ports may be controlled by OF Classification criteria may to be defined to forward ingress packets either to OF pipeline or process them normally physical port number, VLAN tag, NORMAL and FLOOD logical ports are available to send packets from OF pipeline to standard switch processing
Flow Table Entries Priority defines search order only first matching entry applies table-miss entry applied if no match found Wildcard on all fields, priority of 0 packet dropped if table-miss entry is not present Match fields always matched against packet s current state (i.e. previous modification taken into account) Bitwise 3-state comparison masks Packet manipulation/flow/forwarding instructions Meter (rate limiting/packet marking) Idle timeout, Hard timeout flow removal handled by OF switch itself based no timeouts when flow is removed, message is sent to the controller (with flow statistics) Counters (matched packets, bytes) Cookie Used by controller to identify subset of entries when manipulating with entries groups Maskable bit-oriented field Not used for packet processing
OF Entry Matching Criteria Ingress ports Logical & physical Header fields (L2-L4) and some others MPLS ARP Both IPv4/IPv6 supported Including IPv6 extension headers Metadata Specified in Openflow Extensible Match format (OXM) TLV-based
Metadata Treated as 64-bit word of bit flags new metadata = old metadata & ~mask value & mask
Flow Table Entry Instructions Pipeline processing Modify metadata associated with the packet (Write- Metadata) Send packet to subsequent table (Goto-Table) Apply actions Actions to be taken immediately (Apply-Actions) performed in order specified by the list Add action to ActionSet to be processed when the packet leaves pipeline (Write-Actions/Clear-Actions) Existing action of the same type is always overwritten Use ApplyActions if multiple operations on the same field are neede d (e.g push multiple labels) Action processed in order given by specification (not in order of adding into ActionSet) Apply Meter Packet is dropped if InstructionSet is empty
Actions Set field set header field Push/Pop tag 802.1q VLAN header, MPLS header, PBB service instance I-Tag Change TTL IP/MPLS TTL, increment, decrement, set, copy inwards, copy outwards Send packet to Controller Forward packet to physical or logical port Output send (new) packet to output port Set Queue packet s queue ID on output port Group process packet using specified action group Drop packet
Action Groups Groups of actions may be referenced from multiple entries may be changed independently on entries pointing to them Group may contain multiple action buckets
Group Table Group ID Group Type defines which action buckets are executed All all buckets (broadcast/multicast packet) Indirect if only single bucket is defined Select switch selects one of buckets (internal hash function or weighted round-robin) Fast Failover execute first live bucket Hit Counters Per group and per group s bucket Action Buckets Bucket may be associated with port/group whose status determine bucket s liveness
OF Switch Ports Physical ports HW interfaces of the switch OF specification provides unified detailed view to L1 properties and statistics for the controller Including peer capabilities received via autonegotiation Logical ports Port-channels (LAG), tunnels, loopbacks, Packets coming from/destined to logical port have TunnelID metadata When passed to controller, both logical and physical ingress ports are identified Reserved ports define forwarding actions ALL (out): Flood to all ports (except ingress) Controller: Packet sent to/received from OF controller TABLE (out): send packet to first table in OF pipeline IN_PORT (out): send packet back via ingress port LOCAL: local networking stack Hybrid switches only: NORMAL (out): Forward using normal switch logic FLOOD (out): flood from all non-of ports (except ingress)
Meter Table (rate limiting) Contain per-flow meter entries Useful for QoS implementations May be referenced from flow table entries instruction sets If referenced from multiple flow entries, meter measures aggregate metrics Entries contain Meter ID, Meter Bands, Hit Counter (per meter and meter band) Multiple bands (rates) may be defined in single meter For each band, action is specified: DSCP remark / drop
Counters Per flow table # entries, # lookups, # matches Per flow entry Received packets/bytes, duration Per port Packets/bytes RX+TX, # of errors of various types Per queue TX packets/bytes, duration Per action group reference count, packet/bytes count, duration Per action group bucket packet/bytes count Per meter - packet/bytes count, duration Per meter band - packet/bytes count
Flow Table Maintenance Flow table synchronization automatic update of flow table to reflect changes done in another table Flow entry eviction Mechanism of discarding oldest flow entries in case if table is full May be turned on/off for each flow entry Flow importance field may influence eviction process
Special pipeline processing OF switch may be instructed to reassemble IP packet from fragments before sending it to pipeline Action may request packet to be buffered Only (configurable) part of the packet is sent to the controller Buffer ID is attached Controller may then reference the packet by Buffer ID instead of sending it back and forth over OF channel (mostly for Packet-out operation) Buffers automatically expire
OpenFlow Channel
OpenFlow Protocol OF messages carry OF switch configuration requests Events from OF switch to controller Data packets passed from OF switch to controller and packets injected by controller to OF switch Various transports allowed, mostly TCP or TCP/TLS (port 6633) Separate (out-of-band) TCP/IP network for communication between controller(s) and OF switches Reliable transport is expected Both synchronous and asynchronous messages Requests/Replies paired using XID Not processed by OF pipeline Connection initiated by OF switch Optional auxilliary connections Same pair of OF SW and controller Better utilization of OF SW parallell implementation May use different transport Switch may optimize order of processing of received messages Barrier message may be used to request SW to process all pending messages before proceeding Barrier should be placed between messages that depend one on the another Message bundling Atomic modifications (all changes are applied together or that none of them is applied) Transaction may even span multiple OF switches In case of OF channel break, OF switch may start to behave as standard switch (i.e. send all packets to NORMAL port)
OpenFlow Protocol Messages Controller to Switch Features request for OF switch identity and list of supported features Configuration Query/Set OF switch configuration parameters Modify State add/delete/modify flow entries, group entries and meters, set switch ports Including group modify based on match criteria Read State Packet Out Packet data + input port + actions OR buffer ID Barrier used to setup partial message ordering Role Request used to manage controller s HA Controller may ask OF SW to gain specific role Equal or master/slave controller s roles Slave cannot modify SW state nor receive async messages Role handover between redundant controllers is out of scope of OF specification Asynchronous-configuration set filter for messages asynchronously sent by OF Switch
OpenFlow Protocol Messages Switch to Controller (asynchronous) Packet-in Data packet from OF SW -> controller Flow-removed Inform about removal based on timer expiration or explicit DELETE from controller for flow entries with OFPFF_SEND_FLOW_REM flag Port-status Port status or port configuration changed
General symmetric messages Hello Control channel keepalives Echo request/reply manual SW/Ctrl liveness check Error Request processing failure Experimenter Standard way to pass arbitrary info Development of OF protocol extensions
Example OF Controller Implementation Self-learning switch IP router with RIP Stateless firewall
OpenStack
Motivation - Goal Datacenter environment for rapid service deployment True cloud with native SDN support Including SW-based network services to limit need of physical devices Virtual routers, FWaaS, LBaaS, VPNaaS, smooth integration of HW-based devices still needed Automated deployment of whole virtualized network (including security rules), customized OS and preconfigured applications Automation inherent in the solution not just 3 rd party tools to automate deployment on traditional network architectures Including zero-config physical capacity server implementation Scalability of compute, network and storage platform Elastic cloud with complete tenant isolation Support for new horizontally scalable applications Network needs to support prevailing horizontal traffic End-user self-service computing, network, storage Limited requirements on network devices capabilities, vendor neutrality
What is OpenStack? Opensource public/private cloud platform with full-scope integrated coverage of computing, network and storage capacity Automation is the core objective, inherently built into all features on all layers of OpenStack infrastructure. Well-defined service APIs Built-in tools and methods to communicate with cloud managing applications providing location information, load data etc.
OpenStack Scope
Why just OpenStack as a SDN solution? Vendor neutral Open source managed by non-profit Openstack Foundation developed by community with strong partner support Widely accepted Cloud infra (hosting) providers traditional network device vendors and niche players
OpenStack Components (subprojects) OpenStack Compute (code-name Nova) OpenStack Networking (code-name Quantum) OpenStack Block Storage (code-name Cinder) Corresponds to SAN services OpenStack Object Storage (code-name Swift) OpenStack Image Service (code-name Glance) OS images OpenStack Identity (code-name Keystone) OpenStack Dashboard (code-name Horizon)
OpenStack Server Node Types Compute (Nova) VM scheduling and placement bare metal or various hypervisors Network plugabble, API-driven networking L2 over L3 (GRE/VxLAN) replace VLANseverywhere DHCPaaS Controller
Solution Benefits (1) Network Traditional L3 network transport only is recommended No complicated L2 extension technologies to implement multisite setup Easier network management Leaf & Spine topology beneficial because of inherent network scalability (ECMP) and high-availability support Linear scalability (ports / costs), no upfront overinvestment No manual network configuration changes needed when implementing new customer or datacenter segments for a customer no ineffective interactions in customer setup deployment implemented by multiple platform-oriented teams including network-side / server-side switching config issues With SW-based routers customer-specific addresses may be automatically propagated to outside world Handles dynamic assignment of public IP addresses (floating IPs) including NAT DHCP integrated Network device vendors plugins facilitate integration with external network
Solution Benefits (2) Applications/services New emerging type of application with horizontal auto-scaling may be hosted effectively elastic cloud application or OpenStack plartform itself control control dynamic spawning of workload VMs in a standard way Whole customer computing environment (including virtualized network infra) may developed and transferred between development environment, private cloud and public cloud using open OpenStack API
OpenStack Requirements on Underlying Network In general, only single L3 network segment is required for all tenants data traffic couple of preconfigured shared system VLANs for control/management Good throughput & scalability Leaf & Spine architecture is a common practice but not an absolute must pets & cattle approach Standard Equal-Cost Multipath (ECMP) L3 core with traditional routing protocol like OSPF, IS-IS or EIGRP fits best No problems with number of supported VLANs, STP stability, no expensive multi-site VLAN extension technologies like VPLS or TRILL/FabricPath
Integrations with traditional networking VxLAN gateway Automated configuration of external connections Floading IP propagation by dynamic routing protocol Configuration of MPLS/VPN VRF instance including VPNaaS
3 rd party enhancements Distributed software-based routing (OpenVSwitch replacement) Pluggable network services (VMs or physical devices) Commercial OpenStack plugins LBaaS, FWaaS,
References OpenFlow https://www.opennetworking.org/images/stories/downloa ds/sdn-resources/onf-specifications/openflow/openflowspec-v1.4.0.pdf` http://mininet.org/ OpenStack http://docs.openstack.org/trainingguides/content/index.html http://www.slideshare.net/openstack/intro-grizzlyarchv1-19109550