IBM Haifa Research Lab On the effect of forwarding table size on SDN network utilization Rami Cohen IBM Haifa Research Lab Liane Lewin Eytan Yahoo Research, Haifa Seffi Naor CS Technion, Israel Danny Raz CS Technion, Israel
Overview SDN Traditional Network architecture: Data plane and control plane are collocated Control packets are sent inbound and Based on these packets the switches configure/update their FIB/RIB VLAN, TRILL, SNMP, ACL, MPLS, OSPF, analytics, RIP, BGP, RSVP,.. VLAN, TRILL, SNMP, ACL, MPLS, OSPF, analytics, RIP, BGP, RSVP,.. VLAN, TRILL, SNMP, ACL, MPLS, OSPF, analytics, RIP, BGP, RSVP,.. VLAN, TRILL, SNMP, ACL, MPLS, OSPF, analytics, RIP, BGP, RSVP,.. VLAN, TRILL, SNMP, ACL, MPLS, OSPF, analytics, RIP, BGP, RSVP,.. SDN Architecture: Data Plane and control plane are decoupled A centralized controller is used to configure the FIB A common configuration protocol: Openflow Openflow agent SDN Controller Openflow agent Openflow agent Openflow agent Openflow agent
Overview SDN (cont.) In SDN the controller has a global view of the network topology Enables fine granularity (e.g., per session) flow configuration considering global/local constraints A set of demands can be satisfied as a network flow optimization problem H H4: 8 H H: 6 H3 H4: SDN Controller H But what about the flow table? 4 H4 H 3 4 8 8 6 7 H3 6 H 3
Overview TCAM Enables to compare a data against predefined set of rules in a single operation Return an action (or address) associated with the first match Each rule consists of ternary bits (0,, or don t care ) Common usage: hardware based packet classification and flow table Comparing specific header fields (e.g., destination address), against rules reflecting the flow table. 0 Match 00* 0 00* Action Port Port Port 3 ** Port 3 4 00 Port 4
Overview TCAM (cont.) There's no such thing as a free lunch TCAM is expensive Silicon space (== power == $) Switches TCAM size is a limited resource Each flow crossing a switch requires at least one TCAM entry H 4 H4 H 3 4 8 8 6 7 H3 6 H
Overview TCAM (cont.) There's no such thing as a free lunch TCAM is expensive Silicon space (== power == $) Switches TCAM size is a limited resource Each flow crossing a switch requires at least one TCAM entry H 4 H4 8 4 H The number of 3flows crossing a switch 8 7 should be taken into account 6 H3 6 H 6
Overview TCAM (cont.) H 4 H4 H 3 4 8 8 6 7 H3 6 H H 4 H4 H 3 4 8 8 6 7 H3 6 H 7
Model and Problem Definition The Bounded Path Degree Max Flow Problem Given a Graph G= ( V, E) Link capacity ce e E A set of pairs s i, t ) and demands (associated with each pair) ( i d i Node flow table size b v v V Objective: find the maximum feasible flow between all pairs Feasible means:. Do not exceed link capacity. Do not exceed demand 3. Do not exceed node flow table size 8
Linear Problem Formulation Given a path p denote by c(p) the capacity of the path Namely, the capacity of the bottleneck edge in the path c( p) min c( e) = 8 3 e p c(p)= x(p) is the fractional of the path that is used Namely, x(p) c(p) is the flow sent through p. 8 3 x(p)=½ b(v) denotes the maximum number of paths that can cross a node (i.e. switch) v This bound is derived from the TCAM size 9
Linear Problem Formulation (cont.) Max for each edge e : for each pair { p p ( s, t )} { p e p} ( p) c( p) ( p) ( s, t ): x( p) i i x x { p p ( s, t )} i i i i c c for each path : 0 x(p) ( p) c( e) ( p) di Maximize the flow Don t exceed the edge capacity Don t exceed the flow demand 0
Linear Problem Formulation (cont.) Max for each edge e : for each pair for each node v : { p p ( s, t )} { p e p} x { p v p} ( p) ( p) ( s, t ): x( p) i i x x { p p ( s, t )} i i i i ( p) c c c ( p) ( p) c( e) ( p) di b( v) Maximize the flow Don t exceed the edge capacity Don t exceed the flow demand Don t exceed the switch flow table size * The model and all the results are applied also to the weighted version of the problem where each flow demand d i has a weight w i.
Linear Problem Formulation (cont.) Typical LP relaxation: Find a non Integer linear solution. Namely, Derive an Integer solution where x( i) 0, { } 0 x( i) In our LP problem: The Integer desired solution may consist fractional flows 0 x( i) Nevertheless, even a fractional flow requires a full TCAM entry. Thus the last constrain must be satisfied with an integral relaxation.
Integrality Gap s Link capacities All demands d i c( e) = = Node flow table size b( v) = s s 3 s k t t t 3 t k 3
Integrality Gap (cont.) s s Example: k=4 Any Integral solution can connect a single pair OPT int = (fractional flow is useless in this case since it consumes the same flow table resources) s 3 s 4 t t t 3 t 4 4
Integrality Gap (cont.) s s ½ ½ By routing a fractional flow of ½ one can obtain a total flow of k n Ο k ( ) = Ω( n) Since the gap is s 3 ½ s 4 ½ t t t 3 t 4
Path Selection Number of paths may be exponential Practical approach: A relative small number of candidate routing paths, can be used for each (s i,t i ) E.g., the k shortest paths, the k disjoint paths, etc. Enables operator to control and manage the network. Enables online modification of some pairs Nevertheless, our theoretical results support the general case The number of constrains is polynomial The number of paths carrying flow in polynomial The LP problem can be solved in a polynomial time 6
Main Algorithms and Theoretical Results ( ) ( ) An Ο log n, Ο log n bicriteria approximation algorithm Achieves a total flow ofω( ) OPT logn Switch flow table size may be exceeded by a factor of O( log n) Algorithm:. Solve the LP. Independently, for each path p choose it with probability x(p) 3. For each chosen path p, route a flow value of c(p)/(6 log n) Proof sketch: By scaling down the flows on each path the link capacities are violated with a negligible probability This scaling down reduce the total flow by a factor of (at most) log n The path degree constrains will be violated by more than a factor of with a negligible probability log n 7
Main Algorithms and Theoretical Results (cont.) ( ) ( ) An Ο, Ο log n bicriteria approximation algorithm In case that the flow of each path does not consume more than of the path capacity In the LP: x p ( ) logn logn 8
Simulation and Experimental Results Grpah Type Barabasi-Albert (BA) graph Simulating power-law based system 000 switches used to connect about 40,000 physical servers. BCube graph Simulating a complex data center topology ~000 switches used to connect about 40,000 physical servers. Mesh graph Simulating a general network, or a row cold storage data center ~00 switches In data center topology, these switches may have small forwarding table connecting ~000 micro servers In a general network these switches may connect tens of thousands endpoints. In all the graphs the link capacity is 0Gbps 9
Simulation and Experimental Results For each topology Traffic of 00,000 different (random) demands Flow table size 00-0000* H H4 H H3 H m p * Normalized using n considering the number of demands, the graph topologies (average path size and number of nodes), in an ideal flow distribution. 0
Simulation and Experimental Results Path Selection A set of k candidates paths has been selected between each pair k shortest paths k almost disjoint paths k almost disjoint paths between (si,t i ): Repeat the following two steps k times:. Find the shortest path between (s i,t i ). Significantly increase the weight of the path links. s t 0 0 0 0
Path Selection (cont.) s t 0 0 0 0 s t 0 0 0 0
Simulation and Experimental Results (cont.) In practice in all simulations we received an almost optimal solution (compare to the fractional solution) With respect to flows and forwarding table violation (less than %) 3
Path-degree Performance vs. Greedy Greedy algorithm: Find the maximum flow Remove flows violating the flow table size 4
Performance Vs. Flow Table Size Normalized forwarding table size: m p n
Forwarding table utilization Maximum flow table size: 000 6
7