Duet: Cloud Scale Load Balancing with Hardware and Software Rohan Gandhi Y. Charlie Hu Hongqiang Harry Liu Guohan Lu Jitu Padhye Lihua Yuan Ming Zhang 1
Datacenter Load Balancer is Critical For Online Services Internet VIP1, VIP2 Load balancer (Mux) Terminology VIP Virtual IP DIP Direct IP Mux Multiplexer Store VIP-DIP mapping Split VIP traffic to DIPs Server... Server Server... Server DIP = 10.0.1.1 DIP = 10.0.2.2 DIP = 10.0.3.3 DIP = 10.0.4.4 bing.com OneDrive VIP1 = 1.1.1.1 VIP2 = 1.1.1.2 Load balancer provides high availability and scalability 2
Existing LBs Have Limitations Specialized Hardware LBs Too costly $100+ million for 15 Tbps Poor robustness 1+1 redundancy Software LBs Scale with demand Scale up/down according to VIP traffic High robustness n+1 redundancy Software LBs have cost and performance limitations 3
Software LB Design DC Network DIP2 Client VIP1.. SMux DIP1, DIP2.. DIP1 VIP1 Software LB Benefits: High robustness Scale with demand Ananta: Cloud-scale load balancing [SIGCOMM 13] 4
Latency inflation (msec) CPU Utilization Software LB has Limitations 10 100 80 1 60 40 0.1 100k 300k 450k Traffic (pkts/sec) 20 0 100k 300k 450k Traffic (pkts/sec) High latency inflation: 200 usec Low capacity: 300k pkts/sec 5k SMuxes needed at 15Tbps traffic in 50k server DC 5
How can we build high performance, low cost and robust load balancer? Duet ideas Use commodity switches as hardware Muxes Use software Muxes as a backstop 6
Can Switch Act As a Mux Switches offer: High capacity (500+ million pkts/sec) Low latency inflation (1 usec) Mux functionalities Split VIP traffic across DIPs Forward VIP traffic to DIPs ECMP Switch resources Tunneling 7
Implementing HMux on Switch VIP DIPs Forwarding Table ECMP Table Tunneling Table 1.1.1.1/32 1.1.1.2/32 10.0.0.1 10.0.0.2 20.0.0.1 20.0.0.2 VIP 1.1.1.1/32 1.1.1.2/32 Index 0 1 2 3 DIPs 10.0.0.1 10.0.0.2 20.0.0.1 20.0.0.2 HMux Dest: VIP Src: Client Dest: DIP Src: HMux Dest: VIP Src: Client 8
Key Design Challenges Limited switch memory High failure robustness VIP assignment VIP migration 9
Challenge 1: Switches have Limited Memory Workload: 100k+ VIPs and 1+ millions DIPs Single HMux cannot store all VIPs and DIPs Forwarding Table VIP 1.1.1.1/32 1.1.1.2/32 ECMP Table Index 0 1 2 3 Tunneling Table DIPs 10.0.0.1 10.0.0.2 20.0.0.1 20.0.0.2 Limits #VIPs Limits #DIPs Table Forwarding ECMP Tunneling Max. size 16k 4k 512 Max VIPs Max DIPs 10
Solution: Partitioning VIPs across VIP DIPs HMux-A 1.1.1.1/32 1.1.1.1/32 DIP1, DIP2 HMuxes HMux-B 1.1.1.2/32 VIP DIPs 1.1.1.2/32 DIP3, DIP4 Source DIP2 DIP4 Capacity VIPs DIPs Single HMux 16k 512 All HMuxes 16k 512 * 2k = 1M Fixed Scales with #DIPs 17
Challenge 2: High Robustness HMux-A 1.1.1.1/32 HMux-B 1.1.1.2/32 Source DIP1 DIP2 Availability during failure? Large number of VIPs? 12
Idea: Integrate SMux with HMux HMuxes SMuxes Duet Low latency High capacity High availability Scale to large #VIPs? 13
Solution: Use SMuxes As a Backstop VIP DIP 1.1.1.1/32 DIP1, DIP2 HMux-A HMux-B VIP DIP 1.1.1.2/32 DIP3, DIP4 SMux-A VIP DIP 1.1.1.1/32 DIP1, DIP2 1.1.1.2/32 DIP3, DIP4 SMux-B VIP DIP 1.1.1.1/32 DIP1, DIP2 1.1.1.2/32 DIP3, DIP4 14
Solution: Use SMuxes As a Backstop VIP = 1.1.1.1/32 HMux-A HMux-B VIP = 1.1.1.2/32 Source SMux-A SMux-B VIP = 1.1.1.0/24 VIP = 1.1.1.0/24 High availability during failure Scale to large #VIPs 14
VIP Traffic Distribution is Highly Skewed Top 10% VIPs carry 99% traffic Duet handles 86-99.9% traffic using HMuxes 16
Challenge 3: How to Assign VIPs? VIP-1?? VIP-2...???????? VIP-k SMux SMux Objective: Maximize traffic handled by HMuxes Input: VIP traffic, DIP locations Topology Constraints: Switch memory Link capacity 17
Challenge 4: How to Migrate VIPs? Current S1 Maintain VIP availability New S1 S2 S3 S2 S3 VIP1 VIP2 Withdraw and announce S1 Announce before withdraw S1 VIP2 VIP1 VIP1 VIP2 S2 S3 VIP2 VIP1 VIP2 VIP1 S2 S3 VIP1 Limited Memory VIP2 18
Solution: Migrate VIPs through SMuxes Current New S1 S1 S2 S3 S2 S3 VIP1 VIP2 VIP2 VIP1 SMux SMux S1 SMux SMux Withdraw VIPs from old location S2 S3 Announce VIPs from new location VIP1, VIP2 SMux VIP1, VIP2 SMux Fast and maintains VIP availability 19
Duet Extensions SNAT Support VIPs with 512+ DIPs Port based load balancing Load balancing in virtualized networks 20
Testbed 10 switches, 3 SMuxes 10 VIPs, 34 DIPs Experimental Setup Simulation Topology and traffic trace from Azure DC SMux1 SMux2 SMux3 High capacity High availability Low cost 21
Duet Provides High Capacity Traffic to SMux: 200k pkts/sec Traffic to SMux: 400k pkts/sec Traffic to HMux: 1.2m pkts/sec HMux has larger capacity and lower latency than SMux 22
Duet Provides High Availability HMux1 fails 38msec VIP1 traffic falls back SMux VIP1 on HMux1 VIP2 on HMux2 VIP3 on SMux 23
Number of SMuxes Duet Reduces Cost 5000 4000 3839 3000 2000 1000 0 2144 1462 820 33 64 142 362 1.25 2.5 5 10 VIP Traffic (Tbps) Software LB Duet Duet reduces cost by 10-24x 24
Summary Specialized and software LBs have cost and performance problems Duet key ideas: Use commodity switches as HMuxes Use small number of SMuxes as backstop Benefits: Low latency High capacity High robustness Low cost 25