TCP Pacing in Data Center Networks Monia Ghobadi, Yashar Ganjali Department of Computer Science, University of Toronto {monia, yganjali}@cs.toronto.edu 1
TCP, Oh TCP! 2
TCP, Oh TCP! TCP congestion control 2
TCP, Oh TCP! TCP congestion control Focus on evolution of cwnd over RTT. 2
TCP, Oh TCP! TCP congestion control Focus on evolution of cwnd over RTT. Damages 2
TCP, Oh TCP! TCP congestion control Focus on evolution of cwnd over RTT. Damages TCP pacing 2
The Tortoise and the Hare: Why Bother Pacing? 3
The Tortoise and the Hare: Why Bother Pacing? Renewed interest in pacing for the data center environment Small buffer switches Small round-trip times Disparity between the total capacity of the network and the capacity of individual queues 3
The Tortoise and the Hare: Why Bother Pacing? Renewed interest in pacing for the data center environment Small buffer switches Small round-trip times Disparity between the total capacity of the network and the capacity of individual queues Focus on tail latency cause by short-term unfairness in TCP 3
TCP Pacing s Potential 4
TCP Pacing s Potential Better link utilization on small switch buffers 4
TCP Pacing s Potential Better link utilization on small switch buffers Better short-term fairness among flows of similar RTTs: Improves worst-flow latency 4
TCP Pacing s Potential Better link utilization on small switch buffers Better short-term fairness among flows of similar RTTs: Improves worst-flow latency Allows slow-start to be circumvented Saving many round-trip time May allow much larger initial congestion window to be used safely 4
Contributions 5
Contributions Effectiveness of TCP pacing in data centers. 5
Contributions Effectiveness of TCP pacing in data centers. Benefits of using TCP diminish as we increase the number of concurrent connections beyond a certain threshold (Point of Inflection). 5
Contributions Effectiveness of TCP pacing in data centers. Benefits of using TCP diminish as we increase the number of concurrent connections beyond a certain threshold (Point of Inflection). Inconclusive results in previous works. 5
Contributions Effectiveness of TCP pacing in data centers. Benefits of using TCP diminish as we increase the number of concurrent connections beyond a certain threshold (Point of Inflection). Inconclusive results in previous works. Inter-flow bursts. 5
Contributions Effectiveness of TCP pacing in data centers. Benefits of using TCP diminish as we increase the number of concurrent connections beyond a certain threshold (Point of Inflection). Inconclusive results in previous works. Inter-flow bursts. Test-bed experiments. 5
Inter-flow Bursts 6
Inter-flow Bursts C: bottleneck link capacity 6
Inter-flow Bursts C: bottleneck link capacity B max :buffer size 6
Inter-flow Bursts C: bottleneck link capacity B max :buffer size N: longed lived flows. 6
Inter-flow Bursts C: bottleneck link capacity B max :buffer size N: longed lived flows. W: packets in every RTT in or non- manner. 6
Inter-flow Bursts C: bottleneck link capacity B max :buffer size N: longed lived flows. W: packets in every RTT in or non- manner. RTT 2RTT 3RTT 6
Inter-flow Bursts C: bottleneck link capacity B max :buffer size N: longed lived flows. W: packets in every RTT in or non- manner. RTT 2RTT 3RTT 6
Inter-flow Bursts C: bottleneck link capacity B max :buffer size N: longed lived flows. W: packets in every RTT in or non- manner. RTT 2RTT 3RTT 6
Inter-flow Bursts C: bottleneck link capacity B max :buffer size N: longed lived flows. W: packets in every RTT in or non- manner. RTT 2RTT 3RTT 6
Inter-flow Bursts C: bottleneck link capacity B max :buffer size N: longed lived flows. W: packets in every RTT in or non- manner. X: Inter-flow burst RTT 2RTT 3RTT 6
Inter-flow Bursts C: bottleneck link capacity B max :buffer size N: longed lived flows. W: packets in every RTT in or non- manner. X: Inter-flow burst RTT 2RTT 3RTT 6
Modeling 7
Modeling best case of non- 7
Modeling best case of non- worst case of 7
Modeling best case of non- worst case of RTT 7
Modeling best case of non- worst case of RTT RTT 7
Modeling best case of non- worst case of RTT RTT 7
Modeling best case of non- worst case of RTT RTT 7
Experimental Studies 8
Experimental Studies Flow of sizes 1,2, 3 MB between servers and clients. 8
Experimental Studies Flow of sizes 1,2, 3 MB between servers and clients. Bottleneck BW: 1,2, 3 Gbps 8
Experimental Studies Flow of sizes 1,2, 3 MB between servers and clients. Bottleneck BW: 1,2, 3 Gbps RTT: 1 to 1ms 8
Experimental Studies Flow of sizes 1,2, 3 MB between servers and clients. Bottleneck BW: 1,2, 3 Gbps RTT: 1 to 1ms Bottleneck utilization, Drop rate, average and tail FCT 8
Base-Case Experiment: One flow vs Two flows, 64KB of buffering, Utilization/Drop/FCT 9
Base-Case Experiment: One flow vs Two flows, 64KB of buffering, Utilization/Drop/FCT No congestion 9
Base-Case Experiment: One flow vs Two flows, 64KB of buffering, Utilization/Drop/FCT No congestion Congestion 9
Base-Case Experiment: One flow vs Two flows, 64KB of buffering, Utilization/Drop/FCT Bottleneck Link Utilization (Mbps) 3 25 2 15 1 5 No congestion Congestion 5 1 15 2 25 3 Time (sec) 9
Base-Case Experiment: One flow vs Two flows, 64KB of buffering, Utilization/Drop/FCT Bottleneck Link Utilization (Mbps) 3 25 2 15 1 5 No congestion Congestion 5 1 15 2 25 3 Time (sec) 9
Base-Case Experiment: One flow vs Two flows, 64KB of buffering, Utilization/Drop/FCT Bottleneck Link Utilization (Mbps) 3 25 2 15 1 5 No congestion 38% Congestion 5 1 15 2 25 3 Time (sec) 9
Base-Case Experiment: One flow vs Two flows, 64KB of buffering, Utilization/Drop/FCT Bottleneck Link Utilization (Mbps) 3 25 2 15 1 5 No congestion 38% Congestion 5 1 15 2 25 3 Time (sec) 9
Base-Case Experiment: One flow vs Two flows, 64KB of buffering, Utilization/Drop/FCT Bottleneck Link Utilization (Mbps) 3 25 2 15 1 5 No congestion 38% Congestion 5 1 15 2 25 3 Time (sec) 9
Base-Case Experiment: One flow vs Two flows, 64KB of buffering, Utilization/Drop/FCT Bottleneck Link Utilization (Mbps) 3 25 2 15 1 5 No congestion 38% Congestion 5 1 15 2 25 3 Time (sec) CDF 1.9.8.7.6.5.4.3.2.1.39.63.1.2.3.4 Flow Completion Time (sec) 9
Base-Case Experiment: One flow vs Two flows, 64KB of buffering, Utilization/Drop/FCT Bottleneck Link Utilization (Mbps) 3 25 2 15 1 5 No congestion 38% Congestion 5 1 15 2 25 3 Time (sec) CDF 1.9.8.7.6.5.4.3.2.1.39 1RTT.63.1.2.3.4 Flow Completion Time (sec) 9
Base-Case Experiment: One flow vs Two flows, 64KB of buffering, Utilization/Drop/FCT Bottleneck Link Utilization (Mbps) 3 25 2 15 1 5 No congestion 38% Congestion 5 1 15 2 25 3 Time (sec) CDF 1.9.8.7.6.5.4.3.2.1.39 1RTT 2RTTs.63.1.2.3.4 Flow Completion Time (sec) 9
Base-Case Experiment: One flow vs Two flows, 64KB of buffering, Utilization/Drop/FCT Bottleneck Link Utilization (Mbps) 3 25 2 15 1 5 No congestion 38% Congestion 5 1 15 2 25 3 Time (sec) CDF 1.9.8.7.6.5.4.3.2.1.39 1RTT 2RTTs.63.1.2.3.4 Flow Completion Time (sec) CDF 1.9.8.7.6.5.4.3.2.1.6.2.5 1 2 3 5 7 Flow Completion Time (sec) 9
Base-Case Experiment: One flow vs Two flows, 64KB of buffering, Utilization/Drop/FCT Bottleneck Link Utilization (Mbps) 3 25 2 15 1 5 No congestion 38% Congestion 5 1 15 2 25 3 Time (sec) CDF 1.9.8.7.6.5.4.3.2.1.39 1RTT 2RTTs.63.1.2.3.4 Flow Completion Time (sec) CDF 1.9.8.7.6.5.4.3.2.1 2RTTs.6.2.5 1 2 3 5 7 Flow Completion Time (sec) 9
Multiple flows: Link Utilization/Drop/Latency Buffer size 1.7% of BDP, varying number of flows 1
Bottleneck Link Utilization (Mbps) Multiple flows: Link Utilization/Drop/Latency Buffer size 1.7% of BDP, varying number of flows 1 8 6 4 2 N * PoI 1 2 3 4 5 6 7 8 9 1 Number of Flows 1
Bottleneck Link Utilization (Mbps) Bottleneck Link Drop (%) Multiple flows: Link Utilization/Drop/Latency Buffer size 1.7% of BDP, varying number of flows 1 8 6 4 2 N * PoI 1 2 3 4 5 6 7 8 9 1 Number of Flows N * PoI 1.8.6.4.2 1 2 3 4 5 6 7 8 9 1 Number of Flows 1
Multiple flows: Link Utilization/Drop/Latency Buffer size 1.7% of BDP, varying number of flows Bottleneck Link Utilization (Mbps) Bottleneck Link Drop (%) 1 8 6 4 2 1 2 3 4 5 6 7 8 9 1 Number of Flows N * PoI 1.8.6.4.2 N * PoI Average FCT (sec) 1.8.6.4.2 N * PoI 1 2 3 4 5 6 7 8 9 1 Number of Flows 1 2 3 4 5 6 7 8 9 1 Number of Flows 1
Multiple flows: Link Utilization/Drop/Latency Buffer size 1.7% of BDP, varying number of flows Bottleneck Link Utilization (Mbps) Bottleneck Link Drop (%) 1 8 6 4 2 1 2 3 4 5 6 7 8 9 1 Number of Flows N * PoI 1.8.6.4.2 N * PoI 1 2 3 4 5 6 7 8 9 1 Number of Flows Average FCT (sec) 99 th Percentile FCT (sec) 1.8.6.4.2 1 2 3 4 5 6 7 8 9 1 Number of Flows N * PoI 2 1.8 1.6 N * PoI 1.4 1.2 1.8.6.4.2 1 2 3 4 5 6 7 8 9 1 Number of Flows 1
Multiple flows: Link Utilization/Drop/Latency Buffer size 1.7% of BDP, varying number of flows Bottleneck Link Utilization (Mbps) Bottleneck Link Drop (%) 1 8 6 4 2 1 2 3 4 5 6 7 8 9 1 Number of Flows N * PoI 1.8.6.4.2 N * PoI 1 2 3 4 5 6 7 8 9 1 Number of Flows Average FCT (sec) 99 th Percentile FCT (sec) 1.8.6.4.2 1 2 3 4 5 6 7 8 9 1 Number of Flows N * PoI 2 1.8 1.6 Number of concurrent connections increase beyond a certain point the benefits of pacing diminish. N * PoI 1.4 1.2 1.8.6.4.2 1 2 3 4 5 6 7 8 9 1 Number of Flows 1
Multiple flows: Link Utilization/Drop/Latency Buffer size 3.4% of BDP, varying number of flows 11
Multiple flows: Link Utilization/Drop/Latency Buffer size 3.4% of BDP, varying number of flows Bottleneck Link Utilization (Mbps) 1 Bottleneck link drop(%) 8 6 4 2 1 2 3 4 5 6 7 8 9 1 Number of Flows.8.7.6.5.4.3.2.1 non N * PoI 2 4 6 8 1 Number of flows sharing the bottleneck Average FCT (sec) 99 th Percentile FCT (sec) 1.8.6.4.2 N * PoI 1 2 3 4 5 6 7 8 9 1 Number 2 N * of Flows PoI 1.8 1.6 1.4 1.2 1.8.6.4.2 1 2 3 4 5 6 7 8 9 1 Number of Flows 11
Multiple flows: Link Utilization/Drop/Latency Buffer size 3.4% of BDP, varying number of flows Bottleneck Link Utilization (Mbps) 1 Bottleneck link drop(%) 8 6 4 5 flows, BDP 125.4packets and buffer size 2 312 packets.2 1 2 3 4 5 6 7 8 9 1 N* = 8 flows. Number of Flows.8.7.6.5.4.3.2.1 non N * PoI 2 4 6 8 1 Number of flows sharing the bottleneck Average FCT (sec) Aagarwal et al.: Don t pace! Kulik et al.: Pace! 99 th Percentile FCT (sec) 1.8.6 N * PoI 1 2 3 4 5 6 7 8 9 1 Number 2 N * of Flows PoI 1.8 1.6 1.4 1.2 1.8.6.4.2 1 2 3 4 5 6 7 8 9 1 Number of Flows 1 flow, BDP 91 packets, buffer size 1 packets. N* = 9 flows. 11
N* vs. Buffer 12
N* vs. Buffer Bottleneck Link Utilization (Mbps) 1 8 6 4 2 5 1 15 2 25 Buffer Size (KB) 12
N* vs. Buffer Bottleneck Link Utilization (Mbps) 1 8 6 4 2.8.7 5 1 15 2 25 Buffer Size (KB) non Bottleneck link drop(%).6.5.4.3.2.1 5 1 15 2 25 Buffer size(kb) 12
N* vs. Buffer Bottleneck Link Utilization (Mbps) 1 8 6 4 2 5 1 15 2 25 Buffer Size (KB) Average CT (sec) 1.4 1.2 1.8.6.4.2 5 1 15 2 25 Buffer Size (KB).8.7 non.6 Bottleneck link drop(%).5.4.3.2.1 5 1 15 2 25 Buffer size(kb) 12
N* vs. Buffer Bottleneck Link Utilization (Mbps) 1 8 6 4 2 Bottleneck link drop(%).8.7.6.5.4.3.2.1 5 1 15 2 25 Buffer Size (KB) non 5 1 15 2 25 Buffer size(kb) Average CT (sec) 99 th Percentile CT (sec) 1.4 1.2 1.8.6.4.2 2.8 2.4 2 1.6 1.2.8.4 5 1 15 2 25 Buffer Size (KB) 5 1 15 2 25 Buffer Size (KB) 12
N* vs. Buffer Bottleneck Link Utilization (Mbps) 1 8 6 4 2 Bottleneck link drop(%).8.7.6.5.4.3.2.1 5 1 15 2 25 Buffer Size (KB) non 5 1 15 2 25 Buffer size(kb) Average CT (sec) 99 th Percentile CT (sec) 1.4 1.2 1.8.6.4.2 2.8 2.4 2 1.6 1.2.8.4 5 1 15 2 25 Buffer Size (KB) 5 1 15 2 25 Buffer Size (KB) 12
Clustering Effect: The probability of packets from a flow being followed by packets from other flows 13
Clustering Effect: The probability of packets from a flow being followed by packets from other flows Non-: Packets of each flow are clustered together. 13
Clustering Effect: The probability of packets from a flow being followed by packets from other flows Non-: Packets of each flow are clustered together. Paced: Packets of different flows are multiplexed. 13
Drop Synchronization: Number of Flows Affected by Drop Event 14
Drop Synchronization: Number of Flows Affected by Drop Event NetFPGA router to count the number of flows affected by drop events. 14
Drop Synchronization: Number of Flows Affected by Drop Event NetFPGA router to count the number of flows affected by drop events. CDF 1.9.8.7.6.5.4.3.2.1 1 2 3 4 Number of Flows Affected by Drop Event CDF 1.9.8.7.6.5.4.3.2.1 2 4 6 8 Number of Flows Affected by Drop Event CDF 1.9.8.7.6.5.4.3.2.1 5 1 15 2 25 3 35 4 Number of Flows Affected by Drop Event N: 48 N: 96 N: 384 14
Drop Synchronization: Number of Flows Affected by Drop Event NetFPGA router to count the number of flows affected by drop events. CDF 1.9.8.7.6.5.4.3.2.1 1 2 3 4 Number of Flows Affected by Drop Event CDF 1.9.8.7.6.5.4.3.2.1 2 4 6 8 Number of Flows Affected by Drop Event CDF 1.9.8.7.6.5.4.3.2.1 5 1 15 2 25 3 35 4 Number of Flows Affected by Drop Event N: 48 N: 96 N: 384 14
Future Trends for Pacing: per-egress pacing. 15
Future Trends for Pacing: per-egress pacing. Bottleneck Link Utilization (Mbps) 1 Bottleneck Link Drop (%) 8 6 4 2 per flow per host + per flow 6 12 24 48 96 192 Number of Flows 2 15 1 5 N * PoI per flow non per host + per flow 6 12 24 48 96 192 Number of flows Average RCT (sec) 99 th Percentile RCT (sec) 1.8.6.4.2 N * PoI per flow per host + per flow 6 12 24 48 96 192 Number of Flows 2.2 2 1.8 1.6 N * PoI per flow per host + per flow 1.4 1.2 1.8.6.4.2 6 12 24 48 Number of Flows 96 192 15
Future Trends for Pacing: per-egress pacing. Bottleneck Link Utilization (Mbps) 1 Bottleneck Link Drop (%) 8 6 4 2 per flow per host + per flow 6 12 24 48 96 192 Number of Flows 2 15 1 5 N * PoI per flow non per host + per flow 6 12 24 48 96 192 Number of flows Average RCT (sec) 99 th Percentile RCT (sec) 1.8.6.4.2 N * PoI per flow per host + per flow 6 12 24 48 96 192 Number of Flows 2.2 2 1.8 1.6 N * PoI per flow per host + per flow 1.4 1.2 1.8.6.4.2 6 12 24 48 Number of Flows 96 192 15
Future Trends for Pacing: per-egress pacing. Bottleneck Link Utilization (Mbps) 1 Bottleneck Link Drop (%) 8 6 4 2 per flow per host + per flow 6 12 24 48 96 192 Number of Flows 2 15 1 5 N * PoI per flow non per host + per flow 6 12 24 48 96 192 Number of flows Average RCT (sec) 99 th Percentile RCT (sec) 1.8.6.4.2 N * PoI per flow per host + per flow 6 12 24 48 96 192 Number of Flows 2.2 2 1.8 1.6 N * PoI per flow per host + per flow 1.4 1.2 1.8.6.4.2 6 12 24 48 Number of Flows 96 192 15
Future Trends for Pacing: per-egress pacing. Bottleneck Link Utilization (Mbps) 1 Bottleneck Link Drop (%) 8 6 4 2 per flow per host + per flow 6 12 24 48 96 192 Number of Flows 2 15 1 5 N * PoI per flow non per host + per flow 6 12 24 48 96 192 Number of flows Average RCT (sec) 99 th Percentile RCT (sec) 1.8.6.4.2 N * PoI per flow per host + per flow 6 12 24 48 96 192 Number of Flows 2.2 2 1.8 1.6 N * PoI per flow per host + per flow 1.4 1.2 1.8.6.4.2 6 12 24 48 Number of Flows 96 192 15
Future Trends for Pacing: per-egress pacing. Bottleneck Link Utilization (Mbps) 1 Bottleneck Link Drop (%) 8 6 4 2 per flow per host + per flow 6 12 24 48 96 192 Number of Flows 2 15 1 5 N * PoI per flow non per host + per flow 6 12 24 48 96 192 Number of flows Average RCT (sec) 99 th Percentile RCT (sec) 1.8.6.4.2 N * PoI per flow per host + per flow 6 12 24 48 96 192 Number of Flows 2.2 2 1.8 1.6 N * PoI per flow per host + per flow 1.4 1.2 1.8.6.4.2 6 12 24 48 Number of Flows 96 192 15
Conclusions and Future work Re-examine TCP pacing s effectiveness: Demonstrate when TCP pacing brings benefits in such environments. Inter-flow burstiness Burst-pacing vs. packet-pacing. Per-egress pacing. 16
Renewed Interest 17
Traffic Burstiness Survey Bursty is a word with no agreed meaning. How do you define a bursty traffic? If you are involved with a data center, is your data center traffic bursty? If yes, do you think that it will be useful to supress the burstiness in your traffic? If no, are you already supressing the burstiness? How? Would you anticipate the traffic becoming burstier in the future? monia@cs.toronto.edu 18
19
Base-Case Experiment: One RPC vs Two RPCs, 64KB of buffering, Latency 2
Multiple flows: Link Utilization/Drop/ Latency Buffer size: 6% of BDP, varying number of 21
Base-Case Experiment: One RPC vs Two RPCs, 64KB of buffering, Latency / Queue Occupancy 22
Base-Case Experiment: One RPC vs Two RPCs, 64KB of buffering, Latency / Queue Occupancy 22
Base-Case Experiment: One RPC vs Two RPCs, 64KB of buffering, Latency / Queue Occupancy 22
Base-Case Experiment: One RPC vs Two RPCs, 64KB of buffering, Latency / Queue Occupancy 22
Functional test 23
Functional test 23
Functional test 23
Functional test 23
RPC vs. Streaming Paced by ack clocking Bursty 1GE 1GE 1GE RTT = 1ms 24
Zooming in more on the flow 25
Multiple flows: Link Utilization/Drop/Latency Buffer size 6.8% of BDP, varying number of flows 26
Multiple flows: Link Utilization/Drop/Latency Buffer size 6.8% of BDP, varying number of flows Bottleneck Link Utilization (Mbps) 1 8 6 4 N * PoI 2 1 2 3 4 5 6 7 8 9 1 Number of Flows 26
Multiple flows: Link Utilization/Drop/Latency Buffer size 6.8% of BDP, varying number of flows Bottleneck Link Utilization (Mbps) 1 8 6 4 N * PoI 2 1 2 3 4 5 6 7 8 9 1 Number of Flows 1.8 non.6 Bottleneck link drop(%).4.2.2.4.6.8 1 2 4 6 8 1 Number of flows sharing the bottleneck 26
Multiple flows: Link Utilization/Drop/Latency Buffer size 6.8% of BDP, varying number of flows Bottleneck Link Utilization (Mbps) 1 8 6 4 N * PoI 2 1 2 3 4 5 6 7 8 9 1 Number of Flows Average FCT (sec) 1.8.6.4 N * PoI.2 1 2 3 4 5 6 7 8 9 1 Number of Flows 1.8 non.6 Bottleneck link drop(%).4.2.2.4.6.8 1 2 4 6 8 1 Number of flows sharing the bottleneck 26
Multiple flows: Link Utilization/Drop/Latency Buffer size 6.8% of BDP, varying number of flows Bottleneck Link Utilization (Mbps) Bottleneck link drop(%) 1 8 6 4 2 1 2 3 4 5 6 7 8 9 1 Number of Flows 1.8.6.4.2.2.4.6.8 N * PoI non 1 2 4 6 8 1 Number of flows sharing the bottleneck Average FCT (sec) 99 th Percentile FCT (sec) 1.8.6.4.2 1 2 3 4 5 6 7 8 9 1 Number of Flows 4 3.6 3.2 2.8 2.4 2 1.6 1.2.8.4 N * N * PoI PoI 1 2 3 4 5 6 7 8 9 1 Number of Flows 26