TCP loss sensitivity analysis ADAM KRAJEWSKI, IT-CS-CE



Similar documents
High-Speed TCP Performance Characterization under Various Operating Systems

ALTHOUGH it is one of the first protocols

Iperf Tutorial. Jon Dugan Summer JointTechs 2010, Columbus, OH

EVALUATING NETWORK BUFFER SIZE REQUIREMENTS

Data Networks Summer 2007 Homework #3

Application Level Congestion Control Enhancements in High BDP Networks. Anupama Sundaresan

TCP Tuning Techniques for High-Speed Wide-Area Networks. Wizard Gap

Frequently Asked Questions

Linux NIC and iscsi Performance over 40GbE

2 TCP-like Design. Answer

4 High-speed Transmission and Interoperability

DDoS attacks on electronic payment systems. Sean Rijs and Joris Claassen Supervisor: Stefan Dusée

Names & Addresses. Names & Addresses. Hop-by-Hop Packet Forwarding. Longest-Prefix-Match Forwarding. Longest-Prefix-Match Forwarding

WEB SERVER PERFORMANCE WITH CUBIC AND COMPOUND TCP

Using Linux Traffic Control on Virtual Circuits J. Zurawski Internet2 February 25 nd 2013

TCP in Wireless Mobile Networks

Prefix AggregaNon. Company X and Company Y connect to the same ISP, and they are assigned the prefixes:

Network Probe. Figure 1.1 Cacti Utilization Graph

Operating Systems and Networks Sample Solution 1

Low-rate TCP-targeted Denial of Service Attack Defense

Lecture 15: Congestion Control. CSE 123: Computer Networks Stefan Savage

RFC 6349 Testing with TrueSpeed from JDSU Experience Your Network as Your Customers Do

A Survey: High Speed TCP Variants in Wireless Networks

First Midterm for ECE374 03/09/12 Solution!!

Visualizations and Correlations in Troubleshooting

Improving the Performance of TCP Using Window Adjustment Procedure and Bandwidth Estimation

Applications. Network Application Performance Analysis. Laboratory. Objective. Overview

Quantifying the Performance Degradation of IPv6 for TCP in Windows and Linux Networking

TCP/IP Over Lossy Links - TCP SACK without Congestion Control

B-2 Analyzing TCP/IP Networks with Wireshark. Ray Tompkins Founder of Gearbit

Using TrueSpeed VNF to Test TCP Throughput in a Call Center Environment

Outline. TCP connection setup/data transfer Computer Networking. TCP Reliability. Congestion sources and collapse. Congestion control basics

The new frontier of the DATA acquisition using 1 and 10 Gb/s Ethernet links. Filippo Costa on behalf of the ALICE DAQ group

Mobile Communications Chapter 9: Mobile Transport Layer

First Midterm for ECE374 03/24/11 Solution!!

Application Note. Windows 2000/XP TCP Tuning for High Bandwidth Networks. mguard smart mguard PCI mguard blade

The Problem with TCP. Overcoming TCP s Drawbacks

Optimizing Throughput on Guaranteed-Bandwidth WAN Networks for the Large Synoptic Survey Telescope (LSST)

TCP and Wireless Networks Classical Approaches Optimizations TCP for 2.5G/3G Systems. Lehrstuhl für Informatik 4 Kommunikation und verteilte Systeme

Supermicro Ethernet Switch Performance Test Straightforward Public Domain Test Demonstrates 40Gbps Wire Speed Performance

End-to-End Network/Application Performance Troubleshooting Methodology

How do I get to

Improving Effective WAN Throughput for Large Data Flows By Peter Sevcik and Rebecca Wetzel November 2008

TCP over Wireless Networks

Simulation-Based Comparisons of Solutions for TCP Packet Reordering in Wireless Network

Final for ECE374 05/06/13 Solution!!

Transport Layer Protocols

Master s Thesis. Design, Implementation and Evaluation of

Mike Canney Principal Network Analyst getpackets.com

Network Security TCP/IP Refresher

Lecture Objectives. Lecture 07 Mobile Networks: TCP in Wireless Networks. Agenda. TCP Flow Control. Flow Control Can Limit Throughput (1)

Question: 3 When using Application Intelligence, Server Time may be defined as.

COMP 361 Computer Communications Networks. Fall Semester Midterm Examination

Linux TCP Implementation Issues in High-Speed Networks

Measure wireless network performance using testing tool iperf

CSE331: Introduction to Networks and Security. Lecture 9 Fall 2006

TCP for Wireless Networks

Homework 2 assignment for ECE374 Posted: 02/21/14 Due: 02/28/14

TCP/IP Inside the Data Center and Beyond. Dr. Joseph L White, Juniper Networks

A Passive Method for Estimating End-to-End TCP Packet Loss

A Survey on Congestion Control Mechanisms for Performance Improvement of TCP

Sockets vs. RDMA Interface over 10-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck

Answer: that dprop equals dtrans. seconds. a) d prop. b) d trans

La couche transport dans l'internet (la suite TCP/IP)

TCP Offload Engines. As network interconnect speeds advance to Gigabit. Introduction to

TCP/IP Optimization for Wide Area Storage Networks. Dr. Joseph L White Juniper Networks

Quality of Service in the Internet. QoS Parameters. Keeping the QoS. Traffic Shaping: Leaky Bucket Algorithm

1. The subnet must prevent additional packets from entering the congested region until those already present can be processed.

Where IT perceptions are reality. Test Report. OCe14000 Performance. Featuring Emulex OCe14102 Network Adapters Emulex XE100 Offload Engine

Network Friendliness of Mobility Management Protocols

Test Methodology White Paper. Author: SamKnows Limited

How Router Technology Shapes Inter-Cloud Computing Service Architecture for The Future Internet

Gateway Strategies for VoIP Traffic over Wireless Multihop Networks

The Fundamentals of Intrusion Prevention System Testing

Burst Testing. New mobility standards and cloud-computing network. This application note will describe how TCP creates bursty

File Transfer Protocol Performance Study for EUMETSAT Meteorological Data Distribution

TCP in Wireless Networks

Challenges of Sending Large Files Over Public Internet

Performance Evaluation of Different TCP Congestion Control Schemes in 4G System

TCP Labs. WACREN Network Monitoring and Measurement Workshop Antoine Delvaux perfsonar developer

15-441: Computer Networks Homework 2 Solution

CSE 473 Introduction to Computer Networks. Exam 2 Solutions. Your name: 10/31/2013

The Data Replication Bottleneck: Overcoming Out of Order and Lost Packets across the WAN

Throughput Issues for High-Speed Wide-Area Networks

Architecture and Performance of the Internet

Netest: A Tool to Measure the Maximum Burst Size, Available Bandwidth and Achievable Throughput

Campus Network Design Science DMZ

APPENDIX 1 USER LEVEL IMPLEMENTATION OF PPATPAN IN LINUX SYSTEM

Effects of Filler Traffic In IP Networks. Adam Feldman April 5, 2001 Master s Project

QoS Parameters. Quality of Service in the Internet. Traffic Shaping: Congestion Control. Keeping the QoS

Based on Computer Networking, 4 th Edition by Kurose and Ross

TCP over Multi-hop Wireless Networks * Overview of Transmission Control Protocol / Internet Protocol (TCP/IP) Internet Protocol (IP)

Transcription:

TCP loss sensitivity analysis ADAM KRAJEWSKI, IT-CS-CE

Original problem IT-DB was backuping some data to Wigner CC. High transfer rate was required in order to avoid service degradation. 4G out of... 10G Meyrin But the transfer rate was: Initial: ~1G After some tweaking: ~4G Wigner The problem lies within TCP internals! ADAM KRAJEWSKI - TCP CONGESTION CONTROL 2 / 37

TCP introduction TCP (Transmission Control Protocol) is a protocol providing reliable, connection-oriented transport over network. It operates at Transport Layer (4th) of the OSI reference model. Defined in RFC 793 from 1981. [10 Mb/s at that time! ] Its reliability means that it guarantees no data will be lost during communication. It is connection-oriented because it creates and maintains a logical connection between sender and receiver. Application Presentation Session Transport Network Data Link Physical ADAM KRAJEWSKI - TCP CONGESTION CONTROL 3 / 37

TCP operation basics: connection init Connection initiation: 3-way handshake Sequence counters setup: Client sends random SEQ number (ISN Initial Sequence Number) Server confirms it (ACK=SEQ+1) and sends its own SEQ number Client confirms and syncs to server s SEQ number Client Server ADAM KRAJEWSKI - TCP CONGESTION CONTROL 4 / 37

TCP is reliable TCP uses SEQ and ACK numbers to monitor which segments have been successfully transmitted. Lost packets are easily detected and retransmitted. This provides reliability....but do we really need to wait for every ACK? Timeout Client Server ADAM KRAJEWSKI - TCP CONGESTION CONTROL 5 / 37

TCP window (1) Let us assume that we have a client and a server connected with a 1G Ethernet link with 1 ms delay... RTT = 2ms, B = 1 Gbps 2ms CLIENT DATA ACK SERVER If we send packets of 1460 bytes each and wait for acknowledgment, then the effective transfer rate is 1460B/2ms = 5.84 Mbps. So we end up using 0.5% of available bandwidth. ADAM KRAJEWSKI - TCP CONGESTION CONTROL 6 / 37

TCP window (2) In order to use the full bandwidth we need to send at least B RTT unacknowledged bytes. This way we make sure we will not just sleep and wait for ACKs to come, but keep sending traffic in the meantime. 2ms CLIENT 12 34 5 ACK-1 ACK-2 ACK-3 ACK-4 ACK-5 SERVER The B RTT factor is referred to as Bandwidth-Delay product. ADAM KRAJEWSKI - TCP CONGESTION CONTROL 7 / 37

TCP window (3) Client sends a certain amount of segments one after another. As soon as first ACK is received it assumes transmission is fine and continues to send segments. Basically, at each time t 0 there are W unacknowledged bytes on the wire. W is referred to as TCP Window Size W!= const Client ACK=1461 Server 1 W 1461 2921 4381 5841 7301 8761 SEQ ACK=2921 ADAM KRAJEWSKI - TCP CONGESTION CONTROL 8 / 37

Window growth It starts with a default value wnd W def W DROP Then, it grows by 1 every RTT: wnd wnd + 1 But if a packet loss is detected, the window is cut in half: wnd wnd/2 W 1 0.5 W 1 W def α α This is an AIMD algorithm (Additive Increase Multiplicative Decrease). t ADAM KRAJEWSKI - TCP CONGESTION CONTROL 9 / 37

Congestion control Network congestion appears when there is more traffic that the switching devices can handle. TCP prevents congestion! It starts with a small window thus low transfer rate Then the window grows linearly thus increasing transfer rate TCP assumes that congestion is happening when there are packet losses. So in case of a packet loss it cuts the window size in half thus reducing transfer rate. B [Gbps] B c 0.5 B c B 0 α α t [s] ADAM KRAJEWSKI - TCP CONGESTION CONTROL 10 / 37

Long Fat Network problem Long Fat Networks (LFNs) high bandwidth-delay product (B RTT) Example: CERN link to Wigner. To get full bandwith on 10G link with 25ms RTT we need a window: W 25 = 10Gbps 25ms 31. 2 MB If drop occurs at peak throughput... W drop = 15. 6 MB T Wdrop W 25 = 15. 6 10 6 25ms 108h ADAM KRAJEWSKI - TCP CONGESTION CONTROL 11 / 37

TCP algorithms (1) The default TCP algorithm (reno) behaves poorly. Other congestion control algorithms have been developed such as: BIC CUBIC Scalable Compund TCP Each of them behaves differently and we want to see how they perform in our case... ADAM KRAJEWSKI - TCP CONGESTION CONTROL 12 / 37

Testbed Two servers connected back-to-back. How the congestion control algorithm performance changes with packet loss and delay? Tested with iperf3. Packet loss and delay emulated using NetEm in Linux: tc qdisc add dev eth2 root netem loss 0.1 delay 12.5ms ADAM KRAJEWSKI - TCP CONGESTION CONTROL 13 / 37

Bandwidth [Gbps] 0 ms delay, default settings 10 9 8 7 6 5 4 3 cubic reno bic scalable 2 1 0 0.0001 0.001 0.01 0.1 1 10 Packet loss rate [%] ADAM KRAJEWSKI - TCP CONGESTION CONTROL 14 / 37

Bandwidth [Gbps] 25ms delay, default settings PROBLEM 1.2 1 0.8 0.6 0.4 cubic reno bic scalable 0.2 0 0.0001 0.001 0.01 0.1 1 10 Packet loss rate [%] ADAM KRAJEWSKI - TCP CONGESTION CONTROL 15 / 37

Growing the window (1) We need... W 25 = B RTT = 10Gbps 25ms 31. 2 MB Default settings: net.core.rmem_max = 124K net.core.wmem_max = 124K net.ipv4.tcp_rmem = 4K 87K 4M net.ipv4.tcp_wmem = 4K 16K 4M net.core.netdev_max_backlog = 1K B max = W max RTT = 4MB 25ms Window can t grow bigger than maximum TCP send and receive buffers! 1. 28 Gbps WARNING: The real OS numbers are in pure bytes count so less readable. ADAM KRAJEWSKI - TCP CONGESTION CONTROL 16 / 37

Growing the window (2) We tune TCP settings on both server and client. Tuned settings: net.core.rmem_max = 67M net.core.wmem_max = 67M net.ipv4.tcp_rmem = 4K 87K 67M net.ipv4.tcp_wmem = 4K 65K 67M net.core.netdev_max_backlog = 30K Now it s fine WARNING: TCP buffer size doesn t translate directly to window size because TCP uses some portion of its buffers to allocate operational data structures. So the effective window size will be smaller than maximum buffer size set. Helpful link: https://fasterdata.es.net/host-tuning/linux/ ADAM KRAJEWSKI - TCP CONGESTION CONTROL 17 / 37

Bandwidth [Gbps] 25 ms delay, tuned settings 10 9 8 7 6 5 4 3 2 1 0 0.0001 0.001 0.01 0.1 1 10 Packet loss rate [%] cubic reno scalable bic ADAM KRAJEWSKI - TCP CONGESTION CONTROL 18 / 37

Bandwidth [Gbps] 0 ms delay, tuned settings 10 9 8 7 cubic 6 5 DEFAULT CUBIC 4 3 2 1 0 0.0001 0.001 0.01 0.1 1 10 scalable reno bic Packet loss rate [%] ADAM KRAJEWSKI - TCP CONGESTION CONTROL 19 / 37

Bandwith fairness Goal: Each TCP flow should get a fair share of available bandwidth B 10G 10G 5G 10G 5G 5G t ADAM KRAJEWSKI - TCP CONGESTION CONTROL 20 / 37

Bandwith competition How do congestion control algorithms compete? Tests were done for: Reno, BIC, CUBIC, Scalable 0ms and 25ms delay Two cases: 2 flows starting at the same time 1 flow delayed by 30s 10G 10G congestion 10G ADAM KRAJEWSKI - TCP CONGESTION CONTROL 21 / 37

cubic vs cubic + offset ADAM KRAJEWSKI - TCP CONGESTION CONTROL 23 / 37

Bandwidth [Gbps] cubic vs bic 10 Congestion control 0ms 8 6 cubic 4 bic 2 0 0.0001 0.001 0.01 0.1 1 10 Packet loss rate [%] ADAM KRAJEWSKI - TCP CONGESTION CONTROL 24 / 37

Bandwidth [Gbps] cubic vs scalable Congestion control 0ms 10 9 8 7 6 5 4 3 2 1 0 0.0001 0.001 0.01 0.1 1 10 Packet loss rate [%] cubic scalable ADAM KRAJEWSKI - TCP CONGESTION CONTROL 25 / 37

TCP (reno) friendliness ADAM KRAJEWSKI - TCP CONGESTION CONTROL 26 / 37

Bandwidth [Gbps] cubic vs reno Congestion control 25ms delay 10 9 8 7 6 5 4 3 2 1 0 0.0001 0.001 0.01 0.1 1 10 Packet loss rate [%] cubic reno ADAM KRAJEWSKI - TCP CONGESTION CONTROL 27 / 37

Why default cubic? (1) BIC was the default Linux TCP congestion algorithm until kernel version 2.6.19 It was changed to CUBIC in kernel 2.6.19 with commit message: Change default congestion control used from BIC to the newer CUBIC which it the successor to BIC but has better properties over long delay links. Why? ADAM KRAJEWSKI - TCP CONGESTION CONTROL 28 / 37

Why default cubic? (2) It is more friendly to other congestion algorithms to Reno... ADAM KRAJEWSKI - TCP CONGESTION CONTROL 29 / 37

Why default cubic? (2) It is more friendly to other congestion algorithms to CTCP (Windows default)... ADAM KRAJEWSKI - TCP CONGESTION CONTROL 30 / 37

Why default cubic? (3) It has better RTT fairness properties 0 ms RTT 25 ms RTT Meyrin Wigner ADAM KRAJEWSKI - TCP CONGESTION CONTROL 31 / 37

Why default cubic? (4) RTT fairness test: 0 ms RTT 0 ms RTT 25 ms RTT 25 ms RTT ADAM KRAJEWSKI - TCP CONGESTION CONTROL 32 / 37

The real problem Lots of networks paths! LAG Meyrin ToR 100G 100G LAG Packet losses on: Links (dirty fibers,...) NICs (hardware limitations, bugs,...) ToR Wigner Difficult to troubleshoot! ADAM KRAJEWSKI - TCP CONGESTION CONTROL 33 / 37

Conclusions To achieve high transfer rate to Wigner: User part Tune TCP settings! Follow https://fasterdata.es.net/host-tuning/ Make sure you use at least CUBIC for Linux and CTCP for Windows Networking team part Minimizing packet loss throughout the network Fiber cleaning Avoid BIC: Better performance on lossy links, but unfair to other congestion algorithms unfair to itself (RTT fairness) Submit a ticket if you observe low network performance for your data transfer Collaboration is essential! ADAM KRAJEWSKI - TCP CONGESTION CONTROL 34 / 37

Setting TCP algorithms (1) OS-wide setting (sysctl...): net.ipv4.tcp_congestion_control = reno Add new: # modprobe tcp_bic net.ipv4.tcp_available_congestion_control = cubic reno bic Local setting: setsockopt(), TCP_CONGESTION optname net.ipv4.tcp_allowed_congestion_control = cubic reno ADAM KRAJEWSKI - TCP CONGESTION CONTROL 35 / 37

Setting TCP algorithms (2) OS-wide settings: C:\> netsh interface tcp show global Add-On Congestion Control Provider: ctcp Enabled by default in Windows Server 2008 and newer. Needs to be enabled manually on client systems: Windows 7: C:\>netsh interface tcp set global congestionprovider=ctcp Windows 8/8.1 [Powershell]: Set-NetTCPSetting CongestionProvider ctcp ADAM KRAJEWSKI - TCP CONGESTION CONTROL 36 / 37

References [1] http://indico.cern.ch/event/36803/material/slides/0.pdf [2] https://www.cnaf.infn.it/~ferrari/papers/myslides/2006-04-03- hepix-ferrari.pdf ADAM KRAJEWSKI - TCP CONGESTION CONTROL 37 / 37

TCP loss sensitivity analysis ADAM KRAJEWSKI, IT-CS-CE