Sliding Window Protocol and TCP Congestion Control



Similar documents
TCP over Multi-hop Wireless Networks * Overview of Transmission Control Protocol / Internet Protocol (TCP/IP) Internet Protocol (IP)

A Survey on Congestion Control Mechanisms for Performance Improvement of TCP

TCP in Wireless Mobile Networks

Improving the Performance of TCP Using Window Adjustment Procedure and Bandwidth Estimation

Lecture Objectives. Lecture 07 Mobile Networks: TCP in Wireless Networks. Agenda. TCP Flow Control. Flow Control Can Limit Throughput (1)

Analytic Models for the Latency and Steady-State Throughput of TCP Tahoe, Reno and SACK

TCP over Wireless Networks

TCP/IP Over Lossy Links - TCP SACK without Congestion Control

A Survey: High Speed TCP Variants in Wireless Networks

Transport Layer Protocols

Lecture 15: Congestion Control. CSE 123: Computer Networks Stefan Savage

Performance evaluation of TCP connections in ideal and non-ideal network environments

TCP, Active Queue Management and QoS

First Midterm for ECE374 03/24/11 Solution!!

STUDY OF TCP VARIANTS OVER WIRELESS NETWORK

Active Queue Management (AQM) based Internet Congestion Control

SJBIT, Bangalore, KARNATAKA

Outline. TCP connection setup/data transfer Computer Networking. TCP Reliability. Congestion sources and collapse. Congestion control basics

Simulation-Based Comparisons of Solutions for TCP Packet Reordering in Wireless Network

TCP/IP Optimization for Wide Area Storage Networks. Dr. Joseph L White Juniper Networks

Congestions and Control Mechanisms n Wired and Wireless Networks

First Midterm for ECE374 03/09/12 Solution!!

Data Networks Summer 2007 Homework #3

TCP based Denial-of-Service Attacks to Edge Network: Analysis and Detection

Parallel TCP Data Transfers: A Practical Model and its Application

TCP Westwood for Wireless

Chapter 6 Congestion Control and Resource Allocation

A Study on TCP Performance over Mobile Ad Hoc Networks

2 TCP-like Design. Answer

17: Queue Management. Queuing. Mark Handley

La couche transport dans l'internet (la suite TCP/IP)

Low-rate TCP-targeted Denial of Service Attack Defense

Computer Networks. Chapter 5 Transport Protocols

CSE331: Introduction to Networks and Security. Lecture 9 Fall 2006

Mobile Communications Chapter 9: Mobile Transport Layer

Linux 2.4 Implementation of Westwood+ TCP with rate-halving: A Performance Evaluation over the Internet

Final for ECE374 05/06/13 Solution!!

TCP and Wireless Networks Classical Approaches Optimizations TCP for 2.5G/3G Systems. Lehrstuhl für Informatik 4 Kommunikation und verteilte Systeme

AN IMPROVED SNOOP FOR TCP RENO AND TCP SACK IN WIRED-CUM- WIRELESS NETWORKS

Performance Analysis of AQM Schemes in Wired and Wireless Networks based on TCP flow

Transport layer issues in ad hoc wireless networks Dmitrij Lagutin,

International Journal of Scientific & Engineering Research, Volume 6, Issue 7, July ISSN

High Speed Internet Access Using Satellite-Based DVB Networks

Prefix AggregaNon. Company X and Company Y connect to the same ISP, and they are assigned the prefixes:

A packet-reordering solution to wireless losses in transmission control protocol

TCP Fast Recovery Strategies: Analysis and Improvements

Application Level Congestion Control Enhancements in High BDP Networks. Anupama Sundaresan

TCP Flow Control. TCP Receiver Window. Sliding Window. Computer Networks. Lecture 30: Flow Control, Reliable Delivery

Optimization of Communication Systems Lecture 6: Internet TCP Congestion Control

An Improved TCP Congestion Control Algorithm for Wireless Networks

APPENDIX 1 USER LEVEL IMPLEMENTATION OF PPATPAN IN LINUX SYSTEM

La couche transport dans l'internet (la suite TCP/IP)

CS268 Exam Solutions. 1) End-to-End (20 pts)

TCP PACKET CONTROL FOR WIRELESS NETWORKS

15-441: Computer Networks Homework 2 Solution

A Passive Method for Estimating End-to-End TCP Packet Loss

TCP for Wireless Networks

Delay-Based Early Congestion Detection and Adaptation in TCP: Impact on web performance

SELECTIVE-TCP FOR WIRED/WIRELESS NETWORKS

This sequence diagram was generated with EventStudio System Designer (

Congestion Control Review Computer Networking. Resource Management Approaches. Traffic and Resource Management. What is congestion control?

Chapter 5. Transport layer protocols

Why Congestion Control. Congestion Control and Active Queue Management. Max-Min Fairness. Fairness

Names & Addresses. Names & Addresses. Hop-by-Hop Packet Forwarding. Longest-Prefix-Match Forwarding. Longest-Prefix-Match Forwarding

ECSE-6600: Internet Protocols Exam 2

The Effect of Packet Reordering in a Backbone Link on Application Throughput Michael Laor and Lior Gendel, Cisco Systems, Inc.

Comparative Analysis of Congestion Control Algorithms Using ns-2

CSE 473 Introduction to Computer Networks. Exam 2 Solutions. Your name: 10/31/2013

CS551 End-to-End Internet Packet Dynamics [Paxson99b]

A Study of Internet Packet Reordering

A Congestion Control Algorithm for Data Center Area Communications

Computer Networks - CS132/EECS148 - Spring

Protagonist International Journal of Management And Technology (PIJMT) Online ISSN Vol 2 No 3 (May-2015) Active Queue Management

1. The subnet must prevent additional packets from entering the congested region until those already present can be processed.

High-Speed TCP Performance Characterization under Various Operating Systems

TCP Over Wireless Network. Jinhua Zhu Jie Xu

Packet Queueing Delay

Higher Layer Protocols: UDP, TCP, ATM, MPLS

TCP in Wireless Networks

Research of TCP ssthresh Dynamical Adjustment Algorithm Based on Available Bandwidth in Mixed Networks

FEW would argue that one of TCP s strengths lies in its

Stop And Wait. ACK received; transmit frame 2 CS 455 3

COMP 3331/9331: Computer Networks and Applications. Lab Exercise 3: TCP and UDP (Solutions)

Student, Haryana Engineering College, Haryana, India 2 H.O.D (CSE), Haryana Engineering College, Haryana, India

Transport layer protocols for ad hoc networks

Midterm Exam CMPSCI 453: Computer Networks Fall 2011 Prof. Jim Kurose

EFFECT OF TRANSFER FILE SIZE ON TCP-ADaLR PERFORMANCE: A SIMULATION STUDY

B-2 Analyzing TCP/IP Networks with Wireshark. Ray Tompkins Founder of Gearbit

Transcription:

Sliding Window Protocol and TCP Congestion Control Simon S. Lam Department of Computer Science The University it of Texas at Austin 2/25/2014 1 1

2 Sliding Window Protocol Consider an infinite array, Source, at the sender, and an infinite array, Sink, at the receiver. P1 Sender Source: send window 0 1 2 a 1 a s 1 s acknowledged unacknowledged P2 Receiver next expected Sink: received 0 1 2 r delivered receive window r + RW 1 RW receive window size SW send window size (s - a SW) 2

Sliding Windows in Action Data unit r has just been received by P2 Receive window slides forward P2 sends cumulative ack with sequence number it expects to receive next (r+3) Source: send window P1 Sender 0 1 2 a 1 a s 1 s acknowledged unacknowledged P2 Receiver Sink: r+3 0 1 2 r next expected r + RW 1 delivered d receive window 3 3

4 Sliding Windows in Action P1 has just received cumulative ack with r+3 as next expected sequence number Send window slides forward P1 Sender Source: 0 1 2 a 1 a s 1 s acknowledged send window P2 Receiver next expected r + RW 1 Sink: 0 1 2 r delivered receive window 4

5 Sliding Window protocol Functions provided error control (reliable delivery) in-order delivery flow and congestion control (by varying send window size) TCP uses only cumulative acks Other kinds of acks selective nack selective ack (TCP SACK) bit-vector representing entire state of receive window (in addition to first sequence number of window) 5

6 Sliding Windows for Lossy FIFO Channels A small number of bits in packet header for sequence number Necessary and sufficient condition for correct operation: SW + RW MaxSeqNum Necessity: P1 Sender Source: 0 1 2 a 1 a acknowledged send window unacknowledged Sink: next expected P2 Receiver 0 1 2 delivered RW receive window size SW send window size receive window 6

7 Sliding Windows for Lossy FIFO Channels Sufficiency can only be demonstrated by using a formal method to prove that the protocol provides reliable in-order delivery. See Shankar and Lam, ACM TOPLAS, Vol. 14, No. 3, July 1992. Interesting special cases SW = RW = 1 alternating-bit protocol SW = 7, RW = 1 out-of-order arrivals not accepted, e.g., HDLC SW = RW 7

8 Sliding Windows for LRD Channels (LRD stands for Lossy Duplicative Reordering) Assumption: Packets have bounded lifetime L Be careful how fast sequence numbers are consumed (i.e., data arrive for sending into network) (send rate)l MaxSeqNum TCP 32-bit sequence numbers counts bytes assumes that datagrams will be discarded by IP if too old 8

9 Window Size Controls Sending Rate RTT Source 1 2 W 1 2 W time data ACKs Destination 1 2 W 1 2 W time ~ W packets per RTT when no loss 9

10 Throughput Limit the number of unacked transmitted packets in the network to window size W W Max. throughput h t packets/sec R TT W MSS RTT = bytes/sec (assuming no loss, MSS denotes maximum segment size) Where did we apply Little s Law? Answer : Consider the TCP send buffer 10

11 Throughput or send rate? Previous formula provides an upper bound Average number in the send buffer is less than W unless packet arrival rate to send buffer is infinite If a packet is lost in the network with probability p, then g ff p RTT p T Since T O > RTT, actual throughput is smaller. the average time in send buffer is (1 ) O The throughput hp t of a host s s TCP send buffer is the host s send rate into the network (including original transmissions and retransmissions) As a result of loss, the end-to-end goodput is (1 p) throughput or (1 p) sendrate 11

12 TCP Window Control Receiver flow control Avoid overloading receiver rwnd: receiver (advertised) window Receiver sends rwnd to sender Network congestion control Sender tries to avoid overloading network It infers network congestion from loss indications cwnd: congestion window Sender sets W = min (cwnd, rwnd) 12

Receiver Flow Control Size of rwnd indicates available space in receive buffer decreased when data is received from IP layer and ack d increased when data is consumed by application process Receiver advertises rwnd in each packet it sends 13 13

Effect of Congestion W too big for many flows -> congestion Packet loss -> transmissions on links a packet has traversed prior to its loss are wasted Congestion collapse due to too many retransmissions and too much wasted transmission capacity October 1986, Internet had its first congestion collapse goodput collapse upper bound desired load (aggregate send rate) 14 14

15 Network Congestion Control Sender calculates cwnd from indications of network congestion Congestion indications timeout (loss) 3 dupacks (loss likely) queueing u delay mark (needs ECN) 15

16 TCP Congestion Control Tahoe (Jacobson 1988) Slow Start Congestion Avoidance (CA) Fast Retransmit Reno (Jacobson 1990) Fast Recovery Variants: NewReno, SACK Vegas (Brakmo & Peterson 1994) New Congestion Avoidance AQM RED (Floyd & Jacobson 1993) REM (Athuraliya & Low 2000) Others 16

17 Slow Start Start with cwnd = 1 On each successful ACK, increment cwnd cwnd cwnd + 1 Exponential growth of cwnd each RTT: cwnd 2 x cwnd Enter Congestion Avoidance when cwnd ssthresh For initial slow start, ssthresh is set to a very large value (e.g., 65 Kbytes) Note: for clarity in these slides, cwnd, rwnd, and ssthresh are counted in packets (segments) rather than in bytes 17

Slow Start 1 RTT sender cwnd 1 2 data packet ACK receiver 3 4 5 67 8 cwnd cwnd + 1 (for each ACK) 18 18

19 Congestion Avoidance (CA) sender cwnd 1 data packet receiver CA starts when cwnd ssthresh ACK 2 1 RTT 3 4 On each successful ACK: cwnd cwnd + 1/cwnd Linear growth of cwnd each RTT: cwnd cwnd + 1 TCP Congestion Control (Simon S. Lam) 19

Packet Loss Assumption: loss indicates congestion Packet loss detected by Retransmission timeout (RTO timer) Duplicate ACKs (at least 3) Packets 1 2 3 4 5 6 Acknowledgements 7 time 2 3 4 4 4 4 20 20

Fast Retransmit A timeout is quite long (> RTT) Upon receiving 3 dupacks, sender immediately retransmits without waiting for timeout Adjusts ssthresh ssthresh max(flightsize/2, 2) where flightsize is number of outstanding packets, which h may be less than W = min(rwnd, cwnd) Enter Slow Start (cwnd = 1) [TCP Tahoe] 21 21

22 TCP Tahoe (Jacobson 1988) cwnd SS CA time in RTTs SS: Slow Start CA: Congestion Avoidance Decrease to 1 for either timeout or 3 dupacks Fast retransmit on 3 dupacks 22

23 Successive Timeouts When there is another timeout, double the timeout value Keep doing so for each additional loss-retransmission Exponential backoff up to max timeout value equal to 64 times initial timeout value Note: red line in figure denotes first timeout 23

24 Summary: Tahoe Probe network for spare capacity during SS and CA and increase send rate Drastically reduce rate on loss indication for every ACK { if (W < ssthresh) then W W + 1 else W W + 1/W } for every loss indication { ssthresh W/2 W 1 } Self-clocking l Error recovery by retransmission fast retransmit upon 3 duplicate acks (SS) (CA) Need to estimate round trip time (to get T O value) 24

25 TCP Reno (Jacobson 1990) cwnd SS CA time SS: Slow Start CA: Congestion Avoidance Fast retransmit + fast recovery on 3 dupacks 25

26 TCP Reno (another scenario) cwnd TO 3 dupacks halved Initial slow start Slow start until cwnd reaches ssthresh h t 3 dupacks during initial slow start TCP Congestion Control (Simon S. Lam) 26

Fast recovery (in more detail) Idea: each dupack represents a packet successfully received. Therefore, no need for very drastic action Enter FR/FR after 3 dupacks Set ssthresh max(flightsize/2, 2) Retransmit lost packet Set cwnd ssthresh + #dupacks (window inflation) Wait till W=min(rwnd, cwnd) is large enough; transmit new packet(s) On non-dup ACK (1 RTT later), set cwnd ssthresh (window deflation) Enter CA 27 27

28 Example: FR/FR entry and exit S 1 2 3 4 5 6 7 8 1 9 10 11 1 1 1 1 1 1 1 9 loss R Exit FR/FR time time cwnd 8 7 9 11 ssthresh 4 4 4 4 4 Above scenario: Packet 1 is lost, packets 2, 3, and 4 are received; 3 dupacks with seq. no. 1 returned Fast retransmit Retransmit packet 1 upon 3 dupacks Fast recovery Inflate window with #dupacks such that new packets 9, 10, and 11 can be sent while repairing loss 28

29 AIMD in steady state additive increase: increase cwnd by 1 MSS every RTT in the absence of any loss event 24 Kbytes congestion window multiplicative decrease: cut cwnd in half after 3 dupacks Long-lived TCP connection 16 Kbytes 8 Kbytes time 29

30 TCP throughput (send rate) Approximate formula from Little s Law (assuming no loss) W RTT max. send rate packets/sec W changes with the arrival of each congestion indication To calculate l (average) send rate, we need the average value of W Q: W is a function of what parameter? 30

31 Ave. First approximation M. Mathis, et al., The Macroscopic Behavior of the TCP Congestion Avoidance Algorithm, ACM Computer Communicatons Review, 27(3), 1997. No slow-start, no timeout, long-lived TCP connection cti n Independent identically distributed periods Three dupacks are received in a round with probability p # of RTTs 31

32 Geometric Distribution Ave. no. of transmissions to get first triple dupacks 1 n ib (1 ) i i i p p i1 i1 p i(1 p) i 1 i1 d i d p (1 p) p (1 p) dp dp i1 i0 d 1 1 p p dp 11 p p 1/ p 2 i Aside: ave. no. of transmissions to get first success is 1/(1-p) 32

33 First approximation on (cont.) Average number of packets delivered in one period (area under one saw-tooth) 2 2 W 1W 3 W 2 2 2 8 Average number of packets sent per period is 1/p Equate the two and solve for W, W we get 8 W 3p 2 send rate (in packets/sec) 3 2 W no. of packets/period 8 time per period W RTT 2 1/ p 1 3 2 RTT 2 p RTT 3p 33

34 TCP ACK generation [RFC 1122, RFC 2581] Event at Receiver Arrival of in-order segment with expected seq #. All data up to expected seq # already ACKed Arrival of in-order segment with expected seq #. One other segment has ACK pending Arrival of out-of-order segment higher-than-expect expect seq. #. Gap detected Arrival of segment that partially or completely l fills gap TCP Receiver action Delayed ACK. Wait up to 500ms for next segment. If no next segment, send ACK Immediately send single cumulative ACK, ACKing both in-order segments Immediately send duplicate ACK, indicating seq. # of next expected byte Immediate send ACK, provided that segment starts t at lower end of gap 34

35 Receiver implements m Delayed ACKs Receiver sends one ACK for every two packets received -> each saw-tooth is WxRTT wide -> area under a saw-tooth is 2 Send rate is 3W 1 4 p 1/ p 1/ p 1 3 RTT W RTT 4/(3 p) RTT 4 p One ACK for every b packets received -> send rate is 1 3 RTT 2bp 35

A more detailed model Reference: J. Padhye, V. Firoiu, D. Towsley, and J. Kurose, Modeling TCP Throughput: h A Simple Model and its Empirical i Validation, Proceedings ACM SIGCOMM, 1998. 36 36

Motivation on Previous formulas not so accurate when loss rates are high TCP traces show that there are more loss indications due to timeouts (TO) than due to triple dupacks (TD) 37 37

38 Objectives More accurate steady-state throughput formula as a function of loss (indication) rate and RTT by also accounting for TO behavior of a TCP connection Formula applicable over a wider range of loss rates Explicit it statements t t of assumptions and approximations used in derivation of throughput formula Formula to include the impact of a small rwnd 38

Many assumptions and approximations A1. TCP sender is saturated, i.e., source application process always has a packet to send when send window has space available bulk transfer application A2. Slow Start not modeled A3. Time to send all packets in a window is smaller than RTT (as shown in slide 9) transmission rate is not too low 39 39

40 AIMD evolution of Window Size over time A4. Each TD period is ended d by a TD loss indication. TDP i period has duration A i RTTs A5. Duration of a round (RTT) is independent of window size poor assumption for a slow line A6. Fast Recovery not modeled 40

41 Loss assumptions A7. Losses in different rounds are independent A8. Losses within the same round are correlated as follows: If a packet is lost, all remaining packets transmitted until the end of that round are also lost all lost packets in the same round are counted as a single loss indication when estimating p A9. Assume that {A ij } and {W ij } are mutually independent i.i.d. sequences of random variables 41

42 AIMD throughput (send rate) A10. Assume {W i } to be a Markov regenerative process with rewards {Y i }, where Y i is the number of packets sent in TDP i 1 p E [ W ] E[ Y] p 1/ p send rate Bp ( ) o(1/ p) EA [ ] EA [ ] 2b RTT 3p 1 3 RTT 2bp o(1 / p) 42

43 AIMD with Timeouts Let Y ij denote number of packets sent in jth period of Z i TD 43

44 AIMD with Timeouts (cont.) Let n i denote the number of TD periods within a cycle which ends in i -th TO period, R i denote no. of retransmissions in i- th TO period A11. {n i } form an i.i.d. sequence, independent of {Yij} and {Aij} M i ni ni TO Yij Ri, Si Aij Zi j 1 j 1 TCP Congestion Control (Simon S. Lam) 44

45 Throughput of AIMD with TO EM [ ] En [ ] EY [ ] ER [ ] TO ES [ ] EnEA [ ] [ ] EZ [ ] EM [ ] EnEY [ ] [ ] ER [ ] send rate B TO ES [ ] EnEA [ ] [ ] EZ [ ] EY [ ] Q ER [ ] B EA TO [ ] Q EZ [ ] 1 where Q En [ ] ER [ ] 1 1 p TO with Q and E[ Z ] to be determined Assumption of Markov regenerative process again. <- Probability that a given loss indication is a TO TCP Congestion Control (Simon S. Lam) 45

46 Throughput of AIMD with TO (cont.) B ( p ) Prob[a loss indication is a TO] 1 p 1 EW [ ] Q( EW [ ]) p 1 p <- Eq. (27) f( p) more accurate EA [ ] QEW ( [ ]) TO version of 1 p throughput 1/ p formula 2b 3bp 3p 8 1 2 RTT min 1, 3 (1 32 p ) T 0 2 bp 3bp 2 RTT min 1, 3 p(1 32 p ) T 3 8 0 Extra term added to account for Timeout <-Eq. (29) most wellknown version of throughput formula 46

47 Impact of receiver s flow control limitation approximate it ti i t model Using the well-known Eq. (29) from before, Wmax 1 B( p) min(, ) RTT 2 bp 3 bp 2 RTT min 1,3 p(1 32 p ) T0 3 8 where W is the maximum window size allowed by receiver max The above is Eq. (32) referred to as the approximate model. The full model is Eq. (31) in the paper. 47

48 Summary data from traces (1 hour) Saturated TCP sender p computed from dividing total no. of loss indications by total number of packets sent RTT and T O values are averaged over entire 1-hour trace 48

Summary data from 100s traces Each row represents results from 100 traces each 100 seconds long for same S-D pair Totals are cumulative over 100 traces RTT and T O are average values over 100 traces for same S-D pair 49 49

50 Experimental comparison (1) rwnd Each point represents number of packets in 100s interval of trace T0 ~ single TO, T1 ~ at least 1 double TO in trace, etc. TD Only is analytic model by Mathis et al. Note: W max is only 6 in Figure 7 50

51 Experimental comparison (2) W max = 33 W max =44 51

52 Experimental comparison (3) W max =8 W max =48 52

53 Accuracy of approximate model Figure 18: manic to spiff, with predictions by both full and approximate models (W max =32) max 53

54 Average errors ave. error observations N predicted N Nobserved no. of observations observed 54

55 Conclusions A more detailed analysis than the one by Mathis et al. Numerous assumptions and approximations used but (almost) all of them are explicitly stated Large amount of experimental measurements on the Internet to validate accuracy of the full model (less for the approximate model) Throughput formula accounts for loss indications due to TO as well as rwnd restriction Using the formula requires accurate measurements of loss rate and RTT values (which could be tricky) For TCP Reno and drop-tail router Accuracy (like beauty) is in the eye of the beholder. What do you think? 55

56 Challenge in the future TCP average throughput (approximate) in terms of loss (indication) rate, p 1.22MSS RTT p Example: 1500-byte segments, 100ms RTT, to get 10 Gbps throughput, loss rate needs to be very low p = 2x10-10 New version of TCP needed for connections with large delay-bandwidth product One proposal: Katabi et al. (Sigcomm 2002) Revised congestion control algorithms for data center networks Very low delay, high bandwidth 56

57 The end 57