TCP Flow Control Computer Networks The receiver side of a TCP connection maintains a receiver buffer: Lecture : Flow Control, eliable elivery application process may be slow at reading from the buffer Flow control ensures that sender won t overflow receiver s buffer by transmitting too much, too fast liding Window TCP uses sliding window flow control: allows a larger amount of data in flight than has been acknowledged allows sender to get ahead of the receiver but not too far ahead TCP ending process Last byte written TCP eceiving process Last byte read TCP eceiver Window eceiver window size (rwnd) amount that can be sent without acknowledgment receiver can buffer this amount of data eceiver continually advertises buffer space available to sender by including the current value of rwnd in TCP header ender limits unacked data to rwnd guarantees receiver buffer wouldn t overflow Window ize Next byte ACKed (snd_una) Next byte sent (snd_next) Next byte expected Last byte received [exford] ata ACK d Outstanding Un-ack d data ata OK to send ata not OK to send yet [exford]
TCP Header with rwnd s port s port TCP Flow Control Problems Two flow-control problems: receiver too slow (silly-window syndrome) sender s data comes in small amount (Nagle s algorithm) Flags: YN FIN T PH UG ACK equence Number Acknowledgment eq# Checksum U A P F Options (variable) ata window size Urgent pointer [exford] illy-window syndrome: receiver window opens only by a small amount, hence sender can send only a small amount of data at a time Why is this not good? packet header overhead small packets cause more interrupts at busy receiver Host ender (size ) () (5) 5 () 5 () estination Host w = 5 application reads out byte w = application reads out byte w = olution to illy-window yndrome on t advertise window until it opens significantly (> ½*M or ½*rwnd) Implementation alternatives: ACK with rwnd=: sender probes after persistence timer goes off delayed ACK, but delayed not more than 5 ms or ACK every other segment (Why?) persist timer Host ender () () (5) probe 499 () estination Host w = 5 application reads out byte w = w = Characteristics of Interactive Applications User sends only a small amount of data, eg, instant messaging, sends one character at a time Problem: 4-byte header for every byte sent! olution: clumping, sender clumps data together, ie, sender waits for a reasonable amount of time before sending How long is reasonable?
Nagle Algorithm send first segment immediately accumulate data until ACK returns, or up to ½ sender window or ½ M Advantages: bulk transfer is not held up data sent as fast as network can deliver (see next slide) Nagle Algorithm Nagle sends data as fast as network can deliver: round-trip time () ender eceiver ender eceiver Host Host Host Host round-trip time () w without Nagle with Nagle pkts ACKs w accumulated data sent in pkt to w+ to w+ Can be disabled by setsockopt(tcp_noelay) w+ bytes in s w+ bytes in s TCP Error ecovery ender: maintains only one active timer, for snd_una, restarting the timer after retransmission TCP Go-back-N with Buffering eceiver: buffers out-of-order packets cumulative ACKs all packets received in-order eceiver: cumulative ACKs all packets received in-order out-of-order packets repeat the last ACK buffers out-of-order packets error recovery on TCP is actually more complicated because it s tied up with congestion control, but it still relies on retransmission timeout for correctness For illustration only! Actual error recovery tied up with congestion control
etransmission Timeout AQ depends on retransmission to achieve reliability: sender sets a timeout waiting for an ACK etransmission timeout (TO) computed from round-trip time (TT) expects ACK to arrive after an TT but on the Internet, TT of a path varies over time, due to: route changes congestion Varying TT complicates the computation of: retransmission timeout (TO) optimal sender s window size round-trip time () retransmission timeout ()! ender 4 ACKs Pkts eceiver 4 Implications of Bad TO TO too small: unnecessary retransmissions: ender outer A A estination TO too big: lower throughput: ender outer A estination Estimating TT TO must adapt to actual and current TT Estimate the TT by watching returning ACKs compute a smoothed estimate by keeping a running average of the TTs (aka Exponentially Weighted Moving Average (EWMA)) estimated_tt = α * estimated_tt + ( α ) * sample_tt where sample_tt: time between when a segment is transmitted and when its ACK is received α is the weight: α : each sample changes the estimate only a little bit α : each sample influences the estimate heavily α is typically ( ½, which allows for fast implementation ( right shifts)) Example TT Estimation
How to Compute TO? First try: TO = β TT, with β typically set to or Two problems: an ACK acknowledges receipt of data, is not an ACK for transmission: which packet to associate with an ACK in the case of retransmission? TTs spread too wide pdf mean? ender pkt i pkt i ack i = estination pkts with these must be rexmitted samples ACK Ambiguity Which retransmitted packet to associate with an ACK? original packet: retransmitted packet: TO can grow unbounded TO shrinks ender estination ender estination There is a feedback loop between TO computation and TT estimate ACK Ambiguity: Karn s Algorithm Karn s algorithm: adjust TT estimate only from non-retransmitted samples however, ignoring retransmissions could lead to insensitivity to long delays so, back off TO upon retransmission: TO new = γ TO old, γ typically = TT pread Too Wide TT estimate computed using EWMA only considers the mean, doesn t take variance into account pdf pkts with these must be rexmitted = mean Jacobson s algorithm: estimate deviation () of sample_tt new = α old + ( α) sample_tt estimated_tt compute new estimated_tt as usual take the deviation in sample_tt () into account when computing TO TO = estimated_tt + 4 samples
Timers Used in TCP TIME_WAIT: *ML persistence timer TO 4 keep-alive timer: probe the other side if connection has been idle for too long may be turned on/off idle period may be set using setsockopt()