La couche transport dans l'internet (la suite TCP/IP) C. Pham Université de Pau et des Pays de l Adour Département Informatique http://www.univ-pau.fr/~cpham Congduc.Pham@univ-pau.fr
Cours de C. Pham, Univ. Pau
La couche transport dans l'internet La couche Transport (couche 4) est composé de 2 protocoles: TCP (RFC 793) et UDP (RFC 768) TCP mode connecté fiable contrôle de la congestion sélection sur le champ protocol du packet IP UDP TCP ICMP UDP mode datagramme non fiable IP data Cours de C. Pham, Univ. Pau
Cours de C. Pham, Univ. Pau
Cours de C. Pham, Univ. Pau
Ports standards Cours de C. Pham, Univ. Pau SMTP: 25, HTTP: 80, Telnet: 23 FTP control: 20, FTP transfer: 21 DNS: 53
Cours de C. Pham, Univ. Pau
Cours de C. Pham, Univ. Pau
UDP Port Numbers (some examples) 0 Reserved 7 Echo 11 Users (Gives list of active users) 13 Daytime 17 Quote (Gives the quote of the day) 53 Domain (Domain name server) 67 BOOTPS (Bootstrap Protocol Server) 68 BOOTPC (Bootstrap Protocol Client) 69 TFTP (Trivial File Transfer Protocol) 123 NTP (Network Time Protocol) Cours de C. Pham, Univ. Pau
UDP Checksum Pseudohdr UDP header UDP Data Area Pad Source IP Address Destination IP Address Zero Proto UDP Length zero s to adjust length to mult. of 16 bit. Checksum := Complemented 16 bit, one s complement sum of all fields, grouped by 16 bit. The checksum is considered 0 for this computation. Including the pseudoheader is against strict layering! Cours de C. Pham, Univ. Pau
UDP checksum Goal: detect errors (e.g., flipped bits) in transmitted segment Sender: treat segment contents as sequence of 16-bit integers checksum: addition (1 s complement sum) of segment contents sender puts checksum value into UDP checksum field In reality some IP header fields are included w/ the UDP segment for checksumming. Receiver: compute checksum of received segment check if computed checksum equals checksum field value: NO - error detected YES - no error detected. But maybe errors nonetheless? More in chap 5 on stronger error detection methods Cours de C. Pham, Univ. Pau
UDP Checksum Example Consider three 16-bit words: 0110011001100110 0101010101010101 0000111100001111 (1 s complement) sum of first two 16-bit words is: 1011101110111011 Adding the third word to the above sum gives: 1100101011001010 1 s complement of this sum => invert 0 s and 1 s 0011010100110101 (this is the checksum field) If no errors, sum of all four 16-bit words (incl. Checksum) will be all 1s, I.e., 1111111111111111 Cours de C. Pham, Univ. Pau
Cours de C. Pham, Univ. Pau
Cours de C. Pham, Univ. Pau
TCP Checksum Pseudohdr TCP header TCP Data Area Pad Source IP Address Destination IP Address Zero Proto TCP Length zero s to adjust length to mult. of 16 bit. Checksum := Complemented 16 bit, one s complement sum of all fields, grouped by 16 bit. The checksum is considered 0 for this computation. Including the pseudoheader is against strict layering! Cours de C. Pham, Univ. Pau
TCP Error Correction Sliding window error correction Cumulative Acknowledgment Position in stream of last received byte Acknowledgments piggybacking with reverse traffic Retransmission policy implementation dependent Adaptive time-out Network delays vary widely due to traffic fluctuations Round-trip time continuously monitored Time-out based on weighted average of round-trip times Congestion control Receiver congestion prevented by adapting window size Network Congestion detected by round-trip delay analysis Congestion cured by slowing down transmissions Cours de C. Pham, Univ. Pau
Cours de C. Pham, Univ. Pau
Cours de C. Pham, Univ. Pau
TCP: scenarios de retransmission Host A Host B Host A Host B timeout Seq=92, 8 bytes data X loss ACK=100 Seq=92, 8 bytes data Seq=100 timeout Seq=92 timeout Seq=92, 8 bytes data Seq=100, 20 bytes data ACK=100 ACK=120 Seq=92, 8 bytes data ACK=100 ACK=120 time ACK perdu time timeout prématuré, ACKs cumulés Cours de C. Pham, Univ. Pau
Cours de C. Pham, Univ. Pau
TCP Window Flow Control: Coté émetteur fenêtre Envoyés et accusés Envoyés mais pas accusés Pas encore envoyé Prochain à envoyer Cours de C. Pham, Univ. Pau
Window Flow Control: Coté récepteur Packet Sent Source Port Dest. Port Sequence Number Acknowledgment HL/Flags Window D. Checksum Urgent Pointer Options.. Packet Received Source Port Dest. Port Sequence Number Acknowledgment HL/Flags Window D. Checksum Urgent Pointer Options.. App write acknowledged sent to be sentoutside window Cours de C. Pham, Univ. Pau
How to estimate RTT? RTT = prop + queuing delay Queuing delay highly variable So, different samples of RTTs will give different random values of queuing delay Q: how to estimate RTT? SampleRTT (M): measured time from segment transmission until ACK receipt M will vary wildly use several recent measurements, not just current SampleRTT to calculate "AverageRTT" (A) A (1-x)*A + x*m and then set RTO=Rβ, x=0.1 (old version) Cours de C. Pham, Univ. Pau slide modified by C. Pham
Round Trip Time and Timeout (II) New version (Jacobson, 1988) constant multiple of the mean is not good better use mean (A) and variance (D) Setting the RTO timeout Err=M-A A A+gErr, g=1/8 D D+h( Err -D), h=1/4 RTO = A+ 4D In 1988, Jacobson specified 2D, but corrected it into 4D in 1990 Example A=0, D=3s and RTO=A+2D initially. So RTO 1 =6s If M=1.5s for 1st segment Err=1.5-0=1.5 A=0+O.125*1.5=O.1875 D=3+0.25( 1.5-3)=2.625 RTO 2 =0.1875+4*2.625=10.6875 Cours de C. Pham, Univ. Pau slide added by C. Pham
Timer Granularity Many TCP implementations set RTO in multiples of 200,500 or 1000ms Why? Avoid spurious timeouts RTTs can vary quickly due to cross traffic Delayed-ack timer can delay valid acks by up to 200ms Make timers interrupts efficient What happens for the first couple of packets? Pick a very conservative value (seconds) Can lead to stall if early packet lost Cours de C. Pham, Univ. Pau
Cours de C. Pham, Univ. Pau
connection establisment Client Server Client request connection SYN 1415531521:145531521(0) mss<1024> SYN 1823083521:1823083521(0) ACK 1415531522, mss(1024> serveracks receipt of SYN, sends back SYNACK client ACKs receipt of SYNACK ACK 1823083522 Trace obtained with tcpdump showing seq#:implied_seq#(ndata) Cours de C. Pham, Univ. Pau slide added by C. Pham
Cours de C. Pham, Univ. Pau
Cours de C. Pham, Univ. Pau
The congestion phenomenon 10 Mbps 1.5 Mbps 100 Mbps Too many packets sent to the same interface. Difference bandwidth from one network to another Main consequence: packet losses in routers
Congestion dans le réseau (2) Flow control is for receivers Congestion control is for the network From Computer Networks, A. Tanenbaum Congestion collapse was first observed in 1986 by V. Jacobson. Congestion control was added to TCP (TCP Reno) in 1988. Cours de C. Pham, Univ. Pau
Cours de C. Pham, Univ. Pau
Cours de C. Pham, Univ. Pau
Cours de C. Pham, Univ. Pau
Cours de C. Pham, Univ. Pau
TCP congestion control: the big picture Congestion window doubles every round-trip time Sequence No From Computer Networks, A. Tanenbaum packet ack Time cwnd grows exponentially (slow start), then linearly (congestion avoidance) with 1 more segment per RTT If loss, divides threshold by 2 (multiplicative decrease) and restart with cwnd=1 packet Cours de C. Pham, Univ. Pau
Cours de C. Pham, Univ. Pau
AIMD Phase plot User 2 s Allocation x 2 t 0 Convergence point Fairness Line x 1 =x 2 Efficiency Line x 1 +x 2 =C Multiplicative Decrease preserves the fairness because the user s allocation ratio remains the same Ex: x 2 = x.b 2 x 1 x 1.b User 1 s Allocation x 1 Assumption: decrease policy must (at minimum) reverse the load increase over-and-above efficiency line Implication: decrease factor should be conservatively set to account for any congestion detection lags etc! Cours de C. Pham, Univ. Pau
Evolutions de TCP 1975 Three-way handshake Raymond Tomlinson In SIGCOMM 75 1974 TCP described by Vint Cerf and Bob Kahn In IEEE Trans Comm 1982 TCP & IP RFC 793 & 791 1983 BSD Unix 4.2 supports TCP/IP 1984 Nagel s algorithm to reduce overhead of small packets; predicts congestion collapse 1986 Congestion collapse observed 1987 Karn s algorithm to better estimate round-trip time 1988 Van Jacobson s algorithms congestion avoidance and congestion control (most implemented in 4.3BSD Tahoe) 1990 4.3BSD Reno fast retransmit delayed ACK s 1975 1980 1985 1990
TCP dans les années 1990s 1994 T/TCP (Braden) Transaction TCP 1996 SACK TCP (Floyd et al) Selective Acknowledgement 1993 TCP Vegas (Brakmo et al) real congestion avoidance 1994 ECN (Floyd) Explicit Congestion Notification 1996 Hoe Improving TCP startup 1996 FACK TCP (Mathis et al) extension to SACK 1993 1994 1996