Transmission Control Protocol (TCP) A brief summary
TCP Basics TCP (RFC 793) is a connection-oriented transport protocol TCP entities only present at hosts (end-end) retain state of each open connection Needs special PDUs (segments) for opening and closing of connections All PDU types based on single segment format Operates flow and error control algorithms like a data link protocol but must also cope with variable network delays possible reordering of segments Augmented over time with a variety of enhancements and options E.g. congestion control algorithms (slow start and congestion avoidance) introduced in RFC 2001, led to TCP Tahoe implementation in 1988
TCP Segment Format Exchange of TPDUs called segments. Header of five 32-bit words. Each segment carries a sequence number. TCP sees transmission as stream of (data) bytes. Every byte in stream has a number. Sequence number of segment is number of its first data byte. Acknowledgement number, if present, is number of next expected byte. Header length Source port number Checksum 32-bits Sequence number Acknowledgement number Flags Options (0 or more words) Data Destination port number Window size Urgent pointer Header length (4 bits) gives number of 32 bit words (including options). Next 6-bit field is reserved There are 6 1-bit flags: URG, ACK, PSH, RST, SYN, FIN used for e.g. setting up connections. Window size is number of bytes receiver will accept (sliding window flow control) Checksum is over TCP segment plus pseudoheader. If URG =1, urgent pointer gives number of urgent bytes that follow sequence number. Options have many uses: e.g negotiate MSS AC 4 3
Connection opening and closing To establish connection TCP uses three-way handshake. Client sends initial SYN segment with SYN set and an initial sequence number (ISN). If server has process listening at target port it will usually accept connection. Sends back SYN with ACK set, acknowledgement number for the next expected byte (ISN+1) and an initial sequence number for return stream (ISN R ). Originator replies with ACK segment with sequence number ISN+1 and acknowledging server s ISN R +1. Client is said to perform active open; server passive open. Refusal of connection uses RST bit (e.g. when no process listening at destination port) Connection closed with FIN segment (ACKed). Must be done for each direction. Connection can be aborted with RST segment at any time.
TCP data transfer TCP entity decides how application data split (c.f. UDP lets application decide) Will use current MSS to decide. During data transmission, each segment sent has to be acknowledged. Acknowledgement carries number of next byte expected. Implicitly ACKs all bytes prior to this (cumulative acknowledge). When a segment is sent a retransmission timer is started. Missing segment will result in repeated ACKs from last received data Data is buffered while receiver awaits missing segment When missing data received, receiver ACKs all contiguous data it has buffered. If the retransmission timeout (RTO) interval expires before segment is acknowledged, sender will retransmit. If retransmission fails, sender backs off by doubling RTO and trying again. RTO is recomputed by measuring round trip time (RTT) and using the new RTT value to modify RTO. No negative ACK. Errors cause time-out. Reliance on timeouts can cause problems where more than one segment goes missing in a window SACK (Selective ACK) option addresses this
Send Windows TCP uses credit-based flow control (variable size sliding window). 16-bit window size field indicates how many bytes a host is willing to accept. Note limit of 64K. Idea is to keep data pipeline as full as possible without overwhelming receiver. Fast link or large RTT requires larger window (bandwidth-delay product) Extension proposed in RFC 1323 allows window size to be optionally scaled by 2 n with n up to 14. Sender can only send up to min(swnd,cwnd) without an ACK,where swnd is the send window size and cwnd the congestion window size. Busy receiver can send frequent small credits fragmenting data transfer: silly window syndrome.
TCP transmission example Initial credit rating of B: B's available buffer space TCP entity A Credit=4000 Credit=3500 Seq=1000 Data len=500 TCP entity B Credit reduced by 500 as 500 bytes have just been sent Credit reduced by 1000 as 1000 bytes have just been sent Credit=2500 Credit=1500 Credit=2500 Seq=2500 Data len=1000 Seq=1500 Data len=1000 Seq=3500 Data len=1000 ACK=2500 Win=3500 ACK=3500 Win=2500
Notes Receiver need not ACK all segments. In delayed acknowledgements (RFC 1122), only every second segment is ACKed unless a preset delay passes (RFC recommends 500ms) without a further arrival. When small amounts of data are sent regularly from sender (e.g. interactive applications like Telnet), sometimes makes sense to batch data into single segment. In Nagle algorithm (RFC 896), only send data when last segment ACK arrives. Nagle and delayed ACK can interact badly. Why? PUSH flag used to tell receiver to deliver all outstanding data to receiving process. Many TCP implementations deliver data as soon as possible anyway. Urgent mode lets receiver know that there is urgent data in a segment. URG bit is set and urgent pointer points to last byte of urgent data. This is used to transfer interrupt and break commands.