A new technical standard called “RFC 9293” was published on 18 August 2022. It specifies the TCP (Transmission Control Protocol), through which the majority of data flows on the network pass. The previous version of this standard dated from… 1981. What is TCP, why is it so important and what is this new version of the standard for?
Yes, it’s a protocol that’s more than forty years old, which handles the majority of data sent over the Internet. To understand its role, we must first place it among all the other protocols, these rules that applications must follow in order to communicate. Simplifying grossly, we can say that communication on the Internet needs a physical medium (for example optical fibre), a mechanism allowing the packets of data to go from one router to another until they reach their destination, a means of correcting the imperfections of the network and rules specific to the application used. TCP deals with the third point: it corrects the imperfections of the network. The mechanism allowing the packets of data to go from one router to another is called IP, Internet Protocol. These IP packets, these groups of bytes circulating on the Internet, do not always necessarily reach their destination: for example, if a router has to send packets over a slow line and its memory, where the packets are waiting, becomes full, the router abandons the packets. This is not an anomaly, it is the normal functioning of the Internet. But obviously the user does not want to receive a file with bits missing! So the role of TCP is to make sure the application receives all the data.
It is what is known as a transport protocol, and TCP is the one most used on the Internet. The other two most widely used are UDP (User Datagram Protocol), which does not attempt to ensure the reliability of the transport, certain applications preferring to do this themselves, and QUIC, which is more recent and has taken some “market share” from TCP but remains a minority choice. TCP is so crucial to the proper functioning of the Internet that the set of protocols allowing the network to function are often referred to collectively as “TCP/IP”, from the names of the two most important protocols.
How TCP saves data
IP is fast because it does not need the two machines that are communicating or the intermediate routers to remember what happens. It sends the packet to the next step and forgets about it. But this simplicity and speed are to the detriment of reliability. For example a glitch might appear at the wrong time and result in a packet’s being destroyed. Or a router might be overloaded and abandon some packets. TCP, unlike IP, remembers what happens. Each byte sent has a sequence number; the sender keeps a copy of the packets sent; the receiver sends acknowledgements of receipt (“I’ve received bytes up to number 39781”) and if, after a given time, the acknowledgement of receipt has not come through, TCP re-sends (“I have sent up to 4025, the receiver has received only up to byte 3978, I’m going to re-send bytes 3979 to 4025”). TCP works only in sending and receiving computers, not in intermediate routers, which know only IP2.
But it’s not a matter of blindly re-sending! Nor for that matter of sending as many bytes as possible without knowing whether they’ll get through. The Internet is a shared network, and it’s important not to contribute to its congestion. So TCP has another role: constantly making sure it doesn’t send more bytes than the network (and the application on the destination machine) can handle. Algorithms with names like BIC, Cubic and New Reno perform quite complex calculations based on available data to limit the number of bytes sent. It’s a difficult task, since the aim is to preserve the network, which is a common good, while at the same time maximising performances.
TCP doesn’t serve only to repair packet losses, it also fixes other network problems such as IP packets not always arriving in the right order and having to be re-sequenced before being transmitted to the application.
Figure 1: View of a full TCP connection by the Wireshark analyzer,
the first three packets open the connection (“three-way handshake”) and “ACK” is an acknowledgment of receipt.
The old RFC
Technical Internet standards are described in documents called RFCs3. The first RFC describing TCP was numbered 7614 and was published in 1980 when the Internet was much smaller than it is today, although already very important for its users. RFC 761 was quickly replaced by RFC 793, which has been with us for more than forty years. That makes TCP a very old protocol in Internet terms.
That said, TCP has not remained frozen during these forty years. The algorithms for combating congestion have been constantly perfected5, extensions have been added to TCP to improve its performance in certain respects, and of course the implementation of TCP has much improved. However, the extensions added made it difficult to program TCP implementations for example by reference to the RFCs. So sooner or later it was going to be necessary to update the old standard.
The new RFC
The body that produces the technical standards for the Internet is the IETF (Internet Engineering Task Force). Work on updating TCP started in 2013, but did not get under way officially at the IETF until 2015. So we see that this updating has been a long and difficult task. The IETF works essentially on the basis of consensus, and there were many details to be settled. And as you can imagine, there was some reluctance to interfere with what is one of the most critical Internet protocols.
Figure 2: The evolution of the future RFC, and its 28 versions
The set of requirements did not of course call for far-reaching changes to TCP. In particular, it was essential for the programs implementing TCP following RFC 793 to be able to interoperate without problems with those following the new RFC. So there is no “TCP version 2”; TCP will continue to operate as before; the new RFC is above all a new wording and the correction of numerous problems that come to light only in extreme cases.
TCP is indisputably a huge success. For example, you probably have in your pocket right now a device with several TCP connections in progress with different machines on the Internet. Despite the relatively rapid take-off of QUIC, TCP remains the main Internet transport protocol. The installed base is so huge, from the smallest connected object to the biggest super-calculators, and the availability of TCP implementations, tested and optimised, is such that TCP will probably remain a crucial protocol for some time to come. Another forty years?
1 – Needless to say, this article is very simplified. In reality, we don’t send the sequence number of the byte but a value which is the sum of this sequence number and a randomly chosen initial sequence number (ISN). Also, there are many TCP optimisations that modify the very simple algorithm presented here to a greater or lesser extent. And the acknowledgement of receipt in fact contains the number of the next byte expected, not that of the last one received.
2 – Here too, the way the Internet has evolved has made things more complicated. The router nearest to you, your operator’s box, performs more complex operations and unfortunately does not confine itself to just passing on the IP packets as fast as possible.
3 – The initials had originally a meaning but is has changed so we just use the acronym. The same applies to QUIC by the way.
4 – Before that, IP and TCP were merged in a single protocol. The years 1980-1981 saw the separation of these two protocols.
5 – It is a very active area of research with numerous scientific articles published.