Reorganisation of the technical standards for HTTP

06/07/2022

You will certainly be familiar with the HTTP protocol (Hypertext Transfer Protocol), since together with the HTML (Hypertext Markup Language) format and URLs (Uniform Resource Locators, such as https://www.service-public.fr/particuliers/vosdroits/F11601), it is one of the three pillars of the Web . HTTP is all the more well-known given that it is the name¹ that appears at the beginning of the URL and as such is highly visible. It is probably the only Internet communication protocol whose name is known to the general public. This protocol is the subject of very active technical standardisation, and the documents that officially define HTTP have recently been thoroughly reorganised, largely in order to better reflect the coexistence on the Internet of three main versions of HTTP.

HTTP, what’s it for?

HTTP is a protocol. A protocol is a set of rules that applications must rigorously follow to ensure interoperability, in other words to make sure they can communicate, even if they have been developed by different organisations. The success of the Internet relies on this notion of interoperability, which ensures independence from any particular organisation. These rules, which form the protocol, are generally set out in a special document called a specification. Ideally², this specification will be drawn up and published by an open standardisation organisation, that is to say one that publishes and distributes the texts of standards free of charge but also opens the process of standardisation to all. In the case of HTTP, this standardisation organisation is the IETF (Internet Engineering Task Force) and the standards it publishes are referred to as RFCs³. Each RFC is identified by a number.

These technical standards state, for example, that HTTP is a client-server protocol, where the server waits for requests and the client connects to the server, sends a request and reads a response. The standard specifies the precise format of this request and of the response, so that any client can talk to any server. It is this standard that establishes, for example, that the response will begin with a three-digit code allowing the client application to unequivocally determine whether the request has been properly handled. (You will be familiar with at least one of these codes, 404, which indicates to the client that the server has not been able to execute the request because it referred to content that does not exist.)

History and versions

HTTP has a long history and has seen many changes. The very first version of HTTP, developed by Tim Berners-Lee, was not formally standardised and was extremely simple compared with today’s HTTP. Around 1989-1990, this version was given the name version 0, but at the time it was the only version in existence and very few people foresaw how immensely successful HTTP would become.

The first version to be formally specified was version 1, or more precisely 1.0, in 1996⁴. Its standard was RFC 1945, which did not have the status of an IETF standard and was thought of as a simple description of what existed. This new version introduced numerous new features and the protocol bore only a distant resemblance to the original. There were now several methods (types of requests, allowing a distant resource to be recovered but also modified), a mechanism for attaching metadata to the request and to the response, and so on.

In parallel, work was being done on the future version 1.1, which was standardised in 1997 in RFC 2068. This version is still widely used today. For example, Google’s search engine crawler uses only this version.

Since then, not only has the standard for version 1.1 been updated several times, but two other versions of HTTP have been developed. Version 2 was standardised in 2015 in RFC 7540. The two important new features were the possibility of recovering resources in parallel (for example, the images of a website page and its style sheet) and transition from a textual representation of the request and the response to a binary representation, more difficult for humans but more practical and effective for programs. Version 3, meanwhile, has just been standardised in 2022, with RFC 9114. The big change is the abandonment of the TCP (Transmission Control Protocol) in favour of the QUIC transport protocol, which improves performance by multiplexing, with less latency.

Just as version 2 did not replace version 1 (far from it), so version 3 does not aim to completely replace version 2. Not only is it difficult to replace a version that is widely deployed and used⁵, and on which many Internet users rely, but in addition each version has its specific advantages that are useful in certain cases. The disappearance in the short term of HTTP/1 or HTTP/2 is therefore not something that is envisaged.

The new standards

The old technical standards for HTTP had to be adapted to this long-term coexistence of the three versions. It was therefore decided to separate the standardisation of the basic characteristics of HTTP, which are independent of the version, and the standardisation of each version with its specific features. The new standards consist of:

RFC 9110, which describes the semantics and the general concepts of HTTP. The protocol has gained weight since its beginnings and this RFC contains no fewer than 250 pages.
RFC 9112 standardises HTTP/1 (replacing the former RFC 7230).
RFC 9113 standardises HTTP/2 (replacing the former RFC 7540)
RFC 9114 standardises HTTP/3 (first standardisation of this version).

Other RFCs complete the set, such as RFC 9111 on the caching of results (an important means of improving HTTP performances) and RFC 9204 on a mechanism for compressing metadata used by HTTP/3.

This solely concerns a reorganisation of documents, HTTP remains unchanged, except for some small details or to correct errors. Thus the applications that “spoke” HTTP/1 or HTTP/2 will generally not need to be modified.

HTTP is frequently described as a very simple protocol, as a result of which it is often used for teaching, both for this simplicity and because it allows students to work with real examples⁶. The size of some RFCs can be misleading since they are often composed of long lists of elements to be managed without this really adding to the complexity of the protocol. However, writing a complete HTTP client or server is no simple matter and requires attentive reading of at least RFC 9110 and one of the specific RFCs of a version.

Note that the coexistence of the three versions, which will probably be prolonged, poses a problem for the HTTP client. How can it know which version(s) is or are accepted by the server? The safest solution is to connect in HTTP/1 and the server will then specify (for example through the Alt-Svc: header field) which other versions it handles. But there are also more effective methods, such as the future SVCB/HTTPS record in the DNS.

Conclusion

In any case, programmers, students, or anyone else seriously wishing to learn about HTTP will now have a simpler task thanks to this improved organisation of the standards. The Internet is constantly evolving and the standards are not always updated in good time. So congratulations to the many people who have worked to clean up and tidy the pile of documents.

^{1 – Or sometimes its variant using the acronym HTTPS for HTTP Secure.}

^{2 – If you are curious about “Internet governance”, note however that it is not obligatory on the Internet, which is a permissionless network.}

^{3 – Meaning ‘Request For Comments’, but the name, retained for historical reasons, is very misleading.}

^{4 – This is the date of the RFC; version 1 had already been running for a long time on most HTTP clients and servers.}

^{5 – Remember how long it took for IPv6 to replace IPv4?}

^{6 – Especially for HTTP/1.}