HTTP的发展

HTTPWWW的基础协议。Tim Berners-Lee在1989-1991年创建了它,HTTP已经发生了许多变化,保持了大部分的简单性,并进一步塑造了其灵活性。HTTP已经从一个早期的协议逐步进化成在实验环境下交换文件的协议,再进化到携带图片,高分辨率视频和3D的现代复杂互联网的协议。

万维网的发明

1989年, 当Tim Berners-Lee 在 CERN工作时写了一份关于建立一个互联网上的超文本系统的报告。这个系统起初被称为 Mesh,随后在1990年实施期间被更名为万维网(World Wide Web)。它在现有的TCP和IP协议上建立,由四个部分组成:

  • 一个用来表示超文本文档的文本格式,超文本标记语言 (HTML)。
  • 一个用来交换超文本文档的简单协议,超文本传输协议(HTTP)。
  • 一个显示(偶然地编辑)超文本文档的客户端,即第一个网络浏览器,被称为 WorldWideWeb。
  • 一个服务器用以提供可访问的文档,是httpd的一个早期版本。

这四个部分完成于1990年底,且第一批服务器已经在1991年初在CERN以外的地方运行了。 1991年8月16日,Tim Berners-Lee 在公开的超文本新闻组发表的文章现在被官方视为是万维网作为公共项目的开始。

HTTP协议在应用的早期阶段非常简单,后来被称为HTTP/0.9,有时候被视为单行(one-line)协议.

HTTP/0.9 – 单行协议

最初版本的HTTPi协议并没有版本号; 它的版本号被定位0.9以区分后来的版本。 HTTP/0.9 极其简单: 一个请求由一个单行的指令构成,唯一的指令GET开头,后面跟一个资源的路径(一旦连接到服务器,协议、服务器、端口这些都不是必须的)。

GET /mypage.html

响应也是极其简单的: 只是包含文档本身。

<HTML>
这是一个非常简单的HTML页面
</HTML>

跟后来的版本不一样,响应内容并没有HTTP头,这意味着只有HTML文件可以传送,没法传送其他类型的文件。没有状态或者错误代码:一旦出了问题,一个特定的包含问题描述的HTML文件将被发送,供人们查看。

HTTP/1.0 – 构建可扩展性

HTTP/0.9应用非常局限,浏览器和服务器迅速扩展使其用途更广:

  • 版本信息现在会随着每个请求发送(HTTP1.0 被追加到GET行)
  • 状态代码行也会在响应开始时发送,允许浏览器本身了解请求的成功或失败,并相应地调整其行为(如以特定方式更新或使用本地缓存)
  • 引入了HTTP头的概念,无论是对于请求还是响应,允许传输元数据,并使协议非常灵活和可扩展。
  • 在新的HTTP头的帮助下,增加了传输除纯文本HTML文件外的其他类型文档的能力。(感谢Content-Type头)。

一个典型的请求看来就像这样:

GET /mypage.html HTTP/1.0
User-Agent: NCSA_Mosaic/2.0 (Windows 3.1)
200 OK
Date: Tue, 15 Nov 1994 08:12:31 GMT
Server: CERN/3.0 libwww/2.17
Content-Type: text/html
<HTML>
一个包含图片的页面
  <IMG SRC="/myimage.gif">
</HTML>

接下来就是第二个连接,请求获取图片:

GET /myimage.gif HTTP/1.0
User-Agent: NCSA_Mosaic/2.0 (Windows 3.1)
200 OK
Date: Tue, 15 Nov 1994 08:12:32 GMT
Server: CERN/3.0 libwww/2.17
Content-Type: text/gif
(这里是图片内容)

在1991-1995年,这些新扩展并没有被引入标准中,促进协助工作,而仅仅作为一种尝试:服务器和浏览器添加这些新扩展功能,但出现了大量的互操作问题.直到1996年11月,为了解决这些问题,发表了一份新文档(RFC 1945),用来描述怎么操作实践这些新扩展功能.这份文档 RFC 1945 定义了 HTTP/1.0,但它是侠义的,并不是官方标准.

HTTP/1.1 – 标准化的协议

HTTP/1.0的多种不同的实现运用起来有些混乱,自1995年开始即HTTP/1.0文档发布的下一年,HTTP的第一个标准化版本正在修订中。HTTP1.1 在1997年初发布,就在HTTP/1.0发布后的几个月后。

HTTP/1.1 消除了歧义并引入了许多改进:

  • 连接可以重复使用,节省了多次打开它的时间,以显示嵌入到单个原始文档中的资源。
  • 增加流水线操作,允许在第一个应答被完全发送之前发送第二个请求,以降低通信的延迟。
  • 支持响应分块。
  • 引入额外的缓存控制机制。
  • 引入内容协商,包括语言,编码,或类型,并允许客户端和服务器约定以最适当的内容进行交换。
  • 感谢Host头,能够使不同的域名配置在同一个IP地址的服务器。

一个典型的请求流程, 所有请求都通过一个连接实现,看起来就像这样:

GET /en-US/docs/Glossary/Simple_header HTTP/1.1
Host: developer.mozilla.org
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:50.0) Gecko/20100101 Firefox/50.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate, br
Referer: https://developer.mozilla.org/en-US/docs/Glossary/Simple_header
200 OK
Connection: Keep-Alive
Content-Encoding: gzip
Content-Type: text/html; charset=utf-8
Date: Wed, 20 Jul 2016 10:55:30 GMT
Etag: "547fa7e369ef56031dd3bff2ace9fc0832eb251a"
Keep-Alive: timeout=5, max=1000
Last-Modified: Tue, 19 Jul 2016 00:59:33 GMT
Server: Apache
Transfer-Encoding: chunked
Vary: Cookie, Accept-Encoding
(content)
GET /static/img/header-background.png HTTP/1.1
Host: developer.cdn.mozilla.net
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:50.0) Gecko/20100101 Firefox/50.0
Accept: */*
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate, br
Referer: https://developer.mozilla.org/en-US/docs/Glossary/Simple_header
200 OK
Age: 9578461
Cache-Control: public, max-age=315360000
Connection: keep-alive
Content-Length: 3077
Content-Type: image/png
Date: Thu, 31 Mar 2016 13:34:46 GMT
Last-Modified: Wed, 21 Oct 2015 18:27:50 GMT
Server: Apache
(image content of 3077 bytes)

HTTP/1.1 作为 RFC 2068 发布于1997年1月.

超过15年的扩展

由于其可扩展性 – 创建新的头和方法是很容易的 – 即使HTTP/1.1协议进行了两次修订,RFC 2616 发布于 1999年6月,而另外两个系列 RFC 7230-RFC 7235 发布于2014年6月作为HTTP/2的预览版本, 这个协议已经稳定了超过15年了。

HTTP 用于安全传输

HTTP最大的变化发生在1994年底。HTTP在基本的TCP/IP协议栈上发送信息,Netscape Communication 在此基础上创建了一个额外的加密传输层:SSL。SSL 1.0从未在公司以外发布过,但SSL 2.0及其后继者SSL 3.0和SSL 3.1允许通过加密来保证服务器和客户端之间交换消息的真实性来创建电子商务网站。SSL在标准化道路上最终成为TLS,随着版本1.0, 1.1, 1.2的出现成功地关闭漏洞。TLS 1.3 目前正在形成。

During the same time, the need for an encrypted transport layer raised: the Web left the relative trustiness of a mostly academic network, to a jungle where advertisers, random individuals or criminals compete to get as much private information about people, try to impersonate them or even to replace data transmitted by altered ones. 随着通过HTTP构建的应用程序变得越来越强大,可以访问越来越多的私人信息,如地址簿,电子邮件或用户的地理位置,即使在电子商务使用之外,对TLS的需求也变得普遍。

HTTP 用于复杂应用

Tim Berners-Lee 对于 Web 的最初设想不是一个只读媒体。 他设想一个 Web 是可以远程添加或移动文档,是一种分布式文件系统。 大约 1996 年,HTTP 被扩展到允许创作,并且创建了一个名为 WebDAV 的标准。 它进一步扩展了某些特定的应用程序,如 CardDAV 用来处理地址簿条目,CalDAV 用来处理日历。 但所有这些 *DAV 扩展有一个缺陷:它们必须由要使用的服务器来实现,这是非常复杂的。并且他们在网络领域的使用必须保密。

在 2000 年,一种新的使用 HTTP 的模式被设计出来:representational state transfer (或者说 REST)。 由 API 发起的操作不再通过新的 HTTP 方法传达,而只能通过使用基本的 HTTP / 1.1 方法访问特定的 URI。 这允许任何 Web 应用程序通过提供 API 以允许查看和修改其数据,而无需更新浏览器或服务器:all what is needed was embedded in the files served by the Web sites through standard HTTP/1.1。  REST 模型的缺点在于每个网站都定义了自己的非标准 RESTful API,并对其进行了全面的控制;不同于 *DAV 扩展,客户端和服务器是可互操作的。 RESTful API 在 2010 年变得非常流行。

自 2005 年以来,可用于 Web 页面的 API 大大增加,其中几个 API 为特定目的扩展了 HTTP 协议,大部分是新的特定 HTTP 头:

  • Server-sent events,服务器可以偶尔推送消息到浏览器。
  • WebSocket,一个新协议,可以通过升级现有 HTTP 协议来建立。

Relaxing the security-model of the Web

HTTP is independent of the security model of the Web, the same-origin policy. In fact, the current Web security model has been developed after the creation of HTTP! Over the years, it has proved useful to be able to be more lenient, by allowing under certain constraints to lift some of the restriction of this policy. How much and when such restrictions are lifted is transmitted by the server to the client using a new bunch of HTTP headers. These are defined in specifications like Cross-Origin Resource Sharing (CORS) or the Content Security Policy (CSP).

In addition to these large extensions, numerous other headers have been added, sometimes experimentally only. Notable headers are Do Not Track (DNT) header to control privacy, X-Frame-Options, or Upgrade-Insecure-Request but many more exist.

HTTP/2 – A protocol for greater performance

Over the years, Web pages have become much more complex, even becoming applications in their own right. The amount of visual media displayed, the volume and size of  scripts adding interactivity, has also increased: much more data is transmitted over significantly more HTTP requests. HTTP/1.1 connections need requests sent in the correct order. Theoretically, several parallel connections could be used (typically between 5 and 8), bringing considerable overhead and complexity. For example, HTTP pipelining has emerged as a resource burden in Web development.

In the first half of the 2010s, Google demonstrated an alternative way of exchanging data between client and server, by implementing an experimental protocol SPDY. This amassed interest from developers working on both browsers and servers. Defining an increase in responsiveness, and solving the problem of duplication of data transmitted, SPDY served as the foundations of the HTTP/2 protocol.

The HTTP/2 protocol has several prime differences from the HTTP/1.1 version:

  • It is a binary protocol rather than text. It can no longer be read and created manually despite this hurdle, improved optimization techniques can now be implemented.
  • It is a multiplexed protocol. Parallel requests can be handled over the same connection, removing the order and blocking constraints of the HTTP/1.x protocol.
  • It compresses headers. As these are often similar among a set of requests, this removes duplication and overhead of data transmitted.
  • It allows a server to populate data in a client cache, in advance of it being required, through a mechanism called the server push.

Officially standardized, in May 2015, HTTP/2 has had much success. By July 2016, 8.7% of all Web sites[1] were already using it, representing more than 68% of all requests[2]. High-traffic Web sites showed the most rapid adoption, saving considerably on data transfer overheads and subsequent budgets.

This rapid adoption rate was likely as HTTP/2 does not require adaptation of Web sites and applications: using HTTP/1.1 or HTTP/2 is transparent for them. Having an up-to-date server communicating with a recent browser is enough to enable its use: only a limited set of groups were needed to trigger adoption, and as legacy browser and server versions are renewed, usage has naturally increased, without further Web developer efforts.

Post-HTTP/2 evolution

HTTP didn't stop evolving upon the release of HTTP/2. Like with HTTP/1.x previously, HTTP's extensibility is still beinig used to add new features. Notably, we can cite new extensions of the HTTP protocol appearing in 2016:

  • Support of Alt-Svc allows the dissociation of the identification and the location of a given resource, allowing for a smarter CDN caching mechanism.
  • The introduction of Client-Hints allows the browser, or client, to proactively communicate information about its requirements, or hardware constraints, to the server.
  • The introduction of security-related prefixes in the Cookie header, now helps guarantee a secure cookie has not been altered.

This evolution of HTTP proves its extensibility and simplicity, liberating creation of many applications and compelling the adoption of the protocol. The environment in which HTTP is used today is quite different from that seen in the early 1990s. HTTP's original design proved to be a masterpiece, allowing the Web to evolve over a quarter of a century, without the need of a mutiny. By healing flaws, yet retaining the flexibility and extensibility which made HTTP such a success, the adoption of HTTP/2 hints at a bright future for the protocol.

文档标签和贡献者