HTTP/2 For Web Developers

HTTP/2 changes the way web developers optimize their websites. In HTTP/1.1, it’s become common practice to eek out an extra 5% of page load speed by hacking away at your TCP connections and HTTP requests with techniques like spriting, inlining, domain sharding, and concatenation.

Life’s a little bit easier in HTTP/2. It gives the typical website a 30% performance gain without a complicated build and deploy process. In this article, we’ll discuss the new best practices for website optimization in HTTP/2.

Web Optimization in HTTP/1.1

Most of the website optimization techniques in HTTP/1.1 revolved around minimizing the number of HTTP requests to an origin server. A browser can only open a limited number of simultaneous TCP connections to an origin, and downloading assets over each of those connections is a serial process: the response for one asset has to be returned before the next one can be sent. This is called head-of-line blocking.

As a result, web developers began squeezing as many assets as they could into a single connection and finding other ways to trick browsers into avoiding head-of-line blocking. In HTTP/2, some of these practices can actually hurt page load times.

The New Web Optimization Mindset for HTTP/2

Optimizing for HTTP/2 requires a different mindset. Instead of worrying about reducing HTTP requests, web developers should focus on tuning their website’s caching behavior. The general rule is to ship small, granular resources so they can be cached independently and transferred in parallel.

This shift occurs because of the multiplexing and header compression features of HTTP/2. Multiplexing lets multiple requests share a single TCP connection, allowing several assets to download in parallel without the unnecessary overhead of establishing multiple connections. This eliminates the head-of-line blocking issue of HTTP/1.1. Header compression further reduces the penalty of multiple HTTP requests, since the overhead for each request is smaller than the uncompressed HTTP/1.1 equivalent.

There are two other features of HTTP/2 that also change how you approach web optimization: stream prioritization and server push. The former lets browsers specify what order they want to receive resources, and the latter lets the server send extra resources that the browser doesn’t yet know it needs. To make the most of these new features, web developers need to unlearn some of the HTTP/1.1 best practices that have become second nature to them.

Best Practices for HTTP/2 Web Optimization

Stop Concatenating Files

In HTTP/1.1, web developers often combined all of the CSS for their entire website into a single file. Similarly, JavaScript was also condensed into a single file, and images were combined into a spritesheet. Having a single file for your CSS, JavaScript, and images vastly reduced the amount of HTTP requests, making it a significant performance gain for HTTP/1.1.

However, concatenating files is no longer a best practice in HTTP/2. While concatenation can still improve compression ratios, it forces expensive cache invalidations. Even if only a single line of CSS is changed, browsers are forced to reload all of your CSS declarations.

In addition, not every page in your website uses all of the declarations or functions in your concatenated CSS or JavaScript files. After it’s been cached, this isn’t a big deal, but it means that unnecessary bytes are being transferred, processed, and executed in order to render the first page a user visits. While the overhead of a request in HTTP/1.1 made this a worthwhile tradeoff, it can actually slow down time-to-first-paint in HTTP/2.

Instead of concatenating files, web developers should focus more on optimizing caching policy. By isolating files that change frequently from ones that rarely change, it’s possible to serve as much content as possible from your CDN or the user’s browser cache.

Stop Inlining Assets

Inlining assets is a special case of file concatenation. It’s the practice of embedding CSS stylesheets, external JavaScript files, and images directly into an HTML page. For example, if you have web page that looks like this:

<html>
  <head>
	<link rel="stylesheet" href="/style.css">
  </head>
  <body>
	<img src="logo.png">
	<script src="scripts.min.js"></script>
  </body>
</html>

You could run it through an inlining tool to get something like this:

<html>
  <head>
	<style>

	  body {
		font-size: 18px;

		color: #999;
	  }
	</style>
  </head>
  <body>
	<img src="data:image/png;base64,Rw0KGgoAAAANSUhEUgAAAEAABA...">
	<script>console.log('Hello, World!');</script>
  </body>
</html>

In extreme cases, this can actually reduce the number of HTTP requests for a given web page to one. But, like file concatenation, you shouldn’t inline files you’re trying to optimize for HTTP/2.

Inlining means that browsers can’t cache individual assets. If you have CSS declarations that apply to your entire website embedded into every single HTML file, those declarations get sent back from the server every time. This results in redundant bytes being sent over the wire for every page that a user visits.

Inlining also has the unfortunate side effect of breaking stream prioritization. Because your CSS, JavaScript, and images are now embedded in your HTML, you’re effectively raising their priority to the same level as HTML content. This means that browsers can’t request assets in their preferred order, potentially increasing time-to-first-paint.

Instead of inlining assets, web developers should leverage HTTP/2’s server push functionality. Server push lets your web server say stuff like, "Hold on, you’re going to need this image and this CSS file to render that HTML page you just requested." Conceptually, this is the same as inlining an asset, but it doesn’t break stream prioritization, and it allows you to leverage a CDN and the user’s local browser cache, too.

Stop Sharding Domains

Domain sharding is a common technique for tricking browsers into opening more TCP connections than they’re supposed to. Browsers limit the number of connections to a single origin server, but by splitting your website’s assets across multiple domains, you can get it to open an extra set of TCP connections. This helps avoid head-of-line blocking, but it comes with significant tradeoffs.

Splitting website resources across multiple domains

Domain sharding should be avoided in HTTP/2. Each shard results in an unnecessary DNS query, TCP connection, and TLS handshake (assuming the servers use different TLS certificates). In HTTP/1.1, that overhead was compensated for by allowing several assets to download in parallel. But, this is no longer the case in HTTP/2: multiplexing allows multiple assets to download in parallel over a single connection. And, similar to asset inlining, domain sharding breaks HTTP/2 stream prioritization because browsers can’t prioritize streams across multiple domains.

If you’re currently sharding but want to leverage HTTP/2, you don’t necessarily need to go about refactoring your entire codebase. As discussed in our blog post on mixing domain sharding and SPDY, browsers recognize when multiple origin servers use the same TLS certificate. When they do, the browser will reuse the same SPDY or HTTP/2 connection for both servers. This still results in multiple DNS queries, but it’s a decent compromise solution if you want the best performance under both HTTP/1.1 and HTTP/2.

Some Best Practices Are Still Best Practices

Fortunately, not everything about web optimization changes in HTTP/2. There’s still several HTTP/1.1 best practices that carry over to HTTP/2. The rest of this article discusses techniques that you should be leveraging regardless of whether you’re optimizing for HTTP/1.1 or HTTP/2.

Keep Reducing DNS Lookups

Before a browser can request your website’s resources, it needs to find the IP address of your server using the domain name system (DNS). Until the DNS responds, the user is left staring at a blank screen. HTTP/2 is designed to optimize how web browsers communicate with web servers—it doesn’t affect the performance of the domain name system.

Since DNS queries can be expensive, especially if you have to start your query at the root nameservers, it’s still prudent to minimize the number of DNS lookups that your website uses. DNS prefetching via a <link rel='dns-prefetch' href='...' /> line in your document’s <head> can help, but it’s not a one-size-fits all solution.

Keep Using a Content Delivery Network

It takes around 130ms for light to travel around the circumference of the earth. That’s latency that you can’t get rid of—it’s just physics. The imperfect nature of fiber optic cables and wireless connections, as well as the topology of the global Internet means that it actually takes closer to 300-400ms to transmit a network packet from your computer to a server that’s halfway around the world. User’s can perceive latency as small as 100ms, and the only way to get around the laws of physics is to put your web assets geographically closer to your visitors via a CDN.

Keep Leveraging Browser Caching

You can take content delivery networks one step further by caching assets in the user’s local browser cache. This avoids any kind of data transfer over the network, aside from a 304 Not Modified response.

Keep Minimizing the Size of HTTP Requests

Even though HTTP/2 requests are multiplexed, it takes time to transmit data over the wire. Accordingly, it’s still beneficial to reduce the amount of data you need to transfer. On the request end, this means minimizing the size of cookies, URL and query strings, and referrer URLs as much as possible.

Keep Minimizing the Size of HTTP Responses

Of course, this holds true in the opposite direction. As a web developer, you want to make your server’s response as small as possible by minifying HTML, CSS, and JavaScript Files, optimizing images for the web, and compressing resources with gzip.

Keep Eliminating Unnecessary Redirects

HTTP 301 and 302 redirects are sometimes a fact of life when migrating to a new platform or re-designing your website, but they should be eliminated wherever possible. Redirects result in an extra round trip from the browser to the server, and this adds unnecessary latency. You should be particularly wary of redirect chains where it takes more than a single redirect to get to the destination URL.

Server-side redirects like 301 and 302 responses aren’t ideal, but they aren’t the worst thing in the world, either. They can still be cached locally, so a browser can recognize the URL as a redirect and avoid an unnecessary round trip. Meta refreshes (e.g., <meta http-equiv="refresh"...), on the other hand, are much more costly because they can’t be cached, and they can have performance issues in certain browsers.

Conclusion (and Caveats)

And that’s HTTP/2 web optimization in a nutshell. Avoiding file concatenation, asset inlining, and domain sharding not only speeds up your website, but also makes for a much simpler build and deploy process.

There are, however, a few caveats to this discussion. Most servers, content delivery networks (including CloudFlare), and existing applications don’t support server push. Servers and CDNs will catch up soon enough, but for your application to benefit from server push, you’re going to have to make some changes to your codebase.

Also keep in mind is that HTTP/2 performance gains depend on the type of content you serve. For example, websites that rely on more external assets will see larger performance gains from HTTP/2 multiplexing than ones that have fewer HTTP requests.

The Cloudflare Blog