Image credit: markyharky
This is the first in a series of posts looking at what makes the web slow and what to do about it.
Suppose you needed to transfer 1TB of data (perhaps your home movie collection) from San Francisco to London. What would be the fastest route? Put the disk on British Airways flight 286 at SFO, or transfer it across the Internet using a 100 Mbps connection?
Surprisingly, the answer is the former not the latter. If you had a perfect 100 Mbps Internet connection and could fill it completely with data the transfer would take 22 hours 13 minutes. British Airways make the flight in under 10 hours.
But even with a 100 Mbps Internet connection you're unlikely to get 100 Mbps of transfer speed between San Francisco and London. The details of the TCP protocol used on the Internet and the speed of light collude to make the effective transfer speed much lower.
To really understand the speed of an Internet connection, be it transferring 1TB of data or downloading a web page, there are two values that you need to know: the bandwidth and the latency.
The bandwidth is how much data can be sent on the connection in a unit of time. In the example above the Internet connection has a bandwidth of 100 Mbps, the Boeing 747 has a bandwidth of 222 Mbps (the 1TB carried divided by the flight time).
The latency is the 'flight time' of data across the connection. For a connection between London and San Francisco across the Internet thelatency will be something like 150ms. That figure is governed and limited by the speed of light. For the 747 the latency is the literal flying time of 10 hours.
One thing British Airways ensures while flying the 1TB of data is reliability. The data is very, very unlikely to not arrive in London. The Internet does not provide the same guarantee. As data is transferred across the Internet it gets delayed, lost, corrupted and misordered. So, the Internet's core protocol TCP provides mechanisms to ensure the reliable delivery of data despite the lossy network the data is passing through. It's these mechanisms that slow the transfer of data down and where the speed of light comes into play.
(If airlines experienced aircraft loss at the rate the Internet sees packet loss there'd be 28 crashes per day in the US alone).
Image credit: El Ronzo
To ensure reliability, TCP breaks data to be sent up into chunks (which are further broken down into packets) and sends chunks of data and then waits for an acknowledgement that the chunk was successfully received. It's while waiting for the acknowledgement that the speed of light comes into play.
Imagine that a 65kB chunk of data has been sent across a link with a latency of 150ms. The 65kB take 150ms to reach their destination and the receiving machine sends an acknowledgement that takes a further 150ms to arrive. So 0.3s is taken up making sure that 65KB have arrived successfully; that number is called the Round Trip Time.
Those acknowledgement delays significantly hamper connections across long distances (and also from mobile phones).
The amount of data that TCP can send in a single chunk is controlled by the Receive Window of the receiver machine. For web surfers that means that the receiving machine controls how much can be sent without acknowledgement. And the combination of Receive Window and Round Trip Time limit the speed at which downloads can occur no matter what the bandwidth is.
The maximum throughput of TCP is Receive Window divided by Round Trip Time.
For example, on my machine the Receive Window is set at 524,288 bytes meaning that on a slow link from London to San Francisco the fastest download I can get is 524,288 bytes / 0.3s or 14 Mbps. Much less than the 100 Mbps I was hoping to get.
So, my 1TB download would actually take more than 6 days! The speed of light really is a limiting factor in downloading.
How do you fight the speed of light? Since you can't control the Receive Window the only thing you can do is move your web site closer to the people surfing it. And, of course, that's not practical for most web sites since you'd have to have copies of the web site all over the world.
CloudFlare fights the speed of light for you by having data centers around the world. If your site is on CloudFlare then surfers will connect to the data center that's nearest to them.
For example, CloudFlare's own web site is in California, but from London it appears to be only 10ms away because of CloudFlare's London data center. The same distribution of a web site across the world works forany CloudFlare customer.
In my next post I'll look at optimizations you can make to web site content for speedy browsing, and show how CloudFlare helps.
Part two of this series is now available: What Makes SPDY Speedy?