I wrote a blog post the other day about CloudFlare's globally distributed DNS infrastructure and how each ninja name server we give you when you signup doesn't represent just one machine, but instead a whole cluster of machines in each of the data centers we operate worldwide. Several people in the comments wondered how that was possible since each name server URL maps to an IP address. The answer is Anycast, but I wanted to take some more time to explain exactly what that means.
Unicast: One Machine, One IP
Most of the Internet works via a routing scheme called Unicast. Under Unicast, every node on the network gets an IP address which is unique to it. You may have connected to a wireless network and gotten a notice that your IP address is already in use, this happens when two computers on the same Unicast network try and use the same IP. In most cases, that isn't allowed.
Routers keep a map of the world's IP addresses and maintain a sense of the shortest path to get from one node to another. One router will hand a packet off to another router that is closer to the final destination until the packet finally arrives at the address it was sent to. You can see these handoffs if you run a traceroute (instructions for Mac and Windows) to any site. Take, for example, the popular arts, culture, and technology blog laughingsquid.com. If I do a traceroute to their origin server then I get something like this.
Each of those lines, known as a "hop," represents a router that the packets carrying my request are passing through. Often there are clues in the names of the routers as to where they are located. In this case, it looks like the origin is running on a Rackspace server somewhere in or around Dallas (line 13 and knowing that DFW is the airport code for the Dallas-Ft. Worth airport gives it away). No matter where in the world you do a traceroute from, you will eventually get to the same server running in Dallas because it is using the Unicast routing scheme.
Numbers on each line. Those are three samples of the time, in milliseconds, it takes for a packet to do a round-trip to the particular router. The last line is typically the most important because it is an estimate of the real-world network latency between you and the server. In this case for me, about 50 milliseconds (which is pretty good).
Anycast: Many Machines, One IP
While Unicast is the easiest way to run a network, it isn't the only way. At CloudFlare, we make extensive use of another routing scheme called Anycast. With Anycast, multiple machines can share the same IP address. When a request is sent to an Anycasted IP address, routers will direct it to the machine on the network that is closest.
Return to our friends at laughingsquid.com. They are CloudFlare users so if you request their website you get back one or more CloudFlare IP addresses. Under normal circumstances, it doesn't matter where you are in the world, you will get the same IP addresses. However, there are machines in 12 data centers all listening on those IPs. As a result, the network itself will direct traffic to the nearest CloudFlare data center to where you are making the request from. Check out the following traceroute.
I'm running this traceroute from Northern California. The nearest CloudFlare data center is in San Jose. And, if you look closely, hops #11 - #13 all show routers in "SJC" which is the airport code for San Jose. If you run this same traceroute to laughingsquid.com from somewhere else in the world, you'll see the requests hitting whatever data center is closest to you.
Speed, Resiliency & Attack Mitigation
Anycast is not trivial to implement, but if you do it has a number of key benefits. The first is speed. Note the latency on the last hop of about 9.5 milliseconds. Since CloudFlare answers typically 75% of requests from its edge without having to hit the origin, we're able to save significant network latency. We also pick core IBX data centers through which your packet would have likely traveled anyway, which means even when we have to go back to the origin there's little if any additional latency introduced.
In addition to speed, using Anycast means the network can be extremely resilient. Because traffic will find the best path, we can take an entire data center offline and traffic will automatically flow to the next closest data center.
The last benefit of Anycast is that it can help with attack mitigation. In most DDoS attacks, many compromised "zombie" computers are used to form what is known as a botnet. These machines can be scattered around the web and generate so much traffic that they can overwhelm a typical Unicast-connected machine. The nature of CloudFlare's Anycasted network is that we inherently increase the surface area to absorb such an attack. A distributed botnet will have a portion of its denial of service traffic absorbed by each of our data centers. As a result, as we continue to grow our network and expanding into more data centers, it will get harder and harder to launch an effective DDoS against any of our users.
It is not easy to setup a true Anycasted network. It requires that you own your own hardware, build direct relationships with your upstream carriers, and tune your networking routes to ensure traffic doesn't "flap" between multiple locations. We've taken the time to do that at CloudFlare because it helps ensure all our users have access to a faster, safer, better Internet.