Subscribe to receive notifications of new posts:

Cloudflare Traffic Manager: The Details

2016-09-29

4 min read

Cloudflare's investment into building a large global network protects our customers from DDoS attacks, secures them with our Web Application Firewall and Universal SSL, as well as improving performance through our CDN and the many network-level optimizations we're constantly iterating on.

Building on these products, we just introduced Cloudflare Traffic. To explain the benefits, we'll dive into the nitty-gritty details of how the monitoring and load-balancing features of Traffic Manager can be configured, and how we use it within our own website to reduce visitor latency, and improve redundancy across continents. We'll do a similar post on Traffic Control soon.

We're also kicking off the Early Access program for Traffic Manager, with details at the end of this post.

The Details

One of our primary goals when building Traffic Manager was to make it available to everyone: from those with just two servers in a single region, to those with 400 scattered across the globe (and everything in between). We also wanted to solve some of the key limitations of existing products: slow failover times (measured in minutes), and a lack of granular decision making when failing over.

  • We can failover within seconds for proxied ("orange clouded") records. Connecting clients don't need to wait for recursive DNS caches to expire, or trust that they respect sub-60s TTLs, and we can therefore respond to changes in the availability of your origin servers as quickly as needed.

  • At 100 data centers (and growing!) we can assess the availability of your origins and make fine-grained failover decisions at a per-data center level. A network path failure to your origin in London should not impact how we route traffic in New York!

  • Traffic Manager is built on Cloudflare's existing Anycast DNS infrastructure, benefiting from our resilient DDoS protection and our experience in making DNS fast. DNS failures can deeply impact the availability of your infrastructure: by leveraging Cloudflare's highly available network, you can relax.

Existing solutions are often challenged with these requirements. On-premise load balancers are vulnerable to local network conditions, hard to scale out as you grow regionally, and at risk of expensive hardware failure. Existing cloud-based solutions have smaller network footprints (increasing latency) and a lack of effective DDoS mitigation, which we see as critical to running high-availability infrastructure.

Flexible Configuration

Traffic Manager is designed to be flexible, and we want to share some of the many supported use cases.

We had three primary scenarios in mind: load balancing, failover, and geo-steering. Underpinning these choices is our health checking functionality, which allows you to define what a healthy origin means to you. We can probe your origin web servers from every Cloudflare data center (or selected regions) over HTTP(S), check for status codes you define as healthy, parse the response body for specific text, and timeout as you see fit. We'll also send you email notifications as we detect availability changes, so you're able to pin down exactly when we saw that server fail.

Round robin configuration (“active-active”) load balancing, where load is distributed across all healthy servers. When a server is identified as unavailable through health checks, we'll seamlessly avoid routing traffic to it until it's available again.

Load Balancing

Failover configuration ("primary-secondary"), where we route traffic to a specific server or pool of servers, and failover completely to a secondary server or pool when the primary is identified as unavailable. Being able to failover between your own datacenter, AWS regions, or even across major cloud providers are all supported (and easy to configure).

Failover

Geo-steering configuration, which directs traffic to the nearest origin based on region. For example: European clients to your Berlin datacenter, East Coast US clients to your New York datacenter, and Oceanic clients to your Singaporean datacenter (as a simple example!).

Geo-steering

Of course, all of these approaches can be combined: you can load-balance across multiple locations in a specific region, failing over to the next-nearest datacenter should the first (or second) fail. You can also fine-tune what you consider to be a failure mode: given a pool of five servers, you might be able to sustain two unhealthy ones, but a third failure might be the trigger to steer traffic to your next data center.

Combining approaches

How We're Using It

In fact, we're currently combining the above ourselves: the new Cloudflare website combines both failover and geo-steering configurations so we can serve one half of the world from Europe, and the other half from the US.

We've associated each origin web server with their respective regions, and Traffic Manager then automatically directs neighboring regions to those origins without further configuration - e.g. users in Africa are steered to our EU origin, and similarly, users in Asia Pacific and South America are automatically geo-steered to our US origin.

When we add more origins globally, it will be a minor configuration change—with no interruption to traffic—to make that happen.

When our EU origin is taken out of production for maintenance or marked unhealthy, users on that side of the world are automatically steered to the healthy US origin, without us having to interfere manually.

Our own website is critical to our business, and we know our customers feel the same way about their own web properties. We trust Traffic Manager to keep our website available despite what the Internet might throw at us!

Early Access

Starting right now, we are accepting participants in our Early Access program, before we bring this service to all Cloudflare customers. Early Access requires some additional technical savvy, at first, and a willingness to share your feedback with us. For the duration of the Early Access program, Traffic Manager is free to use, with no charges. When introduced generally later this year, Traffic Manager will be available to all customers on all plans, with usage-based charges.

Fit the criteria? Fill out the Early Access form and we'll reach out with the details. We're keen to show off what Traffic Manager can do.

Cloudflare's connectivity cloud protects entire corporate networks, helps customers build Internet-scale applications efficiently, accelerates any website or Internet application, wards off DDoS attacks, keeps hackers at bay, and can help you on your journey to Zero Trust.

Visit 1.1.1.1 from any device to get started with our free app that makes your Internet faster and safer.

To learn more about our mission to help build a better Internet, start here. If you're looking for a new career direction, check out our open positions.
TrafficProduct NewsReliabilityDDoS

Follow on X

Matt Silverlock|@elithrar
Cloudflare|@cloudflare

Related posts

November 20, 2024 10:00 PM

Bigger and badder: how DDoS attack sizes have evolved over the last decade

If we plot the metrics associated with large DDoS attacks observed in the last 10 years, does it show a straight, steady increase in an exponential curve that keeps becoming steeper, or is it closer to a linear growth? Our analysis found the growth is not linear but rather is exponential, with the slope varying depending on the metric (rps, pps or bps). ...