Load Balancing as a concept is pretty straightforward. Take an existing infrastructure and route requests to the available origin servers so no single server is overwhelmed. Add in some health monitoring to ensure each server has a heartbeat/pulse so proactive decisions can be made. With two steps, you get more effective utilization of your existing resources… simple enough!
As your application grows, however, load balancing becomes more complicated. An example of this — and the subject of this blog post — is how load balancing interacts with the Host header in an HTTP request.
Host headers and load balancing
Every request to a website contains a unique piece of identifying information called the Host header. The Host header helps route each request to the correct origin server so the end user is sent the information they requested from the start.
For example, say that you enter
example.com into my URL bar in my browser. You are sending a request to ‘example.com’ to send you back the homepage located within that application. To make sure you actually get resources from
example.com, your browser includes a Host header of
example.com. When that request reaches the back-end infrastructure, it finds the origin server that also has the host
example.com and then sends back the required information.
Host headers are not only important for locating resources, but also for security. Imagine how jarring it would be if someone sent your request to
scary-example.com and returned malicious resources instead! If the Host header in the request does not match the Host header at the destination, then the request will fail.
This behavior can cause issues when you start using third-party applications like Google Compute Cloud, Google App Engine, Heroku, and Amazon S3. These applications are great, helping you deploy and scale applications. However, they have an important effect on the requirements for a Host header. Your server is no longer located at
example.com. Instead, it’s probably at something like
example.herokuapp.com. If you can’t change the Host header on your origin, users might be sending their requests to the wrong place, leading to error messages and poor user experience.
The combination of Host headers and third-party applications caused issues with our Load Balancing solution. The Host header of the request would not match the updated Host header for the new application now added into the request pipeline, meaning the request would fail. Larger customers were forced to choose between using third-party applications and load balancing… not a decision we wanted to force our customers to make!
We saw this to be a big problem and wanted to help our customers leverage different solutions in the market to ensure they are successful and can align their business objectives with their infrastructure.
Introducing Origin Server Host Header Overrides
To address this problem, we’ve added support to override the Host header on a per-origin basis within our Load Balancing solution. This means that you don’t have to worry about sending errors back to your end-users and can seamlessly integrate apps — such as Heroku — into your reliability solutions.
We wanted to create a scalable solution, one that allowed you to perform these overrides in an easy and intuitive way. As we reviewed the problem, we saw that there was no one-size fit all solution. Different customers are going to have their apps architected differently and have different goals based on their business, industry, geography, etc. Thus, it was important to provide flexibility to override the Host header on a per-origin basis, since different origins can support different segments of an application or entirely different applications altogether! With a simple click, users can now update the Host header on their origins. When requests hit the Load Balancer, it reads the overwritten value for the origin host instead of the defaulted origin Host header value and requests are routed properly without errors.
“But what about my health monitors!?” you may be thinking. When you add a Host header override on your origin, we will automatically apply it to the origin’s health monitor. No extra configuration is required. On top of that, we felt it was important to provide as much meaningful information as possible so you could make informed decisions around any configuration updates. When editing a health monitor, you will see a new table if any origins have a Host header override.
How else can overriding the Host header help me?
You might find the Host header override useful if you have a shared web server. In these situations, IP addresses are not uniquely assigned to each server, meaning you might need a Host header override to direct your request to the right domain. For example, you might have
example-perfume.com all on a shared webserver hosted on the same IP address. When a request intended for
example-furniture.com is forwarded to the server, the Host header override —
Host: example-furniture.com — specifies which host to connect to.
Setting a Host header would mean that you enforce a strict HTTPS/TLS connection to reach the origin. We validate the Server Name Indicator (SNI) to verify that the request will be forwarded to the appropriate website.
How does it work under the hood?
To understand how it works, first let’s see what solutions we have currently. To configure a Load Balancer with a third-party cloud platform that requires a Host header, you would need to set a Host header override with Page Rules (Enterprise customers only). This solution works great if you have back-end origin1 and origin2 that expect the same Host header. But when the origins expect different Host headers, it wasn’t possible to set it per origin. There was a need for a more granular flexibility, hence the reason for per-origin Host header override.
Now, when you navigate to the Traffic tab and set up a Load Balancer, you can add a Host header override on your back-end origin. When you add the Host header, we do safety checks — which we’ll get to in a bit — and if it passes all the checks then your configuration gets saved. If you set a Host header override on an origin, it will take precedence over a Host header override set on the monitor.
Cloudflare has over 200 edge locations around the world, making it one of the largest Anycast networks. We make sure that your Load Balancer config information is available at all our Cloudflare edge locations. For example, you may have your website hosted on a third-party cloud provider like Heroku. You would set up a Load Balancer with pools and associated origin
example.com. To reach
example.com hosted on Heroku, you would set the heroku url
example.herokuapp.com as the origin Host header “Host:example.herokuapp.com”. When a request hits a Cloudflare edge, we would first check the load balancing config and the health monitor to check the origin health status. Then, we would set the Host header override and Server Name Indication (SNI).For more about, visit our learning center.
There are some restrictions that limit the domain set as the Host header override on an origin:
- The only allowed HTTP header is “Host” and no other.
- You can specify only one “Host” Host header per origin, so no duplicates or line wrap/indent Host header with space is allowed.
- No ports can be added to the Host header.
- We allow fully qualified domain names (FQDN) and IP addresses that can be resolved by public DNS.
- The FQDN in the Host header must be a subdomain of a zone associated with the account, which is applicable for partial zones and secondary zones.
These requirements make sure you are directing requests to domains that belong to you. With future development, we may relax some restrictions.