CC BY 2.0 from Brian Snelson
I'm pleased to announce that CloudFlare now supports WebSockets. The ability to protect and accelerate WebSockets has been one of our most requested features. As of today, CloudFlare is rolling out WebSocket support for any Enterprise customer, and a limited set of CloudFlare Business customers. Over the coming months, we expect to extend support to all Business and Pro customers.
We're rolling out WebSockets slowly because it presents a new set of challenges. The story below chronicles the challenges of supporting WebSockets, and what we’ve done to overcome them.
The Web Before WebSockets
Before diving into WebSockets, it's important to understand HTTP—the traditional protocol of the web. HTTP supports a number of different methods by which a request can be sent to a server. When you click on a traditional link you are sending a GET request to a web server. The web server receives the request, then sends a response.
When you submit a web form (such as when you're giving your username and password when logging into an account) you use another HTTP method called POST, but the interaction is functionally the same. Your browser (called the ‘client’) sends data to the web server which is waiting to receive it. The web server then sends a response back to your browser. Your browser accepts the response because it's waiting for it after having sent the original request.
Regardless of the HTTP method, the communication between your browser and the web server operates in this lockstep request then response fashion. Once the client's browser has sent the request, it can't be modified.
In order to get new content, the user had to refresh the full page. This was the state of the web until 1999 when the Outlook Web team, unhappy with poor user experience, introduced a custom extension to Internet Explorer called XMLHttpRequest (aka AJAX). From then on web applications could use JavaScript to trigger HTTP requests programmatically in the background without the need of full page refresh.
However, to make sure the page on the client's browser is up to date, the JavaScript needed to trigger the AJAX request every few seconds. This is like asking the web server all the time: is there anything new yet? is there anything new yet?... This works, but it's not particularly efficient.
Ideally, what you'd want is a persistent open connection between the browser and the server allowing them to exchange data in real-time, not just when data is requested.
Prior to WebSockets, there were a few attempts at creating a persistent open connection. These would effectively open an HTTP request, and hold it open for an extended period of time. There were various solutions referred by the name “Comet”. Although they generally worked, they were pretty much a hack with limited functionality and often imposed more overhead than necessary. What was needed was a new protocol supported by both browsers and servers.
Enter WebSockets
WebSockets were adopted as a standard web protocol in 2011. Today, they’re supported by all modern versions of major browsers. The WebSocket protocol is a distinct TCP-based protocol, however, it’s initiated by an HTTP request which is then "upgraded" to create a persistent connection between the browser and the server. A WebSocket connection is bidirectional: the server can send data to the browser without the browser having to explicitly ask for it. This makes things like multiplayer games, chat, and other services that require real-time exchange of information possible over a standard web protocol.
CloudFlare is built on a modified version of the NGINX web server, and NGINX began supporting WebSocket proxying beginning with version 1.3.13 (February 2013). As soon as NGINX proxying support was in place, we investigated how we could support WebSockets for our customers. The challenge was that WebSockets have a very different connection profile, and CloudFlare wasn't originally optimized for that profile.
CC BY-SA 2.0 by Fernando de Sousa
Connection Counts
CloudFlare sees a large number of traditional HTTP requests that generate relatively short-lived connections. And, traditionally, we aimed at optimizing our network to support these requests. WebSockets present new challenges because they require much longer lived connections than traditional web requests, and that required changes to our network stack.
A modern operating system can handle multiple concurrent connections to different network services so long as there's a way to distinguish these connections from each other. One way of making these distinctions is called a "tuple". In theory, there are five distinct elements that form a tuple that can differentiate concurrent connections: protocol (e.g., TCP or UDP), the source IP address, the source port, the destination IP address, and the destination port.
Since CloudFlare is a proxy, there are two connections that matter: connections from browsers to our network, and connections from our network back to our customers' origin web servers. Connections from browsers to our network have highly diverse source IPs so they don’t impose a concurrent connection bottleneck. On the other hand, even before we implemented WebSockets, we've seen constraints based on concurrent connections to our customers' origin servers.
Trouble with Tuples
Connections are distinguished using the five tuple elements, two connections can be told apart if any of the five different variables differ. However, in practice, the set is more limited. In the case of CloudFlare's connections to our customers' origins, the protocol for a connection, whether a WebSocket or HTTP, is always TCP. The destination port is also fixed to 80—if it's a non-encrypted connection, or 443—if it's an encrypted connection.
When CloudFlare first launched, all traffic to each origin server came from only one IP address per CloudFlare server. We found that caused problems with hosting providers' anti-abuse systems. They would see a very large number of requests from a single IP and block it.
Our solution to this problem was to spread the requests across multiple source IP addresses. We hash the IP address of the client in order to ensure the same browser will connect via the same IP address, since some older web applications use the connecting IP address as part of their session formula. Now we have at least 256 IPs for origin traffic in each data center. Each server will have a handful of these addresses to use for traffic to and from the origin.
While it seems like the number of possible connections would be nearly infinite given the five different variables in the tuple, practical realities limit the connection counts quickly. There's only one possibility for the protocol, destination port, and destination IP. For the source IP, we are limited by the number of IP addresses dedicated to connecting to the origin. That leaves the source port, which ends up being the biggest driver in the number of connections you can support per server.
CC BY-ND 2.0
Picking Ports
The number of available ports is defined as a 16-bit number. That allows a maximum of 65,536 theoretical ports, but, in practice, the number of ports available to act as a source port is more limited.
The list of ports that can be used as a source port is known as the Ephemeral Port Range. The standards organization in charge of such things, known as IANA, recommends that the operating system pick a source port between 49152 and 65535. If you follow IANA's recommendations for the Ephemeral Port Range, there are only 16,384 available source ports.
The ports in range 1 - 1023, know as "Well Known Ports", are specially reserved and excluded from the Ephemeral Port Range.
At CloudFlare, we have a good sense of what will be connecting across the IPs on our network so we're comfortable expanding our Ephemeral Port Range to span from 9024 through 65535, giving us 56,512 possible source ports. The maximum number of simultaneous outgoing connections to any given CloudFlare customers' origin from any given server on our network should be: 56,512 multiplied by the number of source IPs assigned to the server. You'd think that would be plenty of connections, but there's a catch.
Bind Before Connect
As I wrote above, in order to prevent too much traffic from coming from a single IP, we spread requests across multiple source IP addresses. We use a version of Linux in the Debian family. In Linux, in order to pin the outbound request to a particular IP you bind a socket to a particular source IP and source port (using the bind() function) then establish the connection (using the connect() function). For example, if you wanted to set the source IP to be 1.1.1.1 and the source port to 1234 and then open a connection to the web server at www.google.com, you'd use the following code:
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.bind(("1.1.1.1", 1234)) s.connect(("www.google.com", 80))
If you specify a source port of 0 when you call bind(), then you're instructing the operating system to randomly find an available port for you:
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
Set source port to 0, instructing the OS to find an available port
s.bind(("1.1.1.1", 0)) s.connect(("www.google.com", 80))
That works great, however, Linux's bind function is conservative. Because it doesn't know what you're going to be using the port for, when a port is reserved the bind function holds it regardless of the protocol, destination IP, or destination port. In other words, if you bind this way you're only using two of the possible five variables in the connection tuple.
At CloudFlare, this limited the number of concurrent connections per server from 64k for every source IP globally, to 64k for every source IP for every destination host. This practically removes the limit of outgoing connections from CloudFlare.
In practice, with typical HTTP connections, the connection limits rarely had an impact. This was because HTTP connections are typically very short-lived so under normal circumstances no server would ever hit the limit. We would occasionally see the limits hit on some servers during large Layer 7 DDoS attacks. We knew, however, if we were going to support WebSockets having a limited pool of concurrent connections would create a problem.
CC BY 2.0
Safely Reusing Ports
Our solution was to instruct the operating system to be less conservative, and allow ports to be reused. You can do this when you set up a socket by setting the SO_REUSEADDR option. Here's the code:
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
Specify that it's ok to reuse the same port even if it's been used before
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1) s.bind(("1.1.1.1", 0)) s.connect(("www.google.com", 80))
This works fine in our case so long as two connections sharing the same source IP and source port are sending and receiving traffic from two different destination IPs. And, given the large number of destination IPs behind our network, conflicts are rare. If there's a conflict, the connect() function will return an error. The solution is to watch for the error and, when one occurs, retry the port selection until you find an available, unconflicted port. Here's a simplified version of the code we use:
for i in range(RETRIES): s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1) s.bind(("192.168.1.21", 0)) try: s.connect(("www.google.com", 80)) break except socket.error, e: if e.errno != errno.EADDRNOTAVAIL: raise else: raise Exception("Failed to find an unused source port")
We’re now running this in production, increasing the maximum number of concurrent connections per server. With that improvement we're comfortable starting to support WebSockets to our customers.
WebSocket Rollout
We're beginning the rollout of WebSockets starting with our Enterprise customers. Enterprise customers who are interested in enabling WebSockets should contact their account manager. We're also rolling this feature out in beta to a limited number of our Business customers. If you're a Business customer and you'd like to sign up for the beta, please open a ticket with our support team.
Over the next few months, as we get a better sense for the demand and load this feature puts on our systems, we plan to expand support for WebSockets to more customers including those at other tiers of CloudFlare's service.
Finally, if you're interested in this topic and want to dive deeper into the technical details, I encourage you to read Marek Majkowski's blog post on the subject. Marek is the engineer on our team who spearheaded CloudFlare's efforts to add support for WebSockets. He argued convincingly that WebSockets were part of the modern web, and it was critical that we find a way to protect and accelerate them. Like our efforts to lead the way for broad SSL support, SPDY, and IPv6, CloudFlare's support of WebSockets furthers our mission of helping build a better Internet.