Eliminating the last reasons to not enable IPv6

by Matthew Prince.

Today is June 6. For the last two years, the date has been celebrated as World IPv6 Day. CloudFlare has offered full IPv6 support as well as our IPv6-to-IPv4 gateway since 2012. In preparation for this year's IPv6 Day, we scanned the world's largest websites in order to figure out how many are available over IPv6. There's good news and bad news.

The good news is that our IPv6 gateway is being widely used in order to enable the IPv6 Web. In fact, of the sites that support IPv6, more than 20% of them do so via CloudFlare. The bad news is that, while CloudFlare, Google, Facebook and other big guys have shown IPv6 can be adopted without a performance penalty, still only 7% of the world's largest websites are available over IPv6. And, disappointingly, in spite of CloudFlare offering it for free, only about half of our customers have turned on our IPv6 gateway.

The silver lining is that, if we can just get all our current customers to enable our IPv6 gateway, we'd nearly reach the milestone of 10% of the world's largest sites being available over IPv6. With that goal in mind, we set out to find and solve the last stumbling blocks for our customers enabling IPv6.

Y U No IPv6?

CloudFlare makes supporting IPv6 on a website ridiculously easy. If your backend supports IPv6 then visitors arriving on an IPv6 connection will be transported via the protocol end-to-end. If your backend only supports IPv4, CloudFlare will accept a visitor over IPv6 and then seamlessly make a request to your server over IPv4. We've been defaulting the gateway on for new customers for several months and any existing customer can enable it with a single click. So, if we've made IPv6 so easy, what's stopping those sites that haven't yet enabled it?

The answer is usually legacy software that assumes an IPv4 world. Often this is software that handles sessions or stops fraud or abuse. For example, the popular, if sometimes controversial, website 4chan is a CloudFlare customer. 4chan has a notoriously mischievous audience. Sometimes, users will try to spam the service. To keep spammers at bay, 4chan uses algorithms which look for suspicious behavior. Unfortunately, these algorithms use a visitor's IP address as one of their parameters and they assume the IP will be in the IPv4 format.

The long term solution, of course, is for platforms like 4chan to rewrite their software to accommodate IPv6. That takes time. We wanted to provide a stopgap solution that would allow CloudFlare customers to enable IPv6 quickly while they worked to upgrade their software to fully support it.

Introducing Pseudo IPv4

To accommodate software that assumes a IPv4 world, today we're enabling a new CloudFlare option: Pseudo IPv4. The option will, whenever a connection is established over IPv6, add a HTTP header to requests with a "pseudo" IPv4 address. We are using Class E IP space (240.0.0.0 - 255.255.255.255) for these addresses. Class E address space is reserved as experimental and no actual traffic should originate from it. If, at some point, some or all of Class E space is designated for a specific use then we'll adjust the schema for the Pseudo IPv4 header.

Class E space gives us 268,435,456 possible unique IPv4 addresses. That's much smaller than the 340 undecillion possible IPv6 addresses, but it's larger than the number of IPs seen by all but the very largest websites. The Pseudo IPv4 service uses a hashing algorithm applied to top 64 bits of the connecting IPv6 address in order to map the visitor to one of the Class E IPs. Because the hashing algorithm will always produce the same output for the same input, the same IPv6 address will always result in the same Pseudo IPv4 address.

There are two options for the Pseudo IPv4 service: 1) you can have us automatically add a header (Cf-Pseudo-IPv4), which can then be parsed by software as needed; or 2) you can have us overwrite the existing Cf-Connecting-IP and X-Forwarded-For headers with a Pseudo IPv4 address. The advantage of the overwrite option is that, in most cases, it won't require any software changes. If you choose the overwrite option, we'll append a new header (Cf-Connecting-IPv6) in order to ensure you can still find the actual connecting IP address for debugging.

The Pseudo IPv4 service, like our IPv6 gateway, is available to all our customers, even those on the free plan. Our hope is that it will eliminate one of the last reasons for IPv6 holdouts. If you're already a CloudFlare customer, login to your account and make sure IPv6 is enabled on the CloudFlare Settings page. You can find the toggle for Pseudo IPv4 there as well. If you're not yet a CloudFlare customer, it only takes five minutes to sign up, so for most of the world there's still time to get your site on our network and join the modern web before World IPv6 Day 2014 comes to an end.

And, by the way, as of today, 4chan is now available over IPv6.

Addendum: Nitty Gritty Details

For the technical in the audience, here are some of the nitty gritty details on how we implemented Pseudo IPv4.

We chose to use the MD5 hashing algorithm. While MD5 has risks when it's used in cryptography, it is faster than many alternatives and has a acceptable uniformity over the hash space. Note that there is some risk of a collision (i.e., where two IPv6 addresses map to the same IPv4 pseudo address), however most abuse/fraud systems take this into account already in order to deal with NATs and other instances where multiple people may share a single IP.

At CloudFlare, we use a modified version of NGINX and wrote most of our request processing code in Lua, relying on the ngx_openresty app server. Here's the Lua code for the proof-of-concept prototype of the hashing algorithm:

function pseudo_ipv4(ipv6_top64)
   -- Grab bottom 32 bits from MD5 hash
   -- ngx.md5 does not suppress leading zeros so regexp will always match.       
   local hash = ngx.md5(ipv6_top64)
   local mod = ngx.re.match(hash,
                  "([a-f0-9]{2})([a-f0-9]{2})([a-f0-9]{2})([a-f0-9]{2})$",
                  "joi")

   -- Normalize first byte to fit in class E space. Done using subtraction as
   -- Lua doesn't have built-in bitwise operators
   local b1 = tonumber(mod[1], 16)
   if b1 >= 0xF0 then
      b1 = b1 - 0xF0
   end

   return string.format("%d.%d.%d.%d",
             b1 + 0xF0, tonumber(mod[2], 16), tonumber(mod[3], 16),
             tonumber(mod[4], 16))
end

Since every request needs to pass through the hashing algorithm, we wanted to make it as fast as possible. We set to work optimizing the Lua prototype for speed. Here's the result:

function pseudo_ipv4(ipv6_top64)
   -- Grab bottom 32 bits from MD5 hash
   local md5 = ngx.md5_bin(ipv6_top64)
   local b1, b2, b3, b4 = md5:byte(13, 16)

   -- Normalize first byte to fit in class E space
   b1 = bit.bor(0xF0, bit.band(b1, 0x0F))

   return string.format('%d.%d.%d.%d', b1, b2, b3, b4)
end

PS - If writing Lua code like the above looks fun, we're always hiring.

comments powered by Disqus