In April 2010, around the time we launched the CloudFlare private beta to a limited set of test users, Google announced that they were going to start taking site speed into account in their rankings. Awesome, we thought, not only was CloudFlare a performance and security service, but now we could promise search engine optimization (SEO) benefits as well!
It was unnerving when, in spite of making most sites at least 30% faster, our early tests showed CloudFlare wasn't helping much in terms of search engine rankings. In fact, for a period of about 3 months last summer, we heard numerous reports of sites seeing reports from their Google Webmaster Tools that indicated Google's search engine crawler wasn't accessing pages as often. Not good.
This wasn't a problem we took lightly. We spent those three months investigating what was happening, looking both into our technical systems to make sure search engines weren't getting restricted and also working directly with the crawl and ranking teams at search engines. The answer, it turned out, was complex.
Search engines don't want to overburden sites when they crawl them. To minimize this, they cluster sites by IP address and, if they detect a problem on one site, they reduce the crawl rate on all sites sharing that IP. We discovered that what was happening was one site behind CloudFlare was having a problem with their origin and, because of that, the search engines would slow down their crawl "velocity" (the rate at which a site gets crawled). None of this became obvious until we had thousands of sites using our service and could work with the crawl teams to get to the bottom of what was going on.
We did a couple things. First, we invented a new technology that, when it detects a problem on a site, automatically changes the site's CloudFlare IP addresses to isolate it from other sites. (Think of it like quarantining a sick patient.) Second, we worked directly with the crawl teams at the big search engines to make them aware of how CloudFlare worked. All the search engines had special rules for CDNs like Akamai already in place. CloudFlare worked a bit differently, but fell into the same general category. With the cooperation of these search teams we were able to get CloudFlare's IP ranges are listed in a special category within search crawlers. Not only does this keep sites behind them from being clustered to a least performant denominator, or incorrectly geo-tagged based on the DNS resolution IP, it also allows the search engines to crawl at their maximum velocity since CloudFlare can handle the load without overburdening the origin.
The results have been dramatic. As soon as we resolved the issues, the problems reported on Google Webmaster Tools immediately cleared up. Sites page speed scores improved. And, while nothing is more important than having good, relevant content for SEO, we get notes from happy users almost every day about how changing nothing other than adding CloudFlare helped with their search rankings. That's pretty cool. Incidentally, while we don't have any inside knowledge into how different search engines rank sites, our anecdotal evidence suggests the ability to handle a higher search crawl velocity may be even more important to rankings than faster page load times.
In the end, what we thought was going to be a quick win turned out to be a surprisingly tricky problem. Today, we see a number of services following in CloudFlare's footsteps and claiming they'll help your SEO just because they can make your site a bit faster. Turns out, while it seems like it should be straight forward, it's really not that easy. We're happy that we were able to solve the problem for our users and, in the end, deliver real SEO benefits. As other cloud service providers inevitably encounter similar issues, we would be happy to share our knowledge in order to help them navigate these tricky issues as well.