Blog What we do Support Community
Login Sign up

Improving RubyDocs with Cloudflare Workers and Workers KV

by Guest Author.

The following is a guest post from Manuel Meurer, Berlin based web developer, entrepreneur, and Ruby on Rails enthusiast. In 2010, he founded Kraut Computing as a one-man web dev shop and launched Uplink, a network for IT experts in Germany, in 2015.

RubyDocs is an open-source service that generates and hosts “fancy docs for any Ruby project”, most notably for the Ruby language itself and for Rails, the most popular Ruby framework. The nifty thing about it is that the docs can be generated for any version of a project — so let’s say you’re working on an old Rails app that still uses version 3.2.22 (released June 16, 2015), then you can really benefit from having access to the docs of that specific version, since a lot of the methods, classes, and concepts of the current Rails version (5.2.1 at the time of writing) don’t exist in that old version.

Scratching an itch

I built RubyDocs back in 2013 to scratch my own itch — a few similar services that I had used over the years had disappeared or hadn’t been regularly updated. After the initial work to get RubyDocs up and running, I continued improving a few small things over the years, such as updating dependencies and adding new projects that were submitted by users. But by and large, the site was (and is) running on autopilot, updating the list of versions for each project automatically from GitHub tags and generating new docs as users request them. One thing I had always wanted to do was to move the hosted docs from a subdomain (docs.rubydocs.org) to a subpath on the main domain (e.g., rubydocs.org/docs). I had put them on the subdomain to be able to use a CDN with long expiration times, since the docs are mostly static HTML and CSS with a bit of JavaScript sprinkled in. But for SEO reasons (AFAIK it’s still better to have everything on the main domain), and for a more coherent experience when using the site, I wanted everything on one domain. But I could never figure out how to run the RubyDocs app itself (built with Rails, of course) on rubydocs.org and still get all the advantages of a CDN for a subpath…

Enter Cloudflare

Fast forward to September 2017 when I read about Cloudflare Workers for the first time. I was already a heavy user of Cloudflare for their DNS, CDN and DDoS mitigation and was always astonished by the amount of high-quality services they were offering for free. And now they basically added a serverless platform on top of that for $5 per month? You really have to admire their dedication to making their stuff available to as many people as possible for as low a price as possible.

For a few months, I kept thinking about what I could use the Workers for until it hit me — they could be the perfect tool to proxy requests from a subpath to a subdomain! I wouldn’t have to change the RubyDocs server/CDN setup at all, just add a Worker that does the proxying and a Page Rule to redirect all traffic from the subdomain to the new subpath. I got in touch with Cloudflare support to confirm that this was indeed possible (and a proper use of their Workers) and since RubyDocs is open-source, they even offered to sponsor the workers!

Let’s get to work!

While I was working on the Worker (no pun intended), an issue in the RubyDocs GitHub repo popped up — it turned out I had inadvertently broken a few URLs with a faulty regex, which was quickly fixed (Worker scripts can be edited in the Cloudflare backend and when saved, the live site is updated within seconds). But the author of the issue also mentioned that someone had apparently created a DuckDuckGo bang for RubyDocs. Sweet, I didn’t even know they existed!

For this bang to really be useful, it was necessary to have a URL that always points to the latest version of a project’s docs, i.e. something like rubydocs.org/d/ruby-latest/ (which now works), and update automatically when a new version is released. Well, I thought to myself, if that isn’t another perfect use case for a Worker! But wait, how does the Worker know which version is the latest? We could include the data in the Worker script and update it periodically, but as the number of projects on RubyDocs grows, the script would grow as well — probably not to an unmanageable size, but it still didn’t feel like a clean solution. The Worker could also make a quick subrequest to ask the main RubyDocs Rails app for the latest version when a request is processed, but that would mean setting up an API and monitoring the performance of the endpoint, and it would most likely severely slow down these ‘latest’ requests.

Enter Cloudflare, again

And as if someone at Cloudflare had been waiting for me to ponder this problem, they launched Cloudflare KV, a key-value store that can be written to via the Cloudflare API and read from within a Worker. I was dumbfounded by the coincidence. It was very obviously the best way to solve my problem — store the latest version of each project from the RubyDocs Rails app every time a new version is detected, and read it from the Worker script when a ‘latest’ request comes in.

Long story short: here is the resulting Worker script (also on GitHub) and after a bit of fiddling (mostly due to my inexperience with JavaScript), everything is working smoothly.

addEventListener('fetch', event => {
  event.respondWith(handleRequest(event.request))
})

async function handleRequest(request) {
  const match = request.url.match(/\/d\/([^?]+)(\?.+)?/);
  let fetchable;

  if (match) {
    let doc   = match[1];
    let query = match[2] || '';

    // Redirect to latest if necessary.
    latestMatch = doc.match(/^([^/]+)-latest/);
    if (latestMatch) {
      const latest = await LATEST.get(latestMatch[1]);
      let newUrl = request.url.replace(/[^/]+-latest/, latest);
      return Response.redirect(newUrl, 302);
    }

    // Redirect to URL with trailing slash if necessary.
    if (!doc.includes('/')) {
      let newUrl = request.url.replace(doc, doc + '/');
      return Response.redirect(newUrl, 301);
     }

    if (doc.endsWith('/'))
      doc += 'index.html';
    fetchable = `http://d3eo0xoa109f6x.cloudfront.net/${doc}${query}`;
  } else {
    fetchable = request;
  }

  const response = await fetch(fetchable);
  return response;
}

NOTE: LATEST is the name of the author's KV namespace and is not a default for Workers KV

I have submitted a request to DuckDuckGo to use the new ‘latest’ URLs for the !rubydocs and !rb bangs, but so far they still forward to an older version.

Many thanks to Cloudflare for supporting RubyDocs and, more importantly, building a better Internet for all of us!

comments powered by Disqus