Writing an API at the Edge with Workers and Cloud Firestore

We’re super stoked about bringing you Workers.dev, and we’re even more stoked at every opportunity we have to dogfood Workers. Using what we create keeps us tuned in to the developer experience, which takes a good deal of guesswork out of drawing our roadmaps.

Our goal with Workers.dev is to provide a way to deploy JavaScript code to our network of 165 data centers without requiring developers to register a domain with Cloudflare first. While we gear up for general availability, we wanted to provide users an opportunity to reserve their favorite subdomain in a fair and consistent way, so we built a system to allow visitors to reserve a subdomain where their Workers will live once Workers.dev is released. This is the story of how we wrote the system backing that submission process.

Requirements

Of course, we always want to use the best tool for the job, so designing the Workers that would back Workers.dev started with an inventory of constraints and user experience expectations:

Constraints

We want to limit reservations to one per email address. It’s no fun if someone writes a bot to claim every good Workers subdomain in ten seconds; they wouldn’t be able to claim them without creating a Cloudflare account for every single one anyway!
We only want to allow a single reservation per subdomain to avoid awkward “sorry” messages later on; so, we need a reliable uniqueness constraint within the datastore on write.
We want to blocklist a few key subdomains, and be able to detect and blocklist more as we continue on.

User Flow

At a high, procedural level, our little system needed to handle the following roadmap:

Visitor submits a form with their desired subdomain and email address.
The form is sent off to a worker, whose job is to:
a. Sanitize the inputs (make sure that subdomain is valid! The email too!)
b. Check Cloud Firestore for existing reservations matching either the subdomain or email address
c. Add the user’s email address to Cloud Firestore and shoot off an email with an auto-generated link
d. Return the results to the landing page.
Display feedback to the visitor:
a. If the reservation cannot be made (subdomain or email address has already been used), display an error and clear the form.
b. If the reservation can be made, indicate success and direct the visitor to their email.
Visitor receives verification email, clicks through to Workers.dev again
The page slurps in data from the url and shoots off a request to another worker, whose job is to:
a. Retrieve the email address associated with the link
b. Check again that the email address is not already associated with a subdomain
c. Attempt to create a new reservation. If this request comes back with a 409 error, the subdomain is already reserved
d. Return the results to the landing page.
Display feedback to the visitor:
a. If the reservation cannot be made (subdomain or email address has already been used), display an error and clear the form.
b. If the reservation was successful, display a message and celebrate! ?

Design

I was tasked with handling the back end operations. A few characteristics make this system an ideal fit to use a Worker. For one, the service is ephemeral; we are only offering reservations for a limited period before official registration. This feature wasn’t going to be permanent, didn’t require access to the existing database, and didn’t depend on another service running on our private network. Also, being able to develop and deploy independently of our larger API was a pretty big bonus, so it seemed like a great job for a Worker.

Workers

Workers, to me, embody Single Responsibility Principle. I believe they should be as small as possible while still encompassing an atomic operation. This kind of composability makes them well-suited as autonomous components of both larger systems and ad-hoc projects like this one.

The first thing that jumped out at me was a clear split in the logic of our roadmap: verifying emails and reserving subdomains. With those two distinct operations we’d use two distinct Worker scripts on two distinct routes. Another valid implementation could use a single worker with branching logic based on the request that calls it, but the two independent steps allows more granular management. At the same time, a lot of the logic is shared, specifically the need to hit the datastore, which means the need for a client module. Serverless Framework made this not only possible, but relatively painless with minimal configuration.

Cloud Firestore

For persistence, the project required a datastore that was both performant and consistent to prevent double-reserving the same subdomain. Workers KV, while an excellent storage device for reading data, is eventually consistent (i.e. updates to the store are not immediately available to every node on the edge). Enter Google Cloud Platform’s Cloud Firestore, which just went GA as we were developing this project

Cloud Firestore is a flexible, scalable database for mobile, web, and server development from Firebase and Google Cloud Platform. Like Firebase Realtime Database, it keeps your data in sync across client apps through real time listeners and offers offline support for mobile and web so you can build responsive apps that work regardless of network latency or Internet connectivity.

Using Cloud Firestore gave us a relatively simple solution to our data storage problem: it is immediately consistent, which allows us to avoid reservation collisions. It can be accessed via a REST API with a simple JWT authentication for a service account, meaning our worker can interface with it using Fetch API. Once I had a client written out in JavaScript, it could be used from the CLI with node-fetch for a few simple data-gathering scripts.

Building the API

While Cloud Firestore is excellent for handling requests from many users, in this case we want to restrict access to just our worker instances, and for that we need to create a Service Account. You can do this from the IAM console for your Google Cloud project. We’ll build one with the role “Cloud Datastore User”, which is the role recommended for our use case.

I used a service account for making API calls from a Worker

Adding a name and description helps track users of the project

'Cloud Datastore User' is the appropriate Role for Service Accounts

I then saved my key as a JSON file for use in building JWTs for authenticating requests

I click Create key to save my Service Account Configuration as a JSON file. This is used in the next step to build out the JWT for authentication requests to the Firestore API.

Firestore Authentication

Because we are planning on running this client in the Workers runtime and we don’t have access to either the DOM or the Node runtime, we’ll have to rely on the REST API rather than either of the JavaScript client libraries.That’s okay though; the REST API is quite robust, and authentication is possible using just a JWT, rather than the full OAuth 2.0 handshake procedure.

Generating JWTs

To keep the configuration out of the source code, I wrote a node script for assembling the various pieces of the configuration into a JSON blob.

const fs = require('fs')
const path = require('path')
const YAML = require('yaml-js')

// Service Definition for Cloud Firestore can be found here:
// https://github.com/googleapis/googleapis/blob/master/google/firestore/firestore_v1.yaml
// Service Account Config should be the JSON file you saved in the last step
let [serviceDefinitionPath, serviceAccountConfigPath] = process.argv.slice(2)

let serviceDefinition = YAML.load(fs.readFileSync(serviceDefinitionPath))
let serviceAccountConfig = require(path.resolve(serviceAccountConfigPath))

// JWT spec at https://developers.google.com/identity/protocols/OAuth2ServiceAccount#jwt-auth
let payload = {
  aud: `https://${serviceDefinition.name}/${serviceDefinition.apis[0].name}`,
  iss: serviceAccountConfig.client_email,
  sub: serviceAccountConfig.client_email,
}

let privateKey = serviceAccountConfig.private_key
let privateKeyID = serviceAccountConfig.private_key_id
let algorithm = 'RS256'
let url = `https://firestore.googleapis.com/v1beta1/projects/${serviceAccountConfig.project_id}/databases/(default)/documents`

// The object we want to send to KV
let FIREBASE_JWT_CONFIG = {
  payload,
  privateKey,
  privateKeyID,
  algorithm,
  url,
}

// Write out to JSON file to send to KV
fs.writeFileSync('./config/metadata.json', JSON.stringify(FIREBASE_JWT_CONFIG))

console.log('Worker metadata file created at', metadataFilename)

I leaned on node-jose for JWT generation, and wrapped it into a small token function that adds the required timestamps to the payload, and generates a JWT:

import jose from 'node-jose';

/**
 * Generate a Google Cloud API JWT
 *
 * @param config - the JWT configuration
 */
export default async function generateJWT(config) {
  const iat = new Date().getTime() / 1000;
  let payload = {
    ...config.payload,
    iat: iat,
    exp: iat + 3600
  };

  const signingKey = await jose.JWK.asKey(
    config.privateKey.replace(/\\n/g, '\n'),
    'pem'
  );

  const sign = await jose.JWS.createSign(
    { fields: { alg: config.algorithm, kid: config.privateKeyID } },
    signingKey
  )
    .update(JSON.stringify(payload), 'utf8')
    .final();

  const signature = sign.signatures[0];
  return [signature.protected, sign.payload, signature.signature].join('.');
}

Adding JWT Configuration to KV

We needed somewhere to keep this configuration that is easily accessible by our Workers but not simply committed in our source code. KV works well in this situation, so I created a namespace and add the JSON-stringified value under they key config.

const fetch = require('node-fetch')

const {
  CLOUDFLARE_AUTH_KEY,
  CLOUDFLARE_AUTH_EMAIL,
  CLOUDFLARE_ACCOUNT_ID,
} = process.env

const JWT_CONFIG_NAMESPACE = 'gcpAuth'

// URL and headers for KV API calls https://api.cloudflare.com/#workers-kv-namespace-properties
const kvURI = `https://api.cloudflare.com/client/v4/accounts/${CLOUDFLARE_ACCOUNT_ID}/storage/kv/namespaces`
const headers = {
  'X-Auth-Email': CLOUDFLARE_AUTH_EMAIL,
  'X-Auth-Key': CLOUDFLARE_AUTH_KEY,
  'Content-Type': 'application/json',
}

async function setUpKV() {
  // Add a KV namespace
  // note: if you are using serverless framework, you can skip this set
  // kv namespace bindings in serverless.yaml
  // if not, you'll want to add logic here to get the list of namespaces
  // and update only if the namespace you want is not already set.
  let namespaceId = await fetch(kvURI, {
    method: 'POST',
    headers,
    body: JSON.stringify({ title: JWT_CONFIG_NAMESPACE })
  }).then(response => response.json()).then(data => {
    if (!data.success) throw new Error(JSON.stringify(data.errors))

    return data.result.id
  })

  // set the config variable to the json blob with our jwt settings
  await fetch(`${kvURI}/${namespaceId}/values/config`, {
    method: 'PUT',
    headers,
    body: JSON.stringify(require('../config/metadata.json'))
  }).then(response => response.json()).then(data => {
    if (!data.success) {
      throw new Error(JSON.stringify(data.errors))
    }
  })
}

setUpKV()
  .catch(console.error)

A note: this method for handling secrets is better than committing them in scripts, but still not the greatest. We’re working on improving support for secrets.

Querying Cloud Firestore

After that, I spent a significant amount of time fiddling with Google’s API Explorer to figure out precisely how to make the requests I needed and parse the responses appropriately. The docs are pretty comprehensive, but the API Explorer is key to navigating requests against your own datastore.

I built a small client that can create and retrieve documents from GCP; this is bare bones, you can add methods that follow this pattern for updating and other operations as you wish. The Worker can then pull the config out of KV and use it to initialize the client.

import { generateJWT } from './generateJWT'

async function buildGCPClient() {
  let config = await firebaseConfig.get('config', 'json')
  return new GCPClient(config)
}

export class GCPClient {
  constructor(config) {
    this.url = config.url
    this.config = config
  }

  async authHeaders() {
    let token = await generateJWT(this.config)
    return { Authorization: `Bearer ${token}` }
  }

  async getDocument(collection, documentId) {
    let headers = await this.authHeaders()
    return fetch(`${this.url}/${collection}/${documentId}`, {
      headers,
    })
  }

  async postDocument(collection, doc, documentId = null) {
    let headers = await this.authHeaders()
    let fields = Object.entries(doc).reduce((acc, [k, v]) => ({ ...acc, [k]: { stringValue: v } }), {})
    let qs = ''
    if (documentId) {
      qs = `?documentId=${encodeURIComponent(documentId)}`
    }
    return fetch(`${this.url}/${collection}${qs}`, {
      headers,
      method: 'POST',
      body: JSON.stringify({
        fields,
      })
    })
  }

  async listDocuments(collection, nextPageToken) {
    let headers = await this.authHeaders()
    let qs = new URLSearchParams({
      fields: 'documents(fields,name),nextPageToken',
    })
    if (nextPageToken) qs.append('pageToken', nextPageToken)
    return fetch(`${this.url}/${collection}?${qs.toString()}`, {
      method: 'GET',
      headers,
    })
  }
}

export async function buildGCPClient() {
  let config = await firebaseConfig.get('config', 'json')
  let url = FIREBASE_API_URL
  return new GCPClient(url, config)
}

I can also use this same client locally to run queries against the store. In that case, rather than grabbing the config from kv, I construct the client using the configuration file I created above locally. I also bind `node-fetch` to `global.fetch` in these scripts.

import { GCPClient } from './lib/GCPClient'

// Use the listDocuments endpoint to query all current reservations
async function getReservations(client) {
  let nextPageToken
  let count = 0
  do {
    let reservations = await client.listDocuments(RESERVATIONS, nextPageToken).then(response => response.json())
    count += reservations.length
    nextPageToken = reservations.nextPageToken
  }
  while (nextPageToken)

  return count
}

global.fetch = nodeFetch

let config = fs.readFileSync('./config/metadata.json')
let client = new GCPClient(config)

getReservations(client)
    .then(console.log)
    .catch(console.error)

Conclusions

Specifically for this project, Workers fit the use-case really well for a few reasons:

We only intend to use this during the run-up to registration, so being able to re-deploy a function completely independent of the main configuration API is incredibly freeing, especially for smaller tweaks.
The lessons learned during this prototyping experience will prove extremely valuable as we implement the more permanent registration system.
Finally, even though our datastore is effectively centralized, using Workers means that all the requests to various APIs - our email service, logging service, and of course GCP- are made from the Edge.Running at the edge leverages our network and keeps our auth data where we want it, while using Cloud Firestore guarantees immediate consistency and performant querying for our Workers running around the world.

Building out this API using Workers was an eye-opening experience. We love any opportunity to use our own products, keeping us in touch with the experience and guiding our roadmap for future development. We’re also extremely excited to see what all of you do on Workers.dev!

Interested in deploying a Cloudflare Worker without setting up a domain on Cloudflare? We’re making it easier to get started building serverless applications with custom subdomains on workers.dev. If you’re already a Cloudflare customer, you can add Workers to your existing website here.

Reserve a workers.dev subdomain

The Cloudflare Blog