Today, we’re excited to announce HTMLRewriter beta — a streaming HTML parser with an easy to use selector based JavaScript API for DOM manipulation, available in the Cloudflare Workers runtime.

For those of you who are unfamiliar, Cloudflare Workers is a lightweight serverless platform that allows developers to leverage Cloudflare’s network to augment existing applications or create entirely new ones without configuring or maintaining infrastructure.

Static Sites to Dynamic Applications

On Friday we announced Workers Sites: a static site deployment workflow built into the Wrangler CLI tool. Now, paired with the HTML Rewriter API, you can perform DOM transformations on top of your static HTML, right on the Cloudflare edge.

You could previously do this by ingesting the entire body of the response into the Worker, however, that method was prone to introducing a few issues. First, parsing a large file was bound to run into memory or CPU limits. Additionally, it would impact your TTFB as the body could no longer be streamed, and the browser would be prevented from doing any speculative parsing to load subsequent assets.

HTMLRewriter was the missing piece to having your application fully live on the edge – soup to nuts. You can build your API on Cloudflare Workers as a serverless function, have the static elements of your frontend hosted on Workers Sites, and dynamically tie them together using the HTMLRewriter API.

Enter JAMStack

You may be thinking “wait!”, JavaScript, serverless APIs… this is starting to sound a little familiar. It sounded familiar to us too.

Is this JAMStack?

First, let’s answer the question — what is JAMStack? JAMStack is a term coined by Mathias Biilmann, that stands for JavaScript, APIs, and Markup. JAMStack applications are intended to be very easy to scale since they rely on simplified static site deployment. They are also intended to simplify the web development workflow, especially for frontend developers, by bringing data manipulation and rendering that traditionally happened on the backend to the front-end and interacting with the backend only via API calls.

So to that extent, yes, this is JAMStack. However, HTMLRewriter takes this idea one step further.

The Edge: Not Quite Client, Not Quite Server

Most JAMStack applications rely on client-side calls to third-party APIs, where the rendering can be handled client-side using JavaScript, allowing front end developers to work with toolchains and languages they are already familiar with. However, this means with every page load the client has to go to the origin, wait for HTML and JS, and then after being parsed and loaded make multiple calls to APIs. Additionally, all of this happens on client-side devices which are inevitably less powerful machines than servers and have potentially flaky last-mile connections.

With HTMLRewriter in Workers, you can make those API calls from the edge, where failures are significantly less likely than on client device connections, and results can often be cached. Better yet, you can write the APIs themselves in Workers and can incorporate the results directly into the HTML — all on the same powerful edge machine. Using these machines to perform “edge-side rendering” with HTMLRewriter always happens as close as possible to your end users, without happening on the device itself, and it eliminates the latency of traveling all the way to the origin.

What does the HTMLRewriter API look like?

The HTMLRewriter class is a jQuery-like experience directly inside of your Workers application, allowing developers to build deeply functional applications, leaning on a powerful JavaScript API to parse and transform HTML.

Below is an example of how you can use the HTMLRewriter to rewrite links on a webpage from HTTP to HTTPS.

const REWRITER = new HTMLRewriter()
    .on('a.avatar', { element:  e => rewriteUrl(e, 'href') })
    .on('img', { element: e => rewriteUrl(e, 'src') });

async function handleRequest(req) {
  const res = await fetch(req);
  return REWRITER.transform(res);
}

In the example above, we create a new instance of HTMLRewriter, and use the selector to find all instances of a and img elements, and call the rewriteURL function on the href and src properties respectively.

Internationalization and localization tutorial: If you’d like to take things further, we have a full tutorial on how to make your application i18n friendly using HTMLRewriter.

Getting started

If you’re already using Cloudflare Workers, you can simply get started with the HTMLRewriter by consulting our documentation (no sign up or anything else required!). If you’re new to Cloudflare Workers, we recommend starting out by signing up here.

If you’re interested in the nitty, gritty details of how the HTMLRewriter works, and learning more than you’ve ever wanted to know about parsing the DOM, stay tuned. We’re excited to share the details with you in a future post.

One last thing, you are not limited to Workers Sites only. Since Cloudflare Workers can be deployed as a proxy in front of any application you can use the HTMLRewriter as an elegant way to augment your existing site, and easily add dynamic elements, regardless of backend.

We love to hear from you!

We’re always iterating and working to improve our product based on customer feedback! Please help us out by filling out our survey about your experience.


Have you built something interesting with Workers? Let us know @CloudflareDev!