Last year, we launched HTMLRewriter for Cloudflare Workers, which enables developers to make streaming changes to HTML on the edge. Unlike a traditional DOM parser that loads the entire HTML document into memory, we developed a streaming parser written in Rust. Today, we’re announcing support for asynchronous handlers in HTMLRewriter. Now you can perform asynchronous tasks based on the content of the HTML document: from prefetching fonts and image assets to fetching user-specific content from a CMS.
How can I use HTMLRewriter?
We designed HTMLRewriter to have a jQuery-like experience. First, you define a handler, then you assign it to a CSS selector; Workers does the rest for you. You can look at our new and improved documentation to see our supported list of selectors, which now include nth-child
selectors. The example below changes the alternative text for every second image in a document.
async function editHtml(request) {
return new HTMLRewriter()
.on("img:nth-child(2)", new ElementHandler())
.transform(await fetch(request))
}
class ElementHandler {
element(e) {
e.setAttribute("alt", "A very interesting image")
}
}
Since these changes are applied using streams, we maintain a low TTFB (time to first byte) and users never know the HTML was transformed. If you’re interested in how we’re able to accomplish this technically, you can read our blog post about HTML parsing.
What’s new with HTMLRewriter?
Now you can define an async
handler which allows any code that uses await
. This means you can make dynamic HTML injection, based on the contents of the document, without having prior knowledge of what it contains. This allows you to customize HTML based on a particular user, feature flag, or even an integration with a CMS.
class UserCustomizer {
// Remember to add the `async` keyword to the handler method
async element(e) {
const user = await fetch(`https://my.api.com/user/${e.getAttribute("user-id")}/online`)
if (user.ok) {
// Add the user’s name to the element
e.setAttribute("user-name", await user.text())
} else {
// Remove the element, since this user not online
e.remove()
}
}
}
What can I build with HTMLRewriter?
To illustrate the flexibility of HTMLRewriter, I wrote an example that you can deploy on your own website. If you manage a website, you know that old links and images can expire with time. Here’s an excerpt from a years’ old post I wrote on the Cloudflare Blog:
As you might see, that missing image is not the prettiest sight. However, we can easily fix this using async handlers in HTMLRewriter. Using a service like the Internet Archive API, we can check if an image no longer exists and rewrite the URL to use the latest archive. That means users don’t see an ugly placeholder and won’t even know the image was replaced.
async function fetchAndFixImages(request) {
return new HTMLRewriter()
.on("img", new ImageFixer())
.transform(await fetch(request))
}
class ImageFixer {
async element(e) {
var url = e.getAttribute("src")
var response = await fetch(url)
if (!response.ok) {
var archive = await fetch(`https://archive.org/wayback/available?url=${url}`)
if (archive.ok) {
var snapshot = await archive.json()
e.setAttribute("src", snapshot.archived_snapshots.closest.url)
} else {
e.remove()
}
}
}
}
Using the Workers Playground, you can view a working sample of the above code. A more complex example could even alert a service like Sentry when a missing image is detected. Using the previous missing image, now you can see the image is restored and users are none of the wiser.
If you’re interested in deploying this to your own website, click on the button below:
What else can I build with HTMLRewriter?
We’ve been blown away by developer projects using HTMLRewriter. Here are a few projects that caught our eye and are great examples of the power of Cloudflare Workers and HTMLRewriter:
If you’re interested in using HTMLRewriter, check out our documentation. Also be sure to share any creations you’ve made with @CloudflareDev, we love looking at the awesome projects you build.