Subscribe to receive notifications of new posts:

A Workers optimization that reduces your bill


3 min read
A Workers optimization that reduces your bill

Recently, we made an optimization to the Cloudflare Workers runtime which reduces the amount of time Workers need to spend in memory. We're passing the savings on to you for all your Unbound Workers.


Workers are often used to implement HTTP proxies, where JavaScript is used to rewrite an HTTP request before sending it on to an origin server, and then to rewrite the response before sending it back to the client. You can implement any kind of rewrite in a Worker, including both rewriting headers and bodies.

Many Workers, though, do not actually modify the response body, but instead simply allow the bytes to pass through from the origin to the client. In this case, the Worker's application code has finished executing as soon as the response headers are sent, before the body bytes have passed through. Historically, the Worker was nevertheless considered to be "in use" until the response body had fully finished streaming.

For billing purposes, under the Workers Unbound pricing model, we charge duration-memory (gigabyte-seconds) for the time in which the Worker is in use.

The change

On December 15-16, we made a change to the way we handle requests that are streaming through the response without modifying the content. This change means that we can mark application code as “idle” as soon as the response headers are returned.

Since no further application code will execute on behalf of the request, the system does not need to keep the request state in memory – it only needs to track the low-level native sockets and pump the bytes through. So now, during this time, the Worker will be considered idle, and could even be evicted before the stream completes (though this would be unlikely unless the stream lasts for a very long time).

Visualized it looks something like this:

A sequence diagram with before and after that shows that after the change, Workers are considered “idle” as soon as the response headers have been forwarded from the origin to the client.

As a result of this change, we've seen that the time a Worker is considered "in use" by any particular request has dropped by an average of 70%. Of course, this number varies a lot depending on the details of each Worker. Some may see no benefit, others may see an even larger benefit.

This change is totally invisible to the application. To any external observer, everything behaves as it did before. But, since the system now considers a Worker to be idle during response streaming, the response streaming time will no longer be billed. So, if you saw a drop in your bill, this is why!

But it doesn’t stop there!

The change also applies to a few other frequently used scenarios, namely Websocket proxying, reading from the cache and streaming from KV.

WebSockets: once a Worker has arranged to proxy through a WebSocket, as long as it isn't handling individual messages in your Worker code, the Worker does not remain in use during the proxying. The change applies to regular stateless Workers, but not to Durable Objects, which are not usually used for proxying.

export default {
  async fetch(request: Request) {
    //Do anything before
    const upgradeHeader = request.headers.get('Upgrade')
    if (upgradeHeader || upgradeHeader === 'websocket') {
      return await fetch(request)
    //Or with other requests

Reading from Cache: If you return the response from a cache.match call, the Worker is considered idle as soon as the response headers are returned.

export default {
  async fetch(request: Request) {
    let response = await caches.default.match('')
    if (response) {
      return response
    // get/create response and put into cache

Streaming from KV: And lastly, when you stream from KV. This one is a bit trickier to get right, because often people retrieve the value from KV as a string, or JSON object and then create a response with that value. But if you fetch the value as a stream, as done in the example below, you can create a Response with the ReadableStream.

interface Env {
  MY_KV_NAME: KVNamespace

export default {
  async fetch(request: Request, env: Env) {
    const readableStream = await env.MY_KV_NAME.get('hello_world.pdf', { type: 'stream' })
    if (readableStream) {
      return new Response(readableStream, { headers: { 'content-type': 'application/pdf' } })

Interested in Workers Unbound?

If you are already using Unbound, your bill will have automatically dropped already.

Now is a great time to check out Unbound if you haven’t already, especially since recently, we’ve also removed the egress fees. Unbound allows you to build more complex workloads on our platform and only pay for what you use.

We are always looking for opportunities to make Workers better. Often that improvement takes the form of powerful new features such as the soon-to-be released Service Bindings and, of course, performance enhancements. This time, we are delighted to make Cloudflare Workers even cheaper than they already were.

We protect entire corporate networks, help customers build Internet-scale applications efficiently, accelerate any website or Internet application, ward off DDoS attacks, keep hackers at bay, and can help you on your journey to Zero Trust.

Visit from any device to get started with our free app that makes your Internet faster and safer.

To learn more about our mission to help build a better Internet, start here. If you're looking for a new career direction, check out our open positions.
Cloudflare WorkersWorkers UnboundDevelopersDeveloper Platform

Follow on X

Kenton Varda|@kentonvarda
Erwin van der Koogh|@evanderkoogh

Related posts

May 30, 2024 1:00 PM

Disrupting FlyingYeti's campaign targeting Ukraine

In April and May 2024, Cloudforce One employed proactive defense measures to successfully prevent Russia-aligned threat actor FlyingYeti from launching their latest phishing campaign targeting Ukraine...