Improve global upload performance with R2 Local Uploads

Today, we are launching Local Uploads for R2 in open beta. With Local Uploads enabled, object data is automatically written to a storage location close to the client first, then asynchronously copied to where the bucket lives. The data is immediately accessible and stays strongly consistent. Uploads get faster, and data feels global.

For many applications, performance needs to be global. Users uploading media content from different regions, for example, or devices sending logs and telemetry from all around the world. But your data has to live somewhere, and that means uploads from far away have to travel the full distance to reach your bucket.

R2 is object storage built on Cloudflare's global network. Out of the box, it automatically caches object data globally for fast reads anywhere — all while retaining strong consistency and zero egress fees. This happens behind the scenes whether you're using the S3 API, Workers Bindings, or plain HTTP. And now with Local Uploads, both reads and writes can be fast from anywhere in the world.

Try it yourself in this demo to see the benefits of Local Uploads.

Ready to try it? Enable Local Uploads in the Cloudflare Dashboard under your bucket's settings, or with a single Wrangler command on an existing bucket.

npx wrangler r2 bucket local-uploads enable [BUCKET]

75% lower total request duration for global uploads

Local Uploads makes upload requests (i.e. PutObject, UploadPart) faster. In both our private beta tests with customers and our synthetic benchmarks, we saw up to 75% reduction in Time to Last Byte (TTLB) when upload requests are made in a different region than the bucket. In these results, TTLB is measured from when R2 receives the upload request to when R2 returns a 200 response.

In our synthetic tests, we measured the impact of Local Uploads by using a synthetic workload to simulate a cross-region upload workflow. We deployed a test client in Western North America and configured an R2 bucket with a location hint for Asia-Pacific. The client performed around 20 PutObject requests per second over 30 minutes to upload objects of 5 MB size.

The following graph compares the p50 (or median) TTLB metrics for these requests, showing the difference in upload request duration — first without Local Uploads (TTLB around 2s), and then with Local Uploads enabled (TTLB around 500ms):

How it works: The distance problem

To understand how Local Uploads can improve upload requests, let’s first take a look at how R2 works. R2's architecture is composed of multiple components including:

R2 Gateway Worker: The entry point for all API requests that handles authentication and routing logic. It is deployed across Cloudflare's global network via Cloudflare Workers.
Durable Object Metadata Service: A distributed layer built on Durable Objects used to store and manage object metadata (e.g. object key, checksum).
Distributed Storage Infrastructure: The underlying infrastructure that persistently stores encrypted object data.

Without Local Uploads, here’s what happens when you upload objects to your bucket: The request is first received by the R2 Gateway, close to the user, where it is authenticated. Then, as the client streams bytes of the object data, the data is encrypted and written into the storage infrastructure in the region where the bucket is placed. When this is completed, the Gateway reaches out to the Metadata Service to publish the object metadata, and it returns a success response back to the client after it is committed.

If the client and the bucket are in separate regions, more variability can be introduced in the process of uploading bytes of the object data, due to the longer distance that the request must travel. This could result in slower or less reliable uploads.

^{A client uploading from Eastern North America to a bucket in Eastern Europe without Local Uploads enabled.}

Now, when you make an upload request to a bucket with Local Uploads enabled, there are two cases that are handled:

The client and the bucket region are in the same region
The client and the bucket region are in different regions

In the first case, R2 follows the regular flow, where object data is written to the storage infrastructure for your bucket. In the second case, R2 writes to the storage infrastructure located in the client region while still publishing to the object metadata to the region of the bucket.

Importantly, the object is immediately accessible after the initial write completes. It remains accessible throughout the entire replication process — there's no waiting period for background replication to finish before the object can be read.

^{A client uploading from Eastern North America to a bucket in Eastern Europe with Local Uploads enabled.}

Note that this is for non-jurisdiction restricted buckets, and Local Uploads are not available for buckets with jurisdiction restriction (e.g. EU, FedRAMP) enabled.

When to use Local Uploads

Local uploads are built for workloads that receive a lot of upload requests originating from different geographic regions than where your bucket is located. This feature is ideal when:

Your users are globally distributed
Upload performance and reliability is critical to your application
You want to optimize write performance without changing your bucket's primary location

To understand the geographic distribution of where your read and write requests are initiated, you can visit the Cloudflare Dashboard, and go to your R2 bucket’s Metrics page and view the Request Distribution by Region graph.

How we built Local Uploads

With Local Uploads, object data is written close to the client and then copied to the bucket's region in the background. We call this copy job a replication task.

Given these replication tasks, we needed an asynchronous processing component for them, which tends to be a great use case for Cloudflare Queues. Queues allow us to control the rate at which we process replication tasks, and it provides built-in failure handling capabilities like retries and dead letter queues. In this case, R2 shards replication tasks across multiple queues per storage region.

Publishing metadata and scheduling replication

When publishing the metadata of an object with Local Uploads enabled, we perform three operations atomically:

Store the object metadata
Create a pending replica key that tracks which replications still need to happen
Create a replication task marker keyed by timestamp, which controls when the task should be sent to the queue

The pending replica key contains the full replication plan: the number of replication tasks, which source location to read from, which destination location to write to, the replication mode and priority, and whether the source should be deleted after successful replication.

This gives us flexibility in how we move an object's data. For example, moving data across long geographical distances is expensive. We could try to move all the replicas as fast as possible by processing them in parallel, but this would incur greater cost and pressure the network infrastructure. Instead, we minimize the number of cross-regional data movements by first creating one replica in the target bucket region, and then use this local copy to create additional replicas within the bucket region.

A background process periodically scans the replication task markers and sends them to one of the queues associated with the destination storage region. The markers guarantee at-least-once delivery to the queue — if enqueueing fails or the process crashes, the marker persists and the task will be retried on the next scan. This also allows us to process replications at different times and enqueue only valid tasks. Once a replication task reaches a queue, it is ready to be processed.

Asynchronous replication: Pull model

For the queue consumer, we chose a pull model where a centralized polling service consumes tasks from the regional queues and dispatches them to the Gateway Worker for execution.

Here's how it works:

Polling service pulls from a regional queue: The consumer service polls the regional queue for replication tasks. It then batches the tasks to create uniform batch sizes based on the amount of data to be moved.
Polling service dispatches to Gateway Worker: The consumer service sends the replication job to the Gateway Worker.
Gateway Worker executes replication: The worker reads object data from the source location, writes it to the destination, and updates metadata in the Durable Object, optionally marking the source location to be garbage collected.
Gateway Worker reports result: On completion, the worker returns the result to the poller, which acknowledges the task to the queue as completed or failed.

By using this pull model approach, we ensure that the replication process remains stable and efficient. The service can dynamically adjust its pace based on real-time system health, guaranteeing that data is safely replicated across regions.

Try it out

Local Uploads is available now in open beta. There is no additional cost to enable Local Uploads. Upload requests made with this feature enabled incur the standard Class A operation costs, same as upload requests made without Local Uploads.

To get started, visit the Cloudflare Dashboard under your bucket's settings and look for the Local Uploads card to enable, or simply run the following command using Wrangler to enable Local Uploads on a bucket.

npx wrangler r2 bucket local-uploads enable [BUCKET]

Enabling Local Uploads on a bucket is seamless: existing uploads will complete as expected and there’s no interruption to traffic.

For more information, refer to the Local Uploads documentation. If you have questions or want to share feedback, join the discussion on our Developer Discord.

The Cloudflare Blog

Improve global upload performance with R2 Local Uploads

75% lower total request duration for global uploads

How it works: The distance problem

When to use Local Uploads

How we built Local Uploads

Publishing metadata and scheduling replication

Asynchronous replication: Pull model

Try it out

Investigating multi-vector attacks in Log Explorer

We deserve a better streams API for JavaScript

How we rebuilt Next.js with AI in one week

Building a serverless, post-quantum Matrix homeserver