
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/">
    <channel>
        <title><![CDATA[ The Cloudflare Blog ]]></title>
        <description><![CDATA[ Get the latest news on how products at Cloudflare are built, technologies used, and join the teams helping to build a better Internet. ]]></description>
        <link>https://blog.cloudflare.com</link>
        <atom:link href="https://blog.cloudflare.com/" rel="self" type="application/rss+xml"/>
        <language>en-us</language>
        <image>
            <url>https://blog.cloudflare.com/favicon.png</url>
            <title>The Cloudflare Blog</title>
            <link>https://blog.cloudflare.com</link>
        </image>
        <lastBuildDate>Thu, 25 Jun 2026 13:23:58 GMT</lastBuildDate>
        <item>
            <title><![CDATA[How we built saga rollbacks for Cloudflare Workflows]]></title>
            <link>https://blog.cloudflare.com/rollbacks-for-workflows/</link>
            <pubDate>Thu, 25 Jun 2026 13:00:00 GMT</pubDate>
            <description><![CDATA[ Cloudflare Workflows, our durable execution engine for multi-step applications, now supports saga-style rollbacks, allowing developers to specify a compensating action for each step.do().  ]]></description>
            <content:encoded><![CDATA[ <p>Cloudflare Workflows allows you to build durable, multi-step applications with built-in retries and state persistence across long-running processes. When a <a href="https://developers.cloudflare.com/workflows/"><u>Workflow</u></a> executes, each step can call external systems, retry failures, and persist state across restarts. But if one step fails, it may leave earlier work from completed steps in an inconsistent or partial state.</p><p>Today we’re shipping saga rollbacks for Workflows, allowing you to declare rollback logic within the step itself, in case of failure.</p><p>For example, consider a workflow for transferring funds between accounts at two different banks:</p><ol><li><p>Debit from account at Bank A</p></li><li><p>Credit to account at Bank B</p></li><li><p>Send email confirmation to both account owners</p></li></ol><p>What happens if Step 2, the credit to account at Bank B, fails? Once the debit succeeds at Bank A, the transaction is committed and the money has left its system. As the orchestrator of the transaction, you cannot simply “undo” the operation in Bank A's system. Instead, the money must be credited back to the account at Bank A through a new operation that semantically reverses the first one.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1j8xfDeKb3FCgE2Ktxf4fq/723e2b9e34189747d3c8eb65f906fb41/BLOG-3317_image6.png" />
          </figure><p>
This pairing of an operation and its compensation logic is called the <a href="https://www.youtube.com/watch?v=xDuwrtwYHu8"><u>saga pattern</u></a>.</p><p>Before today, developers had to implement their own compensation logic to track what succeeded, what failed, and what actions should be taken upon failure, outside of the steps’ direct definitions. Now, you can define compensation logic for each <code>step.do()</code> as an argument within the steps themselves, maintaining your workflow’s durability for the rollback as well.</p>
            <pre><code>// track what completed so we know what to undo
let debitA;
let creditB;
try {
  debitA = await step.do("debit-bank-a", () =&gt; bankA.debit(from, amount));
  creditB = await step.do("credit-bank-b", () =&gt; bankB.credit(to, amount));
  await step.do("notify", () =&gt; notifyBoth(from, to, amount));
} catch (error) {
  // unwind in reverse. each undo is its own durable step,
  // must be idempotent, and must keep going if one fails.
  if (creditB) {
    try {
      await step.do("reverse-credit-b", () =&gt; bankB.debit(to, amount, creditB.id));
    } catch (e) {
      await alertOnCall("reverse-credit-b failed", e);
    }
  }
  if (debitA) {
    try {
      await step.do("refund-debit-a", () =&gt; bankA.credit(from, amount, debitA.id));
    } catch (e) {
      await alertOnCall("refund-debit-a failed", e);
    }
  }
  throw error;
}</code></pre>
            <p><i><sup>Without rollbacks</sup></i></p>
            <pre><code>// each step ships with its own undo. add a step,
// add its rollback right here. no growing catch
// block, no manual ordering, no replay logic.
await step.do("debit-bank-a", () =&gt; bankA.debit(from, amount), {
  rollback: async ({ output }) =&gt; bankA.credit(from, amount, output.id),
});
await step.do("credit-bank-b", () =&gt; bankB.credit(to, amount), {
  rollback: async ({ output }) =&gt; bankB.debit(to, amount, output.id),
});
await step.do("notify", () =&gt; notifyBoth(from, to, amount));</code></pre>
            <p><i><sup>With rollbacks</sup></i></p>
    <div>
      <h2>Try it out</h2>
      <a href="#try-it-out">
        
      </a>
    </div>
    <p>To use rollbacks, just pass an options object containing a <code>rollback</code> function as the last argument to <code>step.do()</code>.</p>
            <pre><code>const debit = await step.do(
  "debit-account-a",
  async () =&gt; {
    return await bankA.debit({
      accountId: fromAccountId,
      amount,
      idempotencyKey: `${transferId}:debit-account-a`,
    });
  },
  {
    rollback: async () =&gt; {
      await bankA.credit({
        accountId: fromAccountId,
        amount,
        idempotencyKey: `${transferId}:rollback-debit-account-a`,
      });
    },
  }
);

// The idempotency keys make both the forward operations and rollback operations safe to retry without duplicating the transfer

const credit = await step.do(
  "credit-account-b",
  async () =&gt; {
    return await bankB.credit({
      accountId: toAccountId,
      amount,
      idempotencyKey: `${transferId}:credit-account-b`,
    });
  },
  {
    rollback: async ({ output }) =&gt; {
      if (output === undefined) {
        return;
      }

      await bankB.debit({
        accountId: toAccountId,
        amount,
        idempotencyKey: `${transferId}:rollback-credit-account-b`,
      });
    },
  }
);


// If we fail here, we may want to revert all previous payments. Users should not have to wrap their code in complex try-catch logic just to revert two small payments (see below)

await step.do("send-confirmation", async () =&gt; {
  await sendTransferConfirmation({ ... });
});</code></pre>
            <p>Rollback functions should be idempotent, just like regular Workflow steps. If you refund a charge, use the payment provider's idempotency key. If you release inventory, make the release safe to call more than once.</p><p>If any step fails, the rollback handlers will execute in reverse <code>step-start</code> order. It sounds simple: run the undo steps when something fails. In practice, there are a few details that make the API and execution model important.</p><p>1. <b>The failed step may still need rollback. </b>A failed <code>step.do()</code> can still be rollback-eligible if it registered a rollback handler.</p><p>The rollback will not start if user code catches an error and the Workflow continues, but if a step error is caught and the Workflow later fails for another reason, rollback can still run for previously registered handlers, which execute in reverse <code>step-start</code> order.</p><p>Why? The step may have partially interacted with an external system before failing. For example, a payment provider may capture a charge, but the step may fail before returning the <code>chargeId</code> to Workflows. That is why rollback handlers receive <code>output</code>, but must handle <code>output === undefined</code>.</p><p>2. <b>Rollback only starts when the Workflow fails. </b>Adding a rollback handler does not mean every step error triggers rollback. If user code catches an error and continues, the Workflow continues. Rollback starts when the Workflow itself is about to fail terminally.</p><p>When rollback starts, Workflows finds eligible <code>step.do()</code> calls, runs their rollback handlers, then records the final Workflow failure.</p><p>3. <b>Ordering has to be predictable. </b>For sequential Workflows, rollback order feels obvious:</p><ol><li><p>Reserve inventory.</p></li><li><p>Charge card.</p></li><li><p>Create shipment.</p></li><li><p>If shipment fails, refund the card and release the inventory.</p></li></ol><p>Parallel steps make this more subtle. Completion order can differ from start order, so Workflows uses reverse step-start order instead of reverse completion order.</p><p>The practical rules are:</p><ol><li><p>Any started or completed steps with rollback handlers are eligible.</p></li><li><p>The failing <code>step.do()</code> is also eligible if it registered a rollback handler.</p></li><li><p>Handlers run in reverse step-start order, not completion order.</p></li></ol>
    <div>
      <h2>How we designed the API</h2>
      <a href="#how-we-designed-the-api">
        
      </a>
    </div>
    <p>Once we had the expected behavior in mind, we had to add this new pattern into the Workflows API. Rollbacks went through a few iterations before we landed on <code>rollback options</code>. </p>
    <div>
      <h3>Why not a fluent or builder API?</h3>
      <a href="#why-not-a-fluent-or-builder-api">
        
      </a>
    </div>
    <p>The first approach was a fluent form: <code>step.do(...).rollback(...)</code> It reads well. The forward action and the compensation sit next to each other, and the call site looks like ordinary JavaScript chaining.</p><p>The problem is that <code>step.do()</code> already has an important meaning: it starts a durable step and returns a Promise for the step output. In Workers, promise-like values are especially meaningful because Workers RPC supports <a href="https://blog.cloudflare.com/capnweb-javascript-rpc-library/#chained-calls-promise-pipelining"><u>promise pipelining</u></a>, a pattern inherited from systems like <a href="https://capnproto.org/rpc.html#time-travel-promise-pipelining"><u>Cap'n Proto</u></a>.</p><p>Promise pipelining lets code call a method on a future value before that value has fully returned to the caller. For example:</p>
            <pre><code>const session = api.authenticate(apiKey);
const name = await session.whoami();</code></pre>
            <p>Here, <code>session</code> is not the real session object yet. It is more like a handle to the session that will exist soon. When you call <code>session.whoami()</code>, Workers can send that call to the remote side early and say: “once authentication creates the session, call <code>whoami()</code> on it.”</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/cgBccGGKzjrx2gnnyAUvL/f0470a7a40ef05027e952d42abfa592c/BLOG-3317_image4.png" />
          </figure><p>That saves a round trip. The caller does not need to wait for <code>authenticate()</code> to fully finish before asking for <code>whoami()</code>.</p><p>We considered a fluent API:</p>
            <pre><code>step.do("charge-card", chargeCard).rollback(refundCharge);</code></pre>
            <p>
To a reader, that can look like “call <code>.rollback()</code> on the result of <code>charge-card</code>.”   But rollback is not part of the step’s output. It is part of the <code>step.do()</code> options, registered before the step starts, so Workflows knows how to compensate the step if a later step fails.</p><p>A fluent API also makes step timing harder to reason about. Today, <code>step.do()</code> starts the step when it is called, so developers can start a step, do other work, and await the first step later:</p>
            <pre><code>const first = step.do("first", () =&gt; serviceA.call());

await step.do("second", () =&gt; serviceB.call());

await first;</code></pre>
            <p>With today’s execution model, <code>first</code> starts immediately, before <code>second</code>. A fluent API would complicate that. Workflows would need to wait and see whether <code>.rollback()</code> gets attached before it knows the full step definition. That could delay when the step is sent to the engine.</p><p>In the earlier example, <code>first</code> could start at <code>await first</code> instead of at <code>step.do("first", ...)</code>, after <code>second</code> has already completed.</p><p>That makes concurrent Workflows harder to reason about: step timing would depend on when the returned <code>Promise</code> is consumed, not just where <code>step.do()</code> is called.</p><p>We also considered a builder-style API:</p>
            <pre><code>const charge = await step
	.saga("charge")
	.do(() =&gt; chargeCard())
	.rollback(() =&gt; refundCharge())
	.run();</code></pre>
            <p>A builder API avoids the <code>Promise</code> ambiguity. It also gives us an obvious place for future step-level options, and makes it clear that the forward action and rollback action belong to the same saga step.</p><p>But it adds ceremony. Every step needs a final <code>.run()</code>, forgetting <code>.run()</code> would be easy and hard to spot without tooling, and simple one-step cases start to look like configuration chains. It also introduces a new <code>step.saga()</code> builder, breaking from the existing <code>step.&lt;action&gt;</code> pattern. Most importantly, it makes <code>step.do()</code> feel like an older API rather than the primary Workflows primitive. The goal of rollback was to extend <code>step.do()</code>, not replace it.</p>
    <div>
      <h3>Rollback as step metadata</h3>
      <a href="#rollback-as-step-metadata">
        
      </a>
    </div>
    
            <pre><code>step.do(..., { rollback })</code></pre>
            <p>Ultimately, we chose the explicit form where rollback is metadata on the step.</p><p>This way, each rollback is defined within the forward step itself. Each handler receives the error that caused the rollback to start, the <a href="https://developers.cloudflare.com/workflows/build/step-context/"><u>step context</u></a>, and the output, which is either the persisted value returned by the forward step (which can be undefined) or undefined if the step failed before persisting a value.</p><p>Rollbacks emit lifecycle events, so you can tell whether compensation started, which rollback handler failed, and whether rollback completed successfully.</p><p>Crucially, the original Workflow failure remains separate: rollback is what Workflows does after the failure, not the reason the Workflow failed.</p><p>Just as you can define custom retry and timeout behavior in the<a href="https://developers.cloudflare.com/workflows/build/workers-api/#workflowstepconfig"> <u>step configuration</u></a> via <code>WorkflowStepConfig</code>, you add rollback-specific values in <code>rollbackConfig</code>.</p>
            <pre><code>{
  rollback: async ({ output }) =&gt; {
    await bankA.credit({ accountId: fromAccountId, amount, transferId: `${transferId}-reversal` });
  },
  rollbackConfig: {
    retries: { limit: 10, delay: '30 seconds', backoff: 'exponential' },
    timeout: '2 minutes',
  },
}</code></pre>
            <p>This matches the lifecycle-event mental model we wanted. A <code>step.do()</code> already describes a durable unit of work that Workflows records, retries, and later shows in logs. Rollback is another lifecycle behavior for that same unit of work. It should travel with the step definition, not live in a separate wrapper or builder.</p><ul><li><p>The step still starts when <code>step.do()</code> normally starts.</p></li><li><p>The returned promise still represents the step output.</p></li><li><p>Concurrent Workflow code keeps the same execution model.</p></li><li><p>Retry and timeout options for rollback live next to the rollback handler.</p></li><li><p>Existing <code>step.do()</code> calls keep working exactly as they do today.</p></li></ul><p>This shape is slightly more explicit than the fluent API, but that explicitness is useful. The operation and its compensation are still in one place, and the API does not introduce a new step builder or a new kind of promise. Developers who already understand <code>step.do()</code> only need to learn one additional <code>options</code> object.</p><p>This is less magical, but it is simpler to adopt, and clearer to understand.</p>
    <div>
      <h2>How it works under the hood</h2>
      <a href="#how-it-works-under-the-hood">
        
      </a>
    </div>
    <p>Rollback feels like a small API addition, but it changes what Workflows needs to record about each step.</p><p>A regular <code>step.do()</code> already has a durable record. Workflows records that the step started, whether it completed, what it returned, and whether it should be skipped instead of repeated if the Workflow resumes later.</p><p>Rollbacks add one more thing to that record: whether the step registered compensation logic.</p><p>This means Workflows has two pieces of information to bring together if the Workflow fails.</p><p>The first is <b>durable step history</b>. The Workflow engine stores data to know what ran, what completed, what output was saved, and whether rollback was registered.</p><p>The second is the <b>rollback handler</b> itself, which is the function written to compensate for that step. Workflows does not save the text of that function as data. Instead, it keeps a callable reference to the handler while the Workflow is running.</p><p>In Workers RPC, this kind of callable reference is called a <a href="https://developers.cloudflare.com/workers/runtime-apis/rpc/lifecycle"><b><u>stub</u></b></a>. A stub lets one part of the system call code that is running somewhere else. Stubs also have lifetimes such that they can be disposed when a call or execution context ends. If you need to keep a stub past that point, Workers RPC provides a <a href="https://developers.cloudflare.com/workers/runtime-apis/rpc/lifecycle/#the-dup-method"><code><u>dup()</u></code></a> method, which creates another handle to the same target.</p><p>For rollback, that model is useful. The durable step history records what needs compensation. The rollback stub gives Workflows a way to invoke the compensation code. And because rollback handlers may need to outlive the immediate <code>step.do()</code> call that registered them, Workflows keeps its own callable reference to the handler for the rollback phase.</p><p>In the common case, when a Workflow enters rollback in the same engine lifetime, Workflows already has the rollback stubs it needs. It can use the durable step history to find eligible steps, then invoke the rollback stubs that were registered during forward execution.</p><p>This gets more subtle when Workflows has to <b>recover</b> after a restart.</p><p>If the engine is evicted, crashes, or restarts while rollback is needed, Workflows still has the durable step history, but it may no longer have the in-memory rollback stubs. To recover, Workflows uses <b>replay</b>: a recovery mode where it can re-run the Workflow code without re-executing completed forward step bodies.</p><p>When replay reaches a completed <code>step.do()</code>, Workflows reads the persisted result instead of running the step body again. For rollback recovery, Workflows only needs to rebuild handlers for steps that had rollback attached and are eligible for rollback. As those <code>step.do() </code>calls are encountered, their rollback options can register the callable stubs again</p><p>That lets Workflows recover the rollback handlers it needs without duplicating the original external side effects.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6F0SOtk10x2op5YxnKKnXM/54f7e763f4ada07e353e8bcac5549833/BLOG-3317_image5.png" />
          </figure><p>With those pieces in place, rollback can work whether the handler is still available in memory or has to be rebuilt during recovery.</p><p>When the workflow is about to fail, Workflows does not ask your application to reconstruct what happened. It already has the step history. It can look at the persisted record and answer the important questions:</p><ul><li><p>Which steps started?</p></li><li><p>Which steps finished?</p></li><li><p>Which failed step may still need cleanup?</p></li><li><p>Which steps registered rollback handlers?</p></li><li><p>What output should each rollback handler receive?</p></li><li><p>What order should compensation run in?</p></li></ul><p>Then Workflows invokes each rollback stub with a rollback context: the original error, the step context, and the step output, if one was persisted.</p><p>The ordering detail matters. In normal JavaScript, especially with <code>Promise.all()</code>, completion order is not always the same as start order. If step A starts first and step B starts second, step B might finish first. For rollback, Workflows uses the persisted start order as the stable source of truth, then unwinds it in reverse.</p><p>Rollback handlers also run through Workflows' normal step machinery. That means compensation gets the same operational properties you expect from Workflows: retries, timeouts, lifecycle events, logs, and a final recorded outcome. If a rollback handler keeps failing after its configured retries, Workflows records the rollback outcome as failed, stops running the remaining rollback handlers, and the Workflow instance ultimately ends in the <code>Errored</code> state.</p><p>This is the main difference between saga rollbacks and a <code>catch</code> block. A <code>catch</code> block only knows what is still in memory at its exact point in your JavaScript execution. Workflows rollback uses persisted step history to decide what already happened, invokes the stubs it already has in the common case, and safely rebuilds missing stubs during recovery when it needs to.</p><p>That is also why the API puts rollback on <code>step.do()</code> itself. Rollback is not a separate global error handler — it is metadata attached to the durable unit of work Workflows already understands.</p>
    <div>
      <h2>What’s next</h2>
      <a href="#whats-next">
        
      </a>
    </div>
    <p>Our first iteration of rollbacks includes: </p><ul><li><p>Explicit per-step rollback handlers for <code>step.do()</code></p></li><li><p>Sequential rollback execution</p></li><li><p>Retry and timeout configuration for compensation</p></li></ul><p>Next, we want to explore:</p><ul><li><p>Rollback support for <a href="https://developers.cloudflare.com/workflows/build/events-and-parameters/#wait-for-events"><code><u>waitForEvent</u></code></a></p></li><li><p>Support for parallel rollback execution</p></li><li><p>Rollback support for <a href="https://developers.cloudflare.com/workflows/python/"><u>Python Workflows</u></a></p></li></ul><p>When a multi-step application fails halfway through, the hardest part is often not knowing <i>that</i> it failed. It is knowing <i>what</i> already happened, and what needs to happen next.</p><p>Saga rollbacks let you put that answer directly beside each step. If you are building multi-step applications with Workflows, try saga rollbacks and tell us what compensation patterns you want next. Get started with the <a href="https://developers.cloudflare.com/workflows/"><u>Workflows documentation</u></a> and share feedback in the <a href="https://community.cloudflare.com/"><u>Cloudflare Community</u></a>.</p> ]]></content:encoded>
            <category><![CDATA[Workflows]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[Developers]]></category>
            <guid isPermaLink="false">6BmERiKIIt4pIJoFmNy7Jn</guid>
            <dc:creator>Vaishnav Kavitha</dc:creator>
            <dc:creator>Mia Malden</dc:creator>
            <dc:creator>André Venceslau</dc:creator>
        </item>
        <item>
            <title><![CDATA[How we found a bug in the hyper HTTP library]]></title>
            <link>https://blog.cloudflare.com/hyper-bug/</link>
            <pubDate>Mon, 22 Jun 2026 18:00:00 GMT</pubDate>
            <description><![CDATA[ By rearchitecting the Images binding, we accidentally uncovered a bug that existed in the open-source hyper library across multiple major versions. ]]></description>
            <content:encoded><![CDATA[ <p></p><p>The <a href="https://developers.cloudflare.com/images/"><u>Images</u></a> service, built in Rust on <a href="https://developers.cloudflare.com/workers/"><u>Workers</u></a>, runs on every machine in Cloudflare’s edge network. To handle client connections, we use <a href="https://github.com/hyperium/hyper"><u>hyper</u></a>, an open-source HTTP library for Rust.</p><p>Last year, we <a href="https://blog.cloudflare.com/improve-your-media-pipelines-with-the-images-binding-for-cloudflare-workers/"><u>introduced the Images binding</u></a> to enable custom, programmatic workflows for processing remote images in Workers. At the end of 2025, we rearchitected the binding to provide a more direct, local connection between the Workers runtime and the Images service.</p><p>Shortly after rollout, we received reports that transformation requests from the binding were failing — but only intermittently and only for larger images. Even stranger, the responses for these requests returned a <code>200</code> status without any errors logged. The image data was simply cut short: A response that should have been two megabytes might arrive with a few hundred kilobytes instead.</p><p>We spent six weeks chasing a nearly invisible bug — a race condition that occurred only under specific conditions — in the hyper library that impacted how the Images binding returned processed image data back to the client. In the end, it took four lines of code to fix it.</p>
    <div>
      <h3>Hops, handoffs, and hyper</h3>
      <a href="#hops-handoffs-and-hyper">
        
      </a>
    </div>
    <p>When developers build on Cloudflare, they compose full-stack applications from a set of platform services that are accessible to Workers through bindings. <a href="https://developers.cloudflare.com/workers/runtime-apis/bindings/"><u>Bindings</u></a> provide direct APIs to resources on the Developer Platform like <a href="https://www.cloudflare.com/products/#compute"><u>compute</u></a>, <a href="https://www.cloudflare.com/products/#storage"><u>storage</u></a>, <a href="https://www.cloudflare.com/products/#ai"><u>AI inference</u></a>, and <a href="https://www.cloudflare.com/products/#media"><u>media processing</u></a>.</p><p>The Images binding decouples image optimization from delivery; you can transcode, composite, or manipulate images without needing to return the output as an HTTP response. It also lets you apply <a href="https://developers.cloudflare.com/images/optimization/features/"><u>optimization parameters</u></a> in any order, rather than following the fixed sequence imposed by the <a href="https://developers.cloudflare.com/images/optimization/features/#url-interface"><u>URL interface</u></a>. Here, a worker can pass image data directly to the Images API, chain operations together, and get the processed result back as a stream:</p>
            <pre><code>const result = await env.IMAGES
  .input(image)
  .transform({ width: 800, rotate: 90 })
  .output({ format: "image/avif" });
return result.response();</code></pre>
            <p>At a high level, this is how image data moves through our various services:</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/25KzBWgTy4UfdPbzJOg1lO/9c179230f783e9012624a8a9d99b7dac/image4.png" />
          </figure><p><sup><i>The pipe represents a socket connection between the intermediary and Images, where data is handed off from one process to the next through the kernel’s buffer.</i></sup></p><p>The binding communicates with Images through a socket connection managed by the Workers runtime. A socket connection is a communication channel between two processes. Each end of the socket has buffers that are managed by the operating system’s kernel; these buffers are temporary holding areas where data sits after one side writes it but before the other side reads it.</p><p>Hyper manages the connection on the Images service’s side, reading incoming requests from the socket and writing responses back to it.</p><p>When a request uses the Images binding, the Images service reads the input, performs the requested optimization operations, and encodes the result. It then passes the entire encoded image to hyper as a single in-memory block. </p><p>Hyper writes this response data into its own internal buffer. At this point, hyper considers the encoding work as complete, since it has all the bytes that it needs to send. The next step is to flush its internal buffer to the socket’s outbound buffer, moving the data from the Images service to the intermediary on the other end.</p><p>If the reader on the other end is fast, then hyper can flush everything in one pass — the outbound buffer will have room because the reader is consuming data as quickly as it arrives. Once all data is sent, hyper issues a <code>shutdown</code> on the socket, signaling that the connection is finished and no more data will be written. But if the reader is slower (even by a few milliseconds), then the outbound buffer fills up, and hyper needs to wait until there’s room to continue writing.</p>
    <div>
      <h3>Taking the local</h3>
      <a href="#taking-the-local">
        
      </a>
    </div>
    <p>All incoming traffic on Cloudflare's network passes through FL, an internal intermediary service that runs security and performance features and routes requests to the appropriate backend. When we first launched the binding, image data flowed from the Workers runtime, through FL, to the Images service.</p><p>This path was a natural fit for our initial release and follows the same architecture as our URL interface. Over time, though, this coupling with FL became a constraint: Every change to the binding had to follow FL’s release cycle.</p><p>In December 2025, the Images team replaced FL with a new intermediary service, an internal worker binding that runs on the same machine. In the original architecture, data moved through FL over network sockets; this path carried the overhead of FL’s full processing pipeline, such as DNS lookups and routing.</p><p>The internal binding replaced these with Unix sockets to directly connect the services on the same machine, bypassing FL and the overhead of the network stack. This made the request path to Images faster and gave the team independent control over binding releases.</p><p>Within days of the rollout, we received our first customer report.</p>
    <div>
      <h3>200 OK (not OK)</h3>
      <a href="#200-ok-not-ok">
        
      </a>
    </div>
    <p>The first sign of trouble came from a customer with a non-standard setup: two layers of image processing, where one pipeline was nested inside another.</p><p>First, their worker used the Images binding to composite multiple large source images from R2 — a JPEG background plus PNG overlay layers — into a single combined JPEG. Second, they further compressed, transcoded, and resized the result through the URL interface.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/2K8AiqgOV7Ka7uUUAnk5x0/ca1cc575e08b103ebb677ef66df7e87b/image2.png" />
          </figure><p><sup><i>The bug originated in the inner pipeline’s return path, where the response was truncated before reaching the outer pipeline.</i></sup></p><p>The inner pipeline (transformation binding) handled compositing. The outer pipeline (transformation URL) handled delivery optimizations like scaling and format conversion. This layered approach meant that when the inner pipeline silently returned a truncated response, the only visible error appeared one level up:</p>
            <pre><code>error reading a body from connection: end of file before message length reached</code></pre>
            <p>The outer pipeline received HTTP <code>200</code> from the inner one, with a <code>Content-Length</code> header that promised several megabytes. The actual body was only a fraction of that: In one request, only ~200 KB arrived out of an expected 3.3 MB. The error surfaced in the outer pipeline, but the truncation could have originated in the binding, the intermediary service, the Images service, or somewhere in between.</p><p>When a browser receives a truncated image, the result is visible. Depending on the format, the image either renders partially (e.g., with the bottom half missing or gray) or fails to decode entirely, instead displaying a broken image.</p>
    <div>
      <h3>Debugging in the dark</h3>
      <a href="#debugging-in-the-dark">
        
      </a>
    </div>
    <p>From here, we worked inward through the request path, testing each layer to isolate where the truncation was happening. Some of these efforts hit dead ends; others left breadcrumbs that narrowed the search:</p><ul><li><p><b>Building a reproduction.</b> We built a worker that mimicked the customer’s nested setup, then stripped away layers until we could trigger the bug with the binding alone. A small script let us fire requests in batches. In one early run, 19 out of 25 requests failed. The amount of data that did arrive — roughly 200 KB — was suspiciously close to the size of the socket buffer in production. This confirmed that the problem wasn’t tied to the customer’s configuration and gave us a reliable way to trigger the bug on demand.</p></li><li><p><b>Investigating timeouts.</b> Early on, we suspected the truncation might be related to timeout behavior (i.e., the connection was being closed after a time limit). This theory didn’t hold, as the truncation wasn’t correlated with request duration.</p></li><li><p><b>Updating hyper version.</b> When the bug was first reported, we were running 0.14.x, while the latest hyper version was around 1.8.x. We tested across hyper versions 0.14, 1.7, and 1.8, just in case the most obvious answer was the correct (and easiest) one. But the bug appeared in each version, which meant that there wasn’t an upstream fix.</p></li><li><p><b>Reproducing locally.</b> We ran local integration tests on macOS and a Debian VM. Even under considerable load, our local requests never triggered any failure. Making direct curl requests to the binding socket and replaying captured requests always seemed to work. The bug only appeared on the full production path when there was real concurrency and a real Workers runtime client on the other end of the socket. This led us to suspect the runtime itself.</p></li><li><p><b>Ruling out the Workers runtime.</b> We examined the HTTP client that the Workers runtime uses to communicate with Images through the binding socket. None of the traces from either side of the connection showed any syscalls that indicated an unexpected close or early termination. We observed that the client behaved correctly and multiple other services used the same client without issues.</p></li><li><p><b>Distributed tracing.</b> By inspecting request traces end-to-end, we confirmed that the truncated body was already present before it reached the outer transformation layer in the customer’s setup. That narrowed the problem to the inner pipeline — the binding path through the Images service.</p></li><li><p><b>Instrumenting the intermediary service.</b> We added instrumentation to the intermediary service to measure body sizes before forwarding the response data. The bodies were already truncated by the time they left the Images service, so the intermediary was ruled out.</p></li><li><p><b>Deeper tracing within the Images service.</b> At the service level, the request was processed, the image was properly encoded, and the response was sent with HTTP <code>200</code>.</p></li></ul><p>The only consistent signal was that the bug was timing-dependent: It appeared only on the production path, with real concurrency, and only for larger images.</p>
    <div>
      <h3>A kernel of truth</h3>
      <a href="#a-kernel-of-truth">
        
      </a>
    </div>
    <p>Tools for application-level debugging told only what the system thought it was doing. But according to the system, everything was fine: Tracing said the response was sent; logging reported no errors, and the Images service returned <code>200</code> on every request.</p><p>To see what the system was actually doing, we attached <code>strace</code> to the Images service. <code>strace</code> records the syscalls that a process makes to the kernel, which could show us exactly which bytes were written, when a shutdown was called, and whether the client sent any termination signal.</p><p>Setting up the trace was delicate. <code>strace</code> works by intercepting syscalls as they happen, which adds a small amount of timing overhead to each one. Filtering for a narrow set of syscalls kept that overhead minimal. Broadening the filter, however, slowed the process just enough to shift the timing between the flush and the shutdown check — and make the bug disappear entirely. That alone reinforced our theory that the issue was timing-sensitive.</p><p>Using a reproduction worker, we triggered the bug and compared the syscall output between successful and failing requests.</p><p>In a successful request, the response is written in chunks as the socket buffer allows, with shutdown called only after all the data is sent. For example, this may look like:</p>
            <pre><code>sendto(42, "HTTP/1.1 200 OK\r\nContent-Length: 14991808\r\n...", ...) = 219264
sendto(42, "\xff\xd8\xff\xe0...", 292352) = 292352
// ... keeps writing until buffer drains ...
sendto(42, "...", 292352) = 292352
shutdown(42, SHUT_WR) = 0</code></pre>
            <p>When we reproduced the bug, a failing request looked like:</p>
            <pre><code>sendto(42, "HTTP/1.1 200 OK\r\nContent-Length: 14991808\r\n...", ...) = 219264
shutdown(42, SHUT_WR) = 0</code></pre>
            <p>Here, there is only one write — just enough for the headers and a sliver of the body — before the shutdown is immediately called. Out of a 14.9 MB response, only about 219 KB was sent. The remaining ~14.8 MB of image data never left hyper’s internal buffer, nor was there any termination signal from the client between the write and the shutdown. Instead, the Images service prematurely shut down the connection on its own, genuinely believing it was finished.</p><p>The failing requests confirmed that the bug was a race condition that triggered intermittently. Whether a request succeeded or failed depended on whether the flush and shutdown operations overlapped, which changed from request to request. When the buffer was still full at the exact moment that hyper decided the connection was finished, data was lost.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7IaYVui9xakhcroOBifwmO/c22c004bcb85c7efe38f318b9db92a3f/image3.png" />
          </figure><p><sup><i>When the reader consumes slower than hyper writes, the outbound buffer fills up. If hyper shuts down the connection before the buffer drains, then only a fraction of the response makes it to the intermediary; this incomplete data gets forwarded back to the Workers runtime and the client.</i></sup></p><p>The December rearchitecture didn't introduce this bug, which had been present in hyper for years across multiple major versions. But the new intermediary changed who was reading on the response side of the socket. Our working theory is that FL, the previous intermediary, consumed data fast enough that the socket buffer rarely filled during a response. The new reader read at a pace that occasionally let the buffer fill during larger responses.</p><p>These few milliseconds of backpressure, introduced by an improvement that made everything else faster, were all it took to surface a flaw that had been hiding in plain sight.</p>
    <div>
      <h3>Inside the dispatch loop</h3>
      <a href="#inside-the-dispatch-loop">
        
      </a>
    </div>
    <p>Hyper's HTTP/1 connection lifecycle is driven by a state machine in a file called <code>dispatch.rs</code>. It runs a loop that reads requests, writes responses, flushes the write buffer to the socket, and decides when to shut down. In simplified form:</p>
            <pre><code>fn poll_loop(&amp;mut self, cx: &amp;mut Context&lt;'_&gt;) -&gt; Poll&lt;Result&lt;(), Error&gt;&gt; {
    loop {
        let _ = self.poll_read(cx)?;
        let _ = self.poll_write(cx)?;
        let _ = self.poll_flush(cx)?;

        if !self.conn.wants_read_again() {
            return Poll::Ready(Ok(()));
        }
    }
}</code></pre>
            <p>More precisely, the <code>let _</code> before <code>poll_flush</code> is where the bug lives.</p><p>In Rust, <code>let _ = expr</code> discards the expression's result, including <code>Poll::Pending</code>, the signal that the flush isn’t done yet. The flush might still have megabytes sitting in its buffer, but the loop never finds out.</p><p>When a request fails, this is the exact sequence of events:</p><ol><li><p>The Images service finishes encoding the image and hands the entire response to hyper as a single in-memory block.</p></li><li><p>Hyper writes the block into its internal buffer and marks its write state as <code>Writing::Closed</code>. From an encoding standpoint, the work is done — there is nothing left to encode.</p></li><li><p>Hyper calls <code>poll_flush</code> to move the buffered data to the socket. In our previous example, the socket accepted about 219 KB. The remaining ~14.8 MB stays in hyper's buffer. The socket is full, so the kernel returns <code>Poll::Pending</code>.</p></li><li><p><code>poll_loop</code> discards the <code>Poll::Pending</code> with <code>let _</code>.</p></li><li><p>It checks <code>wants_read_again()</code>. The full request was already received, so this returns <code>false</code>.</p></li><li><p><code>poll_loop</code> returns <code>Poll::Ready(Ok(()))</code>, signaling that the loop is finished, even though the flush is not.</p></li><li><p><code>poll_shutdown()</code> fires. The <code>SHUT_WR</code> syscall is issued.</p></li><li><p>The client receives 219 KB and an EOF (end-of-file) indicating that the connection is closed, even though it expects 14.9 MB.</p></li></ol><p>In the second step, hyper marks the write operation as complete as soon as the response body is buffered (i.e., when encoding is finished), rather than when it has actually been flushed. Most of the time, the flush completes in a single pass and this distinction is invisible. On the rare occasions when the socket buffer is full, the flush has to wait — even though hyper doesn't. The bytes are still sitting in hyper’s buffer, waiting to be flushed to the socket. Hyper proceeds to shut down the connection with this data still in the buffer.</p><p>This also explains why curl never triggered the bug. Curl reads data as fast as it arrives: The socket buffer never fills, the flush always completes immediately, and the discarded return value is harmless. The production path, with a reader that occasionally paused for a few milliseconds, was the only configuration where the buffer filled at exactly the wrong moment.</p>
    <div>
      <h3>Don’t forget to flush</h3>
      <a href="#dont-forget-to-flush">
        
      </a>
    </div>
    <p>After weeks of investigation, the fix itself was conceptually simple. Hyper needed to check whether the flush was actually done before moving on.</p><p>Our reproduction worker confirmed that the bug existed, but it couldn't tell us why a given request failed. Before writing the fix, we needed a test that could trigger the exact socket conditions inside hyper.</p><p>We knew the conditions that triggered the bug: a socket that accepts one chunk of data and then blocks. To test with a controlled scenario, we built a custom wrapper around a TCP stream that simulated a full socket buffer. The wrapper accepted 8 KB on the first write, then returned <code>Poll::Pending</code> on every subsequent write, mimicking a reader that stopped draining the buffer.</p><p>The test sent a 500 KB response through this constrained socket and checked whether hyper called shutdown while 492 KB was still buffered. Without a fix, it did. With the fix, it waited.</p><p>Initially, we applied the fix in hyper’s dispatch loop. Instead of discarding the result of <code>poll_flush</code>, we checked to see whether the flush was actually done:</p>
            <pre><code>let flush_result = self.poll_flush(cx)?;

if flush_result.is_pending() {
    return Poll::Pending;
}

if !self.conn.wants_read_again() {
    return Poll::Ready(Ok(()));
}</code></pre>
            <p>If the flush hasn't completed, then the loop returns <code>Poll::Pending</code> to the asynchronous runtime. The runtime waits for the socket to become writable, then wakes the task back up to continue the flush. The connection shuts down only after all data has been sent.</p><p>When we deployed this fix, we observed that every byte was written and the shutdown was called only after the buffer was actually empty. The customer who made the first report also confirmed that the issue disappeared.</p><p>While our initial solution worked, the dispatch loop wasn’t the right place for the fix. Returning <code>Poll::Pending</code> early could slow down other operations on the same connection by reducing how frequently reads are polled, causing unintended backpressure. It also doesn't correctly handle keepalive connections, where a single connection handles multiple requests in sequence — these should remain reusable even while the previous response is still being flushed. Neither issue affected our particular service (where keepalive is disabled), but both could affect other hyper users if the fix were contributed upstream.</p><p>We traced through hyper's connection lifecycle and found a more targeted approach. Rather than changing how the dispatch loop behaves, we applied the fix at the point where shutdown is actually called. Before shutting down the socket, hyper should first flush any remaining data in its buffer:</p>
            <pre><code>pub(crate) fn poll_shutdown(
    &amp;mut self,
    cx: &amp;mut Context&lt;'_&gt;,
) -&gt; Poll&lt;io::Result&lt;()&gt;&gt; {
    ready!(self.poll_flush(cx)?);
    Pin::new(&amp;mut self.io).poll_shutdown(cx)
}</code></pre>
            <p>This leaves the dispatch loop unchanged. It adds a flush only at the exact point where data loss would otherwise occur — the moment before shutdown.</p>
    <div>
      <h3>What stayed with us</h3>
      <a href="#what-stayed-with-us">
        
      </a>
    </div>
    <p>None of the tools at the application level surfaced any errors, crashes, or log entries that provided useful clues. Application-level observability can have a blind spot for bugs that live below its awareness.</p><p>The failure occurred intermittently, scaled with response size, couldn’t be reproduced with simple tools like curl, and disappeared when we observed the system more closely. These signals pointed to a timing-dependent bug in the connection layer, not in the application logic.</p><p>Our breakthrough came from using kernel-level tooling with <code>strace</code>, the one layer that records what actually happened on the socket. The underlying bug lived in the few milliseconds between a partial flush and a premature shutdown — a window that opened only after we made the system faster.</p><p>We merged our fix and the deterministic test into <code>hyperium/hyper</code> via <a href="https://github.com/hyperium/hyper/pull/4018"><u>PR #4018</u></a>. It will be available in a future hyper release, ensuring that any service using hyper’s HTTP/1 implementation won’t lose response data to the same race condition.</p><p>In the meantime, we’re running an internal fork with the patch applied. This fix stabilized the binding’s architecture, creating a reliable foundation to expand its functionality.</p><p>The Images binding initially covered only transformations of remote images. Earlier this month, we announced that <a href="https://developers.cloudflare.com/changelog/post/2026-06-10-hosted-images-binding/"><u>the Images binding now supports operations for hosted images</u></a>, giving developers a unified way to build media-rich applications on Cloudflare.</p><p>Read more about how the binding works in <a href="https://developers.cloudflare.com/images/storage/binding/"><u>our documentation</u></a>.</p><p>
</p> ]]></content:encoded>
            <category><![CDATA[Image Optimization]]></category>
            <category><![CDATA[Cloudflare Images]]></category>
            <category><![CDATA[Developers]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[Open Source]]></category>
            <guid isPermaLink="false">5X4Vz9KDcjIBXgHIFSuJFZ</guid>
            <dc:creator>Deanna Lam</dc:creator>
            <dc:creator>Diretnan Domnan</dc:creator>
            <dc:creator>Matt Lewis</dc:creator>
        </item>
        <item>
            <title><![CDATA[Temporary Cloudflare Accounts for AI agents]]></title>
            <link>https://blog.cloudflare.com/temporary-accounts/</link>
            <pubDate>Fri, 19 Jun 2026 13:00:00 GMT</pubDate>
            <description><![CDATA[ The moment an agent needs to deploy something, it slams face-first into a wall built for humans. Today we're rolling out Temporary Accounts on Cloudflare Workers. Any agent can now run wrangler deploy — temporary and get a live Worker in seconds. ]]></description>
            <content:encoded><![CDATA[ <p>Everyone's writing code with AI agents today. But the moment an agent needs to deploy something — and needs to sign up and create an account — it slams face-first into a wall built for humans: a browser-based OAuth flow, a dashboard to click through, an API token to copy-paste, a multi-factor authentication prompt to satisfy. For an interactive copilot sitting next to a developer, that's annoying. For a background agent, it's a hard stop.</p><p>Today we're rolling out Temporary Cloudflare Accounts for Agents.</p><p>Agents can now deploy <a href="https://www.cloudflare.com/solutions/frontends/"><u>websites</u></a>, <a href="https://www.cloudflare.com/products/workers/"><u>APIs</u></a>, and <a href="https://www.cloudflare.com/products/agents/"><u>agents</u></a> right away, without first needing to sign up for an account.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6joSXmAJAARgyGphPhXGla/9c3e8cef6f269df6a9c458495c5fe40b/image2.png" />
          </figure><p>Any agent can now run <code>wrangler deploy --temporary </code>and deploy a Worker to Cloudflare. This temporary deployment stays live for 60 minutes, during which time you can claim the temporary account, making it permanently your own. If you don't, it expires on its own.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7IKqILTXCvDH4QjNFtx2pp/798c00b190c9b9909a4501f29b8c18bb/image4.png" />
          </figure><p>Our goal? Let your agent code and ship.</p>
    <div>
      <h2>Why frictionless deployments matter for AI agents</h2>
      <a href="#why-frictionless-deployments-matter-for-ai-agents">
        
      </a>
    </div>
    <p>Frictionless temporary accounts matter more than it might first seem:</p><ul><li><p><b>Background AI sessions have no human in the loop, and are becoming the norm</b>. Any auth step that needs a browser, a copy-paste, or "click here in 60 seconds" means an agent gets stuck and may choose to deploy elsewhere.</p></li><li><p><b>Trial-and-error is the agent's superpower</b>. Agents need a tight write → deploy → verify loop. They need cheap, throwaway deployment targets, so they can curl their own output and decide whether they got it right.</p></li><li><p><b>Agent platforms are building their own ways for deploying code to "just work" without extra steps or credentials</b>. People are starting to expect that this process just works, without the need to sign up for other services that they've not used before or heard of.</p></li></ul>
    <div>
      <h2>How it works</h2>
      <a href="#how-it-works">
        
      </a>
    </div>
    <p>Temporary accounts are built around <a href="https://developers.cloudflare.com/workers/wrangler/"><u>Wrangler</u></a>, our Developer Platform command-line interface (CLI) tool that lets developers bootstrap new projects, manage their configurations and resources, and deploy and update them.</p><p>Wrangler usage is widely <a href="https://developers.cloudflare.com/workers/examples/"><u>documented</u></a> online and agents know how to use it very well. But if you hadn’t yet signed in and granted Wrangler permission to your Cloudflare account, when the agent tried to deploy, it would get stuck at the sign-up and authentication step. And you might rightly ask: How do agents and LLMs know that this new <code>--temporary</code> flag in Wrangler exists, so that they actually use it without a human explicitly telling them to do so?</p><p>To solve this, we updated Wrangler to prompt the agent with a message that tells it about the <code>--temporary</code> flag:</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4pF05TQ9S8HXvOmwD1zoG0/d99ae0ec7d82f26f85dbd5957759697e/image3.png" />
          </figure><p>When the agent discovers this, and then runs <code><b>wrangler deploy</b></code> again with the <code>--temporary</code> flag, Cloudflare provisions a temporary account for the agent to use, gives Wrangler an API token to work with, and provides a claim URL that the agent can give back to the human.</p>
    <div>
      <h2>Let’s go over every step of the flow</h2>
      <a href="#lets-go-over-every-step-of-the-flow">
        
      </a>
    </div>
    
    <div>
      <h3>Deploying and iterating on a new project</h3>
      <a href="#deploying-and-iterating-on-a-new-project">
        
      </a>
    </div>
    <p>Make sure you’re using the latest <a href="https://github.com/cloudflare/workers-sdk/releases"><u>Wrangler release</u></a>, fire up your favorite coding agent, and write a prompt to deploy a "hello world" app in build mode:</p><blockquote><p><code>Make a very simple hello world Cloudflare Worker in TypeScript and deploy it using wrangler, don't ask me questions, do the best you can</code></p></blockquote><p>The agent will run wrangler, pick up the <code>--temporary</code> flag from the output messages, build your script, and deploy it instantly, no human in the loop required:</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1Okmqkf7nBH7Hu2wsKWmnc/879cb6025e630bbd8a99b546de935828/image7.png" />
          </figure><p>As you can see, the agent wrote the script, deployed it using the <code>--temporary</code> flag, curled the preview link it got from the output, and verified that the result matches the code.</p><p>This is great, but agentic coding is often not about one single deployment. A session can go through a cycle of multiple code changes. This is not a problem: the agent can iterate on the Worker script and redeploy the changes as many times as it wants (within the 60-minute claim window). Type this prompt:</p><blockquote><p><code>Now change hello world to "hello cloudflare" and redeploy</code></p></blockquote><p>Look at the agent changing the source code, reusing the previously created temporary account, redeploying a new version and rechecking the result:</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3W86UmPyW5oQxpGkmEEJzL/6168e21df00111aedab481fee5b0a313/image5.png" />
          </figure>
    <div>
      <h3>Claiming the account</h3>
      <a href="#claiming-the-account">
        
      </a>
    </div>
    <p>At any point, you can claim the temporary account and make it yours permanently. When you click the claim link you will be taken to a page where you can either sign up for or sign in to Cloudflare, and then claim the temporary account that your Worker was deployed to. This includes claiming not just Workers, but resources like databases and other bindings, too.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3rrCw7HiJJxy6k3eyipWLR/5aca8cdcb2fc391ea27bab5f335449b0/image6.png" />
          </figure><p>If you do not claim these temporary accounts within 60 minutes, they will be automatically deleted.</p>
    <div>
      <h2>The road to frictionless agentic deployments </h2>
      <a href="#the-road-to-frictionless-agentic-deployments">
        
      </a>
    </div>
    <p>This is just one way we’re eliminating the signup barrier for agents. We recently <a href="https://blog.cloudflare.com/agents-stripe-projects/"><u>announced a partnership with Stripe</u></a> and a new protocol we co-designed that lets agents provision Cloudflare on behalf of their users — creating an account, starting a subscription, registering a domain, and getting an API token to deploy code, with no copy-pasting tokens or entering credit card details. Last month, we collaborated with WorkOS on the launch of <a href="https://workos.com/auth-md"><u>auth.md</u></a>, which anyone can adopt, to let agents provision new accounts using well-established, existing OAuth standards. </p><p>There’s a ton going on in this space, and we’re excited to keep making it easier for agents to use Cloudflare, and for developers to make their own apps <a href="https://isitagentready.com/"><u>agent-ready</u></a>. Temporary accounts are one more step toward frictionless agentic deployments — stay tuned for more. </p><p>Temporary accounts have some limitations, and their capabilities may change over time; check the <a href="https://developers.cloudflare.com/workers/platform/claim-deployments/"><u>developer documentation</u></a> for more information and then go build something. Point your agent at Cloudflare, see how far it gets, and tell us what we can improve or what delights you — share what you’ve built on <a href="https://x.com/CloudflareDev"><u>X</u></a> or hop into the <a href="https://community.cloudflare.com/"><u>Cloudflare Community</u></a>.</p> ]]></content:encoded>
            <category><![CDATA[Agents]]></category>
            <category><![CDATA[Wrangler]]></category>
            <category><![CDATA[AI]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <guid isPermaLink="false">2EKaG8rvH4FLnpEQVgvJ4I</guid>
            <dc:creator>Sid Chatterjee</dc:creator>
            <dc:creator>Celso Martinho</dc:creator>
            <dc:creator>Brendan Irvine-Broque</dc:creator>
        </item>
        <item>
            <title><![CDATA[Bringing more agent harnesses and frameworks to Cloudflare, starting with Flue]]></title>
            <link>https://blog.cloudflare.com/agents-platform-flue-sdk/</link>
            <pubDate>Wed, 17 Jun 2026 19:35:00 GMT</pubDate>
            <description><![CDATA[ The Agents SDK is now a runtime any agent framework can build on. Today we're opening up the Agents SDK primitives, with Flue as a first framework targeting Agents SDK, and rolling out agents in the dashboard. ]]></description>
            <content:encoded><![CDATA[ <p></p><p>2026 is the year agent harnesses go to production. The software that controls the model’s access to the outside world — harnesses like Codex, Claude Code, OpenCode, Pi, and Project Think — has matured to the point where teams are deploying agents as real, load-bearing infrastructure, not just prototypes. </p><p>But building agents that survive production is hard.</p><p>We learned this firsthand building <a href="https://blog.cloudflare.com/project-think/"><u>Project Think</u></a> as our first-party agent harness. In working with our customers to run agents in production, we found a common set of distributed systems problems that every agent faces when running in the cloud. When an agent is interrupted, how can it automatically and gracefully resume from where it left off, without losing context or wasting tokens? How can agents run untrusted code securely? How can agents use the tools they were trained for?</p><p>A harness can’t solve these problems on its own. They’re tied to state, storage and compute — which means they’re dependent on the platform the agent runs on. That’s why we’re taking our learnings from hardening <a href="https://blog.cloudflare.com/project-think/"><u>Project Think</u></a> for production and bringing them to the <a href="https://developers.cloudflare.com/agents/"><u>Cloudflare Agents SDK</u></a> as a base layer. Durable execution, dynamic code execution, a durable filesystem and dynamic workflows, now available to any harness building on Agents SDK.</p><p>At the same time, a new layer has emerged above the harness. Frameworks like <a href="https://flueframework.com/">Flue</a> wrap a harness with the project structures, conventions, integrations and developer experience that make agents productive to build. </p><p>To solve these scaling challenges, there’s a new, three-layer stack that is emerging for building production-grade AI. Here is how the pieces fit together, moving from the user-facing developer experience down to the underlying platform primitives: </p><ul><li><p><b>The framework (Flue)</b> — the project structure, the conventions, the integrations, the CLI and the developer experience for building agents.</p></li><li><p><b>The harness</b> <b>(Pi, Project Think)</b> —  the agentic loop that calls tools, reads results, manages context and keeps going until the task is done.</p></li><li><p><b>The runtime/platform</b> <b>(the Cloudflare Agents SDK)</b> — the compute, state, and storage primitives everything above depends on</p></li></ul><p>The Agents SDK is that bottom layer: it makes primitives like durable execution available to any harness and any framework. Flue, our new open-source framework from the team behind <a href="https://astro.build/"><u>Astro</u></a>, is the first to build on it. Here’s how. </p>
    <div>
      <h2>Flue</h2>
      <a href="#flue">
        
      </a>
    </div>
    <p><a href="https://flueframework.com/"><u>Flue</u></a> shipped 1.0 Beta this week, built on the <a href="https://pi.dev/"><u>Pi</u></a> harness, the same harness that <a href="https://openclaw.ai/"><u>OpenClaw</u></a> is built on. What makes it different as an agent framework is the approach: you don’t script what your agent does, you describe what it knows. Define the context an agent needs — its model, skills, sandbox, and instructions — and it solves whatever task you give it, autonomously. There’s no orchestration loop to write.</p><p>This declarative model is what makes writing agents easy: here’s a triage agent that intercepts a bug report, reproduces it in a sandbox, and diagnoses the issue in under 25 lines.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5U50mjNZpg7RD3fb0pOLLs/5e20ad63c1e6ab92cf5d334b79996c1a/image5.png" />
          </figure>
    <div>
      <h3>The Flue developer experience</h3>
      <a href="#the-flue-developer-experience">
        
      </a>
    </div>
    <p>Flue’s power comes from the fact that agents don’t live in isolation. They are built to exist where your users already work, and integrate with your preferred tooling:</p><ul><li><p><b>Anywhere agents</b>: Drop your agents into Slack, GitHub, Linear, or Discord with pre-configured Channels that handle event verification and dispatch boilerplate automatically.</p></li><li><p><b>Headless, but UI-ready</b>: Agents shouldn’t live in a black box. Flue agents can run completely headlessly for background tasks, but <a href="https://flueframework.com/blog/flue-1-0-beta/#introducing-fluereact"><u>@flue/react</u></a> provides native frontend hooks that stream an agent’s state, tool execution, and live messages straight into your frontend application, without you having to build custom real-time plumbing from scratch.</p></li><li><p><b>Ecosystem-ready</b>: Flue makes it easy to add and upgrade integrations with commands like <code>flue add channel slack</code>, generating a Markdown blueprint that your own coding agent can read, modify, and cleanly integrate straight into your codebase.</p></li></ul>
    <div>
      <h3>Designed for production, not just prototypes</h3>
      <a href="#designed-for-production-not-just-prototypes">
        
      </a>
    </div>
    <p>Moving an agent out of a local terminal and into a production ecosystem introduces traditional distributed systems failures. Host crashes, API timeouts from LLM providers, and unexpected restarts threaten to erase the short-term memory of a running agent turn. </p><p>Flue solves this via Durable Streams. Each event in the execution history is added to an append-only log. By processing every prompt, tool response and model choice as an unchangeable ledger, an agent’s state is never volatile. If a process dies, another simply picks up the log and continues from the exact step it left off. </p>
    <div>
      <h3>Deploy anywhere, including Cloudflare</h3>
      <a href="#deploy-anywhere-including-cloudflare">
        
      </a>
    </div>
    <p>Flue is a multi-cloud framework. On <a href="http://node.js"><u>Node.js</u></a>, each agent runs as a long-lived process. You can deploy it to any VM or container, run it in GitHub Actions, or embed it on an existing server. But when you target Cloudflare, each agent becomes a Durable Object.</p><p>By running each Flue agent inside its own Durable Object, Cloudflare can automatically scale to as many agents as you need, each with their own isolated storage and compute. You don’t have to provision servers, manage sticky sessions, or worry about noisy neighbors. And when Flue agents are deployed to Cloudflare, they get durable execution using Agents SDK’s <code>runFiber()</code>, <code>stash()</code>, and <code>onFiberRecovered()</code> methods. Flue also uses <code>@cloudflare/codemode</code> and <code>@cloudflare/shell</code> for sandboxed code execution against a durable workspace. </p>
    <div>
      <h2>What harnesses need out of an agentic platform</h2>
      <a href="#what-harnesses-need-out-of-an-agentic-platform">
        
      </a>
    </div>
    <p>Flue’s Cloudflare target works so effectively because it maps cleanly to the core primitives we built into the Agents SDK. You can even <a href="https://github.com/withastro/flue/blob/0fd59475d303d8e8d5bc184a9d3fc4ed58c0de93/packages/runtime/src/session.ts#L13"><u>dig into the Flue source code</u></a> to understand how Pi, the underlying harness, is adapted to work on Cloudflare Agents SDK.</p><p>Here’s how Flue leverages the Agents SDK under the hood, and what it takes to run any modern agent harness reliably at scale. </p>
    <div>
      <h3>Every agent harness needs durable execution</h3>
      <a href="#every-agent-harness-needs-durable-execution">
        
      </a>
    </div>
    <p>An agent turn is not a single request. The model streams tokens, calls tools, waits for results, maybe asks a human for approval, or delegates work to a subagent. That sequence can take seconds or minutes, and at any point the process can be interrupted or crash. When that happens, all of the agent state that was in memory is gone: the streaming connection, the pending tool calls, where the agent was in its turn. Sure, the conversation history is persisted on disk, but the user sees a spinner that never resolves. That’s a broken user experience.</p><p><a href="https://developers.cloudflare.com/agents/runtime/execution/durable-execution/"><u>Fibers</u></a> solve this problem by providing a native checkpointing mechanism directly inside the Agent’s underlying <a href="https://developers.cloudflare.com/durable-objects/"><u>Durable Object</u></a>. <code>runFiber()</code> records the progress to the Durable Object’s SQLite storage before the work in the Agent turn starts and checkpoints with <code>stash()</code> as the turn advances. When a fresh agent instance boots after an interruption, <code>onFiberRecovered()</code> delivers the last checkpoint, so your agent knows a turn was interrupted, where it got to, and can decide how to continue. </p>
            <pre><code>import { Agent } from "agents";
import type { FiberRecoveryContext } from "agents";

class MyAgent extends Agent {
  async doWork() {
    await this.runFiber("my-task", async (ctx) =&gt; {
      const step1 = await expensiveOperation();
      ctx.stash({ step1 });

      const step2 = await anotherExpensiveOperation(step1);
      this.setState({ ...this.state, result: step2 });
    });
  }

  async onFiberRecovered(ctx: FiberRecoveryContext) {
    if (ctx.name !== "my-task") return;

    const { step1 } = (ctx.snapshot ?? {}) as { step1?: unknown };
    if (step1) {
      const step2 = await anotherExpensiveOperation(step1);
      this.setState({ ...this.state, result: step2 });
    }
  }
}</code></pre>
            <p>Flue uses <code>runFiber()</code> <a href="https://github.com/withastro/flue/blob/cf775a84e7cf1d7037a464b947d8b2f7e9efcfd1/packages/runtime/src/cloudflare/agent-coordinator.ts#L475"><u>on its Cloudflare target</u></a> for exactly this. With the <code>onFiberRecovered()</code> hook, your harness can decide how to resume the execution of the turn, whether it attempts a full reconstruction model like Project Think that repairs turn state or whether it replays certain parts of the turn. </p>
    <div>
      <h3>Executing code is better than overloading agents with tools</h3>
      <a href="#executing-code-is-better-than-overloading-agents-with-tools">
        
      </a>
    </div>
    <p>Agent harnesses give models access to the outside world through tools. But tool surfaces grow fast, and models get worse at selecting the right tool as the list gets longer and the context window fills up with tool definitions. A better pattern: give the model one tool that executes code. The model writes a TypeScript function that calls the APIs it needs, and the harness runs it. We wrote about this when we introduced <a href="https://blog.cloudflare.com/code-mode"><u>Code Mode</u></a>.</p><p>The question is where that code runs. To run LLM-generated code securely, you need a sandbox. But typical sandboxes would be slow, cost-prohibitive and inefficient to run each tool call. That’s why the Agents SDK provides <a href="https://www.npmjs.com/package/@cloudflare/codemode"><code><u>@cloudflare/codemode</u></code></a>, which wraps <a href="https://developers.cloudflare.com/dynamic-workers/"><u>Dynamic Workers</u></a>, to execute LLM-generated code in its own Worker isolate with only the bindings you provide. </p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1NCq9AP9xL1b70WyB0iqE0/7569fc7ee43b6071f0b98945d872e9c7/BLOG-3336_3.png" />
          </figure><p>Code Mode creates a fresh Dynamic Worker for each snippet, runs it, and discards it. Isolates start in under 10ms and $0.002 per load, resulting in drastically faster and cheaper cost of execution than booting a container every time your agent needs to execute a short piece of code. Flue uses <a href="https://www.npmjs.com/package/@cloudflare/codemode"><code><u>@cloudflare/codemode</u></code></a> on its Cloudflare target to power its code tool. The agent writes JavaScript against the workspace and runs it with Code Mode.</p>
    <div>
      <h3>You don’t need a full container for most workspace tasks</h3>
      <a href="#you-dont-need-a-full-container-for-most-workspace-tasks">
        
      </a>
    </div>
    <p>Agent harnesses often need a filesystem, whether it’s to read files, write outputs, search through code and understand diffs. Coding agents in particular live in the filesystem. But if the harness is running in a serverless environment, how can it get a durable filesystem that persists across executions? </p><p>The usual answer is a container. That works, but it’s expensive for what agents mostly do. The majority of filesystem operations in an agent turn are text. Consider a review agent that reads files, greps through source code, or perhaps writes a patch. You don’t need a full Linux boot for that.</p><p><a href="https://github.com/cloudflare/agents/tree/main/packages/shell"><code><u>@cloudflare/shell</u></code></a> gives your agent a durable virtual filesystem inside its Durable Object, backed by SQLite. It provides typed file operations — read, write, edit, search, grep, diff — that agent harnesses can use as tools.</p><p>Instead of calling individual tools, a Flue agent running on the Cloudflare target writes JavaScript against the workspace virtual file state API. By running more operations within the Durable Object, the agent benefits from the isolate model’s more efficient execution process, entirely avoiding container overhead:</p>
            <pre><code>async () =&gt; {
  const files = await state.glob("src/**/*.ts");
  const results = [];
  for (const file of files) {
    const content = await state.readFile(file);
    const todos = content.match(/\/\/ TODO:.*/g);
    if (todos) results.push({ file, todos });
  }
  return results;
}</code></pre>
            <p>This translates into a faster and more cost-efficient sandbox environment for agents that need to run shell and filesystem operations to get their work done. And for agents that need a full OS, to run npm install, git, or compilers, Cloudflare Containers provides that. We’re also building <a href="https://github.com/cloudflare/workspace"><code><u>@cloudflare/workspace</u></code></a>, to keep the virtual file system of a given Durable Object in sync with a container’s, allowing for seamless transition from lightweight Workers to a Linux environment only when it needs one. </p>
    <div>
      <h3>Dynamic Workflows: let agents write their own workflows to repeat tasks consistently</h3>
      <a href="#dynamic-workflows-let-agents-write-their-own-workflows-to-repeat-tasks-consistently">
        
      </a>
    </div>
    <p>But what happens when an agent needs to do more than read files or execute single code snippets? What happens when it needs to orchestrate a massive, multi-step pipeline that must repeat consistently over time, like a code review that successfully resolves bugs or a research workflow that produces good results? A harness can’t provide durable multi-step execution on its own. It needs the platform to persist each step, retry failures, and resume after interruptions. </p><p>This pattern is gaining traction. Claude Code recently shipped <a href="https://code.claude.com/docs/en/workflows"><u>dynamic workflows</u></a>, where Claude writes a JavaScript script at runtime to hand off work to dozens of subagents, and the runtime executes it durably. <a href="https://developers.cloudflare.com/dynamic-workers/usage/dynamic-workflows/"><u>@cloudflare/dynamic-workflows</u></a> provides this for any harness running on the Agents SDK. Your agent generates a workflow at runtime, and the Workflows engine persists each step, retries failures, and can sleep for hours or wait for external events like human approval. </p><p>From the Agent class, <code>runWorkflow()</code> connects your agent to the <a href="https://developers.cloudflare.com/workflows/"><u>Workflows</u></a> engine. The agent kicks off the workflow and can go to sleep. The workflow calls back into the agent via RPC to report progress, update state, or request approval. When the workflow finishes, the agent wakes up with the result. </p>
    <div>
      <h3>Direct access to the Cloudflare ecosystem</h3>
      <a href="#direct-access-to-the-cloudflare-ecosystem">
        
      </a>
    </div>
    <p>Beyond compute and storage, agent harnesses need access to external capabilities: web browsing, email, memory, search, inference. A harness shouldn't have to integrate each of these separately, manage API keys for each, or worry about credentials leaking through agent-generated code.</p><p>The Agent class gives your harness access to the rest of Cloudflare through <a href="https://developers.cloudflare.com/workers/runtime-apis/bindings/"><u>bindings</u></a>: <a href="https://developers.cloudflare.com/ai-gateway/"><u>AI Gateway</u></a> for per-agent spend tracking and limits, <a href="https://developers.cloudflare.com/browser-rendering/"><u>Browser Run</u></a> for web automation, <a href="https://developers.cloudflare.com/email-service/"><u>Email Service</u></a> for inbox workflows, <a href="https://developers.cloudflare.com/agents/api-reference/agent-memory/"><u>Agent Memory</u></a> for persistent recall, <a href="https://developers.cloudflare.com/ai-search/"><u>AI Search</u></a> for retrieval, <a href="https://developers.cloudflare.com/containers/"><u>Containers</u></a> for workloads that need a full OS, and inference across <a href="https://blog.cloudflare.com/ai-platform/"><u>14+ model providers</u></a>. Bindings grant capabilities without exposing credentials: your agent uses them, but the keys never enter agent-generated code.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7jLx08K0FuCd8mJ4xIjwop/3833967f110d119a260979608cc57fce/BLOG-3336_4.png" />
          </figure>
    <div>
      <h2>Bring your agents to the agentic cloud</h2>
      <a href="#bring-your-agents-to-the-agentic-cloud">
        
      </a>
    </div>
    <p>We know this approach works because it is the exact architectural foundation we used to build <a href="https://blog.cloudflare.com/project-think/"><u>Project Think</u></a>, our first-party agent harness. While Project Think remains our highly optimized, out-of-the-box solution for native Cloudflare agent experiences, the Agents SDK ensures that the broader open-source ecosystem can leverage those exact same battle-tested primitives, including Flue.</p><p>If you're building agents today with Flue, you can deploy in just a few clicks to Cloudflare. And if you're building your own agent harness or you’re building an agent framework, target the Agents SDK and get the platform integration for free.</p><ul><li><p>Agents SDK: <a href="https://developers.cloudflare.com/agents/"><u>developers.cloudflare.com/agents</u></a></p></li><li><p>Flue: <a href="https://flueframework.com/"><u>flueframework.com</u></a>, <a href="https://www.npmjs.com/package/@flue/runtime"><code><u>npm install @flue/runtime</u></code></a></p></li><li><p>Think: <a href="https://github.com/cloudflare/agents/blob/main/docs/think/index.md"><u>docs</u></a></p></li><li><p>Cloudflare Community: <a href="https://community.cloudflare.com/"><u>community.cloudflare.com</u></a></p></li></ul><p></p> ]]></content:encoded>
            <category><![CDATA[AI]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[Developers]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <category><![CDATA[Durable Objects]]></category>
            <category><![CDATA[SDK]]></category>
            <category><![CDATA[MCP]]></category>
            <guid isPermaLink="false">48yPgdpb7RGTBdh4DPPCUF</guid>
            <dc:creator>Thomas Gauvin</dc:creator>
        </item>
        <item>
            <title><![CDATA[Announcing Claude Managed Agents on Cloudflare]]></title>
            <link>https://blog.cloudflare.com/claude-managed-agents/</link>
            <pubDate>Tue, 19 May 2026 13:00:00 GMT</pubDate>
            <description><![CDATA[ Cloudflare has integrated with Anthropic's Claude Managed Agents to provide a fast, isolated execution environment for autonomous code delivery. This means builders can scale agent workflows globally while strictly controlling access to private backends and easily customizing their agent’s tools and runtimes. ]]></description>
            <content:encoded><![CDATA[ <p></p><p>Cloudflare and Anthropic have collaborated to integrate Claude Managed Agents with Cloudflare Sandboxes. Our <a href="https://developers.cloudflare.com/sandbox/tutorials/claude-managed-agents/"><u>new integration</u></a> gives you more control over your agent sandboxes, secures connections to private services, and improves observability.</p><p>In the past year, Cloudflare’s Developer Platform has expanded to give more developers the tools they need to run agents at scale. This includes:</p><ul><li><p><a href="https://developers.cloudflare.com/sandbox/"><u>Sandboxes</u></a> for full stateful Linux microVMs at scale</p></li><li><p><a href="https://developers.cloudflare.com/agents/"><u>Agents SDK</u></a>, providing simple and customizable agent framework</p></li><li><p><a href="https://developers.cloudflare.com/browser-run/"><u>Browser Run</u></a>, which gives agents fully programmable and observable browsers</p></li><li><p><a href="https://developers.cloudflare.com/dynamic-workers/"><u>Dynamic Workers</u></a>, allowing for dynamic sandboxed code execution at massive scale</p></li></ul><p>Our goal is to make Cloudflare the simplest, most secure, and most programmable cloud for agents.</p><p>Integrating with Claude Managed Agents is another step in this direction. You can run your agent loop on the Claude Platform, while using Cloudflare to execute code, secure connections, and run custom tool calls.</p><p>To get going in just minutes, we’ve created a <a href="https://github.com/cloudflare/claude-managed-agents"><u>default deployment template </u></a>that gives you the following:</p><ul><li><p><b>Enhanced security</b> - Run all agent traffic through customizable proxies. This allows you to securely inject credentials, prevent data exfiltration, and better observe how your agents interact with the outside world.</p></li><li><p><b>Sandbox control and observability</b> - Get detailed sandbox metrics and logs. SSH into running machines. Customize sandbox images.</p></li><li><p><b>Lightweight sandboxes</b> - Writing and executing untrusted code can be done in a traditional microVM or a <a href="https://blog.cloudflare.com/dynamic-workers/"><u>lightweight isolate</u></a>. This lets you hit massive scale, boot sandboxes in milliseconds, and minimize infrastructure spend.</p></li><li><p><b>Private service connectivity</b> - Connect agents to private internal services without ever exposing them to the Internet.</p></li><li><p><b>Browser Control and Observability</b> - Get an audit trail of every agent’s browser sessions, including session recording and human-in-the-loop flows.</p></li><li><p><b>Email </b>- Give each of your agents its own email address and ability to send emails.</p></li><li><p><b>Custom tools</b> - Extend your agents with tools without needing additional infrastructure. Just write functions and deploy.</p></li></ul><p>You get all of this out of the box when deploying the integration, and you can easily customize if you need more.</p><p>Let’s take a brief look at Claude Managed Agents, see how to integrate a Cloudflare-based environment, then explore how to get the most out of Claude on Cloudflare.</p>
    <div>
      <h2>An overview of Claude Managed Agents</h2>
      <a href="#an-overview-of-claude-managed-agents">
        
      </a>
    </div>
    <p><a href="https://www.anthropic.com/engineering/managed-agents"><u>Claude Managed Agents</u></a> allow developers to easily define and run agents on the Anthropic platform. In these managed environments, Claude can read files, run commands, browse the web, and execute code. The harness supports built-in prompt caching, compaction, and various agent-first performance optimizations.</p><p>Until now, using Claude Managed Agents has meant running the entire stack on Anthropic-provided infrastructure. While this is great for some developers, others may need more control over their infrastructure choice, whether this is for security, compliance, or performance reasons. Self-managed environments for Claude Agents provide just that.</p><p>Anthropic describes this as “decoupling the brain from the hands.” The core agent loop runs in Anthropic (the “brain”), but the infrastructure for running and executing code (the “hands”) can be run anywhere, including Cloudflare.</p>
    <div>
      <h2>The Cloudflare environment</h2>
      <a href="#the-cloudflare-environment">
        
      </a>
    </div>
    <p>Our new integration gives your agents a Cloudflare-based environment for running and executing code within minutes.</p><p>Follow <a href="https://github.com/cloudflare/claude-managed-agents#onboarding-guide"><u>the onboarding guide</u></a> to get started. Then fork the repo and customize your integration as you see fit.</p><p>After setup, when a Claude Agent starts a session, it sends a message to your new Cloudflare-based control plane. The <a href="https://developers.cloudflare.com/workers/"><u>Workers</u></a>-based control plane gives each agent session a sandboxed environment for executing code, developing applications, running CLI tools, and more. State is automatically persisted across session sleeps.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6nYyreZYWyaKo6Xf2XjmtM/5cc21576a83e223a2cf67e8ed3fde928/image5.png" />
          </figure><p><sup><i>Sandboxes write files and execute code in response to the Claude-based Agent loop</i></sup></p><p>You can optionally configure sandbox instance sizes or customize the container image that runs within VM-based sandboxes. Each sandbox can be observed in the Cloudflare dashboard, sandbox logs can be queried or shipped to external providers like Datadog or Splunk, and the control plane ships with a built-in UI, making it easy to track the state of sandboxes or SSH into specific machines.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/46pEZh7MeAt27wLi7c9e6S/889b269b94db32c3188e3dff5b6cbd75/image3.png" />
          </figure><p><sup><i>Get interactive shell sessions into your agent’s sandbox</i></sup></p>
    <div>
      <h2>Enabling agents at Internet scale</h2>
      <a href="#enabling-agents-at-internet-scale">
        
      </a>
    </div>
    <p>What if your agent backend booted in a few milliseconds, and you didn’t have to pay for the resources of a full VM when running the agent?</p><p>The industry needs <a href="https://blog.cloudflare.com/dynamic-workers/#dynamic-worker-loader-a-lean-sandbox"><u>a lightweight primitive for sandboxing</u></a> as we adopt agents at scale, and we’re building just that.</p><p>But as models get better, we expect more and more workflows to be managed by agents. Each of your customers should be able to run many agents simultaneously; each of your employees should have tens of agents running at once. If we’re constantly running a full microVM per agent, we’ll be unnecessarily burning a ton of resources and money to enable this scale.</p><p>That’s why we’re providing a faster and cheaper sandbox for your Claude Agents. This sandbox is based on the <a href="https://developers.cloudflare.com/agents/"><u>AgentsSDK</u></a>. You can execute arbitrary code in <a href="https://developers.cloudflare.com/dynamic-workers/"><u>Dynamic Workers</u></a> using <a href="https://developers.cloudflare.com/agents/api-reference/codemode/"><u>Codemode</u></a>, and you still <a href="https://developers.cloudflare.com/sandbox/bridge/http-api/#workspace-persistence"><u>get a file system</u></a>, but your agent is doing all of this within a <a href="https://developers.cloudflare.com/workers/reference/how-workers-works/#isolates"><u>V8 isolate</u></a> instead of a microVM.</p><p>If you need agents to act as a developer, building full applications and running Linux-based tools, you can still reach for a microVM-based sandbox. For this, we provide <a href="https://developers.cloudflare.com/containers/"><u>Cloudflare Containers</u></a>, which Claude Managed Agents can also use.</p><p>But if you want a faster, cheaper, and more scalable alternative you can use isolates instead of microVMs easily. Just select “isolate” for backend type when setting up an Agent.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/MZ8cQ2UTYcbe2hFv7IUYw/f7b55a65b6a670651bf30a0e24a6f509/image1.png" />
          </figure><p><sup><i>Setting up an “isolate” backend gives you a lightweight V8 isolate sandbox instead of a microVM</i></sup></p><p>If you want to handle bursts of tens of thousands of concurrent agents or more, running with isolates will allow you to scale in a way that no VM-based solution allows.</p>
    <div>
      <h2>Securing your agentic workloads</h2>
      <a href="#securing-your-agentic-workloads">
        
      </a>
    </div>
    <p>Agents are far more powerful when they connect to your organization’s context. This usually means accessing private services and data.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1LXewj3QO0QZMiEYNEMVxk/08cf093a16e67c8fb48514b1ce744319/image9.png" />
          </figure><p>As <a href="https://blog.cloudflare.com/sandbox-auth/"><u>we’ve written before</u></a>, sandboxed workloads on Cloudflare can use an outbound proxy for fully dynamic, customizable, and zero-trust authentication between sandboxes and external services. This lets you inject secrets into requests outside the sandbox, so the agent never has access to them. This protects against exfiltration attacks.</p><p>And sometimes internal services shouldn’t ever be exposed to the open Internet. We recently launched <a href="https://developers.cloudflare.com/cloudflare-one/networks/connectors/cloudflare-mesh/"><u>Cloudflare Mesh</u></a> and <a href="https://developers.cloudflare.com/workers-vpc/"><u>Cloudflare Workers VPC</u></a> to better connect to these private services, whether they’re running on a cloud provider like AWS or on-premises. This allows you to connect to internal services using post-quantum encrypted networking without a VPN or bastion host.</p><p>Claude Managed Agents can easily connect to private services with header injection or private VPC/Mesh tunnels. This is done via customizable outbound proxies. You can define egress policies that expose only the services you choose to the agent sandboxes that you choose. You can allowlist specific endpoints, perform zero-trust injection of encrypted credentials, access private services via Cloudflare Mesh, and even write custom proxy middleware.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/2728IFBEK7MUx4YDuIldFj/2e28b1fd510abf49dd3a5ad6604b7709/image4.png" />
          </figure><p><sup><i>The integration uses outbound Workers to handle egress however you see fit</i></sup></p><p>You’re able to apply policies per tenant, per agent, or based on whatever metadata is useful. This gives you full control over how your agents connect to external services.</p>
    <div>
      <h2>Doing more with the Cloudflare Developer Platform</h2>
      <a href="#doing-more-with-the-cloudflare-developer-platform">
        
      </a>
    </div>
    <p>Agents need more than just a code execution environment. Cloudflare’s Developer Platform provides the tools you need by default to let your agents do more.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7GkOWWKBsbnBYeaIcrbHrm/e385f4fa3d8ca2fff13a92a92854f94d/image7.png" />
          </figure><p><sup><i>Sandboxes can make tool calls on Cloudflare and safely access external services.</i></sup></p><p>Here are a few of the tools you’ll find most useful as you deploy agents on Cloudflare:</p>
    <div>
      <h3>Browser Run via Claude</h3>
      <a href="#browser-run-via-claude">
        
      </a>
    </div>
    <p>One of the most common tools agents need is a browser. While <code>curl</code> can get you pretty far, when you want an agent to act like a human, this often means interacting with the web like one: rendering JS-heavy applications, taking screenshots for QA validation, filling out forms, etc. <a href="https://blog.cloudflare.com/browser-run-for-ai-agents/"><u>Browser Run</u></a> is Cloudflare’s tool to give agents browsers.</p><div>
  
</div>
<p></p><p><sup><i>A Browser Run session recording lets you watch how your agents used a browser. One of many built-in tools.</i></sup></p><p>The Claude Managed Agents integration ships with multiple browser-related tools that can be enabled immediately. These include <code>browser_search</code>, <code>browser_execute</code>, <code>screenshot</code>, <code>browse</code>, <code>fetch_to_markdown</code>, and a Cloudflare-specific implementation of <code>web_fetch</code> allows your agent to control a browser that runs on Cloudflare infrastructure. This not only lets your agent do more, but it also makes it easy to audit every action your agent’s browser is taking on the web, apply allowlists and denylist to browser sessions, and save recordings of browser sessions for future debugging.</p>
    <div>
      <h3>Agent inboxes</h3>
      <a href="#agent-inboxes">
        
      </a>
    </div>
    <p>The integration also comes with built-in support for email with the <code>send_email</code>, <code>email_read</code>, and <code>email_list</code> tools.</p><p>You can also kick off new sessions via email, or configure the agent to send emails using any domain and address configured with the <a href="https://blog.cloudflare.com/email-for-agents/"><u>Cloudflare Email Service</u></a>. This allows the agent to act on your behalf when it needs to, reply to context in forwarded emails, and autonomously interact with others via email.</p>
    <div>
      <h3>Custom tools and more</h3>
      <a href="#custom-tools-and-more">
        
      </a>
    </div>
    <p>Other built-in tools include <code>call_service</code>, which uses Cloudflare Mesh or Workers VPC to connect to private services, and <code>image_generate</code>, which uses <a href="https://www.cloudflare.com/developer-platform/products/workers-ai/"><u>Workers AI</u></a> to generate images on Cloudflare. This pairs well with Claude providing text-based inference.</p><p>Additionally, we encourage forking the repo to easily add customized tools. For example, you could add a custom tool to host a public file on <a href="https://www.cloudflare.com/developer-platform/r2/"><u>Cloudflare’s R2 object storage</u></a>. Just add the relevant binding in wrangler config, write a <a href="https://zod.dev/"><code><u>zod</u></code></a> definition, and short function in <code>custom-tools.js</code>:</p>
            <pre><code>defineTool({
  name: "r2_host_file",
  description: "Upload from sandbox to R2 and get a public URL.",
  inputSchema: z.object({
    key: z.string().describe("Object key"),
    content: z.string().describe("UTF-8 file body"),
    contentType: z.string().describe("MIME type"),
  }),
  run: async ({ key, content, contentType }, { env }) =&gt; {
    await env.PUBLIC_BUCKET.put(
      key, content, { httpMetadata: { contentType }}
    );
    return `${env.PUB_R2_URL.replace(/\/$/, "")}/${encodeURI(key)}`;
  }
}),</code></pre>
            <p>The Cloudflare Developer Platform provides all sorts of possibilities for extending your agents: give each agent session a git-backed repo with <a href="https://blog.cloudflare.com/artifacts-git-for-agents-beta/"><u>Artifacts</u></a>, run edge inference with <a href="https://workers.cloudflare.com/products/workers-ai/"><u>Workers AI</u></a>, host applications written on the fly with <a href="https://developers.cloudflare.com/dynamic-workers/"><u>Dynamic Workers</u></a>, and more.</p><p>You don’t have to worry about infrastructure or scaling – just write a few lines of code and hit deploy.</p>
    <div>
      <h2>Claude + Cloudflare</h2>
      <a href="#claude-cloudflare">
        
      </a>
    </div>
    <p>We’re excited to be working together with Anthropic to bring Cloudflare’s flexibility, scale, and security to more users. Whether you want to run tens of millions of agents using isolates, securely connect to private services with Workers VPC, or write custom tools that take advantage of all of Cloudflare, our new integration makes it easy.</p><p>See the <a href="https://developers.cloudflare.com/sandbox/claude-managed-agents/"><u>Getting Started with Managed Agents guide</u></a> to get Claude Managed Agents set up with Cloudflare in just minutes.</p> ]]></content:encoded>
            <category><![CDATA[AI]]></category>
            <category><![CDATA[Agents]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[Durable Objects]]></category>
            <category><![CDATA[Developers Storage]]></category>
            <guid isPermaLink="false">2atb0Whie0RvOXQHBoBrKH</guid>
            <dc:creator>Mike Nomitch</dc:creator>
        </item>
        <item>
            <title><![CDATA[Browser Run: now running on Cloudflare Containers, it’s faster and more scalable]]></title>
            <link>https://blog.cloudflare.com/browser-run-containers/</link>
            <pubDate>Wed, 13 May 2026 13:00:00 GMT</pubDate>
            <description><![CDATA[ We’ve enabled higher usage limits, faster performance, better reliability, and increased shipping velocity for our Browser Run product by rebuilding on top of Cloudflare’s Containers. Here’s how. ]]></description>
            <content:encoded><![CDATA[ <p>We’ve enabled higher usage limits, faster performance, and better reliability for Browser Run by rebuilding on top of Cloudflare’s <a href="https://developers.cloudflare.com/containers/"><u>Containers</u></a>.</p><p>You can now spin up 60 browsers per minute via the Workers binding and run up to 120 concurrently — 4x the previous limit. Also, <a href="https://developers.cloudflare.com/browser-run/quick-actions/"><u>Quick Action</u></a> response times dropped more than 50%. You don't need to change anything: these improvements are live today. On top of that, we’re shipping fixes and new features faster than before. Read on to learn how we did it and see the data.</p>
    <div>
      <h3>Remind me: what is Browser Run?</h3>
      <a href="#remind-me-what-is-browser-run">
        
      </a>
    </div>
    <p><a href="https://developers.cloudflare.com/browser-run/"><u>Browser Run</u></a> enables developers to programmatically control and interact with headless browser instances running on Cloudflare’s global network. That’s useful for end-to-end testing of web applications, securely investigating suspicious URLs, and leveraging how browsers can easily render PDF documents, amongst other quick actions like capturing screenshots and extracting content. More recently, it’s become a critical enabler of AI agents to interact with the web. We’re building Browser Run to be the go-to platform to responsibly utilize automated browsers securely at massive scale.</p>
    <div>
      <h3>Outgrowing our bunk bed</h3>
      <a href="#outgrowing-our-bunk-bed">
        
      </a>
    </div>
    <p>Before adopting Cloudflare Containers, we shared infrastructure with <a href="https://www.cloudflare.com/learning/access-management/what-is-browser-isolation/"><u>Browser Isolation</u></a> (BISO). While technically similar, BISO’s larger container images slowed startup and development. Crucially, BISO browsers lacked optimal global distribution, compromising resiliency and latency. Additionally, typical BISO users’ long, steady sessions clashed with Browser Run’s short, spiky usage, creating scaling bottlenecks and availability delays.</p><p>Thankfully, after much internal development, Cloudflare released <a href="https://blog.cloudflare.com/cloudflare-containers-coming-2025/"><u>Durable Object (DO)-enabled Containers  open beta last year</u></a>, meaning we were ready for a tentative adoption that ultimately benefited both product platforms. Like most successful product platforms, we’re committed to building on our own platform wherever feasible so that we can feel and fix any pain points ahead of any external customers.</p>
    <div>
      <h3>The migration: Containers</h3>
      <a href="#the-migration-containers">
        
      </a>
    </div>
    <p>We started a gradual migration by inserting a Worker in our incoming request paths to provide some Container-powered browsers to a handful of users alongside those from BISO. This dual support during development was key: it allowed us to compare performance, isolate implementation bugs and ultimately gain confidence in the benefits of the Container-driven approach.</p><p>Ramping up adoption, we first used the Container browsers for all of our Quick Actions endpoints, then for connections via the Workers browser binding on free accounts, followed by pay-as-you-go accounts in order to validate stability before we rolled it out to all remaining contract customers, ensuring a transition that required no action or existing worker redeployments from our customers.</p>
    <div>
      <h3>Challenges: performance and scale bottlenecks</h3>
      <a href="#challenges-performance-and-scale-bottlenecks">
        
      </a>
    </div>
    <p>On our end, though, we faced a fresh set of challenges getting familiar with a novel, unstable early-stage Containers platform interface that was light on documentation, light on observability, and light on colleagues in an overlapping timezone. However, our feedback to our own teams as <a href="https://www.cloudflare.com/en-gb/the-net/top-of-mind-security/customer-zero/"><u>Customer Zero</u></a> meant that we could provide a tight feedback loop leading to substantial upgrades that benefit our external customers too. Nevertheless, there was a lot of friction to overcome initially, most of which were to be expected for a closed beta in active development. Other hurdles to overcome were intrinsic to the new technical environment.</p><p>For example, once our browsers could run globally, our architecture had to adapt. DO-enabled Containers create a Durable Object as close to the incoming request as possible, but the connected Container may spin up on the other side of the world. This works fine for one-shot messages like "start my app," but when you're establishing a WebSocket between them and exchanging dozens of messages for a screenshot request, those extra milliseconds crossing the globe start adding up.</p><p>Our solution? Create regional pools of pre-warmed DO-backed browser containers to constrain the max distance (and hence max latency) between DOs and containers. When a request comes in, we pick a DO-container pair closest to the user within that region. This keeps latency low on both hops: user to DO, and DO to container. It adds a few more moving parts to our overall architecture, but we figured that was worthwhile so long as we had observability into the global state of each browser so that we could allocate and re-allocate capacity according to changing demand. A perfect use case for <a href="https://developers.cloudflare.com/kv/"><u>Workers KV</u></a>…to a point.</p><p>Demand for our headless browsers has been ramping up since the beginning of last year. In short, AI agent builders discovered Browser Run and quickly brought request volumes outpacing our existing capacity. We quickly hit the limits of how quickly we could adjust our pool capacity to serve this new demand with a scalable approach. KV’s <i>eventual</i> consistency of around 30 seconds was becoming a bottleneck on our critical request path. You might check KV, see a container as "available," but by the time you route to it (30 seconds later), it's already claimed. That lag creates race conditions and overallocation of browsers, severely limiting how fast we could scale to meet demand spikes.</p>
    <div>
      <h3>Migrating from KV to D1 + Queues</h3>
      <a href="#migrating-from-kv-to-d1-queues">
        
      </a>
    </div>
    <p>We previously stored each container state in KV. This meant that we could keep getting a minute old state due to cache TTL (recently <a href="https://developers.cloudflare.com/changelog/post/2026-01-30-kv-reduced-minimum-cachettl/"><u>KV changed the minimum cache TTL to 30 seconds</u></a>, but even so that value is still too high).</p><p>We decided to migrate the container state into <a href="https://developers.cloudflare.com/d1/"><u>D1</u></a> instances instead. D1's transactional nature is a good fit here. Once we assign a browser to a user, it's exclusively theirs. Browsers are not shared resources. SQLite transactions ensure atomic assignment and prevent race conditions where two requests might claim the same browser simultaneously.</p><p>Here’s a simplified version of our browser acquisition query:</p>
            <pre><code>WITH candidate_pool AS (
    -- candidate pool logic to pick based on latency and other rules
)
UPDATE containers
SET status = 'picked'
WHERE sessionId IN (
    SELECT sessionId
    FROM candidate_pool
    ORDER BY RANDOM()
    LIMIT ?5
)
RETURNING data</code></pre>
            <p>We keep D1 shards <a href="https://developers.cloudflare.com/d1/configuration/data-location/#available-location-hints"><u>per location</u></a> and given that we may have several thousand containers running, and that each container needs to update its state every 5 seconds, we kept running into a problem: we would <a href="https://developers.cloudflare.com/d1/platform/limits/#concurrency-and-throughput"><u>overload the database</u></a>. For instance, if each write takes 1ms we can only write at most 1,000 times, which at one row per write would mean that we could only have 5,000 containers before overloading the database.</p><p>However, if we batch those writes, we can get much higher values, because batch writes are not significantly longer than individual ones, so we can increase the throughput in orders of magnitude. In our case, we use 100 row batches, which means we can now update a maximum of 500,000 containers per location. This headroom means capacity planning is no longer a bottleneck.</p><p>Currently, our P95 for batch write is 0.1ms!</p><p>To batch writes, we use <a href="https://www.cloudflare.com/developer-platform/products/cloudflare-queues/"><u>Queues</u></a>: every 5 seconds, each container computes its own state and adds it to its location queue. We then <a href="https://developers.cloudflare.com/queues/configuration/batching-retries/#batching"><u>configure a worker consumer</u></a> with 100 batch size and 1 second batch timeout:</p>
            <pre><code>{
    ...
    "queues": {
        "consumers": [
            {
                "queue": "production-core-containers-queue-weur",
                "max_batch_size": 100,
                "max_batch_timeout": 1,
                "max_retries": 1,
            },
            ...
        ]
        ...
    }
}</code></pre>
            <p>With this configuration, we achieve acceptable lag times well below 2 seconds. That said, queue backlogs can still cause stale state. When this happens, each region falls back to a designated backup region until the primary queue catches up.</p>
    <div>
      <h3>Additional perks for quick actions</h3>
      <a href="#additional-perks-for-quick-actions">
        
      </a>
    </div>
    <p>With dedicated infrastructure, we could now make upgrades to the browser container image without unwanted side effects or bloat for other products like BISO. This opened the door to optimize quick actions like screenshots and content extraction. Previously, our workers established a WebSocket to the remote browser and sent instructions one at a time: open a page, navigate to the URL, wait for it to load and take the screenshot. Each step had to be completed before the next could begin. </p><p>However, now we send all parameters in a single HTTP request directly to the container, and the entire flow executes internally without any back-and-forth between the worker and browser.</p>
    <div>
      <h3>Results: massive performance boost and increased limits</h3>
      <a href="#results-massive-performance-boost-and-increased-limits">
        
      </a>
    </div>
    <p>We’ve seen a sharp decrease in average quick-action response time, as users are able to get what they need from a browser session in less time: less time waiting for browsers to be ready and faster processing of their <a href="https://chromedevtools.github.io/devtools-protocol/"><u>DevTools Protocol</u></a> messages.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3eIJK055QLIl3ay12TIhfQ/9d988169efadb16cb5b18b539b62aa55/image3.png" />
          </figure><p>Overcoming our real-time state management at this new scale meant we could spend more time in the playground, discovering and cooking up new features such as our recently launched <a href="https://developers.cloudflare.com/browser-run/quick-actions/crawl-endpoint/"><u>/crawl endpoint.</u></a> </p>
    <div>
      <h3>Better browser flexibility</h3>
      <a href="#better-browser-flexibility">
        
      </a>
    </div>
    <p>We also benefitted from another important perk by leaving behind shared Browser Isolation containers: faster upgrades.</p><p>When our browsers ran on shared product infrastructure, upgrading Chrome meant coordinating across multiple teams and products, each with their own roadmap and priorities. However, now that we run our own container image, we can upgrade at a faster tempo. For example, WebGL, a much-requested feature, is now available for browser-based rendering along with WebMCP (Model Context Protocol for the web) which enables new agentic interaction patterns. Both are made possible because we can control the browser version and flags without unwanted side effects in other Cloudflare products.</p><p>In a nutshell, we’re just getting started with unleashing the power of browsers at scale, especially for agentic development. We hope you’re diving in too — check out our <a href="https://developers.cloudflare.com/browser-run/"><u>docs</u></a>.</p>
    <div>
      <h3>Get started</h3>
      <a href="#get-started">
        
      </a>
    </div>
    <p>Browser Run is available on all Workers plans. Start with the <a href="https://developers.cloudflare.com/browser-rendering/get-started/"><u>quick start guide</u></a>, explore the <a href="https://developers.cloudflare.com/browser-rendering/rest-api/"><u>Quick Actions</u></a>, or try the <a href="https://developers.cloudflare.com/browser-rendering/rest-api/crawl-endpoint/"><u>/crawl endpoint</u></a> to deeply extract data from any webpage, following links across the site.</p><p>Building AI agents? Check out our <a href="https://agents.cloudflare.com/"><u>Agents SDK</u></a> with built-in Browser Run support.</p> ]]></content:encoded>
            <category><![CDATA[Developers]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <category><![CDATA[Browser Run]]></category>
            <category><![CDATA[Containers]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[Chrome]]></category>
            <category><![CDATA[Agents]]></category>
            <category><![CDATA[Browser Rendering]]></category>
            <category><![CDATA[Browser Run]]></category>
            <guid isPermaLink="false">6hwTzVHKEOlPBwzj6rWnRM</guid>
            <dc:creator>Ruskin Constant</dc:creator>
            <dc:creator>Rui Figueira</dc:creator>
            <dc:creator>Sofia Cardita</dc:creator>
        </item>
        <item>
            <title><![CDATA[Introducing Dynamic Workflows: durable execution that follows the tenant]]></title>
            <link>https://blog.cloudflare.com/dynamic-workflows/</link>
            <pubDate>Fri, 01 May 2026 13:00:00 GMT</pubDate>
            <description><![CDATA[ Dynamic Workflows is a library that lets you route durable execution to tenant-provided code on the fly. Built on Dynamic Workers, it enables platforms to serve millions of unique workflows at near-zero idle cost. ]]></description>
            <content:encoded><![CDATA[ <p>When we first launched Workers eight years ago, it was a direct-to-developers platform. Over the years, we have expanded and scaled the ecosystem so that platforms could not only build on Workers directly, but they could also enable <i>their</i> customers to ship code to <i>us </i>through many multi-tenant applications. We now see on Workers: Applications where users describe what they want, and the AI writes the implementation. Multi-tenant SaaS where every customer's business logic is, at runtime, some TypeScript the platform has never seen before. Agents that write and run their own tools. CI/CD products where every repo defines its own pipeline.</p><p>Last month, when we shipped the <a href="https://blog.cloudflare.com/dynamic-workers/"><u>Dynamic Workers open beta</u></a>, we gave those platforms a clean primitive for the <i>compute</i> side: hand the Workers runtime some code at runtime, get back an isolated, sandboxed Worker, on the same machine, in single-digit milliseconds. <a href="https://blog.cloudflare.com/durable-object-facets-dynamic-workers/"><u>Durable Object Facets</u></a> extended the same idea to <i>storage</i> — each dynamically-loaded app can have its own SQLite database, spun up on demand, with the platform sitting in front, as a supervisor. <a href="https://blog.cloudflare.com/artifacts-git-for-agents-beta/"><u>Artifacts</u></a> did the same for <i>source control</i>: a Git-native, versioned filesystem you can create by the tens of millions, one per agent, one per session, one per tenant. So, we have dynamic deployment for storage and source control. What’s next?</p><p>Today, we are bridging durable execution and dynamic deployment with <a href="https://github.com/cloudflare/dynamic-workflows"><b><u>Dynamic Workflows</u></b></a>.</p>
    <div>
      <h2>The gap between durable and dynamic execution</h2>
      <a href="#the-gap-between-durable-and-dynamic-execution">
        
      </a>
    </div>
    <p><a href="https://developers.cloudflare.com/workflows/"><u>Cloudflare Workflows</u></a> is our durable execution engine. It turns a <code>run(event, step)</code> function into a program where every step survives failures, can sleep for hours or days, can wait for external events, and resumes exactly where it left off when the isolate is recycled. It's the right primitive for anything that has to "keep going" past a single request: onboarding flows, video transcoding pipelines, multi-stage billing, long-running agent loops, and — as of <a href="https://blog.cloudflare.com/workflows-v2/"><u>Workflows V2</u></a> — up to 50,000 concurrent instances and 300 new instances per second per account, redesigned for the agentic era.</p><p>But Workflows has always had one assumption baked in: the workflow code is part of your deployment. Your <code>wrangler.jsonc</code> has a block that says <i>"when the engine calls into </i><code><i>WORKFLOWS</i></code><i>, run the class called </i><code><i>MyWorkflow</i></code><i>."</i> One binding, one class. Per deploy.</p><p>That works fine if you own all the code. It's fine if you're running a traditional application.</p><p>It stops working the moment you want to let your customer ship <i>their</i> workflow.</p><p>Say you're building an app platform where the AI writes TypeScript for every tenant. Say you're running a CI/CD product where each repository has its own pipeline. Say you're using an agents SDK where each agent writes its own durable plan. In every one of these cases, the workflow is different for every tenant, every agent, every request. There is no single class to bind.</p><p>This is the same shape of problem that Dynamic Workers solved for compute and that Durable Object Facets solved for storage. We just hadn't solved it for durable execution yet.</p>
    <div>
      <h2>Dynamic Workflows</h2>
      <a href="#dynamic-workflows">
        
      </a>
    </div>
    <p><code>@cloudflare/dynamic-workflows</code> is a small library. Roughly 300 lines of TypeScript. It lets a single Worker — the <b>Worker Loader</b> — route every <code>create()</code> call to a different tenant's code, and, critically, have the Workflows engine dispatch <code>run(event, step)</code> back to that same code when the workflow actually executes, seconds or hours or days later.</p><p>Here's the whole pattern. A Worker Loader:</p>
            <pre><code>import {
  createDynamicWorkflowEntrypoint,
  DynamicWorkflowBinding,
  wrapWorkflowBinding,
} from '@cloudflare/dynamic-workflows';

// The library looks this class up on cloudflare:workers exports.
export { DynamicWorkflowBinding };

function loadTenant(env, tenantId) {
  return env.LOADER.get(tenantId, async () =&gt; ({
    compatibilityDate: '2026-01-01',
    mainModule: 'index.js',
    modules: { 'index.js': await fetchTenantCode(tenantId) },
    // The tenant sees this as a normal Workflow binding.
    env: { WORKFLOWS: wrapWorkflowBinding({ tenantId }) },
  }));
}

// Register this as class_name in wrangler.jsonc.
export const DynamicWorkflow = createDynamicWorkflowEntrypoint&lt;Env&gt;(
  async ({ env, metadata }) =&gt; {
    const stub = loadTenant(env, metadata.tenantId);
    return stub.getEntrypoint('TenantWorkflow');
  }
);

export default {
  fetch(request, env) {
    const tenantId = request.headers.get('x-tenant-id');
    return loadTenant(env, tenantId).getEntrypoint().fetch(request);
  },
};</code></pre>
            <p>Add to your <code>wrangler.jsonc</code>:</p>
            <pre><code>"workflows": [
		{
			"name": "dynamic-workflow",
			"binding": "WORKFLOW",
			"class_name": "DynamicWorkflow"
		}
	]</code></pre>
            <p>The tenant writes plain, idiomatic Workflows code. They have no idea they're being dispatched:</p>
            <pre><code>import { WorkflowEntrypoint } from 'cloudflare:workers';

export class TenantWorkflow extends WorkflowEntrypoint {
  async run(event, step) {
    return step.do('greet', async () =&gt; `Hello, ${event.payload.name}!`);
  }
}

export default {
  async fetch(request, env) {
    const instance = await env.WORKFLOWS.create({ params: await request.json() });
    return Response.json({ id: await instance.id });
  },
};</code></pre>
            <p>That's it. The tenant calls <code>env.WORKFLOWS.create(...)</code> against what looks like a perfectly normal Workflow binding. Workflow IDs, <code>.status()</code>, <code>.pause()</code>, retries, hibernation, durable steps, <code>step.sleep('24 hours')</code>, <code>step.waitForEvent()</code> — everything works the way it always has.</p><p>The library handles one thing: making sure that when the Workflows engine eventually wakes up and calls <code>run(event, step)</code>, it ends up inside the <i>right tenant's</i> code.</p>
    <div>
      <h2>How it works</h2>
      <a href="#how-it-works">
        
      </a>
    </div>
    <p>Three layers: the Workflows engine (platform) on top, your Worker Loader in the middle, your tenant's code (a Dynamic Worker) on the bottom. </p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3D8fGfZalW4N4h7QngR8tN/59ef281c194cfcba12ea6cfb6ec240d7/image2.png" />
          </figure><p>When a request reaches the Worker Loader, it routes the execution to the correct dynamic code on the fly. The rest of the execution is a handoff between these three layers, left-to-right in time: the request enters, bounces up to the engine, is persisted, and later bounces back down again.</p><p>Walking the flow:</p><p><b>① → ② Entering the tenant's code.</b> The Worker Loader receives an HTTP request, figures out which tenant it's for, loads that tenant's code via the Worker Loader, and forwards the request to its <code>default.fetch</code>. The <code>env</code> it hands the tenant contains <code>WORKFLOWS: wrapWorkflowBinding({ tenantId })</code>. As far as the tenant is concerned, that looks and acts like a real Workflow binding.</p><p><b>③ Up to the Worker Loader.</b> When the tenant calls <code>env.WORKFLOWS.create({ params })</code>, it's actually making a Remote Procedure Call (RPC) into the Worker Loader — the wrapped binding is a <code>WorkerEntrypoint</code> subclass (<code>DynamicWorkflowBinding</code>) that the runtime specialized with the tenant's metadata at load time. That's why you have to <code>export { DynamicWorkflowBinding }</code> from your Worker Loader: the runtime builds per-tenant stubs by looking the class up in <code>cloudflare:workers</code> exports. Bindings that cross the Dynamic Worker boundary <i>have</i> to be RPC stubs — a plain <code>{ create, get }</code> object can't be structured-cloned, and the raw <code>Workflow</code> binding isn't serializable either.</p><p>Inside the Worker Loader, the wrapped binding transparently rewrites the payload:</p>
            <pre><code>tenant calls:  create({ params: { name: 'Alice' } })
                            │
                            ▼
engine sees:   create({ params: {
                  __workerLoaderMetadata: { tenantId: 't-42' },
                  params: { name: 'Alice' }
               }})
</code></pre>
            <p><b>④ Up to the engine.</b> The Worker Loader then calls <code>.create()</code> on the <i>real</i> <code>WORKFLOWS</code> binding with the envelope as the params. From here the Workflows engine takes over. It persists <code>event.payload</code> — which now includes the envelope — and schedules the run. Every time the engine later wakes up the workflow (whether that’s after a 24-hour sleep, a crash, or a deploy), the metadata rides along with the payload, waiting to route the run.</p><p>One implication: treat the metadata as a routing hint, not as authorization. The tenant can read it back via <code>instance.status()</code>. Don't put secrets in there.</p><p><b>⑤ → ⑥ The engine comes back down.</b> When the engine is ready to run a step, it calls <code>.run(event, step)</code> on the class you registered in <code>wrangler.jsonc</code> — the one <code>createDynamicWorkflowEntrypoint</code> gave you. That class unwraps the envelope, hands the metadata to the <code>loadRunner</code> callback <i>you</i> wrote, and forwards the unwrapped event through to whatever runner the callback returns.</p><p>The callback is where everything interesting happens, and it's entirely yours. Fetch the tenant's latest source from R2. Check their plan tier and pick a region. Attach a tail Worker for per-tenant logging. Bundle TypeScript on the fly with <a href="https://www.npmjs.com/package/@cloudflare/worker-bundler"><code><u>@cloudflare/worker-bundler</u></code></a>. In the common case, you just hand off to the Worker Loader:</p>
            <pre><code>const stub = env.LOADER.get(tenantId, () =&gt; loadTenantCode(tenantId));
return stub.getEntrypoint('TenantWorkflow');</code></pre>
            <p>
The Worker Loader caches by ID, so a workflow that runs many steps over many hours reuses the same dynamic Worker across them. When the isolate eventually gets evicted, the next <code>step.do()</code> pulls the code again and keeps going — the tenant's workflow has no idea anything happened. A Dynamic Worker boots in single-digit milliseconds using a few megabytes of memory, so the dispatch overhead is essentially free. You can have a million tenants, each with their own distinct workflow code, each spun up lazily on the step boundary where it's needed, and none of them cost anything while idle.</p>
    <div>
      <h3>The escape hatch</h3>
      <a href="#the-escape-hatch">
        
      </a>
    </div>
    <p>If you want to subclass <code>WorkflowEntrypoint</code> yourself — to add logging around <code>run()</code>, wire up per-tenant observability, or thread custom state through — the library exposes the lower-level <code>dispatchWorkflow</code> primitive that <code>createDynamicWorkflowEntrypoint</code> is built on:</p>
            <pre><code>import { dispatchWorkflow } from '@cloudflare/dynamic-workflows';

export class MyDynamicWorkflow extends WorkflowEntrypoint {
  async run(event, step) {
    return dispatchWorkflow(
      { env: this.env, ctx: this.ctx },
      event,
      step,
      ({ metadata, env }) =&gt; loadRunnerForTenant(env, metadata),
    );
  }
}
</code></pre>
            <p>Everything else — IDs, pause/resume, <code>sendEvent</code>, retries — falls through to the real Workflows engine untouched.</p>
    <div>
      <h2>Dynamic Workers are the primitive</h2>
      <a href="#dynamic-workers-are-the-primitive">
        
      </a>
    </div>
    <p>Step back from the specifics for a second. Every interesting line of this library is either a wrapper around <code>.create()</code> on the outbound side or a wrapper around <code>WorkflowEntrypoint</code> on the inbound side. The actual work — spinning up the tenant's code, sandboxing it, routing RPC across the boundary, caching the isolate, hibernating between steps — is all done by Dynamic Workers underneath.</p><p>That's the real story, and it's a lot bigger than Workflows</p><p>Dynamic Workers is the primitive that swallows everything. <a href="https://blog.cloudflare.com/durable-object-facets-dynamic-workers/"><u>Durable Object Facets</u></a> is the same pattern applied to Durable Objects. Dynamic Workflows is that same pattern applied to <code>WorkflowEntrypoint</code>. Each one is the same small amount of envelope-and-unwrap glue between the static binding you've always had and the dynamic version you can now hand to your customers.</p><p>And we're not stopping at Workflows. Every binding that Workers currently exposes is heading for a dynamic counterpart — queues where each producer ships its own handler, caches, databases, object stores, AI bindings, and MCP servers where every tenant brings their own tools. Whatever you bind to a Worker today, you will soon be able to bind dynamically: dispatched per tenant, per agent, per request, at zero idle cost.</p><p>The unit economics of running a platform like this are, frankly, absurd. Shipping a multi-tenant product used to mean giving every customer their own container, their own database, their own disk, their own scheduler, and stitching it together with orchestration glue, service meshes, and hair-pulling billing math. Many of these applications have to support thousands of customers at the very least; millions, at the most. On Dynamic Workers and everything composing on top of them, idle tenants cost approximately nothing and active tenants share the same hardware through isolate-level multi-tenancy. The floor drops several orders of magnitude. A platform that used to cap out at thousands of paying customers can now reasonably serve tens of millions.</p>
    <div>
      <h2>What this unlocks</h2>
      <a href="#what-this-unlocks">
        
      </a>
    </div>
    
    <div>
      <h3>Agent platforms that plan like engineers</h3>
      <a href="#agent-platforms-that-plan-like-engineers">
        
      </a>
    </div>
    <p>Coding agents — <a href="https://opencode.ai"><u>OpenCode</u></a>, <a href="https://code.claude.com/docs/en/overview"><u>Claude Code</u></a>, Codex, Pi — have been proving for the past year that LLMs are far better at <i>writing code</i> than at making sequential tool calls. The <a href="https://developers.cloudflare.com/agents/"><u>Cloudflare Agents SDK</u></a> and <a href="https://blog.cloudflare.com/project-think"><u>Project Think</u></a> extend that insight into durable execution: with primitives like fibers and sub-agents, an agent's long-running plan can survive crashes, hibernation, and redeploys without the user noticing.</p><p>Dynamic Workflows is the piece that lets that plan be a <i>first-class Cloudflare Workflow</i> — something the agent literally writes and the platform literally runs, with the full durability machinery behind it. A <code>run(event, step)</code> function the model wrote a minute ago, where every <code>step.do(...)</code> is independently retryable, every <code>step.sleep('24 hours')</code> hibernates for free, and every <code>step.waitForEvent(...)</code> waits indefinitely for the human to approve the next action. The agent writes the workflow; the platform runs it; neither has to know ahead of time what the plan looks like.</p>
    <div>
      <h3>SDKs and frameworks where the user brings the logic</h3>
      <a href="#sdks-and-frameworks-where-the-user-brings-the-logic">
        
      </a>
    </div>
    <p>If you're shipping a framework where your customer writes the <code>run(event, step)</code> function — a workflow builder UI, a visual automation tool, a per-tenant extension system, a low-code tool for non-developers — Dynamic Workflows is now the primitive that makes it work without compromise. You call <code>wrapWorkflowBinding({ tenantId })</code> once, hand the result to their code as <code>WORKFLOWS</code>, and every workflow instance they create is automatically tagged, routed back, and executed in their sandbox. The framework owns the Worker Loader; the user owns the workflow; neither has to care about the other.</p>
    <div>
      <h3>CI/CD at primitive speed</h3>
      <a href="#ci-cd-at-primitive-speed">
        
      </a>
    </div>
    <p>Here's the use case that's been getting us most excited.</p><p>Every CI/CD platform in existence is, underneath, a dispatcher of per-repo configuration files: <i>"run these steps, in this order, with these secrets, cache these directories, upload these artifacts."</i> Each repo has its own pipeline. Each branch might have its own variant. Each pull request spawns an instance of that pipeline that has to run to completion, survive a machine crash, retry a flaky step, stream logs, pause for approvals, and persist results.</p><p>That's <i>exactly</i> the shape of a durable workflow. The reason CI hasn't been built that way until now is that nobody had a cloud primitive where <b>the workflow itself is different for every repo, dispatched at runtime, at zero provisioning cost.</b> Now you do.</p><p>Here's what a CI pipeline looks like when it's just code your customer ships with their repo — say, in <code>.cloudflare/ci.ts</code>. The workflow itself is real; the <code>runInSandbox() / summarise()</code> / GitHub binding helpers below are platform-provided glue, the kind of thing you'd ship once in your dispatcher:</p>
            <pre><code>import { WorkflowEntrypoint } from 'cloudflare:workers';

export class CIPipeline extends WorkflowEntrypoint {
  async run(event, step) {
    const { repo, sha, branch, pr } = event.payload;

    // Fork an isolated copy of the repo at this commit. Seconds, not minutes.
    const workspace = await step.do('checkout', () =&gt;
      this.env.ARTIFACTS.fork(repo, { sha })
    );

    await step.do('install', () =&gt; runInSandbox(workspace, ['pnpm', 'install']));

    // Each parallel step is independently retryable.
    const [lint, test, build] = await Promise.all([
      step.do('lint',  () =&gt; runInSandbox(workspace, ['pnpm', 'lint'])),
      step.do('test',  () =&gt; runInSandbox(workspace, ['pnpm', 'test'])),
      step.do('build', () =&gt; runInSandbox(workspace, ['pnpm', 'build'])),
    ]);

    if (pr) {
      await step.do('comment', () =&gt;
        this.env.GITHUB.commentOnPR(repo, pr, summarise({ lint, test, build }))
      );
    }

    // Workflow hibernates until approval arrives. No VM held open.
    if (branch === 'main') {
      await step.waitForEvent('approval', { type: 'deploy-approval', timeout: '24 hours' });
      await step.do('deploy', () =&gt; runInSandbox(workspace, ['pnpm', 'deploy']));
    }
  }
}</code></pre>
            <p>The platform owns the dispatcher. It ingests a webhook, figures out which repo it came from, loads <i>that repo's</i> <code>CIPipeline</code> class as a Dynamic Worker, and hands the run-off to Dynamic Workflows. The platform doesn't know what's in the pipeline. It doesn't need to. It's running a durable function that happens to live in the customer's repo.</p><p>Now line up what each step actually does:</p><ul><li><p><a href="https://blog.cloudflare.com/artifacts-git-for-agents-beta/"><b><u>Artifacts</u></b></a> gives every repo a Git-native, versioned filesystem that lives on Cloudflare's globally distributed network. <a href="https://github.com/cloudflare/artifact-fs"><u>ArtifactFS</u></a> hydrates the tree lazily, so even a multi-GB repo is ready to work within single-digit seconds — and <code>fork()</code> gives each CI run its own isolated copy, with no <code>git clone</code> tax.</p></li><li><p><a href="https://blog.cloudflare.com/dynamic-workers/"><b><u>Dynamic Workers</u></b></a> run each lightweight step (lint, format, typecheck, bundle) in a sandboxed isolate that boots in milliseconds, on the same machine as the repo's data. No VM provisioning, no image pull, no cold start.</p></li><li><p><a href="https://developers.cloudflare.com/dynamic-workers/usage/dynamic-workflows/"><b><u>Dynamic Workflows</u></b></a> holds the whole run together. Steps are retryable and durable. The run hibernates for free while waiting on approvals. State and progress survive deploys, evictions, and crashes.</p></li><li><p><a href="https://blog.cloudflare.com/sandbox-ga/"><b><u>Sandboxes</u></b></a> handle the heavy corners — the step that needs <code>docker build</code>, the integration suite that needs Postgres running, the Rust compile that needs 8 cores. Snapshots to R2 mean even those warm-start in a couple of seconds.</p></li></ul><p>A traditional CI run for a mid-sized JS repo looks something like: <i>allocate VM (15-30s) → pull base image (10s) → </i><code><i>git clone</i></code><i> (10s) → </i><code><i>npm ci</i></code><i> (30-60s) → run tests (actual work) → tear down</i>. Several minutes of ceremony before the first test runs, and you pay for the whole VM the whole time.</p><p>The same pipeline on this stack looks like: <i>edge fork of the repo (seconds) → each step boots a fresh isolate or snapshot-restored sandbox in milliseconds → runs the actual work → hibernates.</i> Nothing has to cold-start. Nothing has to be provisioned ahead of time. Nothing has to be kept warm. The repo doesn't move — the compute comes to it.</p><p>CI has never been this fast, and the reason it hasn't is that none of these primitives have existed together in one place. Now they do.</p>
    <div>
      <h2>Try it</h2>
      <a href="#try-it">
        
      </a>
    </div>
    <p><code>@cloudflare/dynamic-workflows</code> is MIT-licensed and on npm today:</p>
            <pre><code>npm install @cloudflare/dynamic-workflows</code></pre>
            <p>It runs on top of Dynamic Workers, which is in open beta on the Workers Paid plan. The <a href="https://github.com/cloudflare/dynamic-workflows"><u>repo</u></a> includes a working example — an interactive browser playground where you write a <code>TenantWorkflow</code> class, hit <b>Run</b>, and watch the steps execute with live-streaming logs and a per-step checklist that lights up as each <code>step.do()</code> commits. Clone it, deploy it, show it to a coworker.</p><p>If you're a platform, an SDK, a framework, or a CI/CD product, and you want to give your customers their own workflows without running their code in your own process: this is the primitive we built for you. If you're building agents that write durable plans, this is the primitive that makes those plans <i>real</i> Workflows. If you're just watching all of this, and it looks fun to build on top of: we'd love to see what you make.</p><p>Find us in the <a href="https://discord.cloudflare.com/"><u>Cloudflare Developers Discord</u></a>.</p> ]]></content:encoded>
            <category><![CDATA[Workflows]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[Durable Execution]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <category><![CDATA[Developers]]></category>
            <guid isPermaLink="false">4JZIrwtIvEd4qwAd4JAThB</guid>
            <dc:creator>Dan Lapid</dc:creator>
            <dc:creator>Luís Duarte</dc:creator>
        </item>
        <item>
            <title><![CDATA[Agents can now create Cloudflare accounts, buy domains, and deploy]]></title>
            <link>https://blog.cloudflare.com/agents-stripe-projects/</link>
            <pubDate>Thu, 30 Apr 2026 13:00:00 GMT</pubDate>
            <description><![CDATA[ Starting today, agents can now be Cloudflare customers. They can create a Cloudflare account, start a paid subscription, register a domain, and get back an API token to deploy code right away. Humans can be in the loop to grant permission, but there’s no need to go to the dashboard, copy and paste API tokens, or enter credit card details.  ]]></description>
            <content:encoded><![CDATA[ <p>Coding agents are great at building software. But to deploy to production they need three things from the cloud they want to host their app — an account, a way to pay, and an API token. Until now these have been tasks that humans handle directly. Increasingly, agents handle them on the user’s behalf. The agent needs to perform all the tasks a human customer can. They’re given higher-order problems to solve and choose to use Cloudflare and call Cloudflare APIs.</p><p>Starting today, agents can provision Cloudflare on behalf of their users. They can create a Cloudflare account, start a paid subscription, register a domain, and get back an API token to deploy code right away. Humans can be in the loop to grant permission and must accept Cloudflare's terms of service, but no human steps are otherwise required from start to finish. There’s no need to go to the dashboard, copy and paste API tokens, or enter credit card details. Without any extra setup, agents have everything they need to deploy a new production application in one shot. And with Cloudflare’s <a href="https://blog.cloudflare.com/code-mode/"><u>Code Mode MCP server</u></a> and <a href="https://github.com/cloudflare/skills"><u>Agent Skills</u></a>, they’re even better at it.</p><p>This all works via a new protocol that we’ve co-designed with Stripe as part of the launch of <a href="https://projects.dev/"><u>Stripe Projects</u></a>.</p><p>We’re excited to launch this new partnership with Stripe, and also to offer <a href="https://support.stripe.com/questions/stripe-atlas-perks-partners"><u>$100,000 in Cloudflare credits</u></a> to all new startups who incorporate using <a href="https://stripe.com/atlas"><u>Stripe Atlas</u></a>. But this new protocol also makes it possible for any platform with signed-in users to integrate with Cloudflare in the same way Stripe does, with zero friction for the end user.</p>
    <div>
      <h2>How it works: zero to production without any setup or manual steps</h2>
      <a href="#how-it-works-zero-to-production-without-any-setup-or-manual-steps">
        
      </a>
    </div>
    <p>Install the <a href="https://docs.stripe.com/stripe-cli/install"><u>Stripe CLI</u></a> with the <a href="https://docs.stripe.com/projects"><u>Stripe Projects plugin</u></a>, login to Stripe, and then start a new project:</p>
            <pre><code>stripe projects init</code></pre>
            <p>Then prompt your agent to build something new and deploy it to a new domain. You can watch a condensed two-minute video of this entire flow below:</p><div>
  
</div>
<p></p><p>If the email you’re logged into Stripe with already has a Cloudflare account, you’ll be prompted with a typical OAuth flow to grant the agent access. If there is no existing Cloudflare account for the email you’re logged in with, Cloudflare will provision an account automatically for you and your agent:</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5Y4grvxbS3wPfLYOGPV0PE/230d2f439ff9ea8cc75fe6cde9469792/image8.png" />
          </figure><p>You will see the agent build and deploy a site to a new Cloudflare account, and then use the Stripe Projects CLI to register the domain:</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/EKRFYCJLYdsFBVVAkrmnX/82a9781b1e5f8818093f1028decd310b/image6.png" />
          </figure><p>The agent will prompt for input and approval when necessary. For example, if your Stripe account doesn’t yet have a linked payment method, the agent will prompt you to add one:</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/2oTud5jPuMjldOEhLBbMrd/fb91e1065a84668f7bd606963006641b/image1.png" />
          </figure><p>At the end, the agent has deployed to production, and the app runs on the newly registered domain:</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/57boiRl1hScbG8n1ktcUzt/d97e1fc46610329e21accaa34ad92f13/image4.png" />
          </figure><p>The agent has gone from literal zero, no Cloudflare account at all, without any preconfigured <a href="https://github.com/cloudflare/skills"><u>Agent Skills</u></a> or <a href="https://blog.cloudflare.com/code-mode-mcp/"><u>MCP server</u></a>, to having:</p><ul><li><p>Provisioned a new Cloudflare account</p></li><li><p>Obtained an API token</p></li><li><p>Purchased a domain</p></li><li><p>Deployed an app to production</p></li></ul><p>But wait — how did the agent discover that it could do all of this? How did it know what services it could provision, and how to purchase a domain? How did it gain the context it needed to understand how to deploy to Cloudflare? Let’s dig in.</p>
    <div>
      <h2>How the protocol and integration works</h2>
      <a href="#how-the-protocol-and-integration-works">
        
      </a>
    </div>
    <p>There are three components to the interaction between the agent, Stripe, and Cloudflare shown above:</p><ul><li><p><b>Discovery</b> — the agent can call a command to query the catalog of available services.</p></li><li><p><b>Authorization</b> — the platform attests to the identity of the user, allowing providers to provision accounts or link existing ones, and securely issue credentials back to the agent.</p></li><li><p><b>Payment</b> — the platform provides a payment token that providers can use to bill the customer, allowing the agent to start subscriptions, make purchases and be billed on a usage basis.</p></li></ul><p>These build on prior art and existing standards like OAuth, OIDC and payment tokenization — but are used together to remove many steps that might otherwise require a human in the loop.</p>
    <div>
      <h2>Discovery: how agents find services they can provision themselves</h2>
      <a href="#discovery-how-agents-find-services-they-can-provision-themselves">
        
      </a>
    </div>
    <p>In the agent session above, before the agent ran the CLI command <code>stripe projects add cloudflare/registrar:domain</code>, it first had to discover the <a href="https://domains.cloudflare.com/"><u>Cloudflare Registrar</u></a> service. It did this by calling the <code>stripe projects catalog</code> command, which returns available services:</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1fGiH332y2mAIfxiPT4Uof/cad52b61f6f7c89088ea3d54cb678a46/image2.png" />
          </figure><p>The full set of <a href="https://developers.cloudflare.com/directory/"><u>Cloudflare products</u></a> and services from other providers is <a href="https://docs.stripe.com/projects#available-providers"><u>long and growing</u></a> — arguably overwhelming to humans. But for agents, this catalog of services is exactly the context they need. The agent chooses services to use from this catalog based on what the user has asked them to do and the user’s preferences — but the user needs no prior knowledge of what services are offered by which providers, and does not need to provide any input. Providers like Cloudflare make this catalog available via a simple REST API that returns JSON, and that gives agents everything they need.</p>
    <div>
      <h2>Authorization: instant account creation for new users</h2>
      <a href="#authorization-instant-account-creation-for-new-users">
        
      </a>
    </div>
    <p>When the agent chooses a service and provisions it (ex: <code>stripe projects add cloudflare/registrar:domain</code>), it provisions the resource within a Cloudflare account. But how is it able to create one on demand, without sending a human to a signup page?</p><p>Remember how at the start, the user signed in to their Stripe account? Stripe acts as the identity provider, attesting to the user’s identity. Cloudflare automatically provisions a new account for the user if no account already exists, and returns credentials back to the Stripe Projects CLI, which are securely stored, but available to the agent to use to make authenticated requests to Cloudflare. This means if someone is brand new to Cloudflare or other services, they can start building right away with their agent, without extra steps.</p><p>If the user already has a Cloudflare account, they’re sent through a standard OAuth flow to grant access to the Stripe Projects CLI, allowing them to provision resources on their existing Cloudflare account.</p>
    <div>
      <h2>Payment: give your agent a budget it can spend, without giving it your credit card info</h2>
      <a href="#payment-give-your-agent-a-budget-it-can-spend-without-giving-it-your-credit-card-info">
        
      </a>
    </div>
    <p>You might rightly worry, “What if my agent goes a bit overboard and starts buying dozens of domains? Will I end up on the hook for a massive bill? Can I really trust my agent with my credit card?”</p><p>The protocol accounts for this in two ways. When an agent provisions a paid service, Stripe includes a payment token in the request to the Provider (Cloudflare). Raw payment details like credit card numbers aren’t ever shared with the agent. Stripe then sets a default limit of $100.00 USD/month as the maximum the agent can spend on any one provider. When you’re ready to raise this limit, you can then set <a href="https://developers.cloudflare.com/changelog/post/2026-04-13-billable-usage-dashboard-and-budget-alerts/"><u>Budget Alerts</u></a> on your Cloudflare account.</p>
    <div>
      <h2>Any platform with signed-in users can integrate with Cloudflare in the same way Stripe does</h2>
      <a href="#any-platform-with-signed-in-users-can-integrate-with-cloudflare-in-the-same-way-stripe-does">
        
      </a>
    </div>
    <p>Any platform with signed-in users can act as the “Orchestrator”, playing the same role Stripe does with Stripe Projects, and integrate with Cloudflare.</p><p>Let’s say your product is a coding agent. You’d love for people to be able to take what they’ve built and get it deployed to production, using Cloudflare and other services. But the last thing you want is to send people down a maze of authorization flows and decision trees of where and how to deploy it. You just want to let people ship.</p><p>Your platform acts as the Orchestrator, with the already signed-in user. When your user needs a <a href="https://domains.cloudflare.com/"><u>domain</u></a>, a <a href="https://developers.cloudflare.com/r2/"><u>storage bucket</u></a>, a <a href="https://blog.cloudflare.com/dynamic-workers/"><u>sandbox</u></a> to give their agent, or <a href="https://workers.cloudflare.com/"><u>anything else</u></a>, you make one API call to Cloudflare to provision a new Cloudflare account to them, and get back a token to make authenticated requests on their behalf.</p><p>Or let’s say you want Cloudflare customers to be able to easily provision your service, similar to how Cloudflare is partnering with Planetscale to make it possible to <a href="https://blog.cloudflare.com/deploy-planetscale-postgres-with-workers/"><u>create Planetscale Postgres databases directly from Cloudflare</u></a>. We started working with Planetscale on this well before this new protocol got off the ground, but the flow here is quite similar. Cloudflare acts as the Orchestrator, letting you connect to your PlanetScale account, create databases, and use the user’s existing payment method for billing.</p><p>This new protocol starts to standardize the types of cross-product integrations that many platforms have been doing for years, often in ways that were one off or bespoke to a particular platform. Without a standard, each integration required engineering work that often couldn’t be leveraged for future integrations. Similar to how the <a href="https://oauth.net/2/"><u>OAuth standard</u></a> made it possible to delegate access to your account to other platforms, the protocol uses OAuth and extends further into payments and account creation, doing so in a way that treats agents as a first-class concern.</p><p>We’re excited to continue evolving the standard, and to work with Stripe on sharing a more official specification soon. We’re also excited to integrate with more platforms —  email us at <a href="#"><u>agenticpartnerships@cloudflare.com</u></a>, and tell us how you want your platform to integrate with Cloudflare.</p>
    <div>
      <h2>Give your agent the power to provision and pay</h2>
      <a href="#give-your-agent-the-power-to-provision-and-pay">
        
      </a>
    </div>
    <p>Stripe Projects is in open beta, and you can get started even if you don’t yet have a Cloudflare account. Just install the <a href="https://docs.stripe.com/stripe-cli/install"><u>Stripe CLI</u></a>, log in to Stripe, and then start a new project:</p>
            <pre><code>stripe projects init</code></pre>
            <p>Prompt your agent to build something new on Cloudflare, and show us what you’ve built!</p> ]]></content:encoded>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <category><![CDATA[Agents]]></category>
            <category><![CDATA[Registrar]]></category>
            <category><![CDATA[AI]]></category>
            <guid isPermaLink="false">AFvlhoSnkPLdXbRwjUsCU</guid>
            <dc:creator>Sid Chatterjee</dc:creator>
            <dc:creator>Brendan Irvine-Broque</dc:creator>
        </item>
        <item>
            <title><![CDATA[Making Rust Workers reliable: panic and abort recovery in wasm‑bindgen]]></title>
            <link>https://blog.cloudflare.com/making-rust-workers-reliable/</link>
            <pubDate>Wed, 22 Apr 2026 13:00:00 GMT</pubDate>
            <description><![CDATA[ Panics in Rust Workers were historically fatal, poisoning the entire instance. By collaborating upstream on the wasm‑bindgen project, Rust Workers now support resilient critical error recovery, including panic unwinding using WebAssembly Exception Handling. ]]></description>
            <content:encoded><![CDATA[ <p><a href="https://developers.cloudflare.com/workers/languages/rust/"><u>Rust Workers</u></a> run on the Cloudflare Workers platform by compiling Rust to WebAssembly, but as we’ve found, WebAssembly has some sharp edges. When things go wrong with a panic or an unexpected abort, the runtime can be left in an undefined state. For users of Rust Workers, panics were historically fatal, poisoning the instance and possibly even bricking the Worker for a period of time.</p><p>While we were able to detect and mitigate these issues, there remained a small chance that a Rust Worker would unexpectedly fail and cause other requests to fail along with it. An unhandled Rust abort in a Worker affecting one request might escalate into a broader failure affecting sibling requests or even continue to affect new incoming requests. The root cause of this was in wasm-bindgen, the core project that generates the Rust-to-JavaScript bindings Rust Workers depend on, and its lack of built-in recovery semantics.</p><p>In this post, we’ll share how the latest version of Rust Workers handles comprehensive Wasm error recovery that solves this abort-induced sandbox poisoning. This work has been contributed back into <a href="https://github.com/wasm-bindgen/wasm-bindgen"><u>wasm-bindgen</u></a> as part of our collaboration within the wasm-bindgen organization <a href="https://blog.rust-lang.org/inside-rust/2025/07/21/sunsetting-the-rustwasm-github-org/"><u>formed last year</u></a>. First with <code>panic=unwind</code> support, which ensures that a single failed request never poisons other requests, and then with abort recovery mechanisms that guarantee Rust code on Wasm can never re-execute after an abort. </p>
    <div>
      <h2>Initial recovery mitigations</h2>
      <a href="#initial-recovery-mitigations">
        
      </a>
    </div>
    <p>Our initial attempts to address reliability in this area focused on understanding and containing failures caused by Rust panics and aborts in production Rust Workers. We introduced a custom Rust panic handler that tracked failure state within a Worker and triggered full application reinitialization before handling subsequent requests. On the JavaScript side, this required wrapping the Rust-JavaScript call boundary using Proxy‑based indirection to ensure that all entrypoints were consistently encapsulated. We also made targeted modifications to the generated bindings to correctly reinitialize the WebAssembly module after a failure.</p><p>While this approach relied on custom JavaScript logic, it demonstrated that reliable recovery was achievable and eliminated the persistent failure modes we were seeing in practice. This solution was shipped by default to all workers‑rs users starting in version 0.6, and it laid the groundwork for the more general, upstreamed abort recovery mechanisms described in the sections that follow.</p>
    <div>
      <h2>Implementing <code>panic=unwind</code> with WebAssembly Exception Handling</h2>
      <a href="#implementing-panic-unwind-with-webassembly-exception-handling">
        
      </a>
    </div>
    <p>The abort recovery mechanisms described above ensure that a Worker can survive a failure, but they do so by reinitializing the entire application. For stateless request handlers, this is fine. But for workloads that hold meaningful state in memory, such as Durable Objects, reinitialization means losing that state entirely. A single panic in one request could wipe the in-memory state being used by other concurrent requests.</p><p>In most native Rust environments, panics can be unwound, allowing destructors to run and the program to recover without losing state. In WebAssembly, things historically looked very different. Rust compiled to Wasm via <code>wasm32-unknown-unknown</code> defaults to <code>panic=abort</code>, so a panic inside a Rust Worker would abruptly trap with an <code>unreachable</code> instruction and exit Wasm back to JS with a <code>WebAssembly.RuntimeError</code>.</p><p>To recover from panics without discarding instance state, we needed <code>panic=unwind</code> support for <code>wasm32-unknown-unknown</code> in wasm-bindgen, made possible by the WebAssembly Exception Handling proposal, which gained wide engine support in 2023.</p><p>We start by compiling with <code>RUSTFLAGS='-Cpanic=unwind' cargo build -Zbuild-std</code>, which rebuilds the standard library with unwind support and generates code with proper panic unwinding. For example:</p>
            <pre><code>struct HasDropA;
struct HasDropB;
extern "C" {
    fn imported_func();
}

fn some_func() {
    let a = HasDropA;
    let b = HasDropB;
    imported_func();
}</code></pre>
            <p>compiles to WebAssembly as:</p>
            <pre><code>try
  call &lt;imported_func&gt;
catch_all
  call &lt;drop_b&gt;
  call &lt;drop_a&gt;
  rethrow
end
call &lt;drop_b&gt;
call &lt;drop_a&gt;</code></pre>
            <p>This ensures that even if <code>imported_func()</code> panics, destructors still run. Similarly, <code>std::panic::catch_unwind(|| some_func())</code> compiles into:</p>
            <pre><code>try
  call &lt;some_func&gt;
  ;; set result to Ok(return value)
catch
  try
    call &lt;std::panicking::catch_unwind::cleanup&gt;
    ;; set result to Err(panic payload)
  catch_all
    call &lt;core::panicking::cannot_unwind&gt;
    unreachable
  end
end</code></pre>
            <p>Getting this to work end-to-end required several changes to the wasm-bindgen toolchain. The WebAssembly parser Walrus did not know how to handle try/catch instructions, so we added support for them. The descriptor interpreter also needed to be taught how to evaluate code containing exception handling blocks. At that point, the full application could be built with <code>panic=unwind</code>.</p><p>The final step was modifying the exports generated by wasm-bindgen to catch panics at the Rust-JavaScript boundary and surface them as JavaScript <code>PanicError</code> exceptions. One subtlety: Rust will catch foreign exceptions and abort when unwinding through <code>extern "C"</code> functions, so exports needed to be marked <code>extern "C-unwind"</code> to explicitly allow unwinding across the boundary. For futures, a panic rejects the JavaScript <code>Promise</code> with a <code>PanicError</code>.</p><p>Closures required special attention to ensure unwind safety was properly checked, via a new <code>MaybeUnwindSafe</code> trait that checks <code>UnwindSafe</code> only when built with <code>panic=unwind</code>. This quickly exposed a problem, though: many closures capture references that remain after an unwind, making them inherently unwind-unsafe. To avoid a situation where users are encouraged to incorrectly wrap closures in <code>AssertUnwindSafe</code> just to satisfy the compiler, we added <code>Closure::new_aborting</code> variants, which terminate on panic instead of unwinding in cases where unwind safety can't be guaranteed.</p><p>With panic unwinding enabled:</p><ul><li><p>Panics in exported Rust functions are caught by wasm-bindgen</p></li><li><p>Panics surface to JavaScript as PanicError exceptions</p></li><li><p>Async exports reject their returned promises with a PanicError</p></li><li><p>Rust destructors run correctly</p></li><li><p>The WebAssembly instance remains valid and reusable</p></li></ul><p>The full details of the approach and how to use it in wasm-bindgen are covered in the latest guide page for <a href="https://wasm-bindgen.github.io/wasm-bindgen/reference/catch-unwind.html"><u>Wasm Bindgen: Catching Panics</u></a>.</p>
    <div>
      <h2>Abort recovery</h2>
      <a href="#abort-recovery">
        
      </a>
    </div>
    <p>Even with <code>panic=unwind</code> support, aborts still happen - out-of-memory errors being one common cause. Because aborts can’t unwind, there is no possibility of state recovery at all, but we can at least detect and recover from aborts for future operations to avoid invalid state erroring subsequent requests.</p><p>Panic unwind support introduced a new problem for abort recovery. When we receive an error from Wasm we don’t know if it came from an <code>extern “C-unwind”</code> foreign error, or if it was a genuine abort. Aborts can take many shapes in WebAssembly.</p><p>We had two options to solve this technically: either mark all errors which are definitely aborts, or mark all errors which are definitely unwinds. Either could have worked but we chose the latter. Since our foreign exception handling was directly using raw WAT-level (WebAssembly text format) Exception Handling instructions already, we found it easier to implement exception tags for foreign exceptions to distinguish them from aborting non-unwind-safe exceptions.</p><p>With the ability to clearly distinguish between recoverable and non-recoverable errors thanks to this <code>Exception.Tag</code> feature in WebAssembly Exception Handling, we were able to then integrate both a new abort handler as well as abort reentrancy guards.

A new abort hook, <code>set_on_abort</code>, can be used at initialization time to attach a handler that recovers accordingly for the platform embedding’s needs.</p><p>Hardening panic and abort handling is critical to avoiding invalid execution state. WebAssembly allows deeply interleaved call stacks, where Wasm can call into JavaScript and JavaScript can re-enter Wasm at arbitrary depths, while alongside this, multiple tasks can be functioning in the same instance. Previously, an abort occurring in one task or nested stack was not guaranteed to invalidate higher stacks through JS, leading to undefined behavior. Care was required to ensure we can guarantee the execution model, and contribution in this space remains ongoing.</p><p>While aborts are never ideal, and reinitialization on failure is an absolute worst-case scenario, implementing critical error recovery as the last line of defense ensures execution correctness and that future operations will be able to succeed. The invalid state does not persist, ensuring a single failure does not cascade into multiple failures.</p>
    <div>
      <h2>Extension: abort reinitialization for wasm-bindgen libraries</h2>
      <a href="#extension-abort-reinitialization-for-wasm-bindgen-libraries">
        
      </a>
    </div>
    <p>While we were working on this, we realized that this is a common problem for libraries used by JS that are built with wasm-bindgen, and that they would also benefit from attaching an abort handler to be able to perform recovery.</p><p>But when building Wasm as an ES module and importing it directly (e.g. via <code>import { func } from ‘wasm-dep’</code>), it’s not clear what the recovery mechanism would be for a Wasm abort while calling <code>func()</code> for an already-linked and initialized library that is in a user JS application.</p><p>While not strictly a Rust Workers use case, our team also supports JS-based Workers users who run Rust-backed Wasm library dependencies. If we could fix this problem at the same time, that could indirectly also benefit Wasm usage on the Cloudflare Workers platform.</p><p>To support automatic abort recovery for Wasm library use cases, we added support for an experimental reinitialization mechanism into wasm‑bindgen, <code>--reset-state-function</code>. This exposes a function that allows the Rust application to effectively request that it reset its internal Wasm instance back to its initial state for the next call, without requiring consumers of the generated bindings to reimport or recreate them. Class instances from the old instance will throw as their handles become orphaned, but new classes can then be constructed. The JS application using a Wasm library is errored but not bricked.</p><p>The full technical details of this feature and how to use it in wasm-bindgen are covered in the new wasm-bindgen guide section <a href="https://wasm-bindgen.github.io/wasm-bindgen/reference/handling-aborts.html"><u>Wasm Bindgen: Handling Aborts</u></a>.</p>
    <div>
      <h2>Maturing the Rust Wasm Exception Handling ecosystem</h2>
      <a href="#maturing-the-rust-wasm-exception-handling-ecosystem">
        
      </a>
    </div>
    <p>Upstream contributions for this work did not stop at the wasm-bindgen project. Building for Wasm with <code>panic=unwind</code> still requires an experimental nightly Rust target, so we’ve also been working to advance Rust’s Wasm support for WebAssembly Exception Handling to help bring this to stable Rust.</p><p>During the development of WebAssembly Exception Handling, a late‑stage specification change resulted in two variants: <a href="https://github.com/WebAssembly/exception-handling/blob/main/proposals/exception-handling/legacy/Exceptions.md"><u>legacy exception handling</u></a> and the final modern <a href="https://github.com/WebAssembly/exception-handling/blob/main/proposals/exception-handling/Exceptions.md"><u>exception handling "with exnref"</u></a>. Today, Rust’s WebAssembly targets still default to emitting code for the legacy variant. While legacy exception handling is widely supported, it is now deprecated.</p><p>Modern WebAssembly Exception Handling is supported as of the following JS platform releases:</p><table><tr><td><p><b>Runtime</b></p></td><td><p><b>Version</b></p></td><td><p><b>Release Date</b></p></td></tr><tr><td><p>v8</p></td><td><p>13.8.1</p></td><td><p>April 28, 2025</p></td></tr><tr><td><p>workerd</p></td><td><p>v1.20250620.0</p></td><td><p>June 19, 2025</p></td></tr><tr><td><p>Chrome</p></td><td><p>138</p></td><td><p>June 28, 2025</p></td></tr><tr><td><p>Firefox</p></td><td><p>131</p></td><td><p>October 1, 2024</p></td></tr><tr><td><p>Safari</p></td><td><p>18.4</p></td><td><p>March 31, 2025</p></td></tr><tr><td><p>Node.js</p></td><td><p>25.0.0</p></td><td><p>October 15, 2025</p></td></tr></table><p>As we were investigating the support matrix, the largest concern ended up being the Node.js 24 LTS release schedule, which would have left the entire ecosystem stuck on legacy WebAssembly Exception Handling until April 2028.</p><p>Having discovered this discrepancy, we were able to backport modern exception handling to the Node.js 24 release, and even backport the fixes needed to make it work on the Node.js 22 release line to ensure support for this target. This should allow the modern Exception Handling proposal to become the default target next year.</p><p>Over the coming months, we’ll be working to make the transition to stable <code>panic=unwind</code> and modern Exception Handling as invisible as possible to end users.</p><p>While these long‑term investments in the ecosystem take time, they help build a stronger foundation for the Rust WebAssembly community as a whole, and we’re glad to be able to contribute to these improvements.</p>
    <div>
      <h2>Using panic unwind in Rust Workers</h2>
      <a href="#using-panic-unwind-in-rust-workers">
        
      </a>
    </div>
    <p>As of version 0.8.0 of Rust Workers, we have a new <code>--panic-unwind</code> flag, which can be added to the build command, following the <a href="https://github.com/cloudflare/workers-rs?tab=readme-ov-file#panic-recovery-with---panic-unwind"><u>instructions here</u></a>.</p><p>With this flag, panics can be fully recovered, and abort recovery will use the new abort classification and recovery hook mechanism. We highly recommend upgrading and trying it out for a more stable Rust Workers experience, and plan to make <code>panic=unwind</code> the default in a subsequent release. Users remaining on <code>panic=abort</code> will still continue to take advantage of the previous custom recovery wrapper handling from 0.6.0.</p>
    <div>
      <h2>Committing to Rust Workers stability</h2>
      <a href="#committing-to-rust-workers-stability">
        
      </a>
    </div>
    <p>This work is part of our ongoing effort towards a stable release for Rust Workers. By solving these sharp edges of the Wasm platform foundations at their root, and contributing back to the ecosystem where it makes sense, we build stronger foundations not just for our platform, but the entire Rust, JS, and Wasm ecosystem.</p><p>We have a number of future improvements planned for Rust Workers, and we’ll soon be sharing updates on this additional work, including wasm-bindgen generics and automated bindgen, which Guy Bedford from our team previewed in a talk on <a href="https://www.youtube.com/watch?v=zlSJY8Qv5XI"><u>Rust &amp; JS Interoperability at Wasm.io</u></a> last month.</p><p>Find us in <b>#rust‑on‑workers</b> on the <a href="https://discord.com/invite/cloudflaredev"><u>Cloudflare Discord</u></a>. We also welcome feedback and discussion and especially all new contributors to the <a href="https://github.com/cloudflare/workers-rs"><u>workers-rs</u></a> and <a href="https://github.com/wasm-bindgen/wasm-bindgen"><u>wasm-bindgen</u></a> GitHub projects.</p> ]]></content:encoded>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[Rust]]></category>
            <category><![CDATA[Rust Workers]]></category>
            <category><![CDATA[WebAssembly]]></category>
            <category><![CDATA[WASM]]></category>
            <category><![CDATA[Reliability]]></category>
            <category><![CDATA[Engineering]]></category>
            <category><![CDATA[Open Source]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <category><![CDATA[Developers]]></category>
            <guid isPermaLink="false">2StUNK716oSircL9mJUjrg</guid>
            <dc:creator>Guy Bedford</dc:creator>
            <dc:creator>Hood Chatham</dc:creator>
            <dc:creator>Logan Gatlin</dc:creator>
        </item>
        <item>
            <title><![CDATA[Building the agentic cloud: everything we launched during Agents Week 2026]]></title>
            <link>https://blog.cloudflare.com/agents-week-in-review/</link>
            <pubDate>Mon, 20 Apr 2026 13:00:00 GMT</pubDate>
            <description><![CDATA[ Agents Week 2026 is a wrap. Let’s take a look at everything we announced, from compute and security to the agent toolbox, platform tools, and the emerging agentic web. Everything we shipped for the agentic cloud.
 ]]></description>
            <content:encoded><![CDATA[ <p></p><p>Today marks the end of our first Agents Week, an innovation week dedicated entirely to the age of agents. It couldn’t have been more timely: over the past year, agents have swiftly changed how people work. Coding agents are helping developers ship faster than ever. Support agents resolve tickets end-to-end. Research agents validate hypotheses across hundreds of sources in minutes. And people aren't just running one agent: they're running several in parallel and around the clock.</p><p>As Cloudflare's CTO Dane Knecht and VP of Product Rita Kozlov noted in our <a href="https://blog.cloudflare.com/welcome-to-agents-week/"><u>welcome to Agents Week post</u></a>, the potential scale of agents is staggering: If even a fraction of the world's knowledge workers each run a few agents in parallel, you need compute capacity for tens of millions of simultaneous sessions. The one-app-serves-many-users model the cloud was built on doesn't work for that. But that's exactly what developers and businesses want to do: build agents, deploy them to users, and run them at scale.</p><p>Getting there means solving problems across the entire stack. Agents need <b>compute</b> that scales from full operating systems to lightweight isolates. They need <b>security</b> and identity built into how they run.  They need an <b>agent toolbox</b>: the right models, tools, and context to do real work. All the code that agents generate needs a clear path from afternoon <b>prototype to production</b> app. And finally, as agents drive a growing share of Internet traffic, the web itself needs to adapt for the emerging <b>agentic web</b>. Turns out, the containerless, serverless compute platform we launched eight years ago with Workers was ready-made for this moment. Since then, we've grown it into a full platform, and this week we shipped the next wave of primitives purpose-built for agents, organized around exactly those problems.</p><p>We are here to create Cloud 2.0 — the agentic cloud. Infrastructure designed for a world where agents are a primary workload. </p><p>Here's a list of everything we announced this week — we wouldn’t want you to miss a thing.</p>
    <div>
      <h2>Compute</h2>
      <a href="#compute">
        
      </a>
    </div>
    <p>It starts with compute. Agents need somewhere to run, and somewhere to store and run the code they write. Not all agents need the same thing: some need a full operating system to install packages and run terminal commands, most need something lightweight that starts in milliseconds and scales to millions. This week we shipped the environments to run them, as well as a new Git-compatible workspace for agents:</p><table><tr><th><p><b>Announcement</b></p></th><th><p><b>Summary</b></p></th></tr><tr><td><p><a href="https://blog.cloudflare.com/artifacts-git-for-agents-beta"><u>Artifacts: Versioned storage that speaks Git</u></a></p></td><td><p>Give your agents, developers, and automations a home for code and data. We’ve just launched Artifacts: Git-compatible versioned storage built for agents. Create tens of millions of repos, fork from any remote, and hand off a URL to any Git client.</p></td></tr><tr><td><p><a href="https://blog.cloudflare.com/sandbox-ga/"><u>Agents have their own computers with Sandboxes GA</u></a></p></td><td><p>Cloudflare Sandboxes give AI agents a persistent, isolated environment: a real computer with a shell, a filesystem, and background processes that starts on demand and picks up exactly where it left off.</p></td></tr><tr><td><p><a href="https://blog.cloudflare.com/sandbox-auth/"><u>Dynamic, identity-aware, and secure: egress controls for Sandboxes</u></a></p></td><td><p>Outbound Workers for Sandboxes provide a programmable, zero-trust egress proxy for AI agents. This allows developers to inject credentials and enforce dynamic security policies without exposing sensitive tokens to untrusted code.</p></td></tr><tr><td><p><a href="https://blog.cloudflare.com/durable-object-facets-dynamic-workers/"><u>Durable Objects in Dynamic Workers: Give each AI-generated app its own database</u></a></p></td><td><p>Durable Object Facets allows Dynamic Workers to instantiate Durable Objects with their own isolated SQLite databases. This enables developers to build platforms that run persistent, stateful code generated on-the-fly.</p></td></tr><tr><td><p><a href="https://blog.cloudflare.com/workflows-v2/"><u>Rearchitecting the Workflows control plane for the agentic era</u></a></p></td><td><p>Cloudflare Workflows, a durable execution engine for multi-step applications, now supports 50,000 concurrency and 300 creation rate limits through a rearchitectured control plane, helping scale to meet the use cases for durable background agents.</p></td></tr></table>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6GiG5cEdgwoLU94AuUuLfS/f9022abc8ce823ef2468860474598b6f/BLOG-3239_2.png" />
          </figure>
    <div>
      <h2>Security</h2>
      <a href="#security">
        
      </a>
    </div>
    <p>Running agents and their code is only half the challenge. Agents connect to private networks, access internal services, and take autonomous actions on behalf of users. When anyone in an organization can spin up their own agents, security can't be an afterthought. It has to be the default. This week, we launched the tools to make that easy.</p><table><tr><th><p><b>Announcement</b></p></th><th><p><b>Summary</b></p></th></tr><tr><td><p><a href="https://blog.cloudflare.com/mesh/"><u>Secure private networking for everyone: users, nodes, agents, Workers — introducing Cloudflare Mesh</u></a></p></td><td><p>Cloudflare Mesh provides secure, private network access for users, nodes, and autonomous AI agents. By integrating with Workers VPC, developers can now grant agents scoped access to private databases and APIs without manual tunnels.</p></td></tr><tr><td><p><a href="https://blog.cloudflare.com/managed-oauth-for-access/"><u>Managed OAuth for Access: make internal apps agent-ready in one click</u></a></p></td><td><p>Managed OAuth for Cloudflare Access helps AI agents securely navigate internal applications. By adopting RFC 9728, agents can authenticate on behalf of users without using insecure service accounts.</p></td></tr><tr><td><p><a href="https://blog.cloudflare.com/improved-developer-security/"><u>Securing non-human identities: automated revocation, OAuth, and scoped permissions</u></a></p></td><td><p>Cloudflare is introducing scannable API tokens, enhanced OAuth visibility, and GA for resource-scoped permissions. These tools help developers implement a true least-privilege architecture while protecting against credential leakage.</p></td></tr><tr><td><p><a href="https://blog.cloudflare.com/enterprise-mcp/"><u>Scaling MCP adoption: our reference architecture for enterprise MCP deployments</u></a></p></td><td><p>We share Cloudflare's internal strategy for governing MCP using Access, AI Gateway, and MCP server portals. We also launch Code Mode to slash token costs and recommend new rules for detecting Shadow MCP in Cloudflare Gateway.</p></td></tr></table>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6dT2Kw2vS0AJvk6Nw1oAg2/c1b9ca10314f85664ce6a939339f1247/BLOG-3239_3.png" />
          </figure>
    <div>
      <h2>Agent Toolbox</h2>
      <a href="#agent-toolbox">
        
      </a>
    </div>
    <p>A capable agent needs to be able to think and remember, communicate, and see. This means being powered with the right models, with access to the right tools and the right context for their task at hand. This week we shipped the primitives — inference, search, memory, voice, email, and a browser — that turn an agent into something that actually gets work done.</p><table><tr><th><p><b>Announcement</b></p></th><th><p><b>Summary</b></p></th></tr><tr><td><p><a href="https://blog.cloudflare.com/project-think/"><u>Project Think: building the next generation of AI agents on Cloudflare</u></a></p></td><td><p>Announcing a preview of the next edition of the Agents SDK — from lightweight primitives to a batteries-included platform for AI agents that think, act, and persist.</p></td></tr><tr><td><p><a href="https://blog.cloudflare.com/voice-agents/"><u>Add voice to your agent</u></a></p></td><td><p>An experimental voice pipeline for the Agents SDK enables real-time voice interactions over WebSockets. Developers can now build agents with continuous STT and TTS in just ~30 lines of server-side code.</p></td></tr><tr><td><p><a href="https://blog.cloudflare.com/email-for-agents/"><u>Cloudflare Email Service: now in public beta. Ready for your agents</u></a></p></td><td><p>Agents are becoming multi-channel. That means making them available wherever your users already are — including the inbox. Cloudflare Email Service enters public beta with the infrastructure layer to make that easy: send, receive, and process email natively from your agents.</p></td></tr><tr><td><p><a href="https://blog.cloudflare.com/ai-platform/"><u>Cloudflare's AI platform: an inference layer designed for agents </u></a></p></td><td><p>We're building Cloudflare into a unified inference layer for agents, letting developers call models from 14+ providers. New features include Workers binding for running third-party models and an expanded catalog with multimodal models.</p></td></tr><tr><td><p><a href="https://blog.cloudflare.com/high-performance-llms/"><u>Building the foundation for running extra-large language models</u></a></p></td><td><p>We built a custom technology stack to run fast large language models on Cloudflare’s infrastructure. This post explores the engineering trade-offs and technical optimizations required to make high-performance AI inference accessible.</p></td></tr><tr><td><p><a href="https://blog.cloudflare.com/unweight-tensor-compression/"><u>Unweight: how we compressed an LLM 22% without sacrificing quality</u></a></p></td><td><p>Running large LLMs across Cloudflare’s network requires us to be smarter and more efficient about GPU memory bandwidth. That’s why we developed Unweight, a lossless inference-time compression system that achieves up to a 22% model footprint reduction, so that we can deliver faster and cheaper inference than ever before. </p></td></tr><tr><td><p><a href="https://blog.cloudflare.com/introducing-agent-memory/"><u>Agents that remember: introducing Agent Memory</u></a></p></td><td><p>Cloudflare Agent Memory is a managed service that gives AI agents persistent memory, allowing them to recall what matters, forget what doesn't, and get smarter over time.</p></td></tr><tr><td><p><a href="https://blog.cloudflare.com/ai-search-agent-primitive/"><u>AI Search: the search primitive for your agents</u></a></p></td><td><p>AI Search is the search primitive for your agents. Create instances dynamically, upload files, and search across instances with hybrid retrieval and relevance boosting. Just create a search instance, upload, and search.</p></td></tr><tr><td><p><a href="https://blog.cloudflare.com/browser-run-for-ai-agents/"><u>Browser Run: give your agents a browser</u></a></p></td><td><p>Browser Rendering is now Browser Run, with Live View, Human in the Loop, CDP access, session recordings, and 4x higher concurrency limits for AI agents.</p></td></tr></table>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3sTmO0Pt0PabTR2g8pZ963/a3f523a17db3a13301072232664b11c2/BLOG-3239_4.png" />
          </figure>
    <div>
      <h2>Prototype to production</h2>
      <a href="#prototype-to-production">
        
      </a>
    </div>
    <p>The best infrastructure is also one that’s easy to use. We want to meet developers and their agents where they’re already working: in the terminal, in the editor, in a prompt, and make the full Cloudflare platform accessible without context-switching.</p><table><tr><th><p><b>Announcement</b></p></th><th><p><b>Summary</b></p></th></tr><tr><td><p><a href="https://blog.cloudflare.com/cf-cli-local-explorer/"><u>Building a CLI for all of Cloudflare</u></a></p></td><td><p>We’re introducing cf, a new unified CLI designed for consistency across the Cloudflare platform, alongside Local Explorer for debugging local data. These tools simplify how developers and AI agents interact with our nearly 3,000 API operations.</p></td></tr><tr><td><p><a href="https://blog.cloudflare.com/introducing-agent-lee/"><u>Introducing Agent Lee - a new interface to the Cloudflare stack</u></a></p></td><td><p>Agent Lee is an in-dashboard agent that shifts Cloudflare’s interface from manual tab-switching to a single prompt. Using sandboxed TypeScript, it helps you troubleshoot and manage your stack as a grounded technical collaborator.</p></td></tr><tr><td><p><a href="https://blog.cloudflare.com/flagship/"><u>Introducing Flagship: feature flags built for the age of AI</u></a></p></td><td><p>Introducing Flagship, a native feature flag service built on Cloudflare’s global network to eliminate the latency of third-party providers. By using KV and Durable Objects, Flagship allows for sub-millisecond flag evaluation.</p></td></tr><tr><td><p><a href="https://blog.cloudflare.com/deploy-planetscale-postgres-with-workers/"><u>Deploy Postgres and MySQL databases with PlanetScale + Workers</u></a></p></td><td><p>Learn how to deploy PlanetScale Postgres and MySQL databases via Cloudflare and connect Cloudflare Workers.</p></td></tr><tr><td><p><a href="https://blog.cloudflare.com/registrar-api-beta/"><u>Register domains wherever you build: Cloudflare Registrar API now in beta</u></a></p></td><td><p>The Cloudflare Registrar API is now in beta. Developers and AI agents can search, check availability, and register domains at cost directly from their editor, their terminal, or their agent — without leaving their workflow.</p></td></tr></table>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1BuOIU5Q3gfMIekZJEiBra/732c42b78aa397e7bb19b3816c7d1516/BLOG-3239_5.png" />
          </figure>
    <div>
      <h2>Agentic Web</h2>
      <a href="#agentic-web">
        
      </a>
    </div>
    <p>As more agents come online, they're still browsing an Internet that was built for people. Existing websites need new tools to control what bots can access their content, package and present it for agents, and measure how ready they are for this shift.</p><table><tr><th><p><b>Announcement</b></p></th><th><p><b>Summary</b></p></th></tr><tr><td><p><a href="https://blog.cloudflare.com/agent-readiness/"><u>Introducing the Agent Readiness score. Is your site agent-ready?</u></a></p></td><td><p>The Agent Readiness score can help site owners understand how well their websites support AI agents. Here we explore new standards, share Radar data, and detail how we made Cloudflare’s docs the most agent-friendly on the web.</p></td></tr><tr><td><p><a href="https://blog.cloudflare.com/ai-redirects/"><u>Redirects for AI Training enforces canonical content</u></a></p></td><td><p>Soft directives don’t stop crawlers from ingesting deprecated content. Redirects for AI Training allows anybody on Cloudflare to redirect verified crawlers to canonical pages with one toggle and no origin changes.</p></td></tr><tr><td><p><a href="https://blog.cloudflare.com/network-performance-agents-week/"><u>Agents Week: Network performance update</u></a></p></td><td><p>By migrating our request handling layer to a Rust-based architecture called FL2, Cloudflare has increased its performance lead to 60% of the world’s top networks. We use real-user measurements and TCP connection trimeans to ensure our data reflects the actual experience of people on the Internet</p></td></tr><tr><td><p><a href="https://blog.cloudflare.com/shared-dictionaries/"><u>Shared dictionary compression that keeps up with the agentic web</u></a></p></td><td><p>We give you a sneak peek of our support for shared compression dictionaries, show you how it improves page load times, and reveal when you’ll be able to try the beta yourself.</p></td></tr></table>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5jKSnMla0hclK999LtHfuu/886f66ffda1c699a5a15bf6748a6cf9b/BLOG-3239_6.png" />
          </figure>
    <div>
      <h2>That’s a wrap</h2>
      <a href="#thats-a-wrap">
        
      </a>
    </div>
    <p>Agents Week 2026 is ending, but the agentic cloud is just getting started. Everything we shipped this week — from compute and security to the agent toolbox and the agentic web — is the foundation. We're going to keep building on it to give you everything you need to build what's next.</p><p>We also have more blog posts coming out today and tomorrow to continue the story, so keep an eye out for the latest <a href="https://blog.cloudflare.com/"><u>at our blog</u></a>.</p><p>If you're building on any of what we announced this week, we want to hear about it. Come find us on <a href="https://x.com/cloudflaredev"><u>X</u></a> or <a href="https://discord.com/invite/cloudflaredev"><u>Discord</u></a>, or head to the <a href="https://developers.cloudflare.com/products/?product-group=Developer+platform"><u>developer documentation</u></a>. </p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/YLNvB0KU0Eh3KpAj6YgD6/77c190843d1fa76b212571f1d607d8d2/BLOG-3239_7.png" />
          </figure><p></p> ]]></content:encoded>
            <category><![CDATA[Agents Week]]></category>
            <category><![CDATA[Agents]]></category>
            <category><![CDATA[AI]]></category>
            <category><![CDATA[Durable Objects]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[SDK]]></category>
            <category><![CDATA[Browser Run]]></category>
            <category><![CDATA[Cloudflare Access]]></category>
            <category><![CDATA[Browser Rendering]]></category>
            <category><![CDATA[MCP]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <category><![CDATA[Developers]]></category>
            <category><![CDATA[Sandbox]]></category>
            <category><![CDATA[LLM]]></category>
            <category><![CDATA[Cloudflare Gateway]]></category>
            <category><![CDATA[Workers AI]]></category>
            <category><![CDATA[Product News]]></category>
            <category><![CDATA[API]]></category>
            <guid isPermaLink="false">25CSwW9eXDM4FJOdVPLf8f</guid>
            <dc:creator>Ming Lu</dc:creator>
            <dc:creator>Anni Wang</dc:creator>
        </item>
        <item>
            <title><![CDATA[The AI engineering stack we built internally — on the platform we ship]]></title>
            <link>https://blog.cloudflare.com/internal-ai-engineering-stack/</link>
            <pubDate>Mon, 20 Apr 2026 13:00:00 GMT</pubDate>
            <description><![CDATA[ We built our internal AI engineering stack on the same products we ship. That means 20 million requests routed through AI Gateway, 241 billion tokens processed, and inference running on Workers AI, serving more than 3,683 internal users. Here's how we did it.
 ]]></description>
            <content:encoded><![CDATA[ <p></p><p>In the last 30 days, 93% of Cloudflare’s R&amp;D organization used AI coding tools powered by infrastructure we built on our own platform.</p><p>Eleven months ago, we undertook a major project: to truly integrate AI into our engineering stack. We needed to build the internal MCP servers, access layer, and AI tooling necessary for agents to be useful at Cloudflare. We pulled together engineers from across the company to form a tiger team called iMARS (Internal MCP Agent/Server Rollout Squad). The sustained work landed with the Dev Productivity team, who also own much of our internal tooling including CI/CD, build systems, and automation.</p><p>Here are some numbers that capture our own agentic AI use over the last 30 days:</p><ul><li><p><b>3,683 internal users</b> actively using AI coding tools (60% company-wide, 93% across R&amp;D), out of approximately 6,100 total employees</p></li><li><p><b>47.95 million</b> AI requests </p></li><li><p><b>295 teams</b> are currently utilizing agentic AI tools and coding assistants.</p></li><li><p><b>20.18 million</b> AI Gateway requests per month</p></li><li><p><b>241.37 billion</b> tokens routed through AI Gateway</p></li><li><p><b>51.83 billion</b> tokens processed on Workers AI</p></li></ul><p>The impact on developer velocity internally is clear: we’ve never seen a quarter-to-quarter increase in merge requests to this degree.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3lMKwBTT3m7DDmS4BnoNhB/9002104b9c6c09cf72f4052d71e22363/BLOG-3270_2.png" />
          </figure><p>As AI tooling adoption has grown the 4-week rolling average has climbed from ~5,600/week to over 8,700. The week of March 23 hit 10,952, nearly double the Q4 baseline.</p><p>MCP servers were the starting point, but the team quickly realized we needed to go further: rethink how standards are codified, how code gets reviewed, how engineers onboard, and how changes propagate across thousands of repos.</p><p>This post dives deep into what that looked like over the past eleven months and where we ended up. We're publishing now, to close out Agents Week, because the AI engineering stack we built internally runs on the same products we’re shipping and enhancing this week.</p>
    <div>
      <h2>The architecture at a glance</h2>
      <a href="#the-architecture-at-a-glance">
        
      </a>
    </div>
    <p>The engineer-facing tools layer (<a href="https://opencode.ai/"><u>OpenCode</u></a>, Windsurf, and other MCP-compatible clients) include both open-source and third-party coding assistant tools.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/2oZJhdVNbW05lJaSnN3hds/21c5427f275ba908388333884530f455/image8.png" />
          </figure><p>Each layer maps to a Cloudflare product or tool we use:</p><table><tr><th><p>What we built</p></th><th><p>Built with</p></th></tr><tr><td><p>Zero Trust authentication</p></td><td><p><a href="https://developers.cloudflare.com/cloudflare-one/"><b><u>Cloudflare Access</u></b></a></p></td></tr><tr><td><p>Centralized LLM routing, cost tracking, BYOK, and Zero Data Retention controls</p></td><td><p><a href="https://developers.cloudflare.com/ai-gateway/"><b><u>AI Gateway</u></b></a></p></td></tr><tr><td><p>On-platform inference with open-weight models</p></td><td><p><a href="https://developers.cloudflare.com/workers-ai/"><b><u>Workers AI</u></b></a></p></td></tr><tr><td><p>MCP Server Portal with single OAuth</p></td><td><p><a href="https://developers.cloudflare.com/workers/"><b><u>Workers</u></b></a> + <a href="https://developers.cloudflare.com/cloudflare-one/"><b><u>Access</u></b></a></p></td></tr><tr><td><p>AI Code Reviewer CI integration</p></td><td><p><a href="https://developers.cloudflare.com/workers/"><b><u>Workers</u></b></a> + <a href="https://developers.cloudflare.com/ai-gateway/"><b><u>AI Gateway</u></b></a></p></td></tr><tr><td><p>Sandboxed execution for agent-generated code (Code Mode)</p></td><td><p><a href="https://blog.cloudflare.com/dynamic-workers/"><b><u>Dynamic Workers</u></b></a></p></td></tr><tr><td><p>Stateful, long-running agent sessions</p></td><td><p><a href="https://developers.cloudflare.com/agents/"><b><u>Agents SDK</u></b></a> (McpAgent, Durable Objects)</p></td></tr><tr><td><p>Isolated environments for cloning, building, and testing</p></td><td><p><a href="https://developers.cloudflare.com/sandbox/"><b><u>Sandbox SDK</u></b></a> — GA as of Agents Week</p></td></tr><tr><td><p>Durable multi-step workflows</p></td><td><p><a href="https://developers.cloudflare.com/workflows/"><b><u>Workflows</u></b></a> — scaled 10x during Agents Week</p></td></tr><tr><td><p>16K+ entity knowledge graph</p></td><td><p><a href="https://backstage.io/"><b><u>Backstage</u></b></a> (OSS)</p></td></tr></table><p>None of this is internal-only infrastructure. Everything (besides Backstage) listed above is a shipping product, and many of them got substantial updates during Agents Week.</p><p>We’ll walk through this in three acts:</p><ol><li><p><b>The platform layer</b> — how authentication, routing, and inference work (AI Gateway, Workers AI, MCP Portal, Code Mode)</p></li><li><p><b>The knowledge layer</b> — how agents understand our systems (Backstage, AGENTS.md)</p></li><li><p><b>The enforcement layer</b> — how we keep quality high at scale (AI Code Reviewer, Engineering Codex)</p></li></ol>
    <div>
      <h2>Act 1: The platform layer</h2>
      <a href="#act-1-the-platform-layer">
        
      </a>
    </div>
    
    <div>
      <h3>How AI Gateway helped us stay secure and improve the developer experience</h3>
      <a href="#how-ai-gateway-helped-us-stay-secure-and-improve-the-developer-experience">
        
      </a>
    </div>
    <p>When you have over 3,600+ internal users using AI coding tools daily, you need to solve for access and visibility across many clients, use cases, and roles.</p><p>Everything starts with <a href="https://developers.cloudflare.com/cloudflare-one/"><b><u>Cloudflare Access</u></b></a>, which handles all authentication and zero-trust policy enforcement. Once authenticated, every LLM request routes through <a href="https://developers.cloudflare.com/ai-gateway/"><b><u>AI Gateway</u></b></a>. This gives us a single place to manage provider keys, cost tracking, and data retention policies.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5zorZ21OXIbNg7VACGzwZX/e1b35951622bab6ae7ca9fa9b8a5c844/BLOG-3270_4.png" />
          </figure><p><sup><i>The OpenCode AI Gateway overview: 688.46k requests per day, 10.57B tokens per day, routing to four providers through one endpoint.</i></sup></p><p>AI Gateway analytics show how monthly usage is distributed across model providers. Over the last month, internal request volume broke down as follows.</p><table><tr><th><p>Provider</p></th><th><p>Requests/month</p></th><th><p>Share</p></th></tr><tr><td><p>Frontier Labs (OpenAI, Anthropic, Google)</p></td><td><p>13.38M</p></td><td><p>91.16%</p></td></tr><tr><td><p>Workers AI</p></td><td><p>1.3M</p></td><td><p>8.84%</p></td></tr></table><p>Frontier models handle the bulk of complex agentic coding work for now, but Workers AI is already a significant part of the mix and handles an increasing share of our agentic engineering workloads.</p>
    <div>
      <h4>How we increasingly leverage Workers AI</h4>
      <a href="#how-we-increasingly-leverage-workers-ai">
        
      </a>
    </div>
    <p><a href="https://developers.cloudflare.com/workers-ai/"><b><u>Workers AI</u></b></a> is Cloudflare's serverless AI inference platform which runs open-source models on GPUs across our global network. Beyond huge cost improvements compared to frontier models, a key advantage is that inference stays on the same network as your Workers, Durable Objects, and storage. No cross-cloud hops to deal with, which cause more latency, network flakiness, and additional networking configuration to manage.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/2WFqiZSL1RGbD2O7gOPDfR/2edc8c69fdea89fef76e600cb3ee5384/BLOG-3270_5.png" />
          </figure><p><sup><i>Workers AI usage in the last month: 51.47B input tokens, 361.12M output tokens.</i></sup></p><p><a href="https://blog.cloudflare.com/workers-ai-large-models/"><b><u>Kimi K2.5</u></b></a>, launched on Workers AI in March 2026, is a frontier-scale open-source model with a 256k context window, tool calling, and structured outputs. As we described in our <a href="https://blog.cloudflare.com/workers-ai-large-models/"><u>Kimi K2.5 launch post</u></a>, we have a security agent that processes over 7 billion tokens per day on Kimi. That would cost an estimated $2.4M per year on a mid-tier proprietary model. But on Workers AI, it's 77% cheaper.</p><p>Beyond security, we use Workers AI for documentation review in our CI pipeline, for generating AGENTS.md context files across thousands of repositories, and for lightweight inference tasks where same-network latency matters more than peak model capability.</p><p>As open-source models continue to improve, we expect Workers AI to handle a growing share of our internal workloads. </p><p>One thing we got right early: routing through a single proxy Worker from day one. We could have had clients connect directly to AI Gateway, which would have been simpler to set up initially. But centralizing through a Worker meant we could add per-user attribution, model catalog management, and permission enforcement later without touching any client configs. Every feature described in the bootstrap section below exists because we had that single choke point. The proxy pattern gives you a control plane that direct connections don't, and if we plug in additional coding assistant tools later, the same Worker and discovery endpoint will handle them.</p>
    <div>
      <h4>How it works: one URL to configure everything</h4>
      <a href="#how-it-works-one-url-to-configure-everything">
        
      </a>
    </div>
    <p>The entire setup starts with one command:</p>
            <pre><code>opencode auth login https://opencode.internal.domain
</code></pre>
            <p>That command triggers a chain that configures providers, models, MCP servers, agents, commands, and permissions, without the user touching a config file.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3sK7wVF3QbeNJ0rjLBabXh/05b9098197b7a3d084e6d860ed22b169/BLOG-3270_6.png" />
          </figure><p><b>Step 1: Discover auth requirements.</b> OpenCode fetches <a href="https://opencode.ai/docs/config/"><u>config</u></a> from a URL like <code>https://opencode.internal.domain/.well-known/opencode</code>. </p><p>This discovery endpoint is served by a Worker and the response has an <code>auth</code> block telling OpenCode how to authenticate, along with a <code>config</code> block with providers, MCP servers, agents, commands, and default permissions:</p>
            <pre><code>{
  "auth": {
    "command": ["cloudflared", "access", "login", "..."],
    "env": "TOKEN"
  },
  "config": {
    "provider": { "..." },
    "mcp": { "..." },
    "agent": { "..." },
    "command": { "..." },
    "permission": { "..." }
  }
}
</code></pre>
            <p><b>Step 2: Authenticate via Cloudflare Access.</b> OpenCode runs the auth command and the user authenticates through the same SSO they use for everything else at Cloudflare. <code>cloudflared</code> returns a signed JWT. OpenCode stores it locally and automatically attaches it to every subsequent provider request.</p><p><b>Step 3: Config is merged into OpenCode.</b> The config provided is shared defaults for the entire organization, but local configs always take priority. Users can override the default model, add their own agents, or adjust project and user scoped permissions without affecting anyone else.</p><p><b>Inside the proxy Worker.</b> The Worker is a simple Hono app that does three things:</p><ol><li><p><b>Serves the shared config.</b> The config is compiled at deploy time from structured source files and contains placeholder values like {baseURL} for the Worker's origin. At request time, the Worker replaces these, so all provider requests route through the Worker rather than directly to model providers. Each provider gets a path prefix (<code>/anthropic, /openai, /google-ai-studio/v1beta, /compat</code> for Workers AI) that the Worker forwards to the corresponding AI Gateway route.</p></li><li><p><b>Proxies requests to AI Gateway.</b> When OpenCode sends a request like <code>POST /anthropic/v1/messages</code>, the Worker validates the Cloudflare Access JWT, then rewrites headers before forwarding:
</p>
            <pre><code>Stripped:   authorization, cf-access-token, host
Added:      cf-aig-authorization: Bearer &lt;API_KEY&gt;
            cf-aig-metadata: {"userId": "&lt;anonymous-uuid&gt;"}
</code></pre>
            <p>The request goes to AI Gateway, which routes it to the appropriate provider. The response passes straight through with zero buffering. The <code>apiKey</code> field in the client config is empty because the Worker injects the real key server-side. No API keys exist on user machines.</p></li><li><p><b>Keeps the model catalog fresh.</b> An hourly cron trigger fetches the current OpenAI model list from <code>models.dev</code>, caches it in Workers KV, and injects <code>store: false</code> on every model for Zero Data Retention. New models get ZDR automatically without a config redeploy.</p></li></ol><p><b>Anonymous user tracking.</b> After JWT validation, the Worker maps the user's email to a UUID using D1 for persistent storage and KV as a read cache. AI Gateway only ever sees the anonymous UUID in <code>cf-aig-metadata</code>, never the email. This gives us per-user cost tracking and usage analytics without exposing identities to model providers or Gateway logs.</p><p><b>Config-as-code. </b>Agents and commands are authored as markdown files with YAML frontmatter. A build script compiles them into a single JSON config validated against the OpenCode JSON schema. Every new session picks up the latest version automatically.</p><p>The overall architecture is simple and easy for anyone to deploy with our developer platform: a proxy Worker, Cloudflare Access, AI Gateway, and a client-accessible discovery endpoint that configures everything automatically. Users run one command and they're done. There’s nothing for them to configure manually, no API keys on laptops or MCP server connections to manually set up. Making changes to our agentic tools and updating what 3,000+ people get in their coding environment is just a <code>wrangler deploy</code> away.</p>
    <div>
      <h3>The MCP Server Portal: one OAuth, multiple MCP tools</h3>
      <a href="#the-mcp-server-portal-one-oauth-multiple-mcp-tools">
        
      </a>
    </div>
    <p>We described our full approach to governing MCP at enterprise scale in a <a href="https://blog.cloudflare.com/enterprise-mcp/"><u>separate post</u></a>, including how we use <a href="https://developers.cloudflare.com/cloudflare-one/access-controls/ai-controls/mcp-portals/"><u>MCP Server Portals</u></a>, Cloudflare Access, and Code Mode together. Here's the short version of what we built internally.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/36gUHwTs8CzZeS03l9yno1/42953a6dc2ac944f4dbe31e3ae51570e/BLOG-3270_7.png" />
          </figure><p>Our internal portal aggregates 13 production MCP servers exposing 182+ tools across Backstage, GitLab, Jira, Sentry, Elasticsearch, Prometheus, Google Workspace, our internal Release Manager, and more. This unifies access and simplifies everything giving us one endpoint and one Cloudflare Access flow governing access to every tool.</p><p>Each MCP server is built on the same foundation: McpAgent from the Agents SDK, <a href="https://github.com/cloudflare/workers-oauth-provider"><b><u>workers-oauth-provider</u></b></a> for OAuth, and Cloudflare Access for identity. The whole thing lives in a single monorepo with shared auth infrastructure, Bazel builds, CI/CD pipelines, and <code>catalog-info.yaml </code>for Backstage registration. Adding a new server is mostly copying an existing one and changing the API it wraps. For more on how this works and the security architecture behind it, see <a href="https://blog.cloudflare.com/enterprise-mcp/"><u>our enterprise MCP reference architecture</u></a>.</p>
    <div>
      <h3>Code Mode at the portal layer</h3>
      <a href="#code-mode-at-the-portal-layer">
        
      </a>
    </div>
    <p>MCP is the right protocol for connecting AI agents to tools, but it has a practical problem: every tool definition consumes context window tokens before the model even starts working. As the number of MCP servers and tools grows, so does the token overhead, and at scale, this becomes a real cost. <a href="https://blog.cloudflare.com/code-mode/"><u>Code Mode </u></a>is the emerging fix: instead of loading every tool schema up front, the model discovers and calls tools through code.</p><p>Our GitLab MCP server originally exposed 34 individual tools (<code>get_merge_request</code>, <code>list_pipelines</code>, <code>get_file_content</code>, and so on). Those 34 tool schemas consumed roughly 15,000 tokens of context window per request. On a 200K context window, that's 7.5% of the budget gone before asking a question. Multiplied across every request, every engineer, every day, it adds up.</p><p>MCP Server Portals now support Code Mode proxying, which lets us solve that problem centrally instead of one server at a time. Rather than exposing every upstream tool definition to the client, the portal collapses them into two portal-level tools: <code>portal_codemode_search and portal_codemode_execute</code>.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7jQa4HPmrOaOhUojZCJQ8q/4e81201065a50dd67c07e257b725e8b8/BLOG-3270_8.png" />
          </figure><p>The nice thing about doing this at the portal layer is that it scales cleanly. Without Code Mode, every new MCP server adds more schema overhead to every request. With portal-level Code Mode, the client still only sees two tools even as we connect more servers behind the portal. That means less context bloat, lower token cost, and a cleaner architecture overall.</p>
    <div>
      <h2>Act 2: The knowledge layer</h2>
      <a href="#act-2-the-knowledge-layer">
        
      </a>
    </div>
    
    <div>
      <h3>Backstage: the knowledge graph underneath all of it</h3>
      <a href="#backstage-the-knowledge-graph-underneath-all-of-it">
        
      </a>
    </div>
    <p>Before the iMARS team could build MCP servers that were actually useful, we needed to solve a more fundamental problem: structured data about our services and infrastructure. We need our agents to understand context outside the code base, like who owns what, how services depend on each other, where the documentation lives, and what databases a service talks to.</p><p>We run <a href="https://backstage.io/"><u>Backstage</u></a>, the open-source internal developer portal originally built by Spotify, as our service catalog. It's self-hosted (not on Cloudflare products, for the record) and it tracks things like:</p><ul><li><p>2,055 services, 167 libraries, and 122 packages</p></li><li><p>228 APIs with schema definitions</p></li><li><p>544 systems (products) across 45 domains</p></li><li><p>1,302 databases, 277 ClickHouse tables, 173 clusters</p></li><li><p>375 teams and 6,389 users with ownership mappings</p></li><li><p>Dependency graphs connecting services to the databases, Kafka topics, and cloud resources they rely on</p></li></ul><p>Our Backstage MCP server (13 tools) is available through our MCP Portal, and an agent can look up who owns a service, check what it depends on, find related API specs, and pull Tech Insights scores, all without leaving the coding session.</p><p>Without this structured data, agents are working blind. They can read the code in front of them, but they can't see the system around it. The catalog turns individual repos into a connected map of the engineering organization.</p>
    <div>
      <h3>AGENTS.md: getting thousands of repos ready for AI</h3>
      <a href="#agents-md-getting-thousands-of-repos-ready-for-ai">
        
      </a>
    </div>
    <p>Early in the rollout, we kept seeing the same failure mode: coding agents produced changes that looked plausible and were still wrong for the repo. Usually the problem was local context: the model didn't know the right test command, the team's current conventions, or which parts of the codebase were off-limits. That pushed us toward AGENTS.md: a short, structured file in each repo that tells coding agents how the codebase actually works and forces teams to make that context explicit.</p>
    <div>
      <h4>What AGENTS.md looks like</h4>
      <a href="#what-agents-md-looks-like">
        
      </a>
    </div>
    <p>We built a system that generates AGENTS.md files across our GitLab instance. Because these files sit directly in the model's context window, we wanted them to stay short and high-signal. A typical file looks like this:</p>
            <pre><code># AGENTS.md

## Repository
- Runtime: cloudflare workers
- Test command: `pnpm test`
- Lint command: `pnpm lint`

## How to navigate this codebase
- All cloudflare workers  are in src/workers/, one file per worker
- MCP server definitions are in src/mcp/, each tool in a separate file
- Tests mirror source: src/foo.ts -&gt; tests/foo.test.ts

## Conventions
- Testing: use Vitest with `@cloudflare/vitest-pool-workers` (Codex: RFC 021, RFC 042)
- API patterns: Follow internal REST conventions (Codex: API-REST-01)

## Boundaries
- Do not edit generated files in `gen/`
- Do not introduce new background jobs without updating `config/`

## Dependencies
- Depends on: auth-service, config-service
- Depended on by: api-gateway, dashboard
</code></pre>
            <p></p><p>When an agent reads this file, it doesn't have to infer the repo from scratch. It knows how the codebase is organized, which conventions to follow and which Engineering Codex rules apply.</p>
    <div>
      <h4>How we generate them at scale</h4>
      <a href="#how-we-generate-them-at-scale">
        
      </a>
    </div>
    <p>The generator pipeline pulls entity metadata from our Backstage service catalog (ownership, dependencies, system relationships), analyzes the repository structure to detect the language, build system, test framework, and directory layout, then maps the detected stack to relevant Engineering Codex standards. A capable model then generates the structured document, and the system opens a merge request so the owning team can review and refine it.</p><p>We've processed roughly 3,900 repositories this way. The first pass wasn't always perfect, especially for polyglot repos or unusual build setups, but even that baseline was much better than asking agents to infer everything from scratch.</p><p>The initial merge request solved the bootstrap problem, but keeping these files current mattered just as much. A stale AGENTS.md can be worse than no file at all. We closed that loop with the AI Code Reviewer, which can flag when repository changes suggest that AGENTS.md should be updated.</p>
    <div>
      <h2>Act 3: The enforcement layer</h2>
      <a href="#act-3-the-enforcement-layer">
        
      </a>
    </div>
    
    <div>
      <h3>The AI Code Reviewer</h3>
      <a href="#the-ai-code-reviewer">
        
      </a>
    </div>
    <p>Every merge request at Cloudflare gets an AI code review. Integration is straightforward: teams add a single CI component to their pipeline, and from that point every MR is reviewed automatically.</p><p>We use GitLab's self-hosted solution as our CI/CD platform. The reviewer is implemented as a GitLab CI component that teams include in their pipeline. When an MR is opened or updated, the CI job runs <a href="https://opencode.ai/"><u>OpenCode</u></a> with a multi-agent review coordinator. The coordinator classifies the MR by risk tier (trivial, lite, or full) and delegates to specialized review agents: code quality, security, codex compliance, documentation, performance, and release impact. Each agent connects to the AI Gateway for model access, pulls Engineering Codex rules from a central repo, and reads the repository's AGENTS.md for codebase context. Results are posted back as structured MR comments.</p><p>A separate Workers-based config service handles centralized model selection per reviewer agent, so we can shift models without changing the CI template. The review process itself runs in the CI runner and is stateless per execution.</p>
    <div>
      <h3>The output format</h3>
      <a href="#the-output-format">
        
      </a>
    </div>
    
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5CKN6frPHygTkAHBDrY3y7/e04eb00702a25627291c904fbde2e7ea/BLOG-3270_9.png" />
          </figure><p>We spent time getting the output format right. Reviews are broken into categories (Security, Code Quality, Performance) so engineers can scan headers rather than reading walls of text. Each finding has a severity level (Critical, Important, Suggestion, or Optional Nits) that makes it immediately clear what needs attention versus what's informational.</p><p>The reviewer maintains context across iterations. If it flagged something in a previous review round that has since been fixed, it acknowledges that rather than re-raising the same issue. And when a finding maps to an Engineering Codex rule, it cites the specific rule ID, turning an AI suggestion into a reference to an organizational standard.</p><p>Workers AI handles about 15% of the reviewer's traffic, primarily for documentation review tasks where Kimi K2.5 performs well at a fraction of the cost of frontier models. Models like Opus 4.6 and GPT 5.4 handle security-sensitive and architecturally complex reviews where reasoning capability matters most.</p><p>Over the last 30 days:</p><ul><li><p><b>100%</b> AI code reviewer coverage across all repos on our standard CI pipeline.</p></li><li><p>5.47M AI Gateway requests</p></li><li><p>24.77B tokens processed</p></li></ul><p>We're releasing a <a href="http://blog.cloudflare.com/ai-code-review"><u>detailed technical blog post</u></a> alongside this one that covers the reviewer's internal architecture, including how we route between models, the multi-agent orchestration, and the cost optimization strategies we've developed.</p>
    <div>
      <h3>Engineering Codex: engineering standards as agent skills</h3>
      <a href="#engineering-codex-engineering-standards-as-agent-skills">
        
      </a>
    </div>
    <p>The <b>Engineering Codex</b> is Cloudflare's new internal standards system where our core engineering standards live. We have a multi-stage AI distillation process, which outputs a set of codex rules ("If you need X, use Y. You must do X, if you are doing Y or Z.") along with an agent skill that uses progressive disclosure and nested hierarchical information directories and links across markdown files. </p><p>This skill is available for engineers to use locally as they build with prompts like “how should I handle errors in my Rust service?” or “review this TypeScript code for compliance.” Our Network Firewall team audited <code>rampartd</code> using a multi-agent consensus process where every requirement was scored COMPLIANT, PARTIAL, or NON-COMPLIANT with specific violation details and remediation steps reducing what previously required weeks of manual work to a structured, repeatable process.</p><p>At review time, the AI Code Reviewer cites specific Codex rules in its feedback.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6vXFmpfwqv1Xqo5CszCsEB/ada6417470b2b7533a12244578cbd9d0/BLOG-3270_10.png" />
          </figure><p><sup> </sup><sup><i>AI Code Review: showing categorized findings (Codex Compliance in this case) noting the codex RFC violation.</i></sup></p><p>None of these pieces are especially novel on their own. Plenty of companies run service catalogs, ship reviewer bots, or publish engineering standards. The difference is the wiring. When an agent can pull context from Backstage, read AGENTS.md for the repo it’s editing, and get reviewed against Codex rules by the same toolchain, the first draft is usually close enough to ship. That wasn’t true six months ago.</p>
    <div>
      <h2>The scoreboard</h2>
      <a href="#the-scoreboard">
        
      </a>
    </div>
    <p>From launching this effort to 93% R&amp;D adoption took less than a year.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1OtTpHobBRjorxOV2dWG8w/9b9dc80cba9741ee65e8565f63091e1a/BLOG-3270_11.png" />
          </figure><p><b>Company-wide adoption (Feb 5 – April 15, 2026):</b></p><table><tr><th><p>Metric</p></th><th><p>Value</p></th></tr><tr><td><p>Active users</p></td><td><p><b>3,683</b> (60% of the company)</p></td></tr><tr><td><p>R&amp;D team adoption</p></td><td><p><b>93%</b></p></td></tr><tr><td><p>AI messages</p></td><td><p><b>47.95M</b></p></td></tr><tr><td><p>Teams with AI activity</p></td><td><p><b>295</b></p></td></tr><tr><td><p>OpenCode messages</p></td><td><p><b>27.08M</b></p></td></tr><tr><td><p>Windsurf messages</p></td><td><p><b>434.9K</b></p></td></tr></table><p><b>AI Gateway (last 30 days, combined):</b></p><table><tr><th><p>Metric</p></th><th><p>Value</p></th></tr><tr><td><p>Requests</p></td><td><p><b>20.18M</b></p></td></tr><tr><td><p>Tokens</p></td><td><p><b>241.37B</b></p></td></tr></table><p><b>Workers AI (last 30 days):</b></p><table><tr><th><p>Metric</p></th><th><p>Value</p></th></tr><tr><td><p>Input tokens</p></td><td><p><b>51.47B</b></p></td></tr><tr><td><p>Output tokens</p></td><td><p><b>361.12M</b></p></td></tr></table>
    <div>
      <h2>What's next: background agents</h2>
      <a href="#whats-next-background-agents">
        
      </a>
    </div>
    <p>The next evolution in our internal engineering stack will include background agents: agents that can be spun up on demand with the same tools available locally (MCP portal, git, test runners) but running entirely in the cloud. The architecture uses Durable Objects and the Agents SDK for orchestration, delegating to Sandbox containers when the job requires a full development environment like cloning a repo, installing dependencies, or running tests. The <a href="https://blog.cloudflare.com/sandbox-ga/"><u>Sandbox SDK went GA</u></a> during Agents Week.</p><p>Long-running agents, <a href="https://blog.cloudflare.com/project-think/"><u>shipped natively into the Agents SDK</u></a> during Agents Week, solve the durable session problem that previously required workarounds. The SDK now supports sessions that run for extended periods without eviction, enough for an agent to clone a large repo, run a full test suite, iterate on failures, and open a MR in a single session.</p><p>This represents an eleven-month effort to rethink not just how code gets written, but how it gets reviewed, how standards are enforced, and how changes ship safely across thousands of repos. Every layer runs on the same products our customers use.</p>
    <div>
      <h2>Start building</h2>
      <a href="#start-building">
        
      </a>
    </div>
    <p>Agents Week just <a href="https://blogs.cloudflare.com/agents-week-in-review"><u>shipped</u></a> everything you need. The platform is here.</p>
            <pre><code>npx create-cloudflare@latest --template cloudflare/agents-starter
</code></pre>
            <p>That agents starter gets you running. The diagram below is the full architecture for when you’re ready to grow it, your tools layer on top (chatbot, web UI, CLI, browser extension), the Agents SDK handling session state and orchestration in the middle, and the Cloudflare services you call from it underneath.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3Ebzq5neZbshbrhb5WysYY/54335791b2f731e2095fd57a4fe11cf3/image11.png" />
          </figure><p><b>Docs:</b> <a href="https://developers.cloudflare.com/agents/"><u>Agents SDK</u></a> · <a href="https://developers.cloudflare.com/sandbox/"><u>Sandbox SDK</u></a> · <a href="https://developers.cloudflare.com/ai-gateway/"><u>AI Gateway</u></a> · <a href="https://developers.cloudflare.com/workers-ai/"><u>Workers AI</u></a> · <a href="https://developers.cloudflare.com/workflows/"><u>Workflows</u></a> · <a href="https://developers.cloudflare.com/agents/api-reference/codemode/"><u>Code Mode</u></a> · <a href="https://blog.cloudflare.com/model-context-protocol/"><u>MCP on Cloudflare</u></a></p><p><b>Repos:</b> <a href="https://github.com/cloudflare/agents"><u>cloudflare/agents</u></a> · <a href="https://github.com/cloudflare/sandbox-sdk"><u>cloudflare/sandbox-sdk</u></a> · <a href="https://github.com/cloudflare/mcp-server-cloudflare"><u>cloudflare/mcp-server-cloudflare</u></a> · <a href="https://github.com/cloudflare/skills"><u>cloudflare/skills</u></a></p><p>For more on how we’re using AI at Cloudflare, read the post on <a href="http://blog.cloudflare.com/ai-code-review"><u>our process for AI Code Review</u></a>. And check out <a href="https://www.cloudflare.com/agents-week/updates/"><u>everything we shipped during Agents Week</u></a>.</p><p>We’d love to hear what you build. Find us on <a href="https://discord.cloudflare.com/"><u>Discord</u>,</a> <a href="https://x.com/CloudflareDev"><u>X</u>, and</a> <a href="https://bsky.app/profile/cloudflare.social"><u>Bluesky</u></a>.</p><p><sup><i>Ayush Thakur built the AGENTS.md system and the AI Gateway integration for the OpenCode infrastructure, Scott Roemeschke is the Engineering Manager of the Developer Productivity team at Cloudflare, Rajesh Bhatia leads the Productivity Platform function at Cloudflare. This post was a collaborative effort across the Devtools team, with help from volunteers across the company through the iMARS (Internal MCP Agent/Server Rollout Squad) tiger team.</i></sup></p> ]]></content:encoded>
            <category><![CDATA[Agents Week]]></category>
            <category><![CDATA[Agents]]></category>
            <category><![CDATA[AI]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[SASE]]></category>
            <category><![CDATA[MCP]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <category><![CDATA[Developers]]></category>
            <category><![CDATA[Cloudflare Gateway]]></category>
            <category><![CDATA[Product News]]></category>
            <category><![CDATA[Workers AI]]></category>
            <guid isPermaLink="false">4z7ZJ5ZbY3e3YwJtkg6NiD</guid>
            <dc:creator>Ayush Thakur</dc:creator>
            <dc:creator>Scott Roe-Meschke</dc:creator>
            <dc:creator>Rajesh Bhatia</dc:creator>
        </item>
        <item>
            <title><![CDATA[Introducing Flagship: feature flags built for the age of AI]]></title>
            <link>https://blog.cloudflare.com/flagship/</link>
            <pubDate>Fri, 17 Apr 2026 13:00:00 GMT</pubDate>
            <description><![CDATA[ We are launching Flagship, a native feature flag service built on Cloudflare’s global network to eliminate the latency of third-party providers. By using KV and Durable Objects, Flagship allows for sub-millisecond flag evaluation. ]]></description>
            <content:encoded><![CDATA[ <p>AI is writing more code than ever. AI-assisted contributions now account for a rapidly growing share of new code across the platform. Agentic coding tools like OpenCode and Claude Code are shipping entire features in minutes.</p><p>AI-generated code entering production is only going to accelerate. But the bigger shift isn't just speed — it's autonomy.</p><p>Today, an AI agent writes code and a human reviews, merges, and deploys it. Tomorrow, the agent does all of that itself. The question becomes: how do you let an agent ship to production without removing every safety net?</p><p>Feature flags are the answer. An agent writes a new code path behind a flag and deploys it — the flag is off, so nothing changes for users. The agent then enables the flag for itself or a small test cohort, exercises the feature in production, and observes the results. If metrics look good, it ramps the rollout. If something breaks, it disables the flag. The human doesn't need to be in the loop for every step — they set the boundaries, and the flag controls the blast radius.</p><p>This is the workflow feature flags were always building toward: not just decoupling deployment from release, but decoupling human attention from every stage of the shipping process. The agent moves fast because the flag makes it safe to move fast.</p><p>Today, we're announcing Flagship — Cloudflare's native feature flag service, built on <a href="https://openfeature.dev/"><u>OpenFeature</u></a>, the CNCF open standard for feature flag evaluation. It works everywhere — Workers, Node.js, Bun, Deno, and the browser — but it's fastest on Workers, where flags are evaluated within the Cloudflare network. With the Flagship binding and OpenFeature, integration looks like this:</p>
            <pre><code>await OpenFeature.setProviderAndWait(
    new FlagshipServerProvider({ binding: env.FLAGS })
);</code></pre>
            <p>Flagship is now available in closed beta.</p>
    <div>
      <h3>The problem with feature flags on Workers</h3>
      <a href="#the-problem-with-feature-flags-on-workers">
        
      </a>
    </div>
    <p>Many Cloudflare developers have resorted to the pragmatic workaround: hardcoding flag logic directly into their Workers. And honestly, it works well enough in the beginning. Workers deploy in seconds, so flipping a boolean in code and pushing it to production is fast enough for most situations.</p><p>But it doesn't stay simple. One hardcoded flag becomes ten. Ten becomes fifty, owned by different teams, with no central view of what's on or off. There's no audit trail — when something breaks, you're searching <code>git blame</code> to figure out who toggled what.</p>
    <div>
      <h4>Network call to external services</h4>
      <a href="#network-call-to-external-services">
        
      </a>
    </div>
    <p>Another common pattern used on workers is to make an HTTP request to an external service in the following manner:</p>
            <pre><code>const response = await fetch("https://flags.example-service.com/v1/evaluate", {
      ...
      body: JSON.stringify({
        flagKey: "new-checkout-flow",
        context: {
          ...
        },
      }),
    });
const { value } = await response.json();
if (value === true) {
    return handleNewCheckout(request);
}
return handleLegacyCheckout(request);</code></pre>
            <p>That outbound request sits on the critical path of every single user request. It could add considerable latency depending on how far the user is from the flag service's region.</p><p>This is a strange situation. Your application runs at the edge, milliseconds from the user. But the feature flag check forces it to reach back across the Internet to another API before it can decide what to render.</p>
    <div>
      <h4>Why local evaluation doesn't solve the problem</h4>
      <a href="#why-local-evaluation-doesnt-solve-the-problem">
        
      </a>
    </div>
    <p>Some feature flag services offer a "local evaluation" SDK. Instead of calling a remote API on every request, the SDK downloads the full set of flag rules into memory and evaluates them locally. No outbound request per evaluation and the flag decision happens in-process.</p><p>On Workers, none of these assumptions hold. There is no long-lived process: a Worker isolate can be created, serve a request, and be evicted between one request and the next. A new invocation could mean re-initializing the SDK from scratch.</p><p>On a serverless platform, you need a distribution primitive that's already at the edge, one where the caching is managed for you, reads are local, and you don't need a persistent connection to keep things up to date.</p><p>Cloudflare KV is a great primitive for this!</p>
    <div>
      <h3>How Flagship works</h3>
      <a href="#how-flagship-works">
        
      </a>
    </div>
    <p>Flagship is built entirely on Cloudflare's infrastructure — Workers, Durable Objects, and KV. There are no external databases, no third-party services, and no centralized origin servers in the evaluation path.</p><p>When you create or update a flag, the control plane writes the change atomically to a <a href="https://developers.cloudflare.com/durable-objects/"><u>Durable Object</u></a> — a SQLite-backed, globally unique instance that serves as the source of truth for that app's flag configuration and changelog. Within seconds, the updated flag config is synced to <a href="https://developers.cloudflare.com/kv/"><u>Workers KV</u></a>, Cloudflare's globally distributed key-value store, where it's replicated across Cloudflare's network.</p><p>When a request evaluates a flag, Flagship reads the flag config directly from KV at the edge — the same Cloudflare location already handling the request. The evaluation engine then runs right there in an <a href="https://developers.cloudflare.com/workers/reference/how-workers-works/#isolates"><u>isolate</u></a>: it matches the request context against the flag's targeting rules, resolves the rollout percentage, and returns a variation. Both the data and the logic live at the edge — nothing is sent elsewhere to be evaluated.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1nKDxB6OkUr1fpVKGWEugC/8fc4dbf0bd1ec45e30010a223982487e/image2.png" />
          </figure>
    <div>
      <h3>Using Flagship: the Worker binding</h3>
      <a href="#using-flagship-the-worker-binding">
        
      </a>
    </div>
    <p>For teams running Cloudflare Workers, Flagship offers a direct binding that evaluates flags inside the Workers runtime — no HTTP round-trip, no SDK overhead. Add the binding to your <code>wrangler.jsonc</code> and your Worker is connected:</p>
            <pre><code>{
  "flagship": [
    {
      "binding": "FLAGS",
      "app_id": "&lt;APP_ID&gt;"
    }
  ]
}</code></pre>
            <p>That's it. Your account ID is inferred from your Cloudflare account, and the <code>app_id</code> ties the binding to a specific Flagship app. In your Worker, you just ask for a flag value:</p>
            <pre><code>export default {
  async fetch(request: Request, env: Env) {
    // Simple boolean check
    const showNewUI = await env.FLAGS.getBooleanValue('new-ui', false, {
      userId: 'user-42',
      plan: 'enterprise',
    });
    // Full evaluation details when you need them
    const details = await env.FLAGS.getStringDetails('checkout-flow', 'v1', {
      userId: 'user-42',
    });
    // details.value = "v2", details.variant = "new", details.reason = "TARGETING_MATCH"
  },
};</code></pre>
            <p>The binding supports typed accessors for every variation type - <code>getBooleanValue()</code>, <code>getStringValue()</code>, <code>getNumberValue()</code>, <code>getObjectValue()</code> - plus <code>*Details()</code> variants that return the resolved value alongside the matched variant and the reason it was selected. On evaluation errors, the default value is returned gracefully. On type mismatches, the binding throws an exception — that's a bug in your code, not a transient failure.</p>
    <div>
      <h3>The SDK: OpenFeature-native</h3>
      <a href="#the-sdk-openfeature-native">
        
      </a>
    </div>
    <p>Most feature flag SDKs come with their own interfaces and evaluation patterns. Over time, those become deeply embedded in your codebase — and switching providers means rewriting every call site.</p><p>We didn't want to build another one of those. Flagship is built on <a href="https://openfeature.dev/">OpenFeature</a>, the CNCF open standard for feature flag evaluation. OpenFeature defines a common interface for flag evaluation across languages and providers — it's the same relationship that OpenTelemetry has to observability. You write your evaluation code once against the standard, and swap providers by changing a single line of configuration.</p>
            <pre><code>import { OpenFeature } from '@openfeature/server-sdk';
import { FlagshipServerProvider } from '@cloudflare/flagship/server';
await OpenFeature.setProviderAndWait(
  new FlagshipServerProvider({
    appId: 'your-app-id',
    accountId: 'your-account-id',
    authToken: 'your-cloudflare-api-token',
  })
);
const client = OpenFeature.getClient();
const showNewCheckout = await client.getBooleanValue(
  'new-checkout-flow',
  false,
  {
    targetingKey: 'user-42',
    plan: 'enterprise',
    country: 'US',
  }
);</code></pre>
            <p>If you're running on Workers with the Flagship binding, you can pass it directly to the OpenFeature provider. The binding already carries your account context, so there's nothing to configure — authentication is implicit.</p>
            <pre><code>import { OpenFeature } from '@openfeature/server-sdk';
import { FlagshipProvider } from '@cloudflare/flagship/server';
let initialized = false;
export default {
  async fetch(request: Request, env: Env) {
    if (!initialized) {
      await OpenFeature.setProviderAndWait(
        new FlagshipServerProvider({ binding: env.FLAGS })
      );
      initialized = true;
    }
    const client = OpenFeature.getClient();
    const showNewCheckout = await client.getBooleanValue('new-checkout-flow', false, {
      targetingKey: 'user-42',
      plan: 'enterprise',
    });
  },
};</code></pre>
            <p>Your evaluation code doesn't change — the OpenFeature interface is identical. But under the hood, Flagship evaluates flags through the binding instead of over HTTP. You get the portability of the standard with the performance of the binding.</p><p>A client-side provider is also available for browsers. It pre-fetches the flags you specify, caches them with a configurable TTL, and serves evaluations synchronously from that cache.</p>
    <div>
      <h3>What you can do with Flagship</h3>
      <a href="#what-you-can-do-with-flagship">
        
      </a>
    </div>
    <p>Flagship supports the patterns you'd expect from a feature flag service and the ones that become critical when AI-generated code is landing in production daily.</p><p>Flag values can be boolean, strings, numbers, or full JSON objects — useful for configuration blocks, UI theme definitions, or routing users to different API versions without maintaining separate code paths.</p>
    <div>
      <h4>Targeting Rules</h4>
      <a href="#targeting-rules">
        
      </a>
    </div>
    <p>Each flag can have multiple rules, evaluated in priority order. The first rule that matches wins.</p><p>A rule consists of:</p><ul><li><p>Conditions that determine whether the rule applies to a given context</p></li><li><p>A flag variation to serve when the rule matches</p></li><li><p>An optional rollout for percentage-based delivery</p></li><li><p>A priority that determines evaluation order when multiple rules are present (lower number = higher priority)</p></li></ul>
    <div>
      <h4>Nested Logical Conditions</h4>
      <a href="#nested-logical-conditions">
        
      </a>
    </div>
    <p>Conditions can be composed using AND/OR logic, nested up to five levels deep. A single rule can express things like:</p>
            <pre><code>(plan == “enterprise” AND region == “us” ) OR (user.email.endsWith(“@cloudflare.com”))
= serve (“premium”)</code></pre>
            <p>At the top level of a rule, multiple conditions are combined with implicit AND where all conditions must pass for the rule to match. Within each condition, you can nest AND/OR groups for more complex logic.</p>
    <div>
      <h4>Flag Rollouts by Percentage</h4>
      <a href="#flag-rollouts-by-percentage">
        
      </a>
    </div>
    <p>Unlike <a href="https://developers.cloudflare.com/workers/configuration/versions-and-deployments/gradual-deployments/"><u>gradual deployments</u></a>, which split traffic between different uploaded versions of your Worker, feature flags let you roll out behavior by percentage within a single version that is serving 100% of traffic. </p><p>Any rule can include a percentage rollout. Instead of serving a variation to everyone who matches the conditions, you serve it to a percentage of them. </p><p>Rollouts use consistent hashing on the specified context attribute. The same attribute value (userId, for example) always hashes to the same bucket, so they won't flip between variations across requests. You can ramp from 5% to 10% to 50% to 100% of users, so those who were already in the rollout stay in it.</p>
    <div>
      <h3>Built for what comes next</h3>
      <a href="#built-for-what-comes-next">
        
      </a>
    </div>
    <p>AI-generated code entering production is only going to accelerate. Agentic workflows will push it further — agents that autonomously deploy, test, and iterate on code in production. The teams that thrive in this world won't be the ones shipping the fastest. They'll be the ones who can ship fast and still maintain control over what their users see, roll back in seconds when something breaks, and gradually expose new code paths with confidence.</p><p>That's what Flagship is built for:</p><ul><li><p><b>Evaluation across region Earth,</b> cached globally using K/V.</p></li><li><p><b>A full audit trail.</b> Every flag change is recorded with field-level diffs, so you know who changed what and when.</p></li><li><p><b>Dashboard integration</b>. Anyone on the team can toggle a flag or adjust a rollout without touching code.</p></li><li><p><b>OpenFeature compatibility.</b> Adopt Flagship without rewriting your evaluation code. Leave without rewriting it either.</p></li></ul>
    <div>
      <h3>Get started with Flagship</h3>
      <a href="#get-started-with-flagship">
        
      </a>
    </div>
    <p>Starting today, Flagship is in private beta. You can request for access <a href="https://www.cloudflare.com/lp/flagship-private-beta/"><u>here</u></a>. We'll share more details on pricing as we approach general availability.</p><ul><li><p>Visit the<a href="https://dash.cloudflare.com"> <u>Cloudflare dashboard</u></a> to create your first Flagship app</p></li><li><p>Install the SDK: <code>npm i @cloudflare/flagship</code>; or use the Worker binding directly in your Worker</p></li><li><p>Read the <a href="https://developers.cloudflare.com/flagship/get-started/"><u>documentation</u></a> for integration guides and API reference</p></li><li><p>Check out the <a href="https://github.com/cloudflare/flagship"><u>source code</u></a> for examples and to contribute</p></li></ul><p>If you're currently hardcoding flags in your Workers, or evaluating flags through an external service that adds latency to every request, give Flagship a try. We'd love to hear what you build.</p> ]]></content:encoded>
            <category><![CDATA[Agents Week]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <category><![CDATA[AI]]></category>
            <category><![CDATA[Durable Objects]]></category>
            <category><![CDATA[Performance]]></category>
            <category><![CDATA[Feature Flags]]></category>
            <guid isPermaLink="false">3d8fI4f0doBxr4ybfKEymB</guid>
            <dc:creator>Rohan Mukherjee</dc:creator>
            <dc:creator>Abhishek Kankani</dc:creator>
        </item>
        <item>
            <title><![CDATA[Artifacts: versioned storage that speaks Git]]></title>
            <link>https://blog.cloudflare.com/artifacts-git-for-agents-beta/</link>
            <pubDate>Thu, 16 Apr 2026 13:01:00 GMT</pubDate>
            <description><![CDATA[ Give your agents, developers, and automations a home for code and data. We’ve just launched Artifacts: Git-compatible versioned storage built for agents. Create tens of millions of repos, fork from any remote, and hand off a URL to any Git client.
 ]]></description>
            <content:encoded><![CDATA[ <p>Agents have changed how we think about source control, file systems, and persisting state. Developers and agents are generating more code than ever — more code will be written over the next 5 years than in all of programming history — and it’s driven an order-of-magnitude change in the scale of the systems needed to meet this demand. Source control platforms are especially struggling here: they were built to meet the needs of humans, not a 10x change in volume driven by agents who never sleep, can work on several issues at once, and never tire.</p><p>We think there’s a need for a new primitive: a distributed, versioned filesystem that’s built for agents first and foremost, and that can serve the types of applications that are being built today.</p><p>We’re calling this Artifacts: a versioned file system that speaks Git. You can create repositories programmatically, alongside your agents, sandboxes, Workers, or any other compute paradigm, and connect to it from any regular Git client.</p><p>Want to give every agent session a repo? Artifacts can do it. Every sandbox instance? Also Artifacts. Want to create 10,000 forks from a known-good starting point? You guessed it: Artifacts again. Artifacts exposes a REST API and native Workers API for creating repositories, generating credentials, and commits for environments where a Git client isn’t the right fit (i.e. in any serverless function).</p><p>Artifacts is available in private beta and we’re aiming to open this up as a public beta by early May.</p>
            <pre><code>// Create a repo
const repo = await env.AGENT_REPOS.create(name)
// Pass back the token &amp; remote to your agent
return { repo.remote, repo.token }</code></pre>
            
            <pre><code># Clone it and use it like any regular git remote
$ git clone https://x:${TOKEN}@123def456abc.artifacts.cloudflare.net/git/repo-13194.git
</code></pre>
            <p>That’s it. A bare repo, ready to go, created on the fly, that any git client can operate it against.</p><p>And if you want to bootstrap an Artifacts repo from an existing git repository so that your agent can work on it independently and push independent changes, you can do that too with .import():</p>
            <pre><code>interface Env {
  ARTIFACTS: Artifacts
}

export default {
  async fetch(request: Request, env: Env) {
    // Import from GitHub
    const { remote, token } = await env.ARTIFACTS.import({
      source: {
        url: "https://github.com/cloudflare/workers-sdk",
        branch: "main",
      },
      target: {
        name: "workers-sdk",
      },
    })

    // Get a handle to the imported repo
    const repo = await env.ARTIFACTS.get("workers-sdk")

    // Fork to an isolated, read-only copy
    const fork = await repo.fork("workers-sdk-review", {
      readOnly: true,
    })

    return Response.json({ remote: fork.remote, token: fork.token })
  },
}</code></pre>
            <p><a href="http://developers.cloudflare.com/artifacts/"><u>Check out the documentation</u></a> to get started, or if you want to understand how Artifacts is being used, how it was built, and how it works under the hood: read on.</p>
    <div>
      <h2>Why Git? What’s a versioned file system?</h2>
      <a href="#why-git-whats-a-versioned-file-system">
        
      </a>
    </div>
    <p>Agents know Git. It’s deep in the training data of most models. The happy path <i>and </i>the edge cases are well known to agents, and code-optimized models (and/or harnesses) are particularly good at using git.</p><p>Further, Git’s data model is not only good for source control, but for <i>anything</i> where you need to track state, time travel, and persist large amounts of small data. Code, config, session prompts and agent history: all of these are things (“objects”) that you often want to store in small chunks (“commits”) and be able to revert or otherwise roll back to (“history”). </p><p>We could have invented an entirely new, bespoke protocol… but then you have the bootstrap problem. AI models don’t know it, so you have to distribute skills, or a CLI, or hope that users are plugged into your docs MCP… all of that adds friction.

If we can just give agents an authenticated, secure HTTPS Git remote URL and have them operate as if it were a Git repo, though? That turns out to work pretty well. And for non-Git-speaking clients — such as a Cloudflare Worker, a Lambda function, or a Node.js app — we’ve exposed a REST API and (soon) language-specific SDKs. Those clients can also use <a href="https://isomorphic-git.org/"><u>isomorphic-git</u></a>, but in many cases a simpler TypeScript API can reduce the API surface needed.</p>
    <div>
      <h3>Not just for source control</h3>
      <a href="#not-just-for-source-control">
        
      </a>
    </div>
    <p>Artifacts’ Git API might make you think it’s just for source control, but it turns out that the Git API and data model is a powerful way to persist state in a way that allows you to fork, time-travel and diff state for <i>any</i> data.</p><p>Inside Cloudflare, we’re using Artifacts for our internal agents: automatically persisting the current state of the filesystem <i>and</i> the session history in a per-session Artifacts repo. This enables us to:</p><ul><li><p>Persist sandbox state without having to provision (and keep) block storage around.</p></li><li><p>Share sessions with others and allow them to time-travel back through both session (prompt) state <i>and</i> file state, irrespective of whether there were commits to the “actual” repository (source control).</p></li><li><p>And the best: <i>fork</i> a session from any point, allowing our team to share sessions with a co-worker and have them pick it up from them. Debugging something and want another set of eyes? Send a URL and fork it. Want to riff on an API? Have a co-worker fork it and pick up from where you left off.</p></li></ul><p>We’ve also spoken to teams who want to use Artifacts in cases where the Git protocol isn’t a requirement at all, but the semantics (reverting, cloning, diffing) <i>are</i>. Storing per-customer config as part of your product, and want the ability to roll back? Artifacts can be a good representation of this.</p><p>We’re excited to see teams explore the non-Git use-cases around Artifacts just as much as the Git-focused ones.</p>
    <div>
      <h2>Under the hood</h2>
      <a href="#under-the-hood">
        
      </a>
    </div>
    <p>Artifacts are built on top of Durable Objects. The ability to create millions (or tens of millions+) of instances of stateful, isolated compute is inherent to how Durable Objects work today, and that’s exactly what we needed for supporting millions of Git repos per namespace.</p><p>Major League Baseball (for live game fan-out), Confluence Whiteboards, and our own <a href="https://developers.cloudflare.com/agents/"><u>Agents SDK</u></a> use Durable Objects under the hood at significant scale, and so we’re building this on a primitive that we’ve had in production for some time.</p><p>What we did need, however, was a Git server implementation that could run on Cloudflare Workers. It needed to be small, as complete as possible, extensible (<a href="https://git-scm.com/docs/git-notes"><u>notes</u></a>, <a href="https://git-lfs.com/"><u>LFS</u></a>), and efficient. So we built one in <a href="https://ziglang.org/"><u>Zig</u></a>, and compiled it to Wasm.</p><p>Why did we use Zig? Three reasons:</p><ol><li><p> The entire git protocol engine is written in pure Zig (no libc), compiled to a ~100KB WASM binary (with room for optimization!). It implements SHA-1, zlib inflate/deflate, delta encoding/decoding, pack parsing, and the full git smart HTTP protocol — all from scratch, with zero external dependencies other than the standard library.</p></li><li><p> Zig gives us  manual control over memory allocation which is important in constrained environments like Durable Objects. The Zig Build System lets us easily share  code between the WASM runtime (production) and native builds (testing against libgit2 for correctness verification).</p></li><li><p>The WASM module communicates with the JS host via a thin callback interface: 11 host-imported functions for storage operations (host_get_object, host_put_object, etc.) and one for streaming output (host_emit_bytes). The WASM side is fully testable in isolation.</p></li></ol><p>Under the hood, Artifacts also uses R2 (for snapshots) and KV (for tracking auth tokens):</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/35SxJbQfntIscpotc0GBt8/48ae11213d7483c9b488321baacf78e7/BLOG-3269_2.png" />
          </figure><p><sup><code><i>How Artifacts works (Workers, Durable Objects, and WebAssembly)</i></code></sup></p><p>A Worker acts as the front-end, handling authentication &amp; authorization, key metrics (errors, latency) and looking up each Artifacts repository (Durable Object) on the fly. </p><p>Specifically:</p><ul><li><p>Files are stored in the underlying Durable Object’s SQLite database.</p><ul><li><p>Durable Object storage has a 2MB max row size, so large Git objects are chunked and stored across multiple rows.</p></li><li><p>We make use of the sync KV API (state.storage.kv)  which is backed by SQLite under the hood.</p></li></ul></li><li><p> DOs have ~128MB memory limits: this means we can spawn tens of millions of them (they’re fast and light) but have to work within those limits.</p><ul><li><p>We make heavy use of streaming in both the fetch and push paths, directly returning a `ReadableStream&lt;Uint8Array&gt;` built from the raw WASM output chunks.</p></li><li><p>We avoid calculating our own git deltas, instead,  the raw deltas and base hashes are persisted alongside the resolved object. On fetch, if the requesting client already has the base object, Zig emits the delta instead of the full object, which saves bandwidth <i>and</i> memory.</p></li></ul></li><li><p>Support for both v1 and v2 of the git protocol.</p><ul><li><p>We support capabilities including ls-refs, shallow clones (deepen, deepen-since, deepen-relative), and incremental fetch with have/want negotiation.</p></li><li><p>We have an extensive test suite with conformance tests against git clients and verification tests against a libgit2 server designed to validate protocol support.</p></li></ul></li></ul><p>On top of this, we have native support for <a href="https://git-scm.com/docs/git-notes"><u>git-notes</u></a>. Artifacts is designed to be agent-first, and notes enable agents to add notes (metadata) to Git objects. This includes prompts, agent attribution and other metadata that can be read/written from the repo without mutating the objects themselves.</p>
    <div>
      <h2>Big repos, big problems? Meet ArtifactFS.</h2>
      <a href="#big-repos-big-problems-meet-artifactfs">
        
      </a>
    </div>
    <p>Most repos aren’t that big, and Git is <a href="https://github.blog/open-source/git/gits-database-internals-i-packed-object-store/"><u>designed to be extremely efficient</u></a> in terms of storage: most repositories take only a few seconds to clone at most, and that’s dominated by network setup time, auth, and <a href="https://git-scm.com/book/ms/v2/Git-Internals-Git-Objects"><u>checksumming</u></a>. In most agent or sandbox scenarios, that’s workable: just clone the repo as the sandbox starts and get to work.</p><p>But what about a multi-GB repository and/or repos with millions of objects? How can we clone that repo quickly, without blocking the agent’s ability to get to work for minutes and consuming compute?</p><p>A popular web framework (at 2.4GB and with a long history!) takes close to 2 minutes to clone. A shallow clone is faster, but not enough to get down to single digit seconds, and we don’t always want to omit history (agents find it useful).</p><p>Can we get large repos down to ~10-15 seconds so that our agent can get to work? Well, yes: with a few tricks.</p><p>As part of our launch of Artifacts, <a href="https://github.com/cloudflare/artifact-fs"><u>we’re open-sourcing ArtifactFS</u></a>, a filesystem driver designed to mount large Git repos as quickly as possible, hydrating file contents on the fly instead of blocking on the initial clone. It's ideal for agents, sandboxes, containers and other use cases where startup time is critical. If you can shave ~90-100 seconds off your sandbox startup time for every large repo, and you’re running 10,000 of those sandbox jobs per month: that’s 2,778 sandbox hours saved.</p><p>You can think of ArtifactFS as “Git clone but async”:</p><ul><li><p>ArtifactFS runs a blobless clone of a git repository: it fetches the file tree and refs, but not the file contents. It can do that during sandbox startup, which then allows your agent harness to get to work.</p></li><li><p>In the background, it starts to hydrate (download) file contents concurrently via a lightweight daemon.</p></li><li><p>It prioritizes files that agents typically want to operate on first: package manifests (<code>package.json, go.mod</code>), configuration files, and code, deprioritizing binary blobs (images, executables and other non-text-files) where possible so that agents can scan the file tree as the files themselves are hydrated.</p></li><li><p>If a file isn’t fully hydrated when the agent tries to read it, the read will block until it has.</p></li></ul><p>The filesystem does not attempt to “sync” files back to the remote repository: with thousands or millions of objects, that’s typically very slow, and since we’re speaking git, we don’t need to. Your agent just needs to commit and push, as it would with any repository. No new APIs to learn.</p><p>Importantly, ArtifactFS works with any Git remote, not just our own Artifacts. If you’re cloning large repos from GitHub, GitLab, or self-hosted Git infrastructure: you can still use ArtifactFS.</p>
    <div>
      <h2>What’s coming?</h2>
      <a href="#whats-coming">
        
      </a>
    </div>
    <p>Our release today is just the beta, and we’re already working on a number of features that you’ll see land over the next few weeks:</p><ul><li><p>Expanding the <a href="https://developers.cloudflare.com/artifacts/observability/metrics/"><u>available metrics</u></a> we expose. Today we’re shipping metrics for key operations counts per namespace, repo and stored bytes per repo, so that managing millions of Artifacts isn’t toilsome.</p></li><li><p>Support for <a href="https://developers.cloudflare.com/queues/event-subscriptions/"><u>Event Subscriptions</u></a> for repo-level events so that we can emit events on pushes, pulls, clones, and forks to any repository within a namespace. This will also allow you to consume events, write webhooks, and use those events to notify end-users, drive lifecycle events within your products, and/or run post-push jobs (like CI/CD).</p></li><li><p>Native TypeScript, Go and Python client SDKs for interacting with the Artifacts API</p></li><li><p>Repo-level search APIs and namespace-wide search APIs, e.g. “find all the repos with a <code>package.json</code> file”. </p></li></ul><p>We’re also planning an API for <a href="https://developers.cloudflare.com/workers/ci-cd/builds/"><u>Workers Builds</u></a>, allowing you to run CI/CD jobs on any agent-driven workflow.</p>
    <div>
      <h2>What will it cost me?</h2>
      <a href="#what-will-it-cost-me">
        
      </a>
    </div>
    <p>We’re still early with Artifacts, but want our pricing to work at agent-scale: it needs to be cost effective to have millions of repos, unused (or rarely used) repos shouldn’t be a drag, and our pricing should match the massively-single-tenant nature of agents.</p><p>You also shouldn’t have to think about whether a repo is going to be used or not, whether it’s hot or cold, and/or whether an agent is going to wake it up. We’ll charge you for the storage you consume and the operations (e.g. clones, forks, pushes &amp; pulls) against each repo.</p><table><tr><th><p></p></th><th><p><b>$/unit</b></p></th><th><p><b>Included</b></p></th></tr><tr><td><p><b>Operations</b></p></td><td><p>$0.15 per 1,000 operations</p></td><td><p>First 10k included (per month)</p></td></tr><tr><td><p><b>Storage</b></p></td><td><p>$0.50/GB-mo</p></td><td><p>First 1GB included.</p></td></tr></table><p>Big, busy repos will cost more than smaller, less-often-used repos, whether you have 1,000, 100,000, or 10 million of them.</p><p>We’ll also be bringing Artifacts to the Workers Free plan (with some fair limits) as the beta progresses, and we’ll provide updates throughout the beta should this pricing change and ahead of billing any usage.</p>
    <div>
      <h2>Where do I start? </h2>
      <a href="#where-do-i-start">
        
      </a>
    </div>
    
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3xLMMKCN1HNGWbkSyq0tDZ/2a8d49383957804f3ce204783e11ae80/BLOG-3269_3.png" />
          </figure><p>Artifacts is launching in private beta, and we expect public beta to be ready in early May (2026, to be clear!). We’ll be allowing customers in progressively over the next few weeks, and <a href="https://forms.gle/DwBoPRa3CWQ8ajFp7"><u>you can register interest for the private beta</u></a> directly.</p><p>In the meantime, you can learn more about Artifacts by:</p><ul><li><p>Reading the <a href="http://developers.cloudflare.com/artifacts/get-started/workers/"><u>getting started guide</u></a> in the docs.</p></li><li><p>Visiting the Cloudflare dashboard (Build &gt; Storage &amp; Databases &gt; Artifacts)</p></li><li><p>Reading through the <a href="http://developers.cloudflare.com/artifacts/api/rest-api/"><u>REST API examples</u></a></p></li><li><p>Learning more about <a href="http://developers.cloudflare.com/artifacts/concepts/how-artifacts-works/"><u>how Artifacts works</u></a> under the hood</p></li></ul><p>Follow <a href="http://developers.cloudflare.com/changelog/product/artifacts/"><u>the changelog</u></a> to track the beta as it progresses.</p>
    <div>
      <h2>Watch on Cloudflare TV</h2>
      <a href="#watch-on-cloudflare-tv">
        
      </a>
    </div>
    <div>
  
</div>
<p></p> ]]></content:encoded>
            <category><![CDATA[Agents Week]]></category>
            <category><![CDATA[Agents]]></category>
            <category><![CDATA[GitHub]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[Storage]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <category><![CDATA[Developers]]></category>
            <guid isPermaLink="false">2sshzOlmGVsrtBz2mgeceE</guid>
            <dc:creator>Dillon Mulroy</dc:creator>
            <dc:creator>Matt Carey</dc:creator>
            <dc:creator>Matt Silverlock</dc:creator>
        </item>
        <item>
            <title><![CDATA[Deploy Postgres and MySQL databases with PlanetScale + Workers]]></title>
            <link>https://blog.cloudflare.com/deploy-planetscale-postgres-with-workers/</link>
            <pubDate>Thu, 16 Apr 2026 13:00:22 GMT</pubDate>
            <description><![CDATA[ Learn how to deploy PlanetScale Postgres and MySQL databases via Cloudflare and connect Cloudflare Workers. ]]></description>
            <content:encoded><![CDATA[ <p>Cloudflare announced our PlanetScale partnership last September to give <a href="https://workers.cloudflare.com/"><u>Cloudflare Workers</u></a> direct access to Postgres and MySQL databases for fast, full-stack applications.</p><p>Soon, we’re bringing our technologies even closer: you’ll be able to create PlanetScale Postgres and MySQL databases directly from the Cloudflare dashboard and API, and have them billed to your Cloudflare account. </p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5Tj4gJrV5hxlIWxlmoXVZe/7661c1e47c0c868b88301b5f4aca4441/BLOG-3213_2.png" />
          </figure><p>You choose the data storage that fits your Worker application needs and keep a single system for billing as a Cloudflare self-serve or enterprise customer. Cloudflare credits like those given in our <a href="https://www.cloudflare.com/forstartups/"><u>startup program</u></a> or Cloudflare committed spend can be used towards PlanetScale databases.</p>
    <div>
      <h2>Postgres &amp; MySQL for Workers</h2>
      <a href="#postgres-mysql-for-workers">
        
      </a>
    </div>
    <p>SQL relational databases like Postgres and MySQL are a foundation of modern applications. In particular, Postgres has risen in developer popularity with its rich tooling ecosystem (ORMs, GUIs, etc) and extensions like <a href="https://github.com/pgvector/pgvector/"><u>pgvector</u></a> for building vector search in AI-driven applications. Postgres is the default choice for most developers who need a powerful, flexible, and scalable database to power their applications.</p><p>You can already connect your PlanetScale account and create Postgres databases directly from the <a href="https://dash.cloudflare.com/?to=/:account/workers/hyperdrive?modal=1">Cloudflare dashboard</a> for your Workers. Starting next month, a new Cloudflare subscription will bill for new PlanetScale databases direct to your Cloudflare account as a self-serve or enterprise user.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1CHTq1qoaNGSNO5atsiS8J/a8eba618b77362aa467d94c4f625c600/BLOG-3213_3.png" />
          </figure><p><sup><i>How to create PlanetScale databases via </i></sup><a href="https://dash.cloudflare.com/?to=/:account/workers/hyperdrive?modal=1"><sup><i><u>Cloudflare dashboard</u></i></sup></a><sup><i> after your PlanetScale account is connected. Cloudflare billing is coming next month.</i></sup></p><p>With our built-in integration, PlanetScale databases automatically work with Workers using Hyperdrive, our database connectivity service. <a href="https://blog.cloudflare.com/how-hyperdrive-speeds-up-database-access/"><u>Hyperdrive</u></a> service manages database connection pools and query caching to make database queries fast and reliable. You just add a <a href="https://developers.cloudflare.com/workers/runtime-apis/bindings/"><u>binding</u></a> to your Worker’s <a href="https://developers.cloudflare.com/workers/wrangler/configuration/#hyperdrive"><u>config file</u></a>: </p>
            <pre><code>// wrangler.jsonc file
{
  "hyperdrive": [
    {
      "binding": "DATABASE",
      "id": &lt;AUTO_CREATED_ID&gt;
    }
  ]
}
</code></pre>
            <p>And start running SQL queries via your Worker with your Postgres client of choice:</p>
            <pre><code>import { Client } from "pg";

export default {
  async fetch(request, env, ctx) {
   
    const client = new Client({ connectionString: env.DATABASE.connectionString });
    await client.connect();

    const result = await client.query("SELECT * FROM pg_tables");
    ...
}
</code></pre>
            
    <div>
      <h2>PlanetScale developer experience</h2>
      <a href="#planetscale-developer-experience">
        
      </a>
    </div>
    <p>PlanetScale was the obvious choice to provide to the Workers community due to it’s unrivaled performance and reliability. Developers can choose from two of the most popular relational databases <a href="https://planetscale.com/docs/postgres/postgres-compatibility"><u>with Postgres</u></a> or Vitess MySQL. PlanetScale matches how Cloudflare treats performance and reliability as key features of a developer platform. And with features like <a href="https://planetscale.com/docs/postgres/monitoring/query-insights"><u>query insights</u></a> and <a href="https://planetscale.com/docs/connect/ai-tooling"><u>agent driven</u></a> workflows for improving SQL query performance and <a href="https://planetscale.com/docs/postgres/branching"><u>branching</u></a> for deploying code safely, including database changes, the PlanetScale database developer experience is first-class.</p><p>Cloudflare users get the exact same PlanetScale database developer experience. Your PlanetScale databases can be deployed directly from Cloudflare with connections managed via Hyperdrive, which already makes your existing regional databases fast with global Workers. This means access to the same PlanetScale <a href="https://planetscale.com/docs/plans/planetscale-skus"><u>database clusters</u></a> at standard PlanetScale <a href="https://planetscale.com/pricing"><u>pricing</u></a> with all features included like query insights and detailed breakdown of <a href="https://planetscale.com/docs/billing#organization-usage-and-billing-page"><u>usage and costs</u></a>.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/2Pfh4oM8zQSUGJKGEsxF3W/700f627c38279d9d90337b38de72b44e/BLOG-3213_4.png" />
          </figure><p><sup><i>A single node on PlanetScale Postgres starts at </i></sup><a href="https://planetscale.com/blog/5-dollar-planetscale"><sup><i><u>$5/month</u></i></sup></a><sup><i>.</i></sup></p>
    <div>
      <h2>Workers placement</h2>
      <a href="#workers-placement">
        
      </a>
    </div>
    <p>With centralized databases, Workers can run right next to your primary database to reduce latency with an <a href="https://developers.cloudflare.com/workers/configuration/placement/#configure-explicit-placement-hints"><u>explicit placement hint</u></a>. By default, Workers execute closest to a user request, which adds network latency when querying a central database especially for multiple queries. Instead, you can configure your Worker to execute in the closest Cloudflare data center to your PlanetScale database. In the future, Cloudflare can automatically set a placement hint based on the location of your PlanetScale database and reduce network latency to single digit milliseconds.</p>
            <pre><code>{
  "placement": {
    "region": "aws:us-east-1"
  }
}
</code></pre>
            
    <div>
      <h2>Coming soon</h2>
      <a href="#coming-soon">
        
      </a>
    </div>
    <p>You can deploy a PlanetScale Postgres database or connect an existing PlanetScale database to Workers today via the <a href="https://dash.cloudflare.com/?to=/:account/workers/hyperdrive?modal=1"><u>Cloudflare dashboard</u></a>. Everything today is still billed via PlanetScale.</p><p>Launching next month, new PlanetScale databases can be billed to your Cloudflare account. </p><p>We are building more with our PlanetScale partners, such as Cloudflare API integration, so tell us what you’d like to see next.</p> ]]></content:encoded>
            <category><![CDATA[Agents Week]]></category>
            <category><![CDATA[SQL]]></category>
            <category><![CDATA[Database]]></category>
            <category><![CDATA[Storage]]></category>
            <category><![CDATA[Postgres]]></category>
            <category><![CDATA[MySQL]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <category><![CDATA[Developers]]></category>
            <category><![CDATA[PlanetScale]]></category>
            <guid isPermaLink="false">1IGJnHwj5QVRJm9iCdEqYV</guid>
            <dc:creator>Vy Ton</dc:creator>
            <dc:creator>Matt Silverlock</dc:creator>
        </item>
        <item>
            <title><![CDATA[Project Think: building the next generation of AI agents on Cloudflare]]></title>
            <link>https://blog.cloudflare.com/project-think/</link>
            <pubDate>Wed, 15 Apr 2026 13:01:00 GMT</pubDate>
            <description><![CDATA[ Announcing a preview of the next edition of the Agents SDK — from lightweight primitives to a batteries-included platform for AI agents that think, act, and persist.
 ]]></description>
            <content:encoded><![CDATA[ <p>Today, we're introducing Project Think: the next generation of the <a href="https://developers.cloudflare.com/agents/"><u>Agents SDK</u></a>. Project Think is a set of new primitives for building long-running agents (durable execution, sub-agents, sandboxed code execution, persistent sessions) and an opinionated base class that wires them all together. Use the primitives to build exactly what you need, or use the base class to get started fast.</p><p>Something happened earlier this year that changed how we think about AI. Tools like <a href="https://github.com/badlogic/pi-mono"><u>Pi</u></a>, <a href="https://github.com/openclaw"><u>OpenClaw</u></a>, <a href="https://docs.anthropic.com/en/docs/agents"><u>Claude Code</u></a>, and <a href="https://openai.com/codex"><u>Codex</u></a> proved a simple but powerful idea: give an LLM the ability to read files, write code, execute it, and remember what it learned, and you get something that looks less like a developer tool and more like a general-purpose assistant.</p><p>These coding agents aren't just writing code anymore. People are using them to manage calendars, analyze datasets, negotiate purchases, file taxes, and automate entire business workflows. The pattern is always the same: the agent reads context, reasons about it, writes code to take action, observes the result, and iterates. Code is the universal medium of action.</p><p>Our team has been using these coding agents every day. And we kept running into the same walls:</p><ul><li><p><b>They only run on your laptop or an expensive VPS:</b> there's no sharing, no collaboration, no handoff between devices.</p></li><li><p><b>They're expensive when idle</b>: a fixed monthly cost whether the agent is working or not. Scale that to a team, or a company, and it adds up fast.</p></li><li><p><b>They require management and manual setup</b>: installing dependencies, managing updates, configuring identity and secrets.</p></li></ul><p>And there's a deeper structural issue. Traditional applications serve many users from one instance. As mentioned in our Welcome to Agents Week post, <a href="https://blog.cloudflare.com/welcome-to-agents-week/"><u>agents are one-to-one</u></a>. Each agent is a unique instance, serving one user, running one task. A restaurant has a menu and a kitchen optimized to churn out dishes at volume. An agent is more like a personal chef: different ingredients, different techniques, different tools every time.</p><p>That fundamentally changes the scaling math. If a hundred million knowledge workers each use an agentic assistant at even modest concurrency, you need capacity for tens of millions of simultaneous sessions. At current per-container costs, that's unsustainable. We need a different foundation.</p><p>That's what we've been building.</p>
    <div>
      <h2>Introducing Project Think</h2>
      <a href="#introducing-project-think">
        
      </a>
    </div>
    <p>Project Think ships a set of new primitives for the Agents SDK:</p><ul><li><p><b>Durable execution</b> with fibers: crash recovery, checkpointing, automatic keepalive</p></li><li><p><b>Sub-agents</b>: isolated child agents with their own SQLite and typed RPC</p></li><li><p><b>Persistent sessions</b>: tree-structured messages, forking, compaction, full-text search</p></li><li><p><b>Sandboxed code execution</b>: Dynamic Workers, codemode, runtime npm resolution</p></li><li><p><b>The execution ladder</b>: workspace, isolate, npm, browser, sandbox</p></li><li><p><b>Self-authored extensions</b>: agents that write their own tools at runtime</p></li></ul><p>Each of these is usable directly with the Agent base class. Build exactly what you need with the primitives, or use the Think base class to get started fast. Let's look at what each one does.</p>
    <div>
      <h2>Long-running agents</h2>
      <a href="#long-running-agents">
        
      </a>
    </div>
    <p>Agents, as they exist today, are ephemeral. They run for a session, tied to a single process or device, and then they are gone. A coding agent that dies when your laptop sleeps, that’s a tool. An agent that persists — that can wake up on demand, continue work after interruptions, and carry forward the state without depending on your local runtime — that starts to look like infrastructure. And it changes the scaling model for agents completely.</p><p>The Agents SDK builds on <a href="https://developers.cloudflare.com/durable-objects/"><u>Durable Objects</u></a> to give every agent an identity, persistent state, and the ability to wake on message. This is the <a href="https://en.wikipedia.org/wiki/Actor_model"><u>actor model</u></a>: each agent is an addressable entity with its own SQLite database. It consumes zero compute when hibernated. When something happens (an HTTP request, a WebSocket message, a scheduled alarm, an inbound email) the platform wakes the agent, loads its state, and hands it the event. The agent does its work, then goes back to sleep.</p><table><tr><th><p>
</p></th><th><p><b>VMs / Containers</b></p></th><th><p><b>Durable Objects</b></p></th></tr><tr><td><p><b>Idle cost</b></p></td><td><p>Full compute cost, always</p></td><td><p>Zero (hibernated)</p></td></tr><tr><td><p><b>Scaling</b></p></td><td><p>Provision and manage capacity</p></td><td><p>Automatic, per-agent</p></td></tr><tr><td><p><b>State</b></p></td><td><p>External database required</p></td><td><p>Built-in SQLite</p></td></tr><tr><td><p><b>Recovery</b></p></td><td><p>You build it (process managers, health checks)</p></td><td><p>Platform restarts, state survives</p></td></tr><tr><td><p><b>Identity / routing</b></p></td><td><p>You build it (load balancers, sticky sessions)</p></td><td><p>Built-in (name → agent)</p></td></tr><tr><td><p><b>10,000 agents, each active 1% of the time</b></p></td><td><p>10,000 always-on instances</p></td><td><p>~100 active at any moment</p></td></tr></table><p>This changes the economics of running agents at scale. Instead of "one expensive agent per power user," you can build "one agent per customer" or "one agent per task" or "one agent per email thread." The marginal cost of spawning a new agent is effectively zero.</p>
    <div>
      <h3>Surviving crashes: durable execution with fibers</h3>
      <a href="#surviving-crashes-durable-execution-with-fibers">
        
      </a>
    </div>
    <p>An LLM call takes 30 seconds. A multi-turn agent loop can run for much longer. At any point during that window, the execution environment can vanish: a deploy, a platform restart, hitting resource limits. The upstream connection to the model provider is severed permanently, in-memory state is lost, and connected clients see the stream stop with no explanation.</p><p><code></code><a href="https://developers.cloudflare.com/agents/api-reference/durable-execution/"><code><u>runFiber()</u></code></a> solves this. A fiber is a durable function invocation: registered in SQLite before execution begins, checkpointable at any point via <code>stash()</code>, and recoverable on restart via <code>onFiberRecovered</code>.</p>
            <pre><code>import { Agent } from "agents";

export class ResearchAgent extends Agent {
  async startResearch(topic: string) {
    void this.runFiber("research", async (ctx) =&gt; {
      const findings = [];

      for (let i = 0; i &lt; 10; i++) {
        const result = await this.callLLM(`Research step ${i}: ${topic}`);
        findings.push(result);

        // Checkpoint: if evicted, we resume from here
        ctx.stash({ findings, step: i, topic });

        this.broadcast({ type: "progress", step: i });
      }

      return { findings };
    });
  }

  async onFiberRecovered(ctx) {
    if (ctx.name === "research" &amp;&amp; ctx.snapshot) {
      const { topic } = ctx.snapshot;
      await this.startResearch(topic);
    }
  }
}
</code></pre>
            <p>The SDK keeps the agent alive automatically during fiber execution, no special configuration needed. For work measured in minutes, keepAlive() / keepAliveWhile() prevents eviction during active work. For longer operations (CI pipelines, design reviews, video generation) the agent starts the work, persists the job ID, hibernates, and wakes on callback.</p>
    <div>
      <h3>Delegating work: sub-agents via Facets</h3>
      <a href="#delegating-work-sub-agents-via-facets">
        
      </a>
    </div>
    <p>A single agent shouldn't do everything itself. <a href="https://developers.cloudflare.com/agents/api-reference/sub-agents/"><u>Sub-agents</u></a> are child Durable Objects colocated with the parent via <a href="https://blog.cloudflare.com/durable-object-facets-dynamic-workers/"><u>Facets</u></a>, each with their own isolated SQLite and execution context:</p>
            <pre><code>import { Agent } from "agents";

export class ResearchAgent extends Agent {
  async search(query: string) { /* ... */ }
}

export class ReviewAgent extends Agent {
  async analyze(query: string) { /* ... */ }
}

export class Orchestrator extends Agent {
  async handleTask(task: string) {
    const researcher = await this.subAgent(ResearchAgent, "research");
    const reviewer = await this.subAgent(ReviewAgent, "review");

    const [research, review] = await Promise.all([
      researcher.search(task),
      reviewer.analyze(task)
    ]);

    return this.synthesize(research, review);
  }
}
</code></pre>
            <p>Sub-agents are isolated at the storage level. Each one gets its own SQLite database, and there’s no implicit sharing of data between them. This is enforced by the runtime where sub-agent RPC latency is a function call. TypeScript catches misuse at compile time.</p>
    <div>
      <h3>Conversations that persist: the Session API</h3>
      <a href="#conversations-that-persist-the-session-api">
        
      </a>
    </div>
    <p>Agents that run for days or weeks need more than the typical flat list of messages. The experimental <a href="https://developers.cloudflare.com/agents/api-reference/sessions/"><u>Session API</u></a> models this explicitly. Available on the Agent base class, conversations are stored as trees, where each message has a parent_id. This enables forking (explore an alternative without losing the original path), non-destructive compaction (summarize older messages rather than deleting them), and full-text search across conversation history via <a href="https://www.sqlite.org/fts5.html"><u>FTS5</u></a>.</p>
            <pre><code>import { Agent } from "agents";
import { Session, SessionManager } from "agents/experimental/memory/session";

export class MyAgent extends Agent {
  sessions = SessionManager.create(this);

  async onStart() {
    const session = this.sessions.create("main");
    const history = session.getHistory();
    const forked = this.sessions.fork(session.id, messageId, "alternative-approach");
  }
}
</code></pre>
            <p>Session is usable directly with <code>Agent</code>, and it's the storage layer that the <code>Think</code> base class builds on.</p>
    <div>
      <h2>From tool calls to code execution</h2>
      <a href="#from-tool-calls-to-code-execution">
        
      </a>
    </div>
    <p>Conventional tool-calling has an awkward shape. The model calls a tool, pulls the result back through the context window, calls another tool, pulls that back, and so on. As the tool surface grows, this gets both expensive and clumsy. A hundred files means a hundred round-trips through the model.</p><p>But <a href="https://blog.cloudflare.com/code-mode/"><u>models are better at writing code to use a system than they are at playing the tool-calling game</u></a>. This is the insight behind <a href="https://github.com/cloudflare/agents/tree/main/packages/codemode"><u>@cloudflare/codemode</u></a>: instead of sequential tool calls, the LLM writes a single program that handles the entire task.</p>
            <pre><code>// The LLM writes this. It runs in a sandboxed Dynamic Worker.
const files = await tools.find({ pattern: "**/*.ts" });
const results = [];
for (const file of files) {
  const content = await tools.read({ path: file });
  if (content.includes("TODO")) {
    results.push({ file, todos: content.match(/\/\/ TODO:.*/g) });
  }
}
return results;
</code></pre>
            <p>Instead of 100 round-trips to the model, you just run a single program. This leads to fewer tokens used, faster execution, and better results. The <a href="https://github.com/cloudflare/mcp"><u>Cloudflare API MCP server</u></a> demonstrates this at scale. We expose only two tools <code>(search()</code> and <code>execute())</code>, which consume ~1,000 tokens, vs. ~1.17 million tokens for the naive tool-per-endpoint equivalent. This is a 99.9% reduction.</p>
    <div>
      <h3>The missing primitive: safe sandboxes</h3>
      <a href="#the-missing-primitive-safe-sandboxes">
        
      </a>
    </div>
    <p>Once you accept that models should write code on behalf of users, the question becomes: where does that code run? Not eventually, not after a product team turns it into a roadmap item. Right now, for this user, against this system, with tightly defined permissions.</p><p><a href="https://blog.cloudflare.com/dynamic-workers/"><u>Dynamic Workers</u></a> are that sandbox. A fresh V8 isolate spun up at runtime, in milliseconds, with a few megabytes of memory. That's roughly 100x faster and up to 100x more memory-efficient than a container. You can start a new one for every single request, run a snippet of code, and throw it away.</p><p>The critical design choice is the capability model. Instead of starting with a general-purpose machine and trying to constrain it, Dynamic Workers begin with almost no ambient authority (<code>globalOutbound: null</code>, no network access) and the developer grants capabilities explicitly, resource by resource, through bindings. We go from asking "how do we stop this thing from doing too much?" to "what exactly do we want this thing to be able to do?"</p><p>This is the right question for agent infrastructure.</p>
    <div>
      <h3>The execution ladder</h3>
      <a href="#the-execution-ladder">
        
      </a>
    </div>
    <p>This capability model leads naturally to a spectrum of compute environments, an <b>execution ladder</b> that the agent escalates through as needed:</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6yokfTVcg8frH4snf7c4sp/2306d721650b4956b28e2198f7cf915d/BLOG-3200_2.png" />
          </figure><p><b>Tier 0</b> is the Workspace, a durable virtual filesystem backed by SQLite and R2. Read, write, edit, search, grep, diff. Powered by <a href="https://www.npmjs.com/package/@cloudflare/shell"><code><u>@cloudflare/shell</u></code></a>.</p><p><b>Tier 1</b> is a Dynamic Worker: LLM-generated JavaScript running in a sandboxed isolate with no network access. Powered by <a href="https://www.npmjs.com/package/@cloudflare/codemode"><code><u>@cloudflare/codemode</u></code></a>.</p><p><b>Tier 2</b> adds npm. <a href="https://github.com/cloudflare/agents/tree/main/packages/worker-bundler"><code><u>@cloudflare/worker-bundler</u></code></a> fetches packages from the registry, bundles them with esbuild, and loads the result into the Dynamic Worker. The agent writes <code>import { z } from "zod"</code> and it just works.</p><p><b>Tier 3</b> is a headless browser via <a href="https://developers.cloudflare.com/browser-rendering/"><u>Cloudflare Browser Run</u></a>. Navigate, click, extract, screenshot. Useful when the service doesn't support agents yet via MCP or APIs.</p><p><b>Tier 4</b> is a <a href="https://developers.cloudflare.com/sandbox/"><u>Cloudflare Sandbox</u></a> configured with your toolchains, repos, and dependencies: <code>git clone, npm test, cargo build</code>, synced bidirectionally with the Workspace.</p><p>The key design principle: <b>the agent should be useful at Tier 0 alone, where each tier is additive.</b> The user can add capabilities as they go.</p>
    <div>
      <h3>Building blocks, not a framework</h3>
      <a href="#building-blocks-not-a-framework">
        
      </a>
    </div>
    <p>All of these primitives are available as standalone packages. <a href="https://blog.cloudflare.com/dynamic-workers/"><u>Dynamic Workers</u></a>, <a href="https://github.com/cloudflare/agents/tree/main/packages/codemode"><code><u>@cloudflare/codemode</u></code></a>, <a href="https://github.com/cloudflare/agents/tree/main/packages/worker-bundler"><code><u>@cloudflare/worker-bundler</u></code></a>, and <a href="https://www.npmjs.com/package/@cloudflare/shell"><code><u>@cloudflare/shell</u></code></a> (a durable filesystem with tools) are all usable directly with the Agent base class. You can combine them to give any agent a workspace, code execution, and runtime package resolution without adopting an opinionated framework.</p>
    <div>
      <h2>The platform</h2>
      <a href="#the-platform">
        
      </a>
    </div>
    <p>Here's the complete stack for building agents on Cloudflare:</p><table><tr><th><p><b>Capability</b></p></th><th><p><b>What it does</b></p></th><th><p><b>Powered by</b></p></th></tr><tr><td><p>Per-agent isolation</p></td><td><p>Every agent is its own world</p></td><td><p><a href="https://developers.cloudflare.com/durable-objects/"><u>Durable Objects</u></a> (DOs)</p></td></tr><tr><td><p>Zero cost when idle</p></td><td><p>$0 until the agent wakes up</p></td><td><p><a href="https://developers.cloudflare.com/durable-objects/best-practices/websockets/#websocket-hibernation-api"><u>DO Hibernation</u></a></p></td></tr><tr><td><p>Persistent state</p></td><td><p>Queryable, transactional storage</p></td><td><p><a href="https://developers.cloudflare.com/durable-objects/best-practices/access-durable-objects-storage/"><u>DO SQLite</u></a></p></td></tr><tr><td><p>Durable filesystem</p></td><td><p>Files that survive restarts</p></td><td><p>Workspace (SQLite + <a href="https://developers.cloudflare.com/r2/"><u>R2</u></a>)</p></td></tr><tr><td><p>Sandboxed code execution</p></td><td><p>Run LLM-generated code safely</p></td><td><p><a href="https://blog.cloudflare.com/dynamic-workers/"><u>Dynamic Workers</u></a> + <a href="https://github.com/cloudflare/agents/tree/main/packages/codemode"><code><u>@cloudflare/codemode</u></code></a></p></td></tr><tr><td><p>Runtime dependencies</p></td><td><p><code>import * from react</code> just works</p></td><td><p><a href="https://github.com/cloudflare/agents/tree/main/packages/worker-bundler"><code><u>@cloudflare/worker-bundler</u></code></a></p></td></tr><tr><td><p>Web automation</p></td><td><p>Browse, navigate, fill forms</p></td><td><p><a href="https://developers.cloudflare.com/browser-rendering/"><u>Browser Run</u></a></p></td></tr><tr><td><p>Full OS access</p></td><td><p>git, compilers, test runners</p></td><td><p><a href="https://developers.cloudflare.com/sandbox/"><u>Sandboxes</u></a></p></td></tr><tr><td><p>Scheduled execution</p></td><td><p>Proactive, not just reactive</p></td><td><p><a href="https://developers.cloudflare.com/durable-objects/api/alarms/"><u>DO Alarms + Fibers</u></a></p></td></tr><tr><td><p>Real-time streaming</p></td><td><p>Token-by-token to any client</p></td><td><p>WebSockets</p></td></tr><tr><td><p>External tools</p></td><td><p>Connect to any tool server</p></td><td><p>MCP</p></td></tr><tr><td><p>Agent coordination</p></td><td><p>Typed RPC between agents</p></td><td><p>Sub-agents (<a href="https://developers.cloudflare.com/dynamic-workers/usage/durable-object-facets/"><u>Facets</u></a>)</p></td></tr><tr><td><p>Model access</p></td><td><p>Connect to an LLM to power the agent</p></td><td><p><a href="https://developers.cloudflare.com/ai-gateway/"><u>AI Gateway</u></a> + <a href="https://developers.cloudflare.com/workers-ai/"><u>Workers AI</u></a> (or Bring Your Own Model)</p></td></tr></table><p>Each of these is a building block. Together, they form something new: a platform where anyone can build, deploy, and run AI agents as capable as the ones running on your local machine today, but <a href="https://www.cloudflare.com/learning/serverless/what-is-serverless/"><u>serverless</u></a>, durable, and safe by construction. </p>
    <div>
      <h2>The Think base class</h2>
      <a href="#the-think-base-class">
        
      </a>
    </div>
    <p>Now that you've seen the primitives, here's what happens when you wire them all together.</p><p><code>Think</code> is an opinionated harness that handles the full chat lifecycle: agentic loop, message persistence, streaming, tool execution, stream resumption, and extensions. You focus on what makes your agent unique.</p><p>The minimal subclass looks like this:</p>
            <pre><code>import { Think } from "@cloudflare/think";
import { createWorkersAI } from "workers-ai-provider";

export class MyAgent extends Think&lt;Env&gt; {
  getModel() {
    return createWorkersAI({ binding: this.env.AI })(
      "@cf/moonshotai/kimi-k2.5"
    );
  }
}
</code></pre>
            <p>That’s effectively all you need to have a working chat agent with streaming, persistence, abort/cancel, error handling, resumable streams, and a built-in workspace filesystem. Deploy with <code>npx wrangler deploy</code>.</p><p>Think makes decisions for you. When you need more control, you can override the ones you care about:</p><table><tr><td><p><b>Override</b></p></td><td><p><b>Purpose</b></p></td></tr><tr><td><p><code>getModel()</code></p></td><td><p>Return the <code>LanguageModel</code> to use</p></td></tr><tr><td><p><code>getSystemPrompt()</code></p></td><td><p>System prompt</p></td></tr><tr><td><p><code>getTools()</code></p></td><td><p>AI SDK compatible <code>ToolSet</code> for the agentic loop</p></td></tr><tr><td><p><code>maxSteps</code></p></td><td><p>Max tool-call rounds per turn</p></td></tr><tr><td><p><code>configureSession()</code></p></td><td><p>Context blocks, compaction, search, skills</p></td></tr></table><p>Under the hood, Think runs the complete agentic loop on every turn: it assembles the context (base instructions + tool descriptions + skills + memory + conversation history), calls <code>streamText</code>, executes tool calls (with output truncation to prevent context blowup), appends results, loops until the model is done or the step limit is reached. All messages are persisted after each turn.</p>
    <div>
      <h3>Lifecycle hooks</h3>
      <a href="#lifecycle-hooks">
        
      </a>
    </div>
    <p>Think gives you hooks at every stage of the chat turn, without requiring you to own the whole pipeline:</p>
            <pre><code>beforeTurn()
  → streamText()
    → beforeToolCall()
    → afterToolCall()
  → onStepFinish()
→ onChatResponse()
</code></pre>
            <p>Switch to a lower cost model for follow-up turns, limit the tools it can use, and pass in client-side context on each turn. Also log every tool call to analytics and automatically trigger one more follow-up turn after the model completes, all without replacing <code>onChatMessage</code>.</p>
    <div>
      <h3>Persistent memory and long conversations</h3>
      <a href="#persistent-memory-and-long-conversations">
        
      </a>
    </div>
    <p>Think builds on <a href="https://developers.cloudflare.com/agents/api-reference/sessions/?cf_target_id=E7A3D837FA7DC4C7DDA822B3DE0F831B"><u>Session API</u></a> as its storage layer, giving you tree-structured messages with branching built in.</p><p>On top of that, it adds persistent memory through <b>context blocks</b>. These are structured sections of the system prompt that the model can read and update over time, and they persist across hibernation<b>.</b>The model sees "MEMORY (Important facts, use set_context to update) [42%, 462/1100 tokens]" and can proactively remember things.</p>
            <pre><code>configureSession(session: Session) {
  return session
    .withContext("soul", {
      provider: { get: async () =&gt; "You are a helpful coding assistant." }
    })
    .withContext("memory", {
      description: "Important facts learned during conversation.",
      maxTokens: 2000
    })
    .withCachedPrompt();
}
</code></pre>
            <p>Sessions are flexible. You can run multiple conversations per agent and fork them to try a different direction without losing the original.<b> </b></p><p>As context grows, Think handles limits with non-destructive compaction. Older messages are summarized instead of removed, while the full history remains stored in SQLite.<b> </b></p><p>Search is built in as well. Using FTS5, you can query conversation history within a session or across all the sessions. The agent is also able to search its own past using<b> </b><code>search_context</code> tool.</p>
    <div>
      <h3>The full execution ladder, wired in</h3>
      <a href="#the-full-execution-ladder-wired-in">
        
      </a>
    </div>
    <p>Think integrates the entire execution ladder into a single <code>getTools()</code> return:</p>
            <pre><code>import { Think } from "@cloudflare/think";
import { createWorkspaceTools } from "@cloudflare/think/tools/workspace";
import { createExecuteTool } from "@cloudflare/think/tools/execute";
import { createBrowserTools } from "@cloudflare/think/tools/browser";
import { createSandboxTools } from "@cloudflare/think/tools/sandbox";
import { createExtensionTools } from "@cloudflare/think/tools/extensions";

export class MyAgent extends Think&lt;Env&gt; {
  extensionLoader = this.env.LOADER;

  getModel() {
    /* ... */
  }

  getTools() {
    return {
      execute: createExecuteTool({
        tools: createWorkspaceTools(this.workspace),
        loader: this.env.LOADER
      }),
      ...createBrowserTools(this.env.BROWSER),
      ...createSandboxTools(this.env.SANDBOX), // configured per-agent: toolchains, repos, snapshots
      ...createExtensionTools({ manager: this.extensionManager! }),
      ...this.extensionManager!.getTools()
    };
  }
}
</code></pre>
            
    <div>
      <h3>Self-authored extensions</h3>
      <a href="#self-authored-extensions">
        
      </a>
    </div>
    <p>Think takes code execution one step further. An agent can write its own extensions: TypeScript programs that run in Dynamic Workers, declaring permissions for network access and workspace operations.</p>
            <pre><code>{
  "name": "github",
  "description": "GitHub integration: PRs, issues, repos",
  "tools": ["create_pr", "list_issues", "review_pr"],
  "permissions": {
    "network": ["api.github.com"],
    "workspace": "read-write"
  }
}
</code></pre>
            <p>Think's <code>ExtensionManager</code> bundles the extension (optionally with npm deps via <code>@cloudflare/worker-bundler</code>), loads it into a Dynamic Worker, and registers the new tools. The extension persists in DO storage and survives hibernation. The next time the user asks about pull requests, the agent has a <code>github_create_pr </code>tool that didn't exist 30 seconds ago.</p><p>This is the kind of self-improvement loop that makes agents genuinely more useful over time. Not through fine-tuning or RLHF, but through code. The agent is able to write new capabilities for itself, all in sandboxed, auditable, and revocable TypeScript.</p>
    <div>
      <h3>Sub-agent RPC</h3>
      <a href="#sub-agent-rpc">
        
      </a>
    </div>
    <p>Think also works as a sub-agent, called via <code>chat()</code> over RPC from a parent, with streaming events via callback:</p>
            <pre><code>const researcher = await this.subAgent(ResearchSession, "research");
const result = await researcher.chat(`Research this: ${task}`, streamRelay);
</code></pre>
            <p>Each child gets its own conversation tree, memory, tools, and model. The parent doesn't need to know the details.</p>
    <div>
      <h3>Getting started</h3>
      <a href="#getting-started">
        
      </a>
    </div>
    <p>Project Think is experimental. The API surface is stable but will continue to evolve in the coming days and weeks. We're already using it internally to build our own background agent infrastructure, and we're sharing it early so you can build alongside us.</p>
            <pre><code>npm install @cloudflare/think agents ai @cloudflare/shell zod workers-ai-provider</code></pre>
            
            <pre><code>// src/server.ts
import { Think } from "@cloudflare/think";
import { createWorkersAI } from "workers-ai-provider";
import { routeAgentRequest } from "agents";

export class MyAgent extends Think&lt;Env&gt; {
  getModel() {
    return createWorkersAI({ binding: this.env.AI })(
      "@cf/moonshotai/kimi-k2.5"
    );
  }
}

export default {
  async fetch(request: Request, env: Env) {
    return (
      (await routeAgentRequest(request, env)) ||
      new Response("Not found", { status: 404 })
    );
  }
} satisfies ExportedHandler&lt;Env&gt;;
</code></pre>
            
            <pre><code>// src/client.tsx
import { useAgent } from "agents/react";
import { useAgentChat } from "@cloudflare/ai-chat/react";

function Chat() {
  const agent = useAgent({ agent: "MyAgent" });
  const { messages, sendMessage, status } = useAgentChat({ agent });
  // Render your chat UI
}
</code></pre>
            <p>Think speaks the same WebSocket protocol as <code>@cloudflare/ai-chat</code>, so existing UI components work out of the box. If you've built on <a href="https://developers.cloudflare.com/agents/api-reference/chat-agents/"><code><u>AIChatAgent</u></code></a>, your client code doesn't change.</p>
    <div>
      <h2>The third wave</h2>
      <a href="#the-third-wave">
        
      </a>
    </div>
    <p>We see three waves of AI agents:</p><p><b>The first wave was chatbots.</b> They were stateless, reactive, and fragile. Every conversation started from scratch with no memory, no tools, and no ability to act. This made them useful for answering questions, but limited them to only answering questions.</p><p><b>The second wave was coding agents.</b> These are stateful, tool-using and far more capable tools like Pi, Claude Code, OpenClaw, and Codex. These agents can read codebases, write code, execute it, and iterate. These proved that an LLM with the right tools is a general-purpose machine, but they run on your laptop, for one user, with no durability guarantees.</p><p><b>Now we are entering the third wave: agents as infrastructure.</b> Durable, distributed, structurally safe, and serverless. These are agents that run on the Internet, survive failures, cost nothing when idle, and enforce security through architecture rather than behavior. Agents that any developer can build and deploy for any number of users.</p><p>This is the direction we’re betting on.</p><p>The Agents SDK is already powering thousands of production agents. With Project Think and the the primitives it introduces, we're adding the missing pieces to make those agents dramatically more capable: persistent workspaces, sandboxed code execution, durable long-running tasks, structural security, sub-agent coordination, and self-authored extensions.</p><p>It's available today in preview. We're building alongside you, and we'd genuinely love to see what you (and your coding agent) create with it.</p><hr /><p><sup><i>Think is part of the Cloudflare Agents SDK, available as @cloudflare/think. The features described in this post are in preview. APIs may change as we incorporate feedback. Check the </i></sup><a href="https://github.com/cloudflare/agents/blob/main/docs/think/index.md"><sup><i><u>documentation</u></i></sup></a><sup><i> and </i></sup><a href="https://github.com/cloudflare/agents/tree/main/examples/assistant"><sup><i><u>example</u></i></sup></a><sup><i> to get started.</i></sup></p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/161Wz7Tf8Cpzn2u2cBCH3V/37633c016734590005edd280732e89b9/BLOG-3200_3.png" />
          </figure><p></p> ]]></content:encoded>
            <category><![CDATA[Agents Week]]></category>
            <category><![CDATA[Agents]]></category>
            <category><![CDATA[Storage]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <category><![CDATA[Developers]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[Durable Objects]]></category>
            <category><![CDATA[AI]]></category>
            <guid isPermaLink="false">3r2ykMs0LTSPwVHmVWldCy</guid>
            <dc:creator>Sunil Pai</dc:creator>
            <dc:creator>Kate Reznykova</dc:creator>
        </item>
        <item>
            <title><![CDATA[Browser Run: give your agents a browser]]></title>
            <link>https://blog.cloudflare.com/browser-run-for-ai-agents/</link>
            <pubDate>Wed, 15 Apr 2026 13:00:00 GMT</pubDate>
            <description><![CDATA[ Browser Rendering is now Browser Run, with Live View, Human in the Loop, CDP access, session recordings, and 4x higher concurrency limits for AI agents. ]]></description>
            <content:encoded><![CDATA[ <p>AI agents need to interact with the web. To do that, they need a browser. They need to navigate sites, read pages, fill forms, extract data, and take screenshots. They need to observe whether things are working as expected, with a way for their humans to step in if needed. And they need to do all of this at scale.</p><p>Today, we’re renaming Browser Rendering to <b>Browser Run</b>, and shipping key features that make it <i>the</i> browser for <a href="https://www.cloudflare.com/learning/ai/what-is-agentic-ai/"><u>AI agents</u></a>. The name Browser Rendering never fully captured what the product does. Browser Run lets you run full browser sessions on Cloudflare's global network, drive them with code or AI, record and replay sessions, crawl pages for content, debug in real time, and let humans intervene when your agent needs help. </p><p>Here’s what’s new:</p><ul><li><p><b>Live View</b>: see what your agent sees and is doing, in real time. Know instantly if things are working, and when they’re not, see exactly why.</p></li><li><p><b>Human in the Loop</b>: when your agent hits a snag like a login page or unexpected edge case, it can hand off to a human instead of failing. The human steps in, resolves, then hands back control.</p></li><li><p><b>Chrome DevTools Protocol (CDP) Endpoint</b>: the Chrome DevTools Protocol is how agents control browsers. Browser Run now exposes it directly, so agents get maximum control over the browser and existing CDP scripts work on Cloudflare.</p></li><li><p><b>MCP Client Support:</b> AI coding agents like Claude Desktop, Cursor, and OpenCode can now use Browser Run as their remote browser.</p></li><li><p><b>WebMCP Support</b>: agents will outnumber humans using the web. WebMCP allows websites to declare what actions are available for agents to discover and call, making navigation more reliable.</p></li><li><p><b>Session Recordings</b>: capture every browser session for debugging purposes. When something goes wrong, you have the full recording with DOM changes, user interactions, and page navigation.</p></li><li><p><b>Higher limits</b>: run more tasks at once with 120 concurrent browsers, up from 30.  </p></li></ul><div>
  
</div><p><sup><i>An AI agent searching Google Hotels for up to date pricing for stays in Kyoto</i></sup></p>
    <div>
      <h2>Everything an agent needs</h2>
      <a href="#everything-an-agent-needs">
        
      </a>
    </div>
    <p>Let’s think about what agents need when browsing the web and how each feature fits in:</p>
<div><table><colgroup>
<col></col>
<col></col>
</colgroup>
<thead>
  <tr>
    <th><span>What an agent needs</span></th>
    <th><span>Browser Run (formerly Browser Rendering)</span></th>
  </tr></thead>
<tbody>
  <tr>
    <td><span>1) Browsers on-demand</span></td>
    <td><span>Chrome browser on Cloudflare’s global network</span></td>
  </tr>
  <tr>
    <td><span>2) A way to control the browser</span></td>
    <td><span>Take actions like navigate, click, fill forms, screenshot, and more with Puppeteer, Playwright, </span><span>CDP (new)</span><span>, </span><span>MCP Client Support (new)</span><span> and </span><span>WebMCP (new)</span></td>
  </tr>
  <tr>
    <td><span>3) Observability</span></td>
    <td><span>Live View (new)</span><span>, </span><span>Session Recordings (new)</span><span>, and </span><span>Dashboard redesign (new)</span></td>
  </tr>
  <tr>
    <td><span>4) Human intervention</span></td>
    <td><span>Human in the Loop (new)</span></td>
  </tr>
  <tr>
    <td><span>5) Scale</span></td>
    <td><span>10 requests/second for Quick Actions, </span><span>120 concurrent browsers (4x increase)</span></td>
  </tr>
</tbody></table></div>
    <div>
      <h2>1) Open a browser</h2>
      <a href="#1-open-a-browser">
        
      </a>
    </div>
    <p>First, an agent needs a browser. With Browser Run, agents can spin up a headless Chrome instance on Cloudflare’s global network, on demand. No infrastructure to manage, no Chrome versions to maintain. Browser sessions open near users for low latency, and scale up and down as needed. Pair Browser Run with the <a href="https://developers.cloudflare.com/agents/api-reference/browse-the-web/"><u>Agents SDK</u></a> to build long-running agents that browse the web, remember everything, and act on their own. </p>
    <div>
      <h2>2) Take actions</h2>
      <a href="#2-take-actions">
        
      </a>
    </div>
    <p>Once your agent has a browser, it needs ways to control it. Browser Run supports multiple approaches: new low-level protocol access with the Chrome DevTools Protocol (CDP) and WebMCP, in addition to existing higher-level automation using <a href="https://developers.cloudflare.com/browser-rendering/puppeteer/"><u>Puppeteer</u></a> and <a href="https://developers.cloudflare.com/browser-rendering/playwright/"><u>Playwright</u></a>, and <a href="https://developers.cloudflare.com/browser-rendering/rest-api/"><u>Quick Actions</u></a> for simple tasks. Let’s look at the details.</p>
    <div>
      <h3>Chrome DevTools Protocol (CDP) endpoint</h3>
      <a href="#chrome-devtools-protocol-cdp-endpoint">
        
      </a>
    </div>
    <p>The <a href="https://chromedevtools.github.io/devtools-protocol/"><u>Chrome DevTools Protocol (CDP)</u></a> is the low-level protocol that powers browser automation. Exposing CDP directly means the growing ecosystem of agent tools and existing CDP automation scripts can use Browser Run. When you open Chrome DevTools and inspect a page, CDP is what's running underneath. Puppeteer, Playwright, and most agent frameworks are built on top of it.</p><p>Every way that you have been using Browser Run has actually been through CDP already. What’s new is that we're now <a href="https://developers.cloudflare.com/browser-rendering/cdp/"><u>exposing CDP directly</u></a> as an endpoint. This matters for agents because CDP gives agents the most control possible over the browser. Agent frameworks already speak CDP natively, and can now connect to Browser Run directly. CDP also unlocks browser actions that aren't available through Puppeteer or Playwright, like JavaScript debugging. And because you're working with raw CDP messages instead of going through higher-level libraries, you can pass messages directly to models for more token-efficient browser control.</p><p>If you already have CDP automation scripts running against self-hosted Chrome, they work on Browser Run with a one-line config change. Point your WebSocket URL at Browser Run and stop managing your own browser infrastructure.</p>
            <pre><code>// Before: connecting to self-hosted Chrome
const browser = await puppeteer.connect({
  browserWSEndpoint: 'ws://localhost:9222/devtools/browser'
});

// After: connecting to Browser Run
const browser = await puppeteer.connect({
  browserWSEndpoint: 'wss://api.cloudflare.com/client/v4/accounts/&lt;ACCOUNT_ID&gt;/browser-rendering/devtools/browser',
  headers: { 'Authorization': 'Bearer &lt;API_TOKEN&gt;' }
});
</code></pre>
            <p>The CDP endpoint also makes Browser Run more accessible. You can now connect from any language, any environment, without needing to write a <a href="https://developers.cloudflare.com/workers/"><u>Cloudflare Worker</u></a>. (If you're already using Workers, nothing changes.)</p>
    <div>
      <h4>Using Browser Run with MCP Clients</h4>
      <a href="#using-browser-run-with-mcp-clients">
        
      </a>
    </div>
    <p>Now that Browser Run exposes the Chrome DevTools Protocol (CDP), MCP clients including Claude Desktop, Cursor, Codex, and OpenCode can use Browser Run as their remote browser. The <a href="https://github.com/ChromeDevTools/chrome-devtools-mcp"><u>chrome-devtools-mcp package</u></a> from the Chrome DevTools team is an MCP server that gives your AI coding assistant access to the full power of Chrome DevTools for reliable automation, in-depth debugging, and performance analysis.</p><p>Here’s an example of how to configure Browser Run for Claude Desktop:</p>
            <pre><code>{
  "mcpServers": {
    "browser-rendering": {
      "command": "npx",
      "args": [
        "-y",
        "chrome-devtools-mcp@latest",
        "--wsEndpoint=wss://api.cloudflare.com/client/v4/accounts/&lt;ACCOUNT_ID&gt;/browser-rendering/devtools/browser?keep_alive=600000",
        "--wsHeaders={\"Authorization\":\"Bearer &lt;API_TOKEN&gt;\"}"
      ]
    }
  }
}
</code></pre>
            <p>For other MCP clients, see <a href="https://developers.cloudflare.com/browser-rendering/cdp/mcp-clients/"><u>documentation for using Browser Run with MCP clients</u></a>.</p>
    <div>
      <h3>WebMCP support</h3>
      <a href="#webmcp-support">
        
      </a>
    </div>
    <p>The Internet was built for humans, so navigating as an AI agent today is unreliable. We’re betting on a future where more agents use the web than humans. In that world, sites need to be agent-friendly.</p><p>That’s why we’re launching support for <a href="https://developer.chrome.com/blog/webmcp-epp"><u>WebMCP</u></a>, a new browser API from the Google Chrome team that landed in Chromium 146+. WebMCP lets websites expose tools directly to AI agents, declaring what actions are available for agents to discover and call on each page. This helps agents navigate the web more reliably. Instead of agents needing to figure out how to use a site, websites can expose their tools for agents to discover and call</p><p>Two APIs make this work:</p><ul><li><p><code>navigator.modelContext</code> allows websites to register their tools</p></li><li><p><code>navigator.modelContextTesting</code> allows agents to discover and execute those tools</p></li></ul><p>Today, an agent visiting a travel booking site has to figure out the UI by looking at it. With WebMCP, the site declares “here’s a search_flights tool that takes an origin, destination, and date.” The agent calls it directly, without having to loop through slow screenshot-analyze-click loops. This makes navigation more reliable regardless of potential changes to the UI.</p><p>Tools are discovered on the page rather than preloaded. This matters for the long tail of the web, where preloading an MCP server for every possible site is not feasible and would bloat the context window. </p><div>
  
</div><p><sup><i>Using WebMCP to book a hotel through the Chrome DevTools console, discovering available tools with listTools()</i></sup></p><p>We have an experimental pool with browser instances running Chrome beta so you can test emerging browser features before they reach stable Chrome. We also just shipped <a href="https://developers.cloudflare.com/browser-rendering/reference/wrangler-commands/"><u>Wrangler browser commands</u></a> that let you manage browser sessions directly from the CLI, letting you create, manage, and view browser sessions directly from your terminal. To <a href="https://developers.cloudflare.com/browser-run/features/webmcp/"><u>access WebMCP-enabled browsers</u></a>, use the following Wrangler command to create a session in the experimental pool:</p>
            <pre><code>npm i -g wrangler@latest
wrangler browser create --lab --keepAlive 300  
</code></pre>
            
    <div>
      <h3>Existing ways to use Browser Run</h3>
      <a href="#existing-ways-to-use-browser-run">
        
      </a>
    </div>
    <p>While CDP and WebMCP are new, you could already use <a href="https://developers.cloudflare.com/browser-rendering/puppeteer/"><u>Puppeteer</u></a>, <a href="https://developers.cloudflare.com/browser-rendering/playwright/"><u>Playwright</u></a>, or <a href="https://developers.cloudflare.com/browser-rendering/stagehand/"><u>Stagehand</u></a> for full browser automation through Browser Run. And for simple tasks like <a href="https://developers.cloudflare.com/browser-rendering/rest-api/screenshot-endpoint/"><u>capturing screenshots</u></a>, <a href="https://developers.cloudflare.com/browser-rendering/rest-api/pdf-endpoint/"><u>generating PDFs</u></a>, and <a href="https://developers.cloudflare.com/browser-rendering/rest-api/markdown-endpoint/"><u>extracting markdown</u></a>, there are the <a href="https://developers.cloudflare.com/browser-rendering/rest-api/"><u>Quick Action endpoints</u></a>. </p>
    <div>
      <h4>/crawl endpoint — crawl web content</h4>
      <a href="#crawl-endpoint-crawl-web-content">
        
      </a>
    </div>
    <p>We also recently shipped a <a href="https://developers.cloudflare.com/browser-rendering/rest-api/crawl-endpoint/"><u>/crawl endpoint</u></a> that lets you crawl entire sites with a single API call. Give it a starting URL and pages are automatically discovered and scraped, then returned in your preferred format (HTML, Markdown, and structured JSON), with additional parameters to control crawl depth and scope, skip pages that haven’t changed, and specify certain paths to include or exclude. </p><p>We intentionally built /crawl to be a <a href="https://developers.cloudflare.com/browser-run/faq/#will-browser-run-be-detected-by-bot-management"><u>well-behaved crawler</u></a>. That means it respects site owner’s preferences out of the box, is a <a href="https://developers.cloudflare.com/bots/concepts/bot/signed-agents/"><u>signed agent</u></a> with a distinct bot ID that is cryptographically signed using <a href="https://developers.cloudflare.com/bots/reference/bot-verification/web-bot-auth/"><u>Web Bot Auth</u></a>, a non-customizable <a href="https://developers.cloudflare.com/browser-rendering/rest-api/crawl-endpoint/#user-agent"><u>User-Agent</u></a>, and follows robots.txt and <a href="https://www.cloudflare.com/ai-crawl-control/"><u>AI Crawl Control</u></a>. It does not bypass Cloudflare’s bot protections or CAPTCHAs. Site owners choose whether their content is accessible and /crawl respects it. </p>
            <pre><code># Initiate a crawl
curl -X POST 'https://api.cloudflare.com/client/v4/accounts/{account_id}/browser-rendering/crawl' \
  -H 'Authorization: Bearer &lt;apiToken&gt;' \
  -H 'Content-Type: application/json' \
  -d '{
    "url": "https://blog.cloudflare.com/"
  }'
</code></pre>
            
    <div>
      <h2>3) Observe</h2>
      <a href="#3-observe">
        
      </a>
    </div>
    <p>Things don’t always go right the first try. We kept hearing from customers that when their automations failed, they had no idea why. That’s why we’ve added multiple ways to observe what’s happening, so you can see exactly what your agent sees, both live and after the fact. </p>
    <div>
      <h3>Live View</h3>
      <a href="#live-view">
        
      </a>
    </div>
    <p><a href="https://developers.cloudflare.com/browser-run/features/live-view/"><u>Live View</u></a> lets you watch your agent’s browser session in real time. Whether you’re debugging an agent or running a long automation script, you see exactly what’s happening as it happens. This includes the page itself, as well as the DOM, console, and network requests. When something goes wrong — the expected button isn't there, the page needs authentication, or a CAPTCHA appears — you can catch it immediately.</p><p>There are two ways to access Live View. From code, obtain the <code>session_id</code> of the browser you want to inspect and open the <code>devtoolsFrontendURL</code> from the response in Chrome. Or from the Cloudflare dashboard, open the new Live Sessions tab in the Browser Run section and click into any active session.</p><div>
  
</div><p><sup><i>Live View of an AI agent booking a hotel, showing real-time browser activity</i></sup></p>
    <div>
      <h3>Session Recordings</h3>
      <a href="#session-recordings">
        
      </a>
    </div>
    <p>Live View is great when you’re available, but you can’t watch every session. <a href="https://developers.cloudflare.com/browser-run/features/session-recording/"><u>Session Recordings</u></a> captures DOM changes, mouse and keyboard events, and page navigation as structured JSON so you can replay any session after it ends. </p><p>Enable Session Recordings by passing <code>recording:true</code> when launching a browser. After the session closes, you can access the recording in the Cloudflare dashboard from the Runs tab or retrieve recordings via API and replay them with the <a href="https://github.com/rrweb-io/rrweb/tree/master/packages/rrweb-player"><u>rrweb-player</u></a>. Next, we’re adding the ability to inspect DOM state and console output at any point during the recording.   </p><div>
  
</div><p><sup><i>Session recording replay of a browser automation browsing the Sentry Shop and adding a bomber jacket to the cart </i></sup></p>
    <div>
      <h3>Dashboard Redesign</h3>
      <a href="#dashboard-redesign">
        
      </a>
    </div>
    <p>Previously, the <a href="https://dash.cloudflare.com/?to=/:account/workers/browser-run"><u>Browser Run dashboard</u></a> only showed logs from browser sessions. Requests for screenshots, PDFs, markdown, and crawls were not visible. The redesigned dashboard changes that. The new Runs tab shows every request. You can filter by endpoint and view details including target URLs, status, and duration.  </p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7eExar2kjoc2QSq6skzTf6/ba4ad79fa01eb060f8b14cd5afa342e5/BLOG-3221_2.png" />
          </figure><p><sup><i>The Browser Run dashboard Runs tab showing browser sessions and quick actions like PDF, Screenshot, and Crawl in a single view, with a crawl job expanded to show its progress</i></sup></p>
    <div>
      <h2>4) Intervene</h2>
      <a href="#4-intervene">
        
      </a>
    </div>
    <p>Agents are good, but they’re not perfect. Sometimes they need their human to step in. Browser Run supports Human in the Loop workflows where a human can take control of a live browser session, handle what the automation cannot, then let the session continue. </p>
    <div>
      <h3>Human in the Loop</h3>
      <a href="#human-in-the-loop">
        
      </a>
    </div>
    <p>When automation hits a wall, you don't have to restart. With <a href="https://developers.cloudflare.com/browser-run/features/human-in-the-loop/"><u>Human in the Loop</u></a>, you can step in and interact with the page directly to click, type, navigate, enter credentials, or submit forms. This unlocks workflows that agents cannot handle.</p><p>Today, you can step in by opening the Live View URL for any active session. Next, we’re adding a handoff flow where the agent can signal that it needs help, notify a human to step in, then hand control back to the agent once the issue is resolved.</p><div>
  
</div><p><sup><i>An AI agent searching Amazon for an orange lava lamp, comparing options, and handing off to a human when sign-in is required to complete the purchase</i></sup></p>
    <div>
      <h2>5) Scale</h2>
      <a href="#5-scale">
        
      </a>
    </div>
    <p>Customers have asked us to raise limits so that they can do more, faster.</p>
    <div>
      <h3>Higher limits</h3>
      <a href="#higher-limits">
        
      </a>
    </div>
    <p>We've quadrupled the <a href="https://developers.cloudflare.com/browser-rendering/limits/"><u>default concurrent browser limit from 30 to 120</u></a>. Every session gives you instant access to a browser from a global pool of warm instances, so there's no cold start waiting for a browser to spin up. In March, we also <a href="https://developers.cloudflare.com/changelog/post/2026-03-04-br-rest-api-limit-increase/"><u>increased limits for Quick Actions</u></a> to 10 requests per second. If you need higher limits, they're available by request.</p>
    <div>
      <h2>What's next</h2>
      <a href="#whats-next">
        
      </a>
    </div>
    <ul><li><p><b>Human in the Loop Handoff</b>: today you can intervene in a browser session through Live View. Soon, the agent will be able to signal when it needs help, so you can build in notifications to alert a human to step in.</p></li><li><p><b>Session Recordings Inspection</b>: you can already scrub through the timeline and replay any session. Soon, you’ll be able to inspect DOM state and console output as well.</p></li><li><p><b>Traces and Browser Logs</b>: access debugging information without instrumenting your code. Console logs, network requests, timing data. If something broke, you'll know where.</p></li><li><p><b>Screenshot, PDF, and markdown directly from Workers</b>: the same simple tasks available through the <a href="https://developers.cloudflare.com/browser-rendering/rest-api/"><u>REST API</u></a> are coming to <a href="https://developers.cloudflare.com/browser-rendering/workers-bindings/"><u>Workers Bindings</u></a>. e<code>nv.BROWSER.screenshot()</code> just works, with no API tokens needed.</p></li></ul>
    <div>
      <h2>Get started</h2>
      <a href="#get-started">
        
      </a>
    </div>
    <p>Browser Run is available today on both the Workers Free and Workers Paid plans. Everything we shipped today — Live View, Human in the Loop, Session Recordings, and higher concurrency limits — is ready to use. </p><p>If you were already using Browser Rendering, everything works the same, just with a new name and more features.  </p><p>Check out the <a href="https://developers.cloudflare.com/browser-rendering/"><u>documentation</u></a> to get started. </p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6mQq5rxfDK3oU80JCRUL7P/7842cac72f0f4170cc697011230146ab/BLOG-3221_3.png" />
          </figure><p></p> ]]></content:encoded>
            <category><![CDATA[Agents Week]]></category>
            <category><![CDATA[Chrome]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <category><![CDATA[Developers]]></category>
            <category><![CDATA[AI]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[Browser Rendering]]></category>
            <category><![CDATA[Agents]]></category>
            <category><![CDATA[Browser Run]]></category>
            <guid isPermaLink="false">160lCssR1GA8lEUV718ev5</guid>
            <dc:creator>Kathy Liao</dc:creator>
        </item>
        <item>
            <title><![CDATA[Rearchitecting the Workflows control plane for the agentic era]]></title>
            <link>https://blog.cloudflare.com/workflows-v2/</link>
            <pubDate>Wed, 15 Apr 2026 13:00:00 GMT</pubDate>
            <description><![CDATA[ Cloudflare Workflows, a durable execution engine for multi-step applications, now supports higher concurrency and creation rate limits through a rearchitectured control plane, helping scale to meet the use cases for durable background agents.
 ]]></description>
            <content:encoded><![CDATA[ <p>When we originally built <a href="https://developers.cloudflare.com/workflows/"><u>Workflows</u></a>, our durable execution engine for multi-step applications, it was designed for a world in which workflows were triggered by human actions, like a user signing up or placing an order. For use cases like onboarding flows, workflows only had to support one instance per person — and people can only click so fast. </p><p>Over time, what we’ve actually seen is a quantitative shift in the workload and access pattern: fewer human-triggered workflows, and more agent-triggered workflows, created at machine speed. </p><p>As agents become persistent and autonomous infrastructure, operating on behalf of users for hours or days, they need a durable, asynchronous execution engine for the work they are doing. Workflows provides exactly that: every step is independently retryable, the workflow can pause for human-in-the-loop approval, and each instance survives failures without losing progress.  </p><p>Moreover, workflows themselves are being used to implement agent loops and serve as the durable harnesses that manage and keep agents alive. Our<a href="https://developers.cloudflare.com/changelog/post/2026-02-03-agents-workflows-integration/"> <u>Agents SDK integration</u></a> accelerated this, making it easy for agents to spawn workflow instances and get real-time progress back. A single agent session can now kick off dozens of workflows, and many agents running concurrently means thousands of instances created in seconds. With <a href="https://blog.cloudflare.com/project-think"><u>Project Think</u></a> now available, we anticipate that velocity will only increase.</p><p>To help developers scale their agents and applications on Workflows, we are excited to announce that we now support:</p><ul><li><p>50,000 concurrent instances (number of workflow executions running in parallel), <a href="https://developers.cloudflare.com/changelog/post/2025-02-25-workflows-concurrency-increased/"><u>originally 4,500</u></a></p></li><li><p>300 instances/second created per account, previously 100</p></li><li><p>2 million queued instances (meaning instances that have been created or awoken and are waiting for a concurrency slot) per workflow, up from 1 million</p></li></ul><p>We redesigned the Workflows control plane from usage data and first principles to support these increases. For V1 of the control plane, a single Durable Object (DO) could serve as the central registry and coordinator of an entire account. For V2, we built two new components to help horizontally scale the system and alleviate the bottlenecks that V1 introduced, before migrating all customers — with live traffic — seamlessly onto the new version.</p>
    <div>
      <h2>V1: initial architecture of Workflows</h2>
      <a href="#v1-initial-architecture-of-workflows">
        
      </a>
    </div>
    <p>As described in our <a href="https://blog.cloudflare.com/building-workflows-durable-execution-on-workers/#building-cloudflare-on-cloudflare"><u>public beta blog post</u></a>, we built <a href="https://www.cloudflare.com/developer-platform/products/workflows/"><u>Workflows</u></a> entirely on our own developer platform. Fundamentally, a workflow is a series of durable steps, each independently retryable, that can execute tasks, wait for external events, or sleep until a predetermined time. </p>
            <pre><code>export class MyWorkflow extends WorkflowEntrypoint {

  async run(event, step) {
    const data = await step.do("fetch-data", async () =&gt; {
      return fetchFromAPI();
    });

    const approval = await step.waitForEvent("approval", {
      type: "approval",
      timeout: "24 hours",
    });

    await step.do("process-and-save", async () =&gt; {
      return store(transform(data));
    });
  }
}
</code></pre>
            <p>To trigger each instance, execute its logic, and store its metadata, we leverage SQLite-backed <a href="https://www.cloudflare.com/developer-platform/products/durable-objects/"><u>Durable Objects</u></a>, which are a simple but powerful primitive for coordination and storage within a distributed system. </p><p>In the control plane, some Durable Objects — like the <i>Engine</i>, which executes the actual workflow instance, including its step, retry, and sleep logic — are spun up at a ratio of 1:1 per instance. On the other hand, the <i>Account</i> is an account-level Durable Object that manages all workflows and workflow instances for that account.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/55bqaUjc30HJHe9spWYTo8/d8053955660553db8b64a484fb321ec7/BLOG-3116_2.png" />
          </figure><p>To learn more about the V1 control plane, refer to our <a href="https://blog.cloudflare.com/building-workflows-durable-execution-on-workers/"><u>Workflows announcement blog post</u></a>.</p><p>After we launched Workflows into beta, we were thrilled to see customers quickly scaling their use of the product, but we also realized that having a single Durable Object to store all that account-level information introduced a bottleneck. Many customers needed to create and execute hundreds or even thousands of Workflow instances per minute, which could quickly overwhelm the <i>Account</i> in our original architecture. The original rate limits — 4,500 concurrency slots and 100 instance creations per 10 seconds — were a result of this limitation. </p><p>On the V1 control plane, these limits were a hard cap. Any and all operations depending on <i>Account</i>, including create, update, and list, had to go through that single DO. Users with high concurrency workloads could have thousands of instances starting and ending at any given moment, building up to thousands of requests per second to <i>Account</i>. To solve for this, we rearchitected the workflow control plane such that it horizontally scales to higher concurrency and creation rate limits. </p>
    <div>
      <h2>V2: horizontal scale for higher throughput</h2>
      <a href="#v2-horizontal-scale-for-higher-throughput">
        
      </a>
    </div>
    <p>For the new version, we rethought every single operation from the ground up with the goal of optimizing for high-volume workflows. Ultimately, Workflows should scale to support whatever developers need – whether that is thousands of instances created per second or millions of instances running at a time. We also wanted to ensure that V2 allowed for flexible limits, which we can toggle and continue increasing, rather than the hard cap which V1 limits imposed. After many design iterations, we settled on the following pillars for our new architecture: </p><ul><li><p>The source of truth for the existence of a given instance should be its <i>Engine</i> and nothing else. </p><ul><li><p>In the V1 control plane architecture, we lacked a check before queuing the instance as to whether its <i>Engine</i> actually existed. This allowed for a bad state where an instance may have been queued without its corresponding <i>Engine </i>having spun up. </p></li><li><p>Instance lifecycle and liveness mechanisms must be horizontally scalable per-workflow and distributed throughout many regions.</p></li></ul></li><li><p>The new Account singleton should only store the minimum necessary metadata and have an invariant maximum amount of concurrent requests.</p></li></ul>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1txhhObwwIcV8C2gr9Hjfe/df7ea739567c7e42471458357c16583d/unnamed.png" />
          </figure><p>There are two new, critical components in the V2 control plane which allowed us to improve the scalability of Workflows: <i>SousChef</i> and <i>Gatekeeper</i>. The first component, <i>SousChef</i>, is a “second in command” to the <i>Account</i>. Recall that previously, the <i>Account</i> managed the metadata and lifecycle for all of the instances across all of the workflows within a given account. <i>SousChef</i> was introduced to keep track of metadata and lifecycle on a <b>subset</b> of instances in a given workflow. Within an account, a distribution of <i>SousChefs</i> can then report back to <i>Account</i> in a more efficient and manageable way. (An added benefit of this design: not only did we already have per-account isolation, but we also inadvertently gained “per-workflow” isolation within the same account, since each <i>SousChef</i> only takes care of one specific workflow).</p><p>The second component, <i>Gatekeeper</i>, is a mechanism to distribute concurrency “slots” (derived from concurrency limits) across all <i>SousChefs</i> within the account. It acts as a leasing system. When an instance is created, it is randomly assigned to one of the <i>SousChefs</i> within that account. Then the <i>SousChef</i> makes a request to <i>Account</i> to trigger that instance. Either a slot is granted, or the instance is queued. Once the slot is granted, the <i>SousChef</i> triggers execution of the instance and assumes responsibility that the instance never gets stuck. </p><p><i>Gatekeeper</i> was needed to make sure that <i>Engines</i> never overloaded their <i>Account</i> (a pressing risk on V1) so every communication between <i>SousChefs</i> and their <i>Account</i> happens on a periodic cycle, once per second — each cycle will also batch all slot requests, ensuring that only one JSRPC call is made. This ensures the instance creation rate can never overload or influence the most important component, <i>Account</i> (as an aside: if the <i>SousChef </i>count is too high, we rate-limit calls or spread across different <i>SousChefs</i> throughout different time periods). Also, this periodic property allows us to preserve fairness on older instances and to ensure max-min fairness through the many <i>SousChefs</i>, allowing them all to progress. For example, if an instance wakes up, it should be prioritized for a slot over a newly created instance, but each <i>SousChef</i> ensures that its own instances do not get stuck.</p><p>This architecture is more distributed, and therefore, more scalable. Now, when an instance is created, the request path is:</p><ol><li><p>Check control plane version</p></li><li><p>Check if a cached version of the workflow and version details is available in that location</p><ol><li><p>If not, check <i>Account</i> to get workflow name, unique ID, and version, and cache that information</p></li></ol></li><li><p>Store only necessary metadata (instance payload, creation date) onto its own <i>Engine</i></p></li></ol><p>So, how does <i>Engine</i> tell the control plane that it now exists? That happens in the background after instance metadata is set. As background operations on a Durable Object can fail, due to eviction or server failure, we also set an “alarm” on <i>Engine</i> in the creation hot-path. That way, if the background task does not finish, the alarm <b>ensures</b> that the instance will begin. </p><p>A <a href="https://developers.cloudflare.com/durable-objects/api/alarms/"><u>Durable Object alarm</u></a> allows a Durable Object instance to be awakened at a fine-grained time in the future with an<b> at-least-once </b>execution model, with automatic retries built in. We extensively use this combination of background “tasks” and alarms to remove operations off the hot-path while still ensuring that everything will happen as planned. That’s how we keep critical operations like <i>creating an instance</i> fast without ever compromising on reliability. </p><p>Other than unlocking scale, this version of the control plane means that: </p><ul><li><p>Instance listing performance is faster, and actually consistent with cursor pagination; </p></li><li><p>Any operation on an instance does exactly one network hop (as it can go directly to its <i>Engine</i>, ensuring that eyeball request latency is as small as we can manage);</p></li><li><p>We can ensure that more instances are actually behaving correctly (by running on time) concurrently (and correct them if not, making sure that <i>Engines</i> are never late to continue execution).</p></li></ul>
    <div>
      <h2>V1 → V2 migration</h2>
      <a href="#v1-v2-migration">
        
      </a>
    </div>
    <p>Now that we had a new version of the Workflows control plane that can handle a higher volume of user load, we needed to do the “boring” part: migrating our customers and instances to the new system. At Cloudflare’s scale, this becomes a problem in and of itself, so the “boring” part becomes the biggest challenge. Well before its one-year mark, Workflows had already racked up millions of instances and thousands of customers. Also, some tech debt on V1’s control plane meant that a queued instance might not have its own <i>Engine</i> Durable Object created yet, complicating matters further.</p><p>Such a migration is tricky because customers might have instances running at any given moment; we needed a way to add the <i>SousChef</i> and <i>Gatekeeper</i> components into older accounts without causing any disruption or downtime.</p><p>We ultimately decided that we would migrate existing <i>Accounts </i>(which we’ll refer to as <i>AccountOlds) </i>to behave like <i>SousChefs. </i>By persisting the <i>Account</i> DOs, we maintained the instance metadata, and simply converted the DO into a <i>SousChef</i> “DO”: </p>
            <pre><code>// You might be wondering what's this SousChef class? This is the SousChef DO class!
import { SousChef } from "@repo/souschef";

class AccountOld extends DurableObject {
  constructor(state: DurableObjectState, env: Env) {
    // We added the following snippet to the end of our AccountOld DO's
    // constructor. This ensures that if we want, we can use any primitive
    // that is available on SousChef DO
    if (this.currentVersion === ControlPlaneVersions.SOUS_CHEFS) {
      this.sousChef = new SousChef(this.ctx, this.env);
      await this.sousChef.setup()
    }
  }

  async updateInstance(params: UpdateInstanceParams) {
    if (this.currentVersion === ControlPlaneVersions.SOUS_CHEFS) {
      assert(this.sousChef !== undefined, 'SousChef must exist on v2');
      return this.sousChef.updateInstance(params);
    }

    // old logic remains the same
  }

  @RequiresVersion&lt;AccountOld&gt;(ControlPlaneVersions.V1)
  async getMetadata() {
    // this method can only be run if 
    // this.currentVersion === ControlPlaneVersions.V1
  }
}</code></pre>
            <p>We can instantiate the <i>SousChef</i> class within the <i>AccountOld</i> because the SQL tables that track instance metadata, on both <i>SousChefs</i> and <i>AccountOld</i> DOs, are the same on both. As such, we could just decide which version of the code to use. If this hadn’t been the case, we would have been forced to migrate the metadata of millions of instances, which would have made the migration more difficult and longer running for each account. So, how did the migration work?</p><p>First, we prepared <i>AccountOld</i> DOs to be switched to behave as <i>SousChefs</i> (which meant creating a release with a version of the snippet above). Then, we enabled control plane V2 per account, which triggered the next three steps roughly at the same time:</p><ul><li><p>All new instance creation requests are now routed to the new <i>SousChefs</i> (<i>SousChefs</i> are created when they receive the first request), new instances never go to <i>AccountOld</i> again;</p></li><li><p><i>AccountOld</i> DOs start migrating themselves to behave like <i>SousChefs</i>;</p></li><li><p>The new <i>Account</i> DO is spun up with the corresponding metadata.</p></li></ul><p>After all accounts were migrated to the new control plane version, we were able to sunset <i>AccountOld</i> DOs as their instance retention periods expired. Once all instances on all accounts on <i>AccountOlds</i> were migrated, we could spin down those DOs permanently. The migration was completed with no downtime in a process that truly felt like changing a car’s wheels while driving.</p>
    <div>
      <h2>Try it out</h2>
      <a href="#try-it-out">
        
      </a>
    </div>
    <p>If you are new to Workflows, try our <a href="https://developers.cloudflare.com/workflows/get-started/guide/"><u>Get Started guide</u></a> or <a href="https://developers.cloudflare.com/workflows/get-started/durable-agents/"><u>build your first durable agent</u></a> with Workflows.</p><p>If your use case requires higher limits than our new defaults — a concurrency limit of 50,000 slots and account-level creation rate limit of 300 instances per second, 100 per workflow — reach out via your account team or the <a href="https://forms.gle/ukpeZVLWLnKeixDu7"><u>Workers Limit Request Form</u></a>. You can also reach out with feedback, feature requests, or just to share how you are using Workflows on our <a href="https://discord.com/channels/595317990191398933/1296923707792560189"><u>Discord server</u></a>.</p> ]]></content:encoded>
            <category><![CDATA[Agents Week]]></category>
            <category><![CDATA[Agents]]></category>
            <category><![CDATA[Durable Objects]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <category><![CDATA[Developers]]></category>
            <guid isPermaLink="false">5R3ZpKlSDaSxbIwmpXwWYJ</guid>
            <dc:creator>Luís Duarte</dc:creator>
            <dc:creator>Mia Malden</dc:creator>
            <dc:creator>André Venceslau</dc:creator>
        </item>
        <item>
            <title><![CDATA[Add voice to your agent]]></title>
            <link>https://blog.cloudflare.com/voice-agents/</link>
            <pubDate>Wed, 15 Apr 2026 13:00:00 GMT</pubDate>
            <description><![CDATA[ An experimental voice pipeline for the Agents SDK enables real-time voice interactions over WebSockets. Developers can now build agents with continuous STT and TTS in just ~30 lines of server-side code.
 ]]></description>
            <content:encoded><![CDATA[ <p>For many of us, our first experiences with AI agents have been through typing into a chat box. And for those of us using agents day to day, we have likely gotten very good at writing detailed prompts or markdown files to guide them.</p><p>But some of the moments where agents would be most useful are not always text-first. You might be on a long commute, juggling a few open sessions, or just wanting to speak naturally to an agent, have it speak back, and continue the interaction.</p><p>Adding voice to an agent should not require moving that agent into a separate voice framework. Today, we are releasing an experimental voice pipeline for the <a href="https://developers.cloudflare.com/agents/api-reference/voice/"><u>Agents SDK</u></a>.</p><p>With <code>@cloudflare/voice</code>, you can add real-time voice to the same Agent architecture you already use. Voice just becomes another way you can talk to the same Durable Object, with the same tools, persistence, and WebSocket connection model that the Agents SDK already provides. </p><p><code>@cloudflare/voice</code> is an experimental package for the Agents SDK that provides: </p><ul><li><p><code>withVoice(Agent)</code> for full conversation voice agents</p></li><li><p><code>withVoiceInput(Agent)</code> for speech-to-text-only use cases, like dictation or voice search </p></li><li><p><code>useVoiceAgent</code> and <code>useVoiceInput</code> hooks for React apps </p></li><li><p><code>VoiceClient</code> for framework-agnostic clients </p></li><li><p>Built-in <a href="https://developers.cloudflare.com/workers-ai/"><u>Workers AI</u></a> providers, so that you can get started without external API keys: </p><ul><li><p>Continuous STT with <a href="https://developers.cloudflare.com/workers-ai/models/flux/"><u>Deepgram Flux</u></a></p></li><li><p>Continuous STT with<a href="https://developers.cloudflare.com/workers-ai/models/nova-3/"><u> Deepgram Nova 3</u></a></p></li><li><p>Text-to-speech with <a href="https://developers.cloudflare.com/workers-ai/models/aura-1/"><u>Deepgram Aura</u></a></p></li></ul></li></ul><p>This means you can now build an agent that users can talk to in real time over a single WebSocket connection, while keeping the same Agent class, Durable Object instance, and the same SQLite-backed conversation history. </p><p>Just as importantly, we want this to be bigger than one fixed default stack. The provider interfaces in <code>@cloudflare/voice</code> are intentionally small, and we want speech, telephony, and transport providers to build with us, so developers can mix and match the right components for their use case, instead of being locked into a single voice architecture.</p>
    <div>
      <h2>Get started with voice</h2>
      <a href="#get-started-with-voice">
        
      </a>
    </div>
    <p>Here’s the minimal server-side pattern for a voice agent in the Agents SDK: </p>
            <pre><code>import { Agent, routeAgentRequest } from "agents";
import {
  withVoice,
  WorkersAIFluxSTT,
  WorkersAITTS,
  type VoiceTurnContext
} from "@cloudflare/voice";

const VoiceAgent = withVoice(Agent);

export class MyAgent extends VoiceAgent&lt;Env&gt; {
  transcriber = new WorkersAIFluxSTT(this.env.AI);
  tts = new WorkersAITTS(this.env.AI);

  async onTurn(transcript: string, context: VoiceTurnContext) {
    return `You said: ${transcript}`;
  }
}

export default {
  async fetch(request: Request, env: Env) {
    return (
      (await routeAgentRequest(request, env)) ??
      new Response("Not found", { status: 404 })
    );
  }
} satisfies ExportedHandler&lt;Env&gt;;

</code></pre>
            <p>That’s the whole server. You add a continuous transcriber, a text-to-speech provider, and implement <code>onTurn()</code>. 

On the client side, you can connect to it with a React hook: </p>
            <pre><code>import { useVoiceAgent } from "@cloudflare/voice/react";

function App() {
  const {
    status,
    transcript,
    interimTranscript,
    startCall,
    endCall,
    toggleMute
  } = useVoiceAgent({ agent: "my-agent" });

  return (
    &lt;div&gt;
      &lt;p&gt;Status: {status}&lt;/p&gt;
      {interimTranscript &amp;&amp; &lt;p&gt;&lt;em&gt;{interimTranscript}&lt;/em&gt;&lt;/p&gt;}
      &lt;ul&gt;
        {transcript.map((msg, i) =&gt; (
          &lt;li key={i}&gt;
            &lt;strong&gt;{msg.role}:&lt;/strong&gt; {msg.text}
          &lt;/li&gt;
        ))}
      &lt;/ul&gt;
      &lt;button onClick={startCall}&gt;Start Call&lt;/button&gt;
      &lt;button onClick={endCall}&gt;End Call&lt;/button&gt;
      &lt;button onClick={toggleMute}&gt;Mute / Unmute&lt;/button&gt;
    &lt;/div&gt;
  );
}
</code></pre>
            <p>If you are not using React, you can use <code>VoiceClient</code> directly from <code>@cloudflare/voice/client</code>. </p>
    <div>
      <h2>How the voice pipeline works</h2>
      <a href="#how-the-voice-pipeline-works">
        
      </a>
    </div>
    <p>With the <a href="https://github.com/cloudflare/agents"><u>Agents SDK</u></a>, every agent is a <a href="https://developers.cloudflare.com/durable-objects/"><u>Durable Object</u></a> — a stateful, addressable server instance with its own <a href="https://developers.cloudflare.com/agents/api-reference/store-and-sync-state/"><u>SQLite database</u></a>, <a href="https://developers.cloudflare.com/agents/api-reference/websockets/"><u>WebSocket connections</u></a>, and application logic. The voice pipeline extends this model instead of replacing it. </p><p>At a high level, the flow looks like this:</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5IBi8TsQiJ18Um47zpqAIr/d284d47a8653c35bc7c027438a5a7a2c/unnamed__55_.png" />
          </figure><p>Here’s how the pipeline breaks down, step by step: </p><ol><li><p><b>Audio transport: </b>The browser captures microphone audio and streams 16 kHz mono PCM over the same WebSocket connection the agent already uses. </p></li><li><p><b>STT session setup:  </b>When the call starts, the agent creates a continuous transcriber session that lives for the duration of the call. </p></li><li><p><b>STT input: </b>Audio streams continuously into that session.</p></li><li><p><b>STT turn detection: </b>The speech-to-text model itself decides when the user has finished an utterance and emits a stable transcript for that turn. </p></li><li><p><b>LLM/application logic: </b>The voice pipeline passes that transcript to your <code>onTurn(</code>) method. </p></li><li><p><b>TTS output: </b>Your response is synthesized to audio and sent back to the client. If <code>onTurn()</code> returns a stream, the pipeline sentence-chunks it and starts sending audio as sentences are ready. </p></li><li><p><b>Persistence: </b>The user and agent messages are persisted in SQLite, so conversation history survives reconnections and deployments.  </p></li></ol>
    <div>
      <h2>Why voice should grow with the rest of your agent</h2>
      <a href="#why-voice-should-grow-with-the-rest-of-your-agent">
        
      </a>
    </div>
    <p>Many voice frameworks focus on the voice loop itself: audio in, transcription, model response, audio out. Those are important primitives, but there’s a lot more to an agent than just voice. </p><p>Real agents running in production will grow. They need state, scheduling, persistence, tools, workflows, telephony, and ways to keep all of that consistent across channels. As your agent grows in complexity, voice stops being a standalone feature and becomes part of a larger system. </p><p>We wanted voice in the Agents SDK to start from that assumption. Instead of building voice as a separate stack, we built it on top of the same Durable Object-based agent platform, so you can pull in the rest of the primitives you need without re-architecting the application later.</p>
    <div>
      <h3>Voice and text share the same state</h3>
      <a href="#voice-and-text-share-the-same-state">
        
      </a>
    </div>
    <p>A user might start by typing, switch to voice, and go back to text. With Agents SDK, these are all just different inputs to the same agent. The same conversation history lives in SQLite, and the same tools are available. This gives you both a cleaner mental model and a much simpler application architecture to reason about. </p>
    <div>
      <h2>Lower latency comes from...</h2>
      <a href="#lower-latency-comes-from">
        
      </a>
    </div>
    
    <div>
      <h3>a shorter network path </h3>
      <a href="#a-shorter-network-path">
        
      </a>
    </div>
    <p>Voice experiences feel good or bad very quickly. Once a user stops speaking, the system needs to transcribe, think, and start speaking back fast enough to feel conversational. </p><p>A lot of voice latency is not pure model time. It’s the cost of bouncing audio and text between different services in different places. Audio needs to go to STT, transcripts go to an LLM, and responses go to a TTS model – and each handoff adds network overhead. </p><p>With the Agents SDK voice pipeline, the agent runs on Cloudflare’s network, and the built-in providers use Workers AI bindings. That keeps the pipeline tighter and reduces the amount of infrastructure you have to stitch together yourself. </p>
    <div>
      <h3>built-in streaming</h3>
      <a href="#built-in-streaming">
        
      </a>
    </div>
    <p>A voice agent interaction feels much more natural if it speaks the first sentence quickly (also called Time-to-First Audio). When <code>onTurn()</code> returns a stream, the pipeline chunks it into sentences and starts synthesis as sentences complete. That means the user can hear the beginning of the answer while the rest is still being generated. </p>
    <div>
      <h2>A more realistic backend </h2>
      <a href="#a-more-realistic-backend">
        
      </a>
    </div>
    <p>Here is a fuller example that streams an LLM response and starts speaking it back, sentence by sentence:</p>
            <pre><code>import { Agent, routeAgentRequest } from "agents";
import {
  withVoice,
  WorkersAIFluxSTT,
  WorkersAITTS,
  type VoiceTurnContext
} from "@cloudflare/voice";
import { streamText } from "ai";
import { createWorkersAI } from "workers-ai-provider";

const VoiceAgent = withVoice(Agent);

export class MyAgent extends VoiceAgent&lt;Env&gt; {
  transcriber = new WorkersAIFluxSTT(this.env.AI);
  tts = new WorkersAITTS(this.env.AI);

  async onTurn(transcript: string, context: VoiceTurnContext) {
    const ai = createWorkersAI({ binding: this.env.AI });

    const result = streamText({
      model: ai("@cf/cloudflare/gpt-oss-20b"),
      system: "You are a helpful voice assistant. Be concise.",
      messages: [
        ...context.messages.map((m) =&gt; ({
          role: m.role as "user" | "assistant",
          content: m.content
        })),
        { role: "user" as const, content: transcript }
      ],
      abortSignal: context.signal
    });

    return result.textStream;
  }
}

export default {
  async fetch(request: Request, env: Env) {
    return (
      (await routeAgentRequest(request, env)) ??
      new Response("Not found", { status: 404 })
    );
  }
} satisfies ExportedHandler&lt;Env&gt;;
</code></pre>
            <p><code>Context.messages</code> gives you recent SQLite-backed conversation history, and <code>context.signal</code> lets the pipeline abort the LLM call if the user interrupts. </p>
    <div>
      <h2>Voice as an input: <code>withVoiceInput</code></h2>
      <a href="#voice-as-an-input-withvoiceinput">
        
      </a>
    </div>
    <p>Not every speech interface needs to speak back. Sometimes you might want dictation, transcription, or voice search. For these use cases, you can use <code>withVoiceInput</code></p>
            <pre><code>import { Agent, type Connection } from "agents";
import { withVoiceInput, WorkersAINova3STT } from "@cloudflare/voice";

const InputAgent = withVoiceInput(Agent);

export class DictationAgent extends InputAgent&lt;Env&gt; {
  transcriber = new WorkersAINova3STT(this.env.AI);

  onTranscript(text: string, _connection: Connection) {
    console.log("User said:", text);
  }
}
</code></pre>
            <p>On the client, <code>useVoiceInput</code> gives you a lightweight interface centered on transcriptions: </p>
            <pre><code>import { useVoiceInput } from "@cloudflare/voice/react";

const { transcript, interimTranscript, isListening, start, stop, clear } =
  useVoiceInput({ agent: "DictationAgent" });
</code></pre>
            <p>This is useful when speech is an input method, and you don’t need a full conversational loop. </p>
    <div>
      <h2>Voice and text on the same connection</h2>
      <a href="#voice-and-text-on-the-same-connection">
        
      </a>
    </div>
    <p>The same client can call <code>sendText(“What’s the weather?”)</code>, which bypasses STT and sends the text directly to <code>onTurn()</code>. During an active call, the response can be spoken and shown as text. Outside a call, it can remain text-only. </p><p>This gives you a genuinely multimodal agent, without splitting the implementation into different code paths. </p>
    <div>
      <h2>What else can you build? </h2>
      <a href="#what-else-can-you-build">
        
      </a>
    </div>
    <p>Because a voice agent is still an agent, all the normal Agents SDK capabilities still apply. </p>
    <div>
      <h3>Tools and scheduling</h3>
      <a href="#tools-and-scheduling">
        
      </a>
    </div>
    <p>You can greet a caller when a session starts: </p>
            <pre><code>import { Agent, type Connection } from "agents";
import { withVoice, WorkersAIFluxSTT, WorkersAITTS } from "@cloudflare/voice";

const VoiceAgent = withVoice(Agent);

export class MyAgent extends VoiceAgent&lt;Env&gt; {
  transcriber = new WorkersAIFluxSTT(this.env.AI);
  tts = new WorkersAITTS(this.env.AI);

  async onTurn(transcript: string) {
    return `You said: ${transcript}`;
  }

  async onCallStart(connection: Connection) {
    await this.speak(connection, "Hi! How can I help you today?");
  }
}
</code></pre>
            <p>You can schedule spoken reminders and expose tools to your LLM just like any other agent: </p>
            <pre><code>import { Agent } from "agents";
import {
  withVoice,
  WorkersAIFluxSTT,
  WorkersAITTS,
  type VoiceTurnContext
} from "@cloudflare/voice";
import { streamText, tool } from "ai";
import { createWorkersAI } from "workers-ai-provider";
import { z } from "zod";

const VoiceAgent = withVoice(Agent);

export class MyAgent extends VoiceAgent&lt;Env&gt; {
  transcriber = new WorkersAIFluxSTT(this.env.AI);
  tts = new WorkersAITTS(this.env.AI);

  async speakReminder(payload: { message: string }) {
    await this.speakAll(`Reminder: ${payload.message}`);
  }

  async onTurn(transcript: string, context: VoiceTurnContext) {
    const ai = createWorkersAI({ binding: this.env.AI });

    const result = streamText({
      model: ai("@cf/cloudflare/gpt-oss-20b"),
      messages: [
        ...context.messages.map((m) =&gt; ({
          role: m.role as "user" | "assistant",
          content: m.content
        })),
        { role: "user" as const, content: transcript }
      ],
      tools: {
        set_reminder: tool({
          description: "Set a spoken reminder after a delay",
          inputSchema: z.object({
            message: z.string(),
            delay_seconds: z.number()
          }),
          execute: async ({ message, delay_seconds }) =&gt; {
            await this.schedule(delay_seconds, "speakReminder", { message });
            return { confirmed: true };
          }
        })
      },
      abortSignal: context.signal
    });

    return result.textStream;
  }
}
</code></pre>
            
    <div>
      <h3>Runtime model switching</h3>
      <a href="#runtime-model-switching">
        
      </a>
    </div>
    <p>The voice pipeline also lets you choose a transcription model dynamically per connection. </p><p>For example, you might prefer Flux for conversational turn-taking and Nova 3 for higher-accuracy dictation. You can switch at runtime by overriding <code>createTranscriber()</code>: </p>
            <pre><code>import { Agent, type Connection } from "agents";
import {
  withVoice,
  WorkersAIFluxSTT,
  WorkersAINova3STT,
  WorkersAITTS,
  type Transcriber
} from "@cloudflare/voice";

export class MyAgent extends VoiceAgent&lt;Env&gt; {
  tts = new WorkersAITTS(this.env.AI);

  createTranscriber(connection: Connection): Transcriber {
    const url = new URL(connection.url ?? "http://localhost");
    const model = url.searchParams.get("model");
    if (model === "nova-3") {
      return new WorkersAINova3STT(this.env.AI);
    }
    return new WorkersAIFluxSTT(this.env.AI);
  }
}
</code></pre>
            <p>On the client, you can pass query parameters through the hook: </p>
            <pre><code>const voiceAgent = useVoiceAgent({
  agent: "my-voice-agent",
  query: { model: "nova-3" }
});
</code></pre>
            
    <div>
      <h2>Pipeline hooks</h2>
      <a href="#pipeline-hooks">
        
      </a>
    </div>
    <p>You can also intercept data between stages: </p><ul><li><p><code>afterTranscribe(transcript, connection)</code></p></li><li><p><code>beforeSynthesize(text, connection)</code></p></li><li><p><code>afterSynthesize(audio, text, connection)</code></p></li></ul><p>These hooks are useful for content filtering, text normalization, language-specific transformations, or custom logging. </p>
    <div>
      <h2>Telephone and transport options</h2>
      <a href="#telephone-and-transport-options">
        
      </a>
    </div>
    <p>By default, the voice pipeline uses a single WebSocket connection as the simplest path for 1:1 voice agents. But that’s not the only option. </p>
    <div>
      <h3>Phone calls via Twilio</h3>
      <a href="#phone-calls-via-twilio">
        
      </a>
    </div>
    <p>You can connect phone calls to the same agent using the Twilio adapter:</p>
            <pre><code>import { TwilioAdapter } from "@cloudflare/voice-twilio";

export default {
  async fetch(request: Request, env: Env) {
    if (new URL(request.url).pathname === "/twilio") {
      return TwilioAdapter.handleRequest(request, env, "MyAgent");
    }

    return (
      (await routeAgentRequest(request, env)) ??
      new Response("Not found", { status: 404 })
    );
  }
};
</code></pre>
            <p>This lets the same agent handle web voice, text input, and phone calls. </p><p>One caveat: the default Workers AI TTS provider returns MP3, while Twilio expects mulaw 8kHz audio. For production telephony, you may want to use a TTS provider that outputs PCM or mulaw directly. </p>
    <div>
      <h3>WebRTC</h3>
      <a href="#webrtc">
        
      </a>
    </div>
    <p>If you need a transport that is better suited to difficult network conditions or will include multiple participants, the voice package also includes SFU utilities and supports custom transports. The default model is WebSocket-native today, but we plan to develop more adapters to connect to our <a href="https://developers.cloudflare.com/realtime/sfu/"><u>global SFU infrastructure</u></a>. </p>
    <div>
      <h2>Build with us </h2>
      <a href="#build-with-us">
        
      </a>
    </div>
    <p>The voice pipeline is provider-agnostic by design. </p><p>Under the hood, each stage is defined by a small interface: a transcriber opens a continuous session and accepts audio frames as they arrive, while a TTS provider takes text and returns audio. If a provider can stream audio output, the pipeline can use that too.</p>
            <pre><code>interface Transcriber {
  createSession(options?: TranscriberSessionOptions): TranscriberSession;
}

interface TranscriberSession {
  feed(chunk: ArrayBuffer): void;
  close(): void;
}

interface TTSProvider {
  synthesize(text: string, signal?: AbortSignal): Promise&lt;ArrayBuffer | null&gt;;
}
</code></pre>
            <p>We didn’t want voice support in Agents SDK to only work with one fixed combination of models and transports. We wanted the default path to be simple, while still making it easy to plug in other providers as the ecosystem grows. </p><p>The built-in providers use Workers AI, so you can get started without external API keys:</p><ul><li><p><code>WorkersAIFluxSTT</code> for conversational streaming STT</p></li><li><p><code>WorkersAINova3STT</code> for dictation-style streaming STT</p></li><li><p><code>WorkersAITTS</code> for text-to-speech</p></li></ul><p>But the bigger goal is interoperability. If you maintain a speech or voice service, these interfaces are small enough to implement without needing to understand the rest of the SDK internals. If your STT provider accepts streaming audio and can detect utterance boundaries, it can satisfy the transcriber interface. If your TTS provider can stream audio output, even better. </p><p>We would love to work on interoperability with:</p><ul><li><p>STT providers like AssemblyAI, Rev.ai, Speechmatics, or any service with a real-time transcription API</p></li><li><p>TTS providers like PlayHT, LMNT, Cartesia, Coqui, Amazon Polly, or Google Cloud TTS</p></li><li><p>telephony adapters for platforms like Vonage, Telnyx, or Bandwidth</p></li><li><p>transport implementations for WebRTC data channels, SFU bridges, and other audio transport layers</p></li></ul><p>We are also interested in collaborations that go beyond individual providers:</p><ul><li><p>latency benchmarking across STT + LLM + TTS combinations</p></li><li><p>multilingual support and better documentation for non-English voice agents</p></li><li><p>accessibility work, especially around multimodal interfaces and speech impairments</p></li></ul><p>If you are building voice infrastructure and want to see a first-class integration, <a href="https://github.com/cloudflare/agents/pulls"><u>open a PR</u></a> or reach out.</p>
    <div>
      <h2>Try it now</h2>
      <a href="#try-it-now">
        
      </a>
    </div>
    <p>The voice pipeline is available today as an experimental package:</p>
            <pre><code>npm create cloudflare@latest -- --template cloudflare/agents-starter
</code></pre>
            <p>Add <code>@cloudflare/voice</code>, give your agent a transcriber and a TTS provider, deploy it, and start talking to it. You can also read the <a href="https://developers.cloudflare.com/agents/api-reference/voice/"><u>API reference</u></a>. </p><p>If you build something interesting, open an issue or PR on <a href="https://github.com/cloudflare/agents"><u>github.com/cloudflare/agents</u></a>. Voice should not require a separate stack, and we think the best voice agents will be the ones built on the same durable application model as everything else.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1SLeHwqXY0ehOUsy5Nzzq2/c16244f45b8411f8817b9a21b45ed4b8/BLOG-3198_3.png" />
          </figure><p></p> ]]></content:encoded>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[AI]]></category>
            <category><![CDATA[Agents Week]]></category>
            <category><![CDATA[Developers]]></category>
            <category><![CDATA[Agents]]></category>
            <category><![CDATA[Durable Objects]]></category>
            <guid isPermaLink="false">1t4PF8FLpPMGpgBd6UjBBv</guid>
            <dc:creator>Sunil Pai</dc:creator>
            <dc:creator>Korinne Alpers</dc:creator>
        </item>
        <item>
            <title><![CDATA[Scaling MCP adoption: Our reference architecture for simpler, safer and cheaper enterprise deployments of MCP]]></title>
            <link>https://blog.cloudflare.com/enterprise-mcp/</link>
            <pubDate>Tue, 14 Apr 2026 13:00:10 GMT</pubDate>
            <description><![CDATA[ We share Cloudflare's internal strategy for governing MCP using Access, AI Gateway, and MCP server portals. We also launch Code Mode to slash token costs and recommend new rules for detecting Shadow MCP in Cloudflare Gateway.
 ]]></description>
            <content:encoded><![CDATA[ <p>We at Cloudflare have aggressively adopted <a href="https://modelcontextprotocol.io/"><u>Model Context Protocol (MCP)</u></a> as a core part of our AI strategy. This shift has moved well beyond our engineering organization, with employees across product, sales, marketing, and finance teams now using agentic workflows to drive efficiency in their daily tasks. But the adoption of agentic workflow with MCP is not without its <a href="https://www.cloudflare.com/learning/ai/what-is-ai-security/">security risks</a>. These range from authorization sprawl, <a href="https://www.cloudflare.com/learning/ai/prompt-injection/"><u>prompt injection</u></a>, and <a href="https://www.cloudflare.com/learning/security/what-is-a-supply-chain-attack/"><u>supply chain risks</u></a>. To secure this broad company-wide adoption, we have integrated a suite of security controls from both our <a href="https://www.cloudflare.com/sase/"><u>Cloudflare One (SASE) platform</u></a> and our <a href="https://workers.cloudflare.com/"><u>Cloudflare Developer platform</u></a>, allowing us to govern AI usage with MCP without slowing down our workforce.  </p><p>In this blog we’ll walk through our own best practices for securing MCP workflows, by putting different parts of our platform together to create a unified security architecture for the era of autonomous AI. We’ll also share two new concepts that support enterprise MCP deployments:</p><ul><li><p>We are launching <a href="https://developers.cloudflare.com/cloudflare-one/access-controls/ai-controls/mcp-portals/#code-mode"><u>Code Mode with MCP server portals</u></a>, to drastically reduce token costs associated with MCP usage; </p></li><li><p>We describe how to use <a href="https://developers.cloudflare.com/cloudflare-wan/zero-trust/cloudflare-gateway/"><u>Cloudflare Gateway</u></a> for Shadow MCP detection, to discover use of unauthorized remote MCP servers.</p></li></ul><p>We also talk about how our organization approached deploying MCP, and how we built out our MCP security architecture using Cloudflare products including <a href="https://developers.cloudflare.com/agents/guides/remote-mcp-server/"><u>remote MCP servers</u></a>, <a href="https://www.cloudflare.com/sase/products/access/"><u>Cloudflare Access</u></a>, <a href="https://developers.cloudflare.com/cloudflare-one/access-controls/ai-controls/mcp-portals/"><u>MCP server portals</u></a> and <a href="https://www.google.com/search?q=https://www.cloudflare.com/developer-platform/ai-gateway/"><u>AI Gateway</u></a>. </p>
    <div>
      <h2>Remote MCP servers provide better visibility and control</h2>
      <a href="#remote-mcp-servers-provide-better-visibility-and-control">
        
      </a>
    </div>
    <p><a href="https://www.cloudflare.com/learning/ai/what-is-model-context-protocol-mcp/"><u>MCP</u></a> is an open standard that enables developers to build a two-way connection between AI applications and the data sources they need to access. In this architecture, the MCP client is the integration point with the <a href="https://www.cloudflare.com/learning/ai/what-is-large-language-model/"><u>LLM</u></a> or other <a href="https://www.cloudflare.com/learning/ai/what-is-agentic-ai/"><u>AI agent</u></a>, and the MCP server sits between the <a href="https://www.cloudflare.com/learning/ai/mcp-client-and-server/"><u>MCP client</u></a> and the corporate resources.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/73kpNxOQIlM0UOnty9qGXS/53fe1b92e299b52363ac1870df52f2e9/BLOG-3252_2.png" />
          </figure><p>The separation between MCP clients and MCP servers allows agents to autonomously pursue goals and take actions while maintaining a clear boundary between the AI (integrated at the MCP client) and the credentials and APIs of the corporate resource (integrated at the MCP server). </p><p>Our workforce at Cloudflare is constantly using MCP servers to access information in various internal resources, including our project management platform, our internal wiki, documentation and code management platforms, and more. </p><p>Very early on, we realized that locally-hosted MCP servers were a security liability. Local MCP server deployments may rely on unvetted software sources and versions, which increases the risk of <a href="https://owasp.org/www-project-mcp-top-10/2025/MCP04-2025%E2%80%93Software-Supply-Chain-Attacks&amp;Dependency-Tampering"><u>supply chain attacks</u></a> or <a href="https://owasp.org/www-community/attacks/MCP_Tool_Poisoning"><u>tool injection attacks</u></a>. They prevent IT and security administrators from administrating these servers, leaving it up to individual employees and developers to choose which MCP servers they want to run and how they want to keep them up to date. This is a losing game.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4mngDeTGrsah2DiN7IAnrf/575075096e72d6b967df4327a99c35fc/BLOG-3252_3.png" />
          </figure><p>Instead, we have a centralized team at Cloudflare that manages our MCP server deployment across the enterprise. This team built a shared MCP platform inside our monorepo that provides governed infrastructure out of the box. When an employee wants to expose an internal resource via MCP, they first get approval from our AI governance team, and then they copy a template, write their tool definitions, and deploy, all the while inheriting default-deny write controls with audit logging, auto-generated <a href="https://www.cloudflare.com/learning/serverless/glossary/what-is-ci-cd/"><u>CI/CD pipelines</u></a>, and <a href="https://www.cloudflare.com/learning/security/glossary/secrets-management/"><u>secrets management</u></a> for free. This means standing up a new governed MCP server is minutes of scaffolding. The governance is baked into the platform itself, which is what allowed adoption to spread so quickly. </p><p>Our CI/CD pipeline deploys them as <a href="https://developers.cloudflare.com/agents/guides/remote-mcp-server/"><u>remote MCP servers</u></a> on custom domains on <a href="https://www.cloudflare.com/developer-platform/"><u>Cloudflare’s developer platform</u></a>. This gives us visibility into which MCPs servers are being used by our employees, while maintaining control over software sources. As an added bonus, every remote MCP server on the Cloudflare developer platform is automatically deployed across our global network of data centers, so MCP servers can be accessed by our employees with low latency, regardless of where they might be in the world.</p>
    <div>
      <h3>Cloudflare Access provides authentication</h3>
      <a href="#cloudflare-access-provides-authentication">
        
      </a>
    </div>
    <p>Some of our MCP servers sit in front of public resources, like our <a href="https://docs.mcp.cloudflare.com/mcp"><u>Cloudflare documentation MCP server</u></a> or <a href="https://radar.mcp.cloudflare.com/mcp"><u>Cloudflare Radar MCP server</u></a>, and thus we want them to be accessible to anyone. But many of the MCP servers used by our workforce are sitting in front of our private corporate resources. These MCP servers require user authentication to ensure that they are off limits to everyone but authorized Cloudflare employees. To achieve this, our monorepo template for MCP servers integrates <a href="https://www.cloudflare.com/sase/products/access/"><u>Cloudflare Access</u></a> as the OAuth provider. Cloudflare Access secures login flows and issues access tokens to resources, while acting as an identity aggregator that verifies end user <a href="https://www.cloudflare.com/learning/access-management/what-is-sso/"><u>single-sign on (SSO)</u></a>, <a href="https://www.cloudflare.com/learning/access-management/what-is-multi-factor-authentication/"><u>multifactor authentication (MFA)</u></a>, and a variety of contextual attributes such as IP addresses, location, or device certificates. </p>
    <div>
      <h2>MCP server portals centralize discovery and governance</h2>
      <a href="#mcp-server-portals-centralize-discovery-and-governance">
        
      </a>
    </div>
    
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/70s5yIwdzoaYQIoRF4L0mH/018dae32529c30639a2af48ee4031bc6/BLOG-3252_4.png" />
          </figure><p><sup><i>MCP server portals unify governance and control for all AI activity.</i></sup></p><p>As the number of our remote MCP servers grew, we hit a new wall: discovery. We wanted to make it easy for every employee (especially those that are new to MCP) to find and work with all the MCP servers that are available to them. Our MCP server portals product provided a convenient solution. The employee simply connects their MCP client to the MCP server portal, and the portal immediately reveals every internal and third-party MCP servers they are authorized to use. </p><p>Beyond this, our MCP server portals provide centralized logging, consistent policy enforcement and <a href="https://www.cloudflare.com/learning/access-management/what-is-dlp/"><u>data loss prevention</u></a> (DLP guardrails). Our administrators can see who logged into what MCP portal and create DLP rules that prevent certain data, like personally identifiable data (PII), from being shared with certain MCP servers.</p><p>We can also create policies that control who has access to the portal itself, and what tools from each MCP server should be exposed. For example, we could set up one MCP server portal that is only accessible to employees that are part of our <i>finance </i>group that exposes just the read-only tools for the MCP server in front of our internal code repository. Meanwhile, a different MCP server portal, accessible only to employees on their corporate laptops that are in our <i>engineering </i>team, could expose more powerful read/write tools to our code repository MCP server.</p><p>An overview of our MCP server portal architecture is shown above. The portal supports both remote MCP servers hosted on Cloudflare, and third-party MCP servers hosted anywhere else. What makes this architecture uniquely performant is that all these security and networking components run on the same physical machine within our global network. When an employee's request moves through the MCP server portal, a Cloudflare-hosted remote MCP server, and Cloudflare Access, their traffic never needs to leave the same physical machine. </p>
    <div>
      <h2>Code Mode with MCP server portals reduces costs</h2>
      <a href="#code-mode-with-mcp-server-portals-reduces-costs">
        
      </a>
    </div>
    <p>After months of high-volume MCP deployments, we’ve paid out our fair share of tokens. We’ve also started to think most people are doing MCP wrong.</p><p>The standard approach to MCP requires defining a separate tool for every API operation that is exposed via an MCP server. But this static and exhaustive approach quickly exhausts an agent’s context window, especially for large platforms with thousands of endpoints.</p><p>We previously wrote about how we used server-side <a href="https://blog.cloudflare.com/code-mode-mcp/"><u>Code Mode to power Cloudflare’s MCP server</u></a>, allowing us to expose <a href="https://developers.cloudflare.com/api/?cf_target_id=C3927C0A6A2E9B823D2DF3F28E5F0D30"><u>the thousands of end-points in Cloudflare API</u></a> while reducing token use by 99.9%. The Cloudflare MCP server exposes just two tools: a <code>search</code> tool lets the model write JavaScript to explore what’s available, and an <code>execute</code> tool lets it write JavaScript to call the tools it finds. The model discovers what it needs on demand, rather than receiving everything upfront.</p><p>We like this pattern so much, we had to make it available for everyone. So we have now launched the ability to use the “Code Mode” pattern with <a href="https://developers.cloudflare.com/cloudflare-one/access-controls/ai-controls/mcp-portals/"><u>MCP server portals</u></a>. Now you can front all of your MCP servers with a centralized portal that performs audit controls and progressive tool disclosure, in order to reduce token costs.</p><p>Here is how it works. Instead of exposing every tool definition to a client, all of your underlying MCP servers collapse into just two MCP portal tools: <code>portal_codemode_search</code> and <code>portal_codemode_execute</code>. The <code>search</code> tool gives the model access to a <code>codemode.tools()</code> function that returns all the tool definitions from every connected upstream MCP server. The model then writes JavaScript to filter and explore these definitions, finding exactly the tools it needs without every schema being loaded into context. The <code>execute</code> tool provides a <code>codemode</code> proxy object where each upstream tool is available as a callable function. The model writes JavaScript that calls these tools directly, chaining multiple operations, filtering results, and handling errors in code. All of this runs in a sandboxed environment on the MCP server portal powered by <a href="https://developers.cloudflare.com/dynamic-workers/"><u>Dynamic Workers</u></a>. </p><p>Here is an example of an agent that needs to find a Jira ticket and update it with information from Google Drive. It first searches for the right tools:</p>
            <pre><code>// portal_codemode_search
async () =&gt; {
 const tools = await codemode.tools();
 return tools
  .filter(t =&gt; t.name.includes("jira") || t.name.includes("drive"))
  .map(t =&gt; ({ name: t.name, params: Object.keys(t.inputSchema.properties || {}) }));
}
</code></pre>
            <p> The model now knows the exact tool names and parameters it needs, without the full schemas of tools ever entering its context. It then writes a single <code>execute</code> call to chain the operations together:</p>
            <pre><code>// portal_codemode_execute
async () =&gt; {
 const tickets = await codemode.jira_search_jira_with_jql({
  jql: ‘project = BLOG AND status = “In Progress”’,
  fields: [“summary”, “description”]
 });
 const doc = await codemode.google_workspace_drive_get_content({
  fileId: “1aBcDeFgHiJk”
 });
 await codemode.jira_update_jira_ticket({
  issueKey: tickets[0].key,
  fields: { description: tickets[0].description + “\n\n” + doc.content }
 });
 return { updated: tickets[0].key };
}
</code></pre>
            <p>This is just two tool calls. The first discovers what's available, the second does the work. Without Code Mode, this same workflow would have required the model to receive the full schemas of every tool from both MCP servers upfront, and then make three separate tool invocations.</p><p>Let’s put the savings in perspective: when our internal MCP server portal is connected to just four of our internal MCP servers, it exposes 52 tools that consume approximately 9,400 tokens of context just for their definitions. With Code Mode enabled, those 52 tools collapse into 2 portal tools consuming roughly 600 tokens, a 94% reduction. And critically, this cost stays fixed. As we connect more MCP servers to the portal, the token cost of Code Mode doesn’t grow.</p><p>Code Mode can be activated on an MCP server portal by adding a query parameter to the URL. Instead of connecting to your portal over its usual URL (e.g. <code>https://myportal.example.com/mcp</code>), you attach <code>?codemode=search_and_execute</code> to the URL (e.g. <code>https://myportal.example.com/mcp?codemode=search_and_execute</code>).</p>
    <div>
      <h2>AI Gateway provides extensibility and cost controls</h2>
      <a href="#ai-gateway-provides-extensibility-and-cost-controls">
        
      </a>
    </div>
    <p>We aren’t done yet. We plug <a href="https://www.cloudflare.com/developer-platform/products/ai-gateway/"><u>AI Gateway</u></a> into our architecture by positioning it on the connection between the MCP client and the LLM. This allows us to quickly switch between various LLM providers (to prevent vendor lock-in) and to enforce cost controls (by limiting the number of tokens each employee can burn through). The full architecture is shown below. </p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3epLuGydMO1YxkdhkcqGPG/74c26deb712d383942e79699b4fb71da/BLOG-3252_5.png" />
          </figure>
    <div>
      <h2>Cloudflare Gateway discovers and blocks shadow MCP</h2>
      <a href="#cloudflare-gateway-discovers-and-blocks-shadow-mcp">
        
      </a>
    </div>
    <p>Now that we’ve provided governed access to authorized MCP servers, let’s look into dealing with unauthorized MCP servers. We can perform shadow MCP discovery using <a href="https://developers.cloudflare.com/cloudflare-wan/zero-trust/cloudflare-gateway/"><u>Cloudflare Gateway</u></a>. Cloudflare Gateway is our comprehensive secure web gateway that provides enterprise security teams with visibility and control over their employees’ Internet traffic.</p><p>We can use the Cloudflare Gateway API to perform a multi-layer scan to find remote MCP servers that are not being accessed via an MCP server portal. This is possible using a variety of existing Gateway and Data Loss Prevention (DLP) selectors, including:</p><ul><li><p>Using the Gateway <code>httpHost</code> selector to scan for </p><ul><li><p>known MCP server hostnames using (like <a href="http://mcp.stripe.com"><u>mcp.stripe.com</u></a>)</p></li><li><p>mcp.* subdomains using wildcard hostname patterns </p></li></ul></li><li><p>Using the Gateway <code>httpRequestURI</code> selector to scan for MCP-specific URL paths like /mcp and /mcp/sse </p></li><li><p>Using DLP-based body inspection to find MCP traffic, even if that traffic uses URI that do not contain the telltale mentions of <code>mcp</code> or <code>sse</code>. Specifically, we use the fact that MCP uses JSON-RPC over HTTP, which means every request contains a "method" field with values like "tools/call", "prompts/get", or "initialize." Here are some regex rules that can be used to detect MCP traffic in the HTTP body:</p></li></ul>
            <pre><code>const DLP_REGEX_PATTERNS = [
  {
    name: "MCP Initialize Method",
    regex: '"method"\\s{0,5}:\\s{0,5}"initialize"',
  },
  {
    name: "MCP Tools Call",
    regex: '"method"\\s{0,5}:\\s{0,5}"tools/call"',
  },
  {
    name: "MCP Tools List",
    regex: '"method"\\s{0,5}:\\s{0,5}"tools/list"',
  },
  {
    name: "MCP Resources Read",
    regex: '"method"\\s{0,5}:\\s{0,5}"resources/read"',
  },
  {
    name: "MCP Resources List",
    regex: '"method"\\s{0,5}:\\s{0,5}"resources/list"',
  },
  {
    name: "MCP Prompts List",
    regex: '"method"\\s{0,5}:\\s{0,5}"prompts/(list|get)"',
  },
  {
    name: "MCP Sampling Create Message",
    regex: '"method"\\s{0,5}:\\s{0,5}"sampling/createMessage"',
  },
  {
    name: "MCP Protocol Version",
    regex: '"protocolVersion"\\s{0,5}:\\s{0,5}"202[4-9]',
  },
  {
    name: "MCP Notifications Initialized",
    regex: '"method"\\s{0,5}:\\s{0,5}"notifications/initialized"',
  },
  {
    name: "MCP Roots List",
    regex: '"method"\\s{0,5}:\\s{0,5}"roots/list"',
  },
];
</code></pre>
            <p>The Gateway API supports additional automation. For example, one can use the custom DLP profile we defined above to block traffic, or redirect it, or just to log and inspect MCP payloads. Put this together, and Gateway can be used to provide comprehensive detection of unauthorized remote MCP servers accessed via an enterprise network. </p><p>For more information on how to build this out, see this <a href="https://developers.cloudflare.com/cloudflare-one/tutorials/detect-mcp-traffic-gateway-logs/"><u>tutorial</u></a>. </p>
    <div>
      <h2>Public-facing MCP Servers are protected with AI Security for Apps</h2>
      <a href="#public-facing-mcp-servers-are-protected-with-ai-security-for-apps">
        
      </a>
    </div>
    <p>So far, we’ve been focused on protecting our workforce’s access to our internal MCP servers. But, like many other organizations, we also have public-facing MCP servers that our customers can use to agentically administer and operate Cloudflare products. These MCP servers are hosted on Cloudflare’s developer platform. (You can find a list of individual MCPs for specific products <a href="https://developers.cloudflare.com/agents/model-context-protocol/mcp-servers-for-cloudflare/"><u>here</u></a>, or refer back to our new approach for providing more efficient access to the entire Cloudflare API using <a href="https://blog.cloudflare.com/code-mode/"><u>Code Mode</u></a>.)</p><p>We believe that every organization should publish official, first-party MCP servers for their products. The alternative is that your customers source unvetted servers from public repositories where packages may contain <a href="https://www.docker.com/blog/mcp-horror-stories-the-supply-chain-attack/"><u>dangerous trust assumptions</u></a>, undisclosed data collection, and any range of unsanctioned behaviors. By publishing your own MCP servers, you control the code, update cadence, and security posture of the tools your customers use.</p><p>Since every remote MCP server is an HTTP endpoint, we can put it behind the <a href="https://www.cloudflare.com/application-services/products/waf/"><u>Cloudflare Web Application Firewall (WAF)</u></a>. Customers can enable the <a href="https://developers.cloudflare.com/waf/detections/ai-security-for-apps/"><u>AI Security for Apps</u></a> feature within the WAF to automatically inspect inbound MCP traffic for prompt injection attempts, sensitive data leakage, and topic classification. Public facing MCPs are protected just as any other web API.  </p>
    <div>
      <h2>The future of MCP in the enterprise</h2>
      <a href="#the-future-of-mcp-in-the-enterprise">
        
      </a>
    </div>
    <p>We hope our experience, products, and reference architectures will be useful to other organizations as they continue along their own journey towards broad enterprise-wide adoption of MCP.</p><p>We’ve secured our own MCP workflows by: </p><ul><li><p>Offering our developers a templated framework for building and deploying remote MCP servers on our developer platform using Cloudflare Access for authentication</p></li><li><p>Ensuring secure, identity-based access to authorized MCP servers by connecting our entire workforce to MCP server portals</p></li><li><p>Controlling costs using AI Gateway to mediate access to the LLMs powering our workforce’s MCP clients, and using Code Mode in MCP server portals to reduce token consumption and context bloat</p></li><li><p><a href="https://developers.cloudflare.com/cloudflare-one/tutorials/detect-mcp-traffic-gateway-logs/"><u>Discovering</u></a> shadow MCP usage by Cloudflare Gateway </p></li></ul><p>For organizations advancing on their own enterprise MCP journeys, we recommend starting by putting your existing remote and third-party MCP servers behind <a href="https://developers.cloudflare.com/cloudflare-one/access-controls/ai-controls/mcp-portals/"><u> Cloudflare MCP server portals</u></a> and enabling Code Mode to start benefitting for cheaper, safer and simpler enterprise deployments of MCP.   </p><p><sub><i>Acknowledgements:  This reference architecture and blog represents this work of many people across many different roles and business units at Cloudflare. This is just a partial list of contributors: Ann Ming Samborski,  Kate Reznykova, Mike Nomitch, James Royal, Liam Reese, Yumna Moazzam, Simon Thorpe, Rian van der Merwe, Rajesh Bhatia, Ayush Thakur, Gonzalo Chavarri, Maddy Onyehara, and Haley Campbell.</i></sub></p><div>  </div><p></p> ]]></content:encoded>
            <category><![CDATA[AI]]></category>
            <category><![CDATA[Security]]></category>
            <category><![CDATA[Cloudflare One]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[Developers]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <category><![CDATA[MCP]]></category>
            <category><![CDATA[Cloudflare Access]]></category>
            <category><![CDATA[Cloudflare Gateway]]></category>
            <category><![CDATA[Agents Week]]></category>
            <guid isPermaLink="false">73AaroR7GH8sXdbfIV99Fl</guid>
            <dc:creator>Sharon Goldberg</dc:creator>
            <dc:creator>Matt Carey</dc:creator>
            <dc:creator>Ivan Anguiano</dc:creator>
        </item>
        <item>
            <title><![CDATA[Secure private networking for everyone: users, nodes, agents, Workers — introducing Cloudflare Mesh]]></title>
            <link>https://blog.cloudflare.com/mesh/</link>
            <pubDate>Tue, 14 Apr 2026 13:00:09 GMT</pubDate>
            <description><![CDATA[ Cloudflare Mesh provides secure, private network access for users, nodes, and autonomous AI agents. By integrating with Workers VPC, developers can now grant agents scoped access to private databases and APIs without manual tunnels.
 ]]></description>
            <content:encoded><![CDATA[ <p>AI agents have changed how teams think about private network access. Your coding agent needs to query a staging database. Your production agent needs to call an internal API. Your personal AI assistant needs to reach a service running on your home network. The clients are no longer just humans or services. They're <a href="https://www.cloudflare.com/learning/ai/what-is-agentic-ai/"><u>agents</u></a>, running autonomously, making requests you didn't explicitly approve, against infrastructure you need to keep secure.</p><p>Each of these workflows has the same underlying problem: agents need to reach private resources, but the tools for doing that were built for humans, not autonomous software. VPNs require interactive login. SSH tunnels require manual setup. Exposing services publicly is a security risk. And none of these approaches give you visibility into what the agent is actually doing once it's connected.</p><p>Today, <b>we're introducing Cloudflare Mesh</b> to connect your private networks together and provide secure access for your agents. We're also integrating Mesh with <a href="https://www.cloudflare.com/developer-platform/"><u>Cloudflare Developer Platform</u></a> so that <a href="https://www.cloudflare.com/developer-platform/products/workers/"><u>Workers</u></a>, <a href="https://www.cloudflare.com/developer-platform/products/durable-objects/"><u>Durable Objects</u></a>, and agents built with the Agents SDK can reach your private infrastructure directly.</p><p>If you’re using <a href="https://www.cloudflare.com/sase/"><u>Cloudflare One’s SASE and Zero Trust suite</u></a>, you already have access to Mesh. You don’t need a new technology paradigm to secure agentic workloads. You need a SASE that was built for the agentic era, and that’s Cloudflare One. Cloudflare Mesh is a new experience with a simpler setup that leverages the on-ramps you’re already familiar with: WARP Connector <i>(now called a Cloudflare Mesh node)</i> and WARP Client <i>(now called Cloudflare One Client)</i>. Together, these create a private network for human, developer, and agent traffic. Mesh is directly integrated into your existing Cloudflare One deployment. Your existing Gateway policies, Access rules, and device posture checks apply to Mesh traffic automatically.</p><p>If you're a developer who just wants private networking for your agents, services, and team, Mesh is where you start. Set it up in minutes, connect your networks, and secure your traffic. And because Mesh runs on the <a href="https://developers.cloudflare.com/cloudflare-one/"><u>Cloudflare One</u></a> platform, you can grow into more advanced capabilities over time: <a href="https://www.cloudflare.com/sase/products/gateway/"><u>Gateway</u></a> network, DNS, and HTTP policies for fine-grained traffic control, <a href="https://www.cloudflare.com/sase/use-cases/infrastructure-access/"><u>Access for Infrastructure</u></a> for SSH and RDP session management, <a href="https://www.cloudflare.com/sase/products/browser-isolation/"><u>Browser Isolation</u></a> for safe web access, <a href="https://developers.cloudflare.com/cloudflare-one/data-loss-prevention/"><u>DLP</u></a> to prevent sensitive data from leaving your network, and <a href="https://www.cloudflare.com/sase/products/casb/"><u>CASB</u></a> for SaaS security. You won’t have to plan for all of this on day one. You just don't have to migrate when you need it.</p>
    <div>
      <h2>New agentic workflows</h2>
      <a href="#new-agentic-workflows">
        
      </a>
    </div>
    <p>Private networking has always been about connecting clients to resources — SSH into a server, query a database, access an internal API. What's changed is who the clients are. A year ago, the answer was your developers and your services. Today, it's increasingly your agents.</p><p>This isn't theoretical. Look at the ecosystem: the explosion of <a href="https://blog.cloudflare.com/remote-model-context-protocol-servers-mcp/"><u>MCP (Model Context Protocol) servers</u></a> providing tool access, coding agents that need to read from private repos and databases, personal assistants running on home hardware. Each of these patterns assumes the agent can reach the resources it needs. When those resources are isolated in private networks, the agent is stuck. </p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/12RTZcOBkkKwmaRe5VYn9k/e1b00401bee274a7643a4dfd4139e2df/BLOG-3215_2.png" />
          </figure><p>This creates three workflows that are hard to secure today:</p><ol><li><p>Accessing a personal agent from a mobile device. You're running OpenClaw on a Mac mini at home. You want to reach it from your phone, your laptop at a coffee shop, or your work machine. But exposing it to the public Internet (even behind a password) can leave some gaps exposed. Your agent has shell access, file system access, and network access to your home network. One misconfiguration and anyone can reach it.</p></li><li><p>Letting a coding agent access your staging environment. You're using Claude Code, Cursor, or Codex on your laptop. You ask it to check deployment status, query analytics from a staging database, or read from an internal object store. But those services live in a private cloud VPC, so your agent can't reach them without exposing them to the Internet or tunneling your entire laptop into the VPC.</p></li><li><p>Connecting deployed agents to private services. You're building agents into your product using the <a href="https://developers.cloudflare.com/agents/"><u>Agents SDK</u></a> on Cloudflare Workers. Those agents need to call internal APIs, query databases, and access services that aren't on the public Internet. They need private access, but with scoped permissions, audit trails, and no credential leakage.</p></li></ol>
    <div>
      <h2>Cloudflare Mesh: one private network for users, nodes, and agents</h2>
      <a href="#cloudflare-mesh-one-private-network-for-users-nodes-and-agents">
        
      </a>
    </div>
    <p>Cloudflare Mesh is developer-friendly private networking. One lightweight connector, one binary, connects everything: your personal devices, your remote servers, your user endpoints. You don't need to install separate tools for each pattern. One connector on your network, and every access pattern works.</p><p>Once connected, devices in your private network can talk to each other over private IPs, routed through Cloudflare’s global network across 330+ cities giving you better reliability and control over your network.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/ENfLb0kr1Zesh7PnuCvDK/830afd93d49b658023a9ece55e0e505b/BLOG-3215_3.gif" />
          </figure><p>Now, with Mesh, a single solution can solve all of the agent scenarios we mentioned above:</p><ul><li><p>With <b>Cloudflare One Client for iOS</b> on your phone, you can securely connect your mobile devices to your local Mac mini running OpenClaw via a Mesh private network.</p></li><li><p>With <b>Cloudflare One Client for macOS</b> on your laptop, you can connect your laptop to your private network so your coding agents can reach staging databases or APIs and query them.</p></li><li><p>With <b>Mesh nodes</b> on your Linux servers, you can connect VPCs in external clouds together, letting agents access resources and MCPs in external private networks.</p></li></ul><p>Because Mesh is powered by <a href="https://developers.cloudflare.com/cloudflare-one/team-and-resources/devices/cloudflare-one-client/"><u>Cloudflare One Client</u></a>, every connection inherits the security controls of the Cloudflare One platform. Gateway policies apply to Mesh traffic. Device posture checks validate connecting devices. DNS filtering catches suspicious lookups. You get this without additional configuration: the same policies that protect your human traffic protect your agent traffic.</p>
    <div>
      <h2>Choosing between Mesh and Tunnel</h2>
      <a href="#choosing-between-mesh-and-tunnel">
        
      </a>
    </div>
    <p>With the introduction of Mesh, you might ask: when should I use Mesh instead of Tunnel? Both connect external networks privately to Cloudflare, but they serve different purposes. <a href="https://developers.cloudflare.com/cloudflare-one/networks/connectors/cloudflare-tunnel/"><u>Cloudflare Tunnel</u></a> is the ideal solution for unidirectional traffic, where Cloudflare proxies the traffic from the edge to specific private services (like a web server or a database). </p><p>Cloudflare Mesh, on the other hand, provides a full bidirectional, many-to-many network. Every device and node on your Mesh can access one another using their private IPs. An application or agent running in your network can discover and access any other resource on the Mesh without each resource needing its own Tunnel. </p>
    <div>
      <h2>Using the power of Cloudflare’s network</h2>
      <a href="#using-the-power-of-cloudflares-network">
        
      </a>
    </div>
    <p>Cloudflare Mesh gives you the benefits of a mesh network (resiliency, high scalability, low latency and high performance), but, by routing everything through Cloudflare, it resolves a key challenge of mesh networks: NAT traversal.</p><p>Most of the Internet is behind NAT (Network Address Translation). This mechanism allows an entire local network of devices to share a single public IP address by mapping traffic between public headers and private internal addresses. When two devices are behind NAT, direct connections can fail and traffic has to fall back to relay servers. If your relay infrastructure has limited points of presence, a meaningful fraction of your traffic hits those relays, adding latency and reducing reliability. And while it can be possible to self-host your own relay servers to compensate, that means taking on the burden of managing additional infrastructure just to connect your existing network.</p><p>Cloudflare Mesh takes a different approach. All Mesh traffic routes through Cloudflare's global network, the same infrastructure that serves traffic for some of the largest websites of the Internet. For cross-region or multi-cloud traffic, this consistently beats public Internet routing. There's no degraded fallback path, because the Cloudflare edge is the path.</p><p>Routing through Cloudflare also means every packet passes through Cloudflare's security stack. This is the key advantage of building Mesh on the Cloudflare One platform: security isn't a separate product you bolt on later. And by leveraging this same global backbone, we can provide these core pillars to every team from day one:</p><p><b>50 nodes and 50 users free. </b>Your whole team and your whole staging environment on one private network, included with every Cloudflare account. </p><p><b>Global edge routing.</b> 330+ cities, optimized backbone routing. No relay servers with limited points of presence. No degraded fallback paths.</p><p><b>Security controls from day one.</b> Mesh runs on Cloudflare One. Gateway policies, DNS filtering, DLP, traffic inspection, and device posture checks are all available on the same platform. Start with simple private connectivity. Turn on Gateway policies when you need traffic filtering. Enable Access for Infrastructure when you need session-level controls for SSH and RDP. Add DLP when you need to prevent sensitive data from leaving your network. Every capability is one toggle away.</p><p><b>High availability.</b> Create a Mesh node with high availability enabled and spin up multiple connectors using the same token in active-passive mode. They advertise the same IP routes, so if one goes down, traffic fails over automatically.</p>
    <div>
      <h2>Integrated with the Developer Platform with Workers VPC</h2>
      <a href="#integrated-with-the-developer-platform-with-workers-vpc">
        
      </a>
    </div>
    <p>Mesh connects your agents and resources across external clouds, but you also need to be able to connect from your agents built on Workers with Agents SDK as well. To enable this, we’ve extended <a href="https://blog.cloudflare.com/workers-vpc-open-beta/"><u>Workers VPC</u></a> to make your entire Mesh network accessible to Workers and Durable Objects.</p><p>That means that you can connect to your Cloudflare Mesh network from Workers, making the entire network accessible from a single binding’s <code>fetch()</code> call. This complements Workers VPC’s existing support for Cloudflare Tunnel, giving you more choice over how you want to secure your networks. Now, you can specify entire networks that you want to connect to in your <code>wrangler.jsonc</code> file. To bind to your Mesh network, use the <code>cf1:network</code> reserved keyword that binds to the Mesh network of your account:</p>
            <pre><code>"vpc_networks": [
  { "binding": "MESH", "network_id": "cf1:network", "remote": true },
  { "binding": "AWS_VPC", "tunnel_id": "350fd307-...", "remote": true }
]
</code></pre>
            <p>Then, you can use it within your Worker or agent code:</p>
            <pre><code>export default {
  async fetch(request: Request, env: Env, ctx: ExecutionContext) {
    // Reach any internal host on your Mesh, no pre-registration required
    const apiResponse = await env.MESH.fetch("http://10.0.1.50/api/data");

    // Internal hostname resolved via tunnel's private DNS resolver
    const dbResponse = await env.AWS_VPC.fetch("http://internal-db.corp.local:5432");

    return new Response(await apiResponse.text());
  },
};
</code></pre>
            <p>By connecting the Developer Platform to your Mesh networks, you can build Workers that have secure access to your private databases, internal APIs and MCPs, allowing you to build cross-cloud agents and MCPs that provide agentic capabilities to your app. But it also opens up a world where agents can autonomously observe your entire stack end-to-end, cross-reference logs and suggest optimizations in real-time.</p>
    <div>
      <h2>How it all fits together</h2>
      <a href="#how-it-all-fits-together">
        
      </a>
    </div>
    <p>Together, Cloudflare Mesh, Workers VPC, and the Agents SDK provide a unified private network for your agents that spans both Cloudflare and your external clouds. We’ve merged connectivity and compute so your agents can securely reach the resources they need, wherever they live, across the globe.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6tYFghWEN1H3jIwV0Kejc9/04dcaedda5692a9726a7081d19bc413c/BLOG-3215_4.png" />
          </figure><p><b>Mesh nodes</b> are your servers, VMs, and containers. They run a headless version of Cloudflare One Client and get a Mesh IP. Services talk to services over private IPs, bidirectionally, routed through Cloudflare's edge. </p><p><b>Devices</b> are your laptops and phones. They run the Cloudflare One Client and reach Mesh nodes directly: SSH, database queries, API calls, all over private IPs. Your local coding agents use this connection to access private resources. </p><p><b>Agents on Workers</b> reach private services through Workers VPC Network bindings. They get scoped access to entire networks, mediated by MCP. The network enforces what the agent can reach. The MCP server enforces what the agent can do. </p>
    <div>
      <h2>What’s next</h2>
      <a href="#whats-next">
        
      </a>
    </div>
    <p>The current version of Mesh provides the foundation for secure, unified connectivity. But as agentic workflows become more complex, we’re focused on moving beyond simple connectivity toward a network that is more intuitive to manage and more granularly aware of who, or what, is talking to your services. Here is what we are building for the rest of the year.</p>
    <div>
      <h4>Hostname routing</h4>
      <a href="#hostname-routing">
        
      </a>
    </div>
    <p>We're extending Cloudflare Tunnel's <a href="https://blog.cloudflare.com/tunnel-hostname-routing/"><u>hostname routing</u></a> to Mesh this summer. Your Mesh nodes will be able to attract traffic for private hostnames like <code>wiki.local</code> or <code>api.staging.internal</code>, without you having to manage IP lists or worry about how those hostnames resolve on the Cloudflare edge. Route traffic to services by name, not by IP. If your infrastructure uses dynamic IPs, auto-scaling groups, or ephemeral containers, this removes an entire class of routing headaches.</p>
    <div>
      <h4>Mesh DNS</h4>
      <a href="#mesh-dns">
        
      </a>
    </div>
    <p>Today, you reach Mesh nodes by their Mesh IPs: <code>ssh 100.64.0.5</code>. That works, but it's not how you think about your infrastructure. You think in names: <code>postgres-staging</code>, <code>api-prod</code>, <code>nikitas-openclaw</code>.</p><p>Later this year we're building Mesh DNS so that every node and device that joins your Mesh automatically gets a routable internal hostname. No DNS configuration or manual records. Add a node named <code>postgres-staging</code>, and <code>postgres-staging.mesh</code> resolves to the right Mesh IP from any device on your Mesh.</p><p>Combined with hostname routing, you'll be able to <code>ssh postgres-staging.mesh</code> or <code>curl http://api-prod.mesh:3000/health</code> without ever knowing or managing an IP address.</p>
    <div>
      <h4>Identity-aware routing</h4>
      <a href="#identity-aware-routing">
        
      </a>
    </div>
    <p>Today, Mesh nodes authenticate to the Cloudflare edge, but they share an identity at the network layer. Devices authenticate with user identity via the Cloudflare One Client, but nodes don't yet carry distinct, routable identities that Gateway policies can differentiate.</p><p>We want to change that. The goal is identity-aware routing for Mesh, where each node, each device, and eventually each agent gets a distinct identity that policies can evaluate. Instead of writing rules based on IP ranges, you write rules based on who or what is connecting.</p><p>This matters most for agents. Today, when an agent running on Workers calls a tool through a VPC binding, the target service sees a Worker making a request. It doesn't know which agent is calling, who authorized it, or what scope was granted. On the Mesh side, when a local coding agent on your laptop reaches a staging service, Gateway sees your device identity but not the agent's.</p><p>We're working toward a model where agents carry their own identity through the network:</p><ul><li><p>Principal / Sponsor: The human who authorized the action (Nikita from the platform team)</p></li><li><p>Agent: The AI system performing it (the deployment assistant, session #abc123)</p></li><li><p>Scope: What the agent is allowed to do (read deployments, trigger rollbacks, nothing else)</p></li></ul><p>This would let you write policies like: reads from Nikita's agents are allowed, but writes require Nikita directly. Agent traffic can be filtered independently from human traffic. An agent's network access can be revoked without touching Nikita's.</p><p>The infrastructure for this is in place. Mesh nodes provision with per-node tokens, devices authenticate with per-user identity, and Workers VPC bindings scope per-service access. The missing piece is making these identities visible to the policy layer so Gateway can make routing and access decisions based on them. That's what we're building.</p>
    <div>
      <h4>Mesh in containers</h4>
      <a href="#mesh-in-containers">
        
      </a>
    </div>
    <p>Today, Mesh nodes run on VMs and bare-metal Linux servers. But modern infrastructure increasingly runs in containers: Kubernetes pods, Docker Compose stacks, ephemeral CI/CD runners. We're building a Mesh Docker image that lets you add a Mesh node to any containerized environment.</p><p>This means you'll be able to include a Mesh sidecar in your Docker Compose stack and give every service in that stack private network access. A microservice running in a container in your staging cluster could reach a database in your production VPC over Mesh, without either service needing a public endpoint. </p><p>It is also useful for CI/CD pipelines that can access private infrastructure during builds and tests: your GitHub Actions runner pulls the Mesh container image, joins your network, runs integration tests against your staging environment, and tears down. All without VPN credentials to manage or persistent tunnels to maintain: the node disappears when the container exits.</p><p>We expect the Mesh Docker image to be available later this year.</p>
    <div>
      <h2>Get started</h2>
      <a href="#get-started">
        
      </a>
    </div>
    <p>While we continue to evolve these identity and routing capabilities, the foundation for secure, unified networking is available today. You can start bridging your clouds and securing your agents in just a few minutes.</p><p><b>Get started Cloudflare Mesh</b>: Head to Networking &gt; Mesh in the <a href="https://dash.cloudflare.com/?to=/:account/mesh"><u>Cloudflare dashboard</u></a>. Free for up to 50 nodes and 50 users.</p><p><b>Build agents with Agents SDK and Workers VPC: </b>Install the Agents SDK (`npm i agents`), follow the <a href="https://developers.cloudflare.com/workers-vpc/get-started/"><u>Workers VPC quickstart</u></a>, and build a <a href="https://developers.cloudflare.com/agents/guides/remote-mcp-server/"><u>remote MCP server</u></a> with private backend access.</p><p><b>Already on Cloudflare One?</b> Mesh works with your existing setup. Your Gateway policies, device posture checks, and access rules apply to Mesh traffic automatically. See <a href="https://developers.cloudflare.com/cloudflare-one/networks/connectors/cloudflare-mesh/"><u>the Mesh documentation</u></a> to add your first node.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4KMBRDqq5wBjdLHyhM3LNW/98064f81453b6ff29d0bf4b86cfc251a/image1.png" />
          </figure>
    <div>
      <h4>Watch on Cloudflare TV</h4>
      <a href="#watch-on-cloudflare-tv">
        
      </a>
    </div>
    <div>
  
</div>
<p></p> ]]></content:encoded>
            <category><![CDATA[Agents Week]]></category>
            <category><![CDATA[Developers]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <category><![CDATA[Workers AI]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[AI]]></category>
            <category><![CDATA[Zero Trust]]></category>
            <category><![CDATA[Cloudflare One]]></category>
            <category><![CDATA[SASE]]></category>
            <guid isPermaLink="false">iAIJH3zvDaXoiDMugrwOv</guid>
            <dc:creator>Nikita Cano</dc:creator>
            <dc:creator>Thomas Gauvin</dc:creator>
        </item>
    </channel>
</rss>