
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/">
    <channel>
        <title><![CDATA[ The Cloudflare Blog ]]></title>
        <description><![CDATA[ Get the latest news on how products at Cloudflare are built, technologies used, and join the teams helping to build a better Internet. ]]></description>
        <link>https://blog.cloudflare.com</link>
        <atom:link href="https://blog.cloudflare.com/" rel="self" type="application/rss+xml"/>
        <language>en-us</language>
        <image>
            <url>https://blog.cloudflare.com/favicon.png</url>
            <title>The Cloudflare Blog</title>
            <link>https://blog.cloudflare.com</link>
        </image>
        <lastBuildDate>Wed, 08 Apr 2026 14:16:05 GMT</lastBuildDate>
        <item>
            <title><![CDATA[How we use Abstract Syntax Trees (ASTs) to turn Workflows code into visual diagrams ]]></title>
            <link>https://blog.cloudflare.com/workflow-diagrams/</link>
            <pubDate>Fri, 27 Mar 2026 13:00:00 GMT</pubDate>
            <description><![CDATA[ Workflows are now visualized via step diagrams in the dashboard. Here’s how we translate your TypeScript code into a visual representation of the workflow.  ]]></description>
            <content:encoded><![CDATA[ <p><a href="https://www.cloudflare.com/developer-platform/products/workflows/"><u>Cloudflare Workflows</u></a> is a durable execution engine that lets you chain steps, retry on failure, and persist state across long-running processes. Developers use Workflows to power background agents, manage data pipelines, build human-in-the-loop approval systems, and more.</p><p>Last month, we <a href="https://developers.cloudflare.com/changelog/post/2026-02-03-workflows-visualizer/"><u>announced</u></a> that every workflow deployed to Cloudflare now has a complete visual diagram in the dashboard.</p><p>We built this because being able to visualize your applications is more important now than ever before. Coding agents are writing code that you may or may not be reading. However, the shape of what gets built still matters: how the steps connect, where they branch, and what's actually happening.</p><p>If you've seen diagrams from visual workflow builders before, those are usually working from something declarative: JSON configs, YAML, drag-and-drop. However, Cloudflare Workflows are just code. They can include <a href="https://developers.cloudflare.com/workflows/build/workers-api/"><u>Promises, Promise.all, loops, conditionals,</u></a> and/or be nested in functions or classes. This dynamic execution model makes rendering a diagram a bit more complicated.</p><p>We use Abstract Syntax Trees (ASTs) to statically derive the graph, tracking <code>Promise</code> and <code>await</code> relationships to understand what runs in parallel, what blocks, and how the pieces connect. </p><p>Keep reading to learn how we built these diagrams, or deploy your first workflow and see the diagram for yourself.</p><a href="https://deploy.workers.cloudflare.com/?url=https://github.com/cloudflare/templates/tree/main/workflows-starter-template"><img src="https://deploy.workers.cloudflare.com/button" /></a>
<p></p><p>Here’s an example of a diagram generated from Cloudflare Workflows code:</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/44NnbqiNda2vgzIEneHQ3W/044856325693fbeb75ed1ab38b4db1c2/image1.png" />
          </figure>
    <div>
      <h3>Dynamic workflow execution</h3>
      <a href="#dynamic-workflow-execution">
        
      </a>
    </div>
    <p>Generally, workflow engines can execute according to either dynamic or sequential (static) execution order. Sequential execution might seem like the more intuitive solution: trigger workflow → step A → step B → step C, where step B starts executing immediately after the engine completes Step A, and so forth.</p><p><a href="https://developers.cloudflare.com/workflows/"><u>Cloudflare Workflows</u></a> follow the dynamic execution model. Since workflows are just code, the steps execute as the runtime encounters them. When the runtime discovers a step, that step gets passed over to the workflow engine, which manages its execution. The steps are not inherently sequential unless awaited — the engine executes all unawaited steps in parallel. This way, you can write your workflow code as flow control without additional wrappers or directives. Here’s how the handoff works:</p><ol><li><p>An <i>engine</i>, which is a “supervisor” Durable Object for that instance, spins up. The engine is responsible for the logic of the actual workflow execution. </p></li><li><p>The engine triggers a <a href="https://developers.cloudflare.com/cloudflare-for-platforms/workers-for-platforms/how-workers-for-platforms-works/#user-workers"><u>user worker</u></a> via <a href="https://developers.cloudflare.com/cloudflare-for-platforms/workers-for-platforms/configuration/dynamic-dispatch/"><u>dynamic dispatch</u></a>, passing control over to Workers runtime.</p></li><li><p>When Runtime encounters a <code>step.do</code>, it passes the execution back to the engine.</p></li><li><p>The engine executes the step, persists the result (or throws an error, if applicable) and triggers the user Worker again.  </p></li></ol><p>With this architecture, the engine does not inherently “know” the order of the steps that it is executing — but for a diagram, the order of steps becomes crucial information. The challenge here lies in getting the vast majority of workflows translated accurately into a diagnostically helpful graph; with the diagrams in beta, we will continue to iterate and improve on these representations.</p>
    <div>
      <h3>Parsing the code</h3>
      <a href="#parsing-the-code">
        
      </a>
    </div>
    <p>Fetching the script at <a href="https://developers.cloudflare.com/workers/get-started/guide/#4-deploy-your-project"><u>deploy time</u></a>, instead of run time, allows us to parse the workflow in its entirety to statically generate the diagram. </p><p>Taking a step back, here is the life of a workflow deployment:</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1zoOCYji26ahxzh594VavQ/63ad96ae033653ffc7fd98df01ea6e27/image5.png" />
          </figure><p>To create the diagram, we fetch the script after it has been bundled by the internal configuration service which deploys Workers (step 2 under Workflow deployment). Then, we use a parser to create an abstract syntax tree (AST) representing the workflow, and our internal service generates and traverses an intermediate graph with all WorkflowEntrypoints and calls to workflows steps. We render the diagram based on the final result on our API. </p><p>When a Worker is deployed, the configuration service bundles (using <a href="https://esbuild.github.io/"><u>esbuild</u></a> by default) and minifies the code <a href="https://developers.cloudflare.com/workers/wrangler/configuration/#inheritable-keys"><u>unless specified otherwise</u></a>. This presents another challenge — while Workflows in TypeScript follow an intuitive pattern, their minified Javascript (JS) can be dense and indigestible. There are also different ways that code can be minified, depending on the bundler. </p><p>Here’s an example of Workflow code that shows <b>agents executing in parallel:</b></p>
            <pre><code>const summaryPromise = step.do(
         `summary agent (loop ${loop})`,
         async () =&gt; {
           return runAgentPrompt(
             this.env,
             SUMMARY_SYSTEM,
             buildReviewPrompt(
               'Summarize this text in 5 bullet points.',
               draft,
               input.context
             )
           );
         }
       );
        const correctnessPromise = step.do(
         `correctness agent (loop ${loop})`,
         async () =&gt; {
           return runAgentPrompt(
             this.env,
             CORRECTNESS_SYSTEM,
             buildReviewPrompt(
               'List correctness issues and suggested fixes.',
               draft,
               input.context
             )
           );
         }
       );
        const clarityPromise = step.do(
         `clarity agent (loop ${loop})`,
         async () =&gt; {
           return runAgentPrompt(
             this.env,
             CLARITY_SYSTEM,
             buildReviewPrompt(
               'List clarity issues and suggested fixes.',
               draft,
               input.context
             )
           );
         }
       );</code></pre>
            <p>Bundling with <a href="https://rspack.rs/"><u>rspack</u></a>, a snippet of the minified code looks like this:</p>
            <pre><code>class pe extends e{async run(e,t){de("workflow.run.start",{instanceId:e.instanceId});const r=await t.do("validate payload",async()=&gt;{if(!e.payload.r2Key)throw new Error("r2Key is required");if(!e.payload.telegramChatId)throw new Error("telegramChatId is required");return{r2Key:e.payload.r2Key,telegramChatId:e.payload.telegramChatId,context:e.payload.context?.trim()}}),s=await t.do("load source document from r2",async()=&gt;{const e=await this.env.REVIEW_DOCUMENTS.get(r.r2Key);if(!e)throw new Error(`R2 object not found: ${r.r2Key}`);const t=(await e.text()).trim();if(!t)throw new Error("R2 object is empty");return t}),n=Number(this.env.MAX_REVIEW_LOOPS??"5"),o=this.env.RESPONSE_TIMEOUT??"7 days",a=async(s,i,c)=&gt;{if(s&gt;n)return le("workflow.loop.max_reached",{instanceId:e.instanceId,maxLoops:n}),await t.do("notify max loop reached",async()=&gt;{await se(this.env,r.telegramChatId,`Review stopped after ${n} loops for ${e.instanceId}. Start again if you still need revisions.`)}),{approved:!1,loops:n,finalText:i};const h=t.do(`summary agent (loop ${s})`,async()=&gt;te(this.env,"You summarize documents. Keep the output short, concrete, and factual.",ue("Summarize this text in 5 bullet points.",i,r.context)))...</code></pre>
            <p>Or, bundling with <a href="https://vite.dev/"><u>vite</u></a>, here is a minified snippet:</p>
            <pre><code>class ht extends pe {
  async run(e, r) {
    b("workflow.run.start", { instanceId: e.instanceId });
    const s = await r.do("validate payload", async () =&gt; {
      if (!e.payload.r2Key)
        throw new Error("r2Key is required");
      if (!e.payload.telegramChatId)
        throw new Error("telegramChatId is required");
      return {
        r2Key: e.payload.r2Key,
        telegramChatId: e.payload.telegramChatId,
        context: e.payload.context?.trim()
      };
    }), n = await r.do(
      "load source document from r2",
      async () =&gt; {
        const i = await this.env.REVIEW_DOCUMENTS.get(s.r2Key);
        if (!i)
          throw new Error(`R2 object not found: ${s.r2Key}`);
        const c = (await i.text()).trim();
        if (!c)
          throw new Error("R2 object is empty");
        return c;
      }
    ), o = Number(this.env.MAX_REVIEW_LOOPS ?? "5"), l = this.env.RESPONSE_TIMEOUT ?? "7 days", a = async (i, c, u) =&gt; {
      if (i &gt; o)
        return H("workflow.loop.max_reached", {
          instanceId: e.instanceId,
          maxLoops: o
        }), await r.do("notify max loop reached", async () =&gt; {
          await J(
            this.env,
            s.telegramChatId,
            `Review stopped after ${o} loops for ${e.instanceId}. Start again if you still need revisions.`
          );
        }), {
          approved: !1,
          loops: o,
          finalText: c
        };
      const h = r.do(
        `summary agent (loop ${i})`,
        async () =&gt; _(
          this.env,
          et,
          K(
            "Summarize this text in 5 bullet points.",
            c,
            s.context
          )
        )
      )...</code></pre>
            <p>Minified code can get pretty gnarly — and depending on the bundler, it can get gnarly in a bunch of different directions.</p><p>We needed a way to parse the various forms of minified code quickly and precisely. We decided <code>oxc-parser</code> from the <a href="https://oxc.rs/"><u>JavaScript Oxidation Compiler</u></a> (OXC) was perfect for the job. We first tested this idea by having a container running Rust. Every script ID was sent to a <a href="https://developers.cloudflare.com/queues/"><u>Cloudflare Queue</u></a>, after which messages were popped and sent to the container to process. Once we confirmed this approach worked, we moved to a Worker written in Rust. Workers supports running <a href="https://developers.cloudflare.com/workers/languages/rust/"><u>Rust via WebAssembly</u></a>, and the package was small enough to make this straightforward.</p><p>The Rust Worker is responsible for first converting the minified JS into AST node types, then converting the AST node types into the graphical version of the workflow that is rendered on the dashboard. To do this, we generate a graph of pre-defined <a href="https://developers.cloudflare.com/workflows/build/visualizer/"><u>node types</u></a> for each workflow and translate into our graph representation through a series of node mappings. </p>
    <div>
      <h3>Rendering the diagram</h3>
      <a href="#rendering-the-diagram">
        
      </a>
    </div>
    <p>There were two challenges to rendering a diagram version of the workflow: how to track step and function relationships correctly, and how to define the workflow node types as simply as possible while covering all the surface area.</p><p>To guarantee that step and function relationships are tracked correctly, we needed to collect both the function and step names. As we discussed earlier, the engine only has information about the steps, but a step may be dependent on a function, or vice versa. For example, developers might wrap steps in functions or define functions as steps. They could also call steps within a function that come from different <a href="https://blog.cloudflare.com/workers-javascript-modules/"><u>modules</u></a> or rename steps. </p><p>Although the library passes the initial hurdle by giving us the AST, we still have to decide how to parse it.  Some code patterns require additional creativity. For example, functions — within a <code>WorkflowEntrypoint</code>, there can be functions that call steps directly, indirectly, or not at all. Consider <code>functionA</code>, which contains <code>console.log(await functionB(), await functionC()</code>) where <code>functionB</code> calls a <code>step.do()</code>. In that case, both <code>functionA</code> and <code>functionB</code> should be included on the workflow diagram; however, <code>functionC</code> should not. To catch all functions which include direct and indirect step calls, we create a subgraph for each function and check whether it contains a step call itself or whether it calls another function which might. Those subgraphs are represented by a function node, which contains all of its relevant nodes. If a function node is a leaf of the graph, meaning it has no direct or indirect workflow steps within it, it is trimmed from the final output. </p><p>We check for other patterns as well, including a list of static steps from which we can infer the workflow diagram or variables, defined in up to ten different ways. If your script contains multiple workflows, we follow a similar pattern to the subgraphs created for functions, abstracted one level higher. </p><p>For every AST node type, we had to consider every way they could be used inside of a workflow: loops, branches, promises, parallels, awaits, arrow functions… the list goes on. Even within these paths, there are dozens of possibilities. Consider just a few of the possible ways to loop:</p>
            <pre><code>// for...of
for (const item of items) {
	await step.do(`process ${item}`, async () =&gt; item);
}
// while
while (shouldContinue) {
	await step.do('poll', async () =&gt; getStatus());
}
// map
await Promise.all(
	items.map((item) =&gt; step.do(`map ${item}`, async () =&gt; item)),
);
// forEach
await items.forEach(async (item) =&gt; {
	await step.do(`each ${item}`, async () =&gt; item);
});</code></pre>
            <p>And beyond looping, how to handle branching:</p>
            <pre><code>// switch / case
switch (action.type) {
	case 'create':
		await step.do('handle create', async () =&gt; {});
		break;
	default:
		await step.do('handle unknown', async () =&gt; {});
		break;
}

// if / else if / else
if (status === 'pending') {
	await step.do('pending path', async () =&gt; {});
} else if (status === 'active') {
	await step.do('active path', async () =&gt; {});
} else {
	await step.do('fallback path', async () =&gt; {});
}

// ternary operator
await (cond
	? step.do('ternary true branch', async () =&gt; {})
	: step.do('ternary false branch', async () =&gt; {}));

// nullish coalescing with step on RHS
const myStepResult =
	variableThatCanBeNullUndefined ??
	(await step.do('nullish fallback step', async () =&gt; 'default'));

// try/catch with finally
try {
	await step.do('try step', async () =&gt; {});
} catch (_e) {
	await step.do('catch step', async () =&gt; {});
} finally {
	await step.do('finally step', async () =&gt; {});
}</code></pre>
            <p>Our goal was to create a concise API that communicated what developers need to know without overcomplicating it. But converting a workflow into a diagram meant accounting for every pattern (whether it follows best practices, or not) and edge case possible. As we discussed earlier, each step is not explicitly sequential, by default, to any other step. If a workflow does not utilize <code>await</code> and <code>Promise.all()</code>, we assume that the steps will execute in the order in which they are encountered. But if a workflow included <code>await</code>, <code>Promise</code> or <code>Promise.all()</code>, we needed a way to track those relationships.</p><p>We decided on tracking execution order, where each node has a <code>starts:</code> and <code>resolves:</code> field. The <code>starts</code> and <code>resolves</code> indices tell us when a promise started executing and when it ends relative to the first promise that started without an immediate, subsequent conclusion. This correlates to vertical positioning in the diagram UI (i.e., all steps with <code>starts:1</code> will be inline). If steps are awaited when they are declared, then <code>starts</code> and <code>resolves</code> will be undefined, and the workflow will execute in the order of the steps’ appearance to the runtime.</p><p>While parsing, when we encounter an unawaited <code>Promise</code> or <code>Promise.all()</code>, that node (or nodes) are marked with an entry number, surfaced in the <code>starts</code> field. If we encounter an <code>await</code> on that promise, the entry number is incremented by one and saved as the exit number (which is the value in <code>resolves</code>). This allows us to know which promises run at the same time and when they’ll complete in relation to each other.</p>
            <pre><code>export class ImplicitParallelWorkflow extends WorkflowEntrypoint&lt;Env, Params&gt; {
 async run(event: WorkflowEvent&lt;Params&gt;, step: WorkflowStep) {
   const branchA = async () =&gt; {
     const a = step.do("task a", async () =&gt; "a"); //starts 1
     const b = step.do("task b", async () =&gt; "b"); //starts 1
     const c = await step.waitForEvent("task c", { type: "my-event", timeout: "1 hour" }); //starts 1 resolves 2
     await step.do("task d", async () =&gt; JSON.stringify(c)); //starts 2 resolves 3
     return Promise.all([a, b]); //resolves 3
   };

   const branchB = async () =&gt; {
     const e = step.do("task e", async () =&gt; "e"); //starts 1
     const f = step.do("task f", async () =&gt; "f"); //starts 1
     return Promise.all([e, f]); //resolves 2
   };

   await Promise.all([branchA(), branchB()]);

   await step.sleep("final sleep", 1000);
 }
}</code></pre>
            <p>You can see the steps’ alignment in the diagram:</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6EZJ38J3H55yH0OnT11vgg/6dde06725cd842725ee3af134b1505c0/image3.png" />
          </figure><p>After accounting for all of those patterns, we settled on the following list of node types: 	</p>
            <pre><code>| StepSleep
| StepDo
| StepWaitForEvent
| StepSleepUntil
| LoopNode
| ParallelNode
| TryNode
| BlockNode
| IfNode
| SwitchNode
| StartNode
| FunctionCall
| FunctionDef
| BreakNode;</code></pre>
            <p>Here are a few samples of API output for different behaviors: </p><p><code>function</code> call:</p>
            <pre><code>{
  "functions": {
    "runLoop": {
      "name": "runLoop",
      "nodes": []
    }
  }
}</code></pre>
            <p><code>if</code> condition branching to <code>step.do</code>:</p>
            <pre><code>{
  "type": "if",
  "branches": [
    {
      "condition": "loop &gt; maxLoops",
      "nodes": [
        {
          "type": "step_do",
          "name": "notify max loop reached",
          "config": {
            "retries": {
              "limit": 5,
              "delay": 1000,
              "backoff": "exponential"
            },
            "timeout": 10000
          },
          "nodes": []
        }
      ]
    }
  ]
}</code></pre>
            <p><code>parallel</code> with <code>step.do</code> and <code>waitForEvent</code>:</p>
            <pre><code>{
  "type": "parallel",
  "kind": "all",
  "nodes": [
    {
      "type": "step_do",
      "name": "correctness agent (loop ${...})",
      "config": {
        "retries": {
          "limit": 5,
          "delay": 1000,
          "backoff": "exponential"
        },
        "timeout": 10000
      },
      "nodes": [],
      "starts": 1
    },
...
    {
      "type": "step_wait_for_event",
      "name": "wait for user response (loop ${...})",
      "options": {
        "event_type": "user-response",
        "timeout": "unknown"
      },
      "starts": 3,
      "resolves": 4
    }
  ]
}</code></pre>
            
    <div>
      <h3>What’s next</h3>
      <a href="#whats-next">
        
      </a>
    </div>
    <p>Ultimately, the goal of these Workflow diagrams is to serve as a full-service debugging tool. That means you’ll be able to:</p><ul><li><p>Trace an execution through the graph in real time</p></li><li><p>Discover errors, wait for human-in-the-loop approvals, and skip steps for testing</p></li><li><p>Access visualizations in local development</p></li></ul><p>Check out the diagrams on your <a href="https://dash.cloudflare.com/?to=/:account/workers/workflows"><u>Workflow overview pages</u></a>. If you have any feature requests or notice any bugs, share your feedback directly with the Cloudflare team by joining the <a href="https://discord.cloudflare.com/"><u>Cloudflare Developers community on Discord</u></a>.</p> ]]></content:encoded>
            <category><![CDATA[Workflows]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[Developers]]></category>
            <guid isPermaLink="false">4HOWpzOgT3eVU2wFa4adFU</guid>
            <dc:creator>André Venceslau</dc:creator>
            <dc:creator>Mia Malden</dc:creator>
        </item>
        <item>
            <title><![CDATA[A closer look at Python Workflows, now in beta]]></title>
            <link>https://blog.cloudflare.com/python-workflows/</link>
            <pubDate>Mon, 10 Nov 2025 14:00:00 GMT</pubDate>
            <description><![CDATA[ Cloudflare Workflows, our durable execution engine for running multi-step applications, now supports Python. That means less friction, more possibilities, and another reason to build on Cloudflare. ]]></description>
            <content:encoded><![CDATA[ <p>Developers can <a href="https://blog.cloudflare.com/building-workflows-durable-execution-on-workers/"><u>already</u></a> use Cloudflare Workflows to build long-running, multi-step applications on Workers. Now, Python Workflows are here, meaning you can use your language of choice to orchestrate multi-step applications.</p><p>With <a href="https://developers.cloudflare.com/workflows/"><u>Workflows</u></a>, you can automate a sequence of idempotent steps in your application with built-in error handling and retry behavior. But Workflows were originally supported only in TypeScript. Since Python is the de facto language of choice for data pipelines, artificial intelligence/machine learning, and task automation – all of which heavily rely on orchestration – this created friction for many developers.</p><p>Over the years, we’ve been giving developers the tools to build these applications in Python, on Cloudflare. In 2020, we brought <a href="https://blog.cloudflare.com/cloudflare-workers-announces-broad-language-support/"><u>Python to Workers via Transcrypt</u></a> before directly integrating Python into <a href="https://github.com/cloudflare/workerd?cf_target_id=33101FA5C99A5BD54E7D452C9B282CD8"><u>workerd</u></a> in 2024. Earlier this year, we built support for <a href="https://developers.cloudflare.com/workers/languages/python/stdlib/"><u>CPython</u></a> along with <a href="https://pyodide.org/en/stable/usage/packages-in-pyodide.html"><u>any packages built in Pyodide</u></a>, like matplotlib and pandas, in Workers. Now, Python Workflows are supported as well, so developers can create robust applications using the language they know best.</p>
    <div>
      <h2>Why Python for Workflows?</h2>
      <a href="#why-python-for-workflows">
        
      </a>
    </div>
    <p>Imagine you’re training an <a href="https://www.cloudflare.com/learning/ai/what-is-large-language-model/"><u>LLM</u></a>. You need to label the dataset, feed data, wait for the model to run, evaluate the loss, adjust the model, and repeat. Without automation, you’d need to start each step, monitor manually until completion, and then start the next one. Instead, you could use a workflow to orchestrate the training of the model, triggering each step pending the completion of its predecessor. For any manual adjustments needed, like evaluating the loss and adjusting the model accordingly, you can implement a step that notifies you and waits for the necessary input.</p><p>Consider data pipelines, which are a top Python use case for ingesting and processing data. By automating the data pipeline through a defined set of idempotent steps, developers can deploy a workflow that handles the entire data pipeline for them.</p><p>Take another example: building <a href="https://www.cloudflare.com/learning/ai/what-is-agentic-ai/"><u>AI agents</u></a>, such as an agent to manage your groceries. Each week, you input your list of recipes, and the agent (1) compiles the list of necessary ingredients, (2) checks what ingredients you have left over from previous weeks, and (3) orders the differential for pickup from your local grocery store. Using a Workflow, this could look like:</p><ol><li><p><code>await step.wait_for_event()</code> the user inputs the grocery list</p></li><li><p><code>step.do()</code> compile list of necessary ingredients</p></li><li><p><code>step.do()</code> check list of necessary ingredients against left over ingredients</p></li><li><p><code>step.do()</code> make an API call to place the order</p></li><li><p><code>step.do() </code>proceed with payment</p></li></ol><p>Using workflows as a tool to <a href="https://agents.cloudflare.com/"><u>build agents on Cloudflare</u></a> can simplify agents’ architecture and improve their odds for reaching completion through individual step retries and state persistence. Support for Python Workflows means building agents with Python is easier than ever.</p>
    <div>
      <h3>How Python Workflows work</h3>
      <a href="#how-python-workflows-work">
        
      </a>
    </div>
    <p>Cloudflare Workflows uses the underlying infrastructure that we created for durable execution, while providing an idiomatic way for Python users to write their workflows. In addition, we aimed for complete feature parity between the Javascript and the Python SDK. This is possible because Cloudflare Workers support Python directly in the runtime itself. </p>
    <div>
      <h4>Creating a Python Workflow</h4>
      <a href="#creating-a-python-workflow">
        
      </a>
    </div>
    <p>Cloudflare Workflows are fully built on top of <a href="https://www.cloudflare.com/developer-platform/products/workers/"><u>Workers</u></a> and <a href="https://www.cloudflare.com/developer-platform/products/durable-objects/"><u>Durable Objects</u></a>. Each element plays a part in storing Workflow metadata, and instance level information. For more detail on how the Workflows platform works, <a href="https://blog.cloudflare.com/building-workflows-durable-execution-on-workers/"><u>check out this blog post</u></a>.</p><p>At the very bottom of the Workflows control plane sits the user Worker, which is the <code>WorkflowEntrypoint</code>. When the Workflow instance is ready to run, the Workflow engine will call into the <code>run</code> method of the user worker via RPC, which in this case will be a Python Worker.</p><p>This is an example skeleton for a Workflow declaration, provided by the official documentation:</p>
            <pre><code>export class MyWorkflow extends WorkflowEntrypoint&lt;Env, Params&gt; {
  async run(event: WorkflowEvent&lt;Params&gt;, step: WorkflowStep) {
    // Steps here
  }
}</code></pre>
            <p>The <code>run</code> method, as illustrated above, provides a <a href="https://developers.cloudflare.com/workflows/build/workers-api/#workflowstep"><u>WorkflowStep</u></a> parameter that implements the durable execution APIs. This is what users rely on for at-most-once execution. These APIs are implemented in JavaScript and need to be accessed in the context of the Python Worker.</p><p>A <code>WorkflowStep</code> must cross the RPC barrier, meaning the engine (caller) exposes it as an <code>RpcTarget</code>. This setup allows the user's Workflow (callee) to substitute the parameter with a stub. This stub then enables the use of durable execution APIs for Workflows by RPCing back to the engine. To read more about RPC serialization and how functions can be passed from caller and callee, read the <a href="https://developers.cloudflare.com/workers/runtime-apis/rpc/"><u>Remote-Procedure call documentation</u></a>.</p><p>All of this is true for both Python and JavaScript Workflows, since we don’t really change how the user Worker is called from the Workflows side. However, in the Python case, there is another barrier – language bridging between Python and the JavaScript module. When an RPC request targets a Python Worker, there is a Javascript entrypoint module responsible for proxying the request to be handled by the Python script, and then returned to the caller. This process typically involves type translation before and after handling the request.</p>
    <div>
      <h4>Overcoming the language barrier</h4>
      <a href="#overcoming-the-language-barrier">
        
      </a>
    </div>
    <p>Python workers rely on <a href="https://pyodide.org/en/stable/"><u>Pyodide</u></a>, which is a port of CPython to WebAssembly. Pyodide provides a foreign function interface (FFI) to JavaScript which allows for calling into JavaScript methods from Python. This is the mechanism that allows other bindings and Python packages to work within the Workers platform. Therefore, we use this FFI layer not only to allow using the Workflow binding directly, but also to provide <code>WorkflowStep</code> methods in Python. In other words, by considering that <code>WorkflowEntrypoint</code> is a special class for the runtime, the run method is manually wrapped so that <code>WorkflowStep</code> is exposed as a <a href="https://pyodide.org/en/stable/usage/api/python-api/ffi.html?cf_target_id=B32B42023AAEDEF833BCC2D9FD6096A3#pyodide.ffi.JsProxy"><u>JsProxy</u></a> instead of being type translated like other JavaScript objects. Moreover, by wrapping the APIs from the perspective of the user Worker, we allow ourselves to make some adjustments to the overall development experience, instead of simply exposing a JavaScript SDK to a different language with different semantics. </p>
    <div>
      <h4>Making the Python Workflows SDK Pythonic</h4>
      <a href="#making-the-python-workflows-sdk-pythonic">
        
      </a>
    </div>
    <p>A big part of porting Workflows to Python includes exposing an interface that Python users will be familiar with and have no problems using, similarly to what happens with our JavaScript APIs. Let's take a step back and look at a snippet for a Workflow (written in Typescript) definition.</p>
            <pre><code>import { WorkflowEntrypoint, WorkflowStep, WorkflowEvent} from 'cloudflare:workers';
 
export class MyWorkflow extends WorkflowEntrypoint {
    async run(event: WorkflowEvent&lt;YourEventType&gt;, step: WorkflowStep) {
        let state = step.do("my first step", async () =&gt; {
          // Access your properties via event.payload
          let userEmail = event.payload.userEmail
          let createdTimestamp = event.payload.createdTimestamp
          return {"userEmail": userEmail, "createdTimestamp": createdTimestamp}
	    })
 
        step.sleep("my first sleep", "30 minutes");
 
        await step.waitForEvent&lt;EventType&gt;("receive example event", { type: "simple-event", timeout: "1 hour" })
 
   	 const developerWeek = Date.parse("22 Sept 2025 13:00:00 UTC");
        await step.sleepUntil("sleep until X times out", developerWeek)
    }
}</code></pre>
            <p>The Python implementation of the workflows API requires modification of the do method. Unlike other languages, Python does not easily support anonymous callbacks. This behavior is typically achieved through the use of <a href="https://www.w3schools.com/python/python_decorators.asp"><u>decorators</u></a>, which in this case allow us to intercept the method and expose it idiomatically. In other words, all parameters maintain their original order, with the decorated method serving as the callback.</p><p>The methods <code>waitForEvent</code>, <code>sleep</code>, and <code>sleepUntil</code> can retain their original signatures, as long as their names are converted to snake case.</p><p>Here’s the corresponding Python version for the same workflow, achieving similar behavior:</p>
            <pre><code>from workers import WorkflowEntrypoint
 
class MyWorkflow(WorkflowEntrypoint):
    async def run(self, event, step):
        @step.do("my first step")
        async def my_first_step():
            user_email = event["payload"]["userEmail"]
            created_timestamp = event["payload"]["createdTimestamp"]
            return {
                "userEmail": user_email,
                "createdTimestamp": created_timestamp,
            }
 
        await my_first_step()
 
        step.sleep("my first sleep", "30 minutes")
 
         await step.wait_for_event(
            "receive example event",
            "simple-event",
            timeout="1 hour",
        )
 
        developer_week = datetime(2024, 10, 24, 13, 0, 0, tzinfo=timezone.utc)
        await step.sleep_until("sleep until X times out", developer_week)</code></pre>
            
    <div>
      <h4>DAG Workflows</h4>
      <a href="#dag-workflows">
        
      </a>
    </div>
    <p>When designing Workflows, we’re often managing dependencies between steps even when some of these tasks can be handled concurrently. Even though we’re not thinking about it, many Workflows have a directed acyclic graph (DAG) execution flow. Concurrency is achievable in the first iteration of Python Workflows (i.e.: minimal port to Python Workers) because Pyodide captures Javascript thenables and proxies them into Python awaitables. </p><p>Consequently, <code>asyncio.gather</code> works as a counterpart to <code>Promise.all</code>. Although this is perfectly fine and ready to be used in the SDK, we also support a declarative approach.</p><p>One of the advantages of decorating the do method is that we can essentially provide further abstractions on the original API, and have them work on the entrypoint wrapper. Here’s an example of a Python API making use of the DAG capabilities introduced:</p>
            <pre><code>from workers import Response, WorkflowEntrypoint

class PythonWorkflowDAG(WorkflowEntrypoint):
    async def run(self, event, step):

        @step.do('dependency 1')
        async def dep_1():
            # does stuff
            print('executing dep1')

        @step.do('dependency 2')
        async def dep_2():
            # does stuff
            print('executing dep2')

        @step.do('demo do', depends=[dep_1, dep_2], concurrent=True)
        async def final_step(res1=None, res2=None):
            # does stuff
            print('something')

        await final_step()</code></pre>
            <p>This kind of approach makes the Workflow declaration much cleaner, leaving state management to the Workflows engine data plane, as well as the Python workers Workflow wrapper. Note that even though multiple steps can run with the same name, the engine will slightly modify the name of each step to ensure uniqueness. In Python Workflows, a dependency is considered resolved once the initial step involving it has been successfully completed.</p>
    <div>
      <h3>Try it out</h3>
      <a href="#try-it-out">
        
      </a>
    </div>
    <p>Check out <a href="https://developers.cloudflare.com/workers/languages/python/"><u>writing Workers in Python</u></a> and <a href="https://developers.cloudflare.com/workflows/python/"><u>create your first Python Workflow</u></a> today! If you have any feature requests or notice any bugs, share your feedback directly with the Cloudflare team by joining the <a href="https://discord.cloudflare.com/"><u>Cloudflare Developers community on Discord</u></a>.</p> ]]></content:encoded>
            <category><![CDATA[Workflows]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[Python]]></category>
            <guid isPermaLink="false">JmiSM0vpXaKtbQJ49ehaB</guid>
            <dc:creator>Caio Nogueira</dc:creator>
            <dc:creator>Mia Malden</dc:creator>
        </item>
        <item>
            <title><![CDATA[Building a better testing experience for Workflows, our durable execution engine for multi-step applications]]></title>
            <link>https://blog.cloudflare.com/better-testing-for-workflows/</link>
            <pubDate>Tue, 04 Nov 2025 14:00:00 GMT</pubDate>
            <description><![CDATA[ End-to-end testing for Cloudflare Workflows was challenging. We're introducing first-class support for Workflows in cloudflare:test, enabling full introspection, mocking, and isolated, reliable tests for your most complex applications. ]]></description>
            <content:encoded><![CDATA[ <p></p><p><a href="https://www.cloudflare.com/developer-platform/products/workflows/"><u>Cloudflare Workflows</u></a> is our take on "Durable Execution." They provide a serverless engine, powered by the <a href="https://www.cloudflare.com/developer-platform/"><u>Cloudflare Developer Platform</u></a>, for building long-running, multi-step applications that persist through failures. When Workflows became <a href="https://blog.cloudflare.com/workflows-ga-production-ready-durable-execution/"><u>generally available</u></a> earlier this year, they allowed developers to orchestrate complex processes that would be difficult or impossible to manage with traditional stateless functions. Workflows handle state, retries, and long waits, allowing you to focus on your business logic.</p><p>However, complex orchestrations require robust testing to be reliable. To date, testing Workflows was a black-box process. Although you could test if a Workflow instance reached completion through an <code>await</code> to its status, there was no visibility into the intermediate steps. This made debugging really difficult. Did the payment processing step succeed? Did the confirmation email step receive the correct data? You couldn't be sure without inspecting external systems or logs. </p>
    <div>
      <h3>Why was this necessary?</h3>
      <a href="#why-was-this-necessary">
        
      </a>
    </div>
    <p>As developers ourselves, we understand the need to ensure reliable code, and we heard your feedback loud and clear: the developer experience for testing Workflows needed to be better.</p><p>The black box nature of testing was one part of the problem. Beyond that, though, the limited testing offered came at a high cost. If you added a workflow to your project, even if you weren't testing the workflow directly, you were required to disable isolated storage because we couldn't guarantee isolation between tests. Isolated storage is a vitest-pool-workers feature to guarantee that each test runs in a clean, predictable environment, free from the side effects of other tests. Being forced to have it disabled meant that state could leak between tests, leading to flaky, unpredictable, and hard-to-debug failures.</p><p>This created a difficult choice for developers building complex applications. If your project used <a href="https://www.cloudflare.com/developer-platform/products/workers/"><u>Workers</u></a>, <a href="https://www.cloudflare.com/developer-platform/products/durable-objects/"><u>Durable Objects</u></a>, and <a href="https://www.cloudflare.com/developer-platform/products/r2/"><u>R2</u></a> alongside Workflows, you had to either abandon isolated testing for your <i>entire project</i> or skip testing. This friction resulted in a poor testing experience, which in turn discouraged the adoption of Workflows. Solving this wasn't just an improvement, it was a critical <i>step</i> in making Workflows part of any well-tested Cloudflare application.</p>
    <div>
      <h3>Introducing isolated testing for Workflows</h3>
      <a href="#introducing-isolated-testing-for-workflows">
        
      </a>
    </div>
    <p>We're introducing a new set of APIs that enable comprehensive, granular, and isolated testing for your Workflows, all running locally and offline with <code>vitest-pool-workers</code>, our testing framework that supports running tests in the Workers runtime <code>workerd</code>. This enables fast, reliable, and cheap test runs that don't depend on a network connection.</p><p>They are available through the <code>cloudflare:test</code> module, with <code>@cloudflare/vitest-pool-workers</code> version <b>0.9.0</b> and above. The new test module provides two primary functions to introspect your Workflows:</p><ul><li><p><code>introspectWorkflowInstance</code>: useful for unit tests with known instance IDs</p></li><li><p><code>introspectWorkflow</code>: useful for integration tests where IDs are typically generated dynamically.</p></li></ul><p>Let's walk through a practical example.</p>
    <div>
      <h3>A practical example: testing a blog moderation workflow</h3>
      <a href="#a-practical-example-testing-a-blog-moderation-workflow">
        
      </a>
    </div>
    <p>Imagine a simple Workflow for moderating a blog. When a user submits a comment, the Workflow requests a review from workers-ai. Based on the violation score returned, it then waits for a moderator to approve or deny the comment. If approved, it calls a <code>step.do</code> to publish the comment via an external API.</p><p>Testing this without our new APIs would be impossible. You'd have no direct way to simulate the step’s outcomes and simulate the moderator's approval. Now, you can mock everything.</p><p>Here’s the test code using <code>introspectWorkflowInstance</code> with a known instance ID:</p>
            <pre><code>import { env, introspectWorkflowInstance } from "cloudflare:test";

it("should mock a an ambiguous score, approve comment and complete", async () =&gt; {
   // CONFIG
   await using instance = await introspectWorkflowInstance(
       env.MODERATOR,
       "my-workflow-instance-id-123"
   );
   await instance.modify(async (m) =&gt; {
       await m.mockStepResult({ name: "AI content scan" }, { violationScore: 50 });
       await m.mockEvent({ 
           type: "moderation-approval", 
           payload: { action: "approved" },
       });
       await m.mockStepResult({ name: "publish comment" }, { status: "published" });
   });

   await env.MODERATOR.create({ id: "my-workflow-instance-id-123" });
   
   // ASSERTIONS
   expect(await instance.waitForStepResult({ name: "AI content scan" })).toEqual(
       { violationScore: 50 }
   );
   expect(
       await instance.waitForStepResult({ name: "publish comment" })
   ).toEqual({ status: "published" });

   await expect(instance.waitForStatus("complete")).resolves.not.toThrow();
});</code></pre>
            <p>This test mocks the outcomes of steps that require external API calls, such as the 'AI content scan', which calls <a href="https://www.cloudflare.com/developer-platform/products/workers-ai/"><u>Workers AI</u></a>, and the 'publish comment' step, which calls an external blog API.</p><p>If the instance ID is not known, because you are either making a worker request that starts one/multiple Workflow instances with random generated ids, you can call <code>introspectWorkflow(env.MY_WORKFLOW)</code>. Here’s the test code for that scenario, where only one Workflow instance is created:</p>
            <pre><code>it("workflow mock a non-violation score and be successful", async () =&gt; {
   // CONFIG
   await using introspector = await introspectWorkflow(env.MODERATOR);
   await introspector.modifyAll(async (m) =&gt; {
       await m.disableSleeps();
       await m.mockStepResult({ name: "AI content scan" }, { violationScore: 0 });
   });

   await SELF.fetch(`https://mock-worker.local/moderate`);

   const instances = introspector.get();
   expect(instances.length).toBe(1);

   // ASSERTIONS
   const instance = instances[0];
   expect(await instance.waitForStepResult({ name: "AI content scan"  })).toEqual({ violationScore: 0 });
   await expect(instance.waitForStatus("complete")).resolves.not.toThrow();
});</code></pre>
            <p>Notice how in both examples we’re calling the introspectors with <code>await using</code> - this is the <a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Resource_management#the_using_and_await_using_declarations"><u>Explicit Resource Management</u></a> syntax from modern JavaScript. It is crucial here because when the introspector objects go out of scope at the end of the test, its disposal method is automatically called. This is how we ensure each test works with its own isolated storage.</p><p>The <code>modify</code> and <code>modifyAll</code> functions are the gateway to controlling instances. Inside its callback, you get access to a modifier object with methods to inject behavior such as mocking step outcomes, events and disabling sleeps.</p><p>You can find detailed documentation on the <a href="https://developers.cloudflare.com/workers/testing/vitest-integration/test-apis/#workflows"><u>Workers Cloudflare Docs</u></a>.</p><p><b>How we connected Vitest to the Workflows Engine</b></p><p>To understand the solution, you first need to understand the local architecture. When you run <code>wrangler dev,</code> your Workflows are powered by Miniflare, a simulator for testing Cloudflare Workers, and <code>workerd</code>. Each running workflow instance is backed by its own SQLite Durable Object, which we call the "Engine DO". This Engine DO is responsible for executing steps, persisting state, and managing the instance's lifecycle. It lives inside the local isolated Workers runtime.</p><p>Meanwhile, the Vitest test runner is a separate Node.js process living outside of <code>workerd</code>. This is why we have a Vitest custom pool that allows tests to run inside <code>workerd</code> called vitest-pool-workers. Vitest-pool-workers has a Runner Worker, which is a worker to run the tests with bindings to everything specified in the user wrangler.json file. This worker has access to the APIs under the “cloudflare:test” module. It communicates with Node.js through a special DO called Runner Object via WebSocket/RPC.</p><p>The first approach we considered was to use the test runner worker. In its current state, Runner worker has access to Workflow bindings from Workflows defined on the wrangler file. We considered also binding each Workflow's Engine DO namespace to this runner worker. This would give vitest-pool-workers direct access to the Engine DOs where it would be possible to directly call Engine methods. </p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3ptKRqwpfvK1dxY6T5Kuin/fbf92915b2d2a95542bf6bec8addd5ad/image1.png" />
          </figure><p>While promising, this approach would have required undesirable changes to the core of Miniflare and vitest-pool-workers, making it too invasive for this single feature. </p><p>Firstly, we would have needed to add a new <i>unsafe</i> field to Miniflare's Durable Objects. Its sole purpose would be to specify the service name of our Engines, preventing Miniflare from applying its default user prefix which would otherwise prevent the Durable Objects from being found.</p><p>Secondly, vitest-pool-workers would have been forced to bind every Engine DO from the Workflows in the project to its runner, even those not being tested. This would introduce unwanted bindings into the test environment, requiring an additional cleanup to ensure they were not exposed to the user's tests env.</p><p><b>The breakthrough</b></p><p>The solution is a combination of privileged local-only APIs and Remote Procedure Calls (RPC).</p><p>First, we added a set of <code>unsafe</code> functions to the <i>local</i> implementation of the Workflows binding, functions that are not available in the production environment. They act as a controlled access point, accessible from the test environment, allowing the test runner to get a stub to a specific Engine DO by providing its instance ID.</p><p>Once the test runner has this stub, it uses RPC to call specific, trusted methods on the Engine DO via a special <code>RpcTarget</code> called <code>WorkflowInstanceModifier</code>. Any class that extends <code>RpcTarget</code> has its objects replaced by a stub. Calling a method on this stub, in turn, makes an RPC back to the original object.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3AObAsJuBplii3aeqMw2bn/74b21880b09a293fef6f84de1ae1318e/image2.png" />
          </figure><p>This simpler approach is far less invasive because it's confined to the Workflows environment, which also ensures any future feature changes are safely isolated.</p><p><b>Introspecting Workflows with unknown IDs</b></p><p>When creating Workflows instances (either by <code>create()</code> or <code>createBatch()</code>) developers can provide a specific ID or have it automatically generated for them. This ID identifies the Workflow instance and is then used to create the associated Engine DO ID.</p><p>The logical starting point for implementation was <code>introspectWorkflowInstance(binding, instanceID)</code>, as the instance ID is known in advance. This allows us to generate the Engine DO ID required to identify the engine associated with that Workflow instance.</p><p>But often, one part of your application (like an HTTP endpoint) will create a Workflow instance with a randomly generated ID. How can we introspect an instance when we don't know its ID until after it's created?</p><p>The answer was to use a powerful feature of JavaScript: <code>Proxy</code> objects.</p><p>When you use <code>introspectWorkflow(binding)</code>, we wrap the Workflow binding in a Proxy. This proxy non-destructively intercepts all calls to the binding, specifically looking for <code>.create()</code> and <code>.createBatch()</code>. When your test triggers a workflow creation, the proxy inspects the call. It captures the instance ID — either one you provided or the random one generated — and immediately sets up the introspection on that ID, applying all the modifications you defined in the <code>modifyAll</code> call. The original creation call then proceeds as normal.</p>
            <pre><code>env[workflow] = new Proxy(env[workflow], {
  get(target, prop) {
    if (prop === "create") {
      return new Proxy(target.create, {
        async apply(_fn, _this, [opts = {}]) {

          // 1. Ensure an ID exists 
          const optsWithId = "id" in opts ? opts : { id: crypto.randomUUID(), ...opts };

          // 2. Apply test modifications before creation
          await introspectAndModifyInstance(optsWithId.id);

          // 3. Call the original 'create' method 
          return target.create(optsWithId);
        },
      });
    }

    // Same logic for createBatch()
  }
}</code></pre>
            <p>When the <code>await using</code> block from <code>introspectWorkflow()</code> finishes, or the <code>dispose()</code> method is called at the end of the test, the introspector is disposed of, and the proxy is removed, leaving the binding in its original state. It’s a low-impact approach that prioritizes developer experience and long-term maintainability.</p>
    <div>
      <h3>Get started with testing Workflows</h3>
      <a href="#get-started-with-testing-workflows">
        
      </a>
    </div>
    <p>Ready to add tests to your Workflows? Here’s how to get started:</p><ol><li><p><b>Update your dependencies:</b> Make sure you are using <code>@cloudflare/vitest-pool-workers</code> version <b>0.9.0 </b>or newer. Run the following command in your project: <code>npm install @cloudflare/vitest-pool-workers@latest</code></p></li><li><p><b>Configure your test environment:</b> If you're new to testing on Workers, follow our <a href="https://developers.cloudflare.com/workers/testing/vitest-integration/write-your-first-test/"><u>guide to write your first test</u></a>.</p></li></ol><p><b>Start writing tests</b>: Import <code>introspectWorkflowInstance</code> or <code>introspectWorkflow</code> from <code>cloudflare:test</code> in your test files and use the patterns shown in this post to mock, control, and assert on your Workflow's behavior. Also check out the official <a href="https://developers.cloudflare.com/workers/testing/vitest-integration/test-apis/#workflows"><u>API reference</u></a>.</p> ]]></content:encoded>
            <category><![CDATA[Developers]]></category>
            <category><![CDATA[Internship Experience]]></category>
            <category><![CDATA[Product News]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[Workflows]]></category>
            <guid isPermaLink="false">5Kq3w0WQ8bFIvLmxsDpIjO</guid>
            <dc:creator>Olga Silva</dc:creator>
            <dc:creator>Mia Malden</dc:creator>
        </item>
        <item>
            <title><![CDATA[How we simplified NCMEC reporting with Cloudflare Workflows]]></title>
            <link>https://blog.cloudflare.com/simplifying-ncmec-reporting-with-cloudflare-workflows/</link>
            <pubDate>Fri, 11 Apr 2025 14:00:00 GMT</pubDate>
            <description><![CDATA[ We transitioned to Cloudflare Workflows to manage complex, multi-step processes more efficiently. This shift replaced our National Center for Missing & Exploited Children (NCMEC) reporting system. ]]></description>
            <content:encoded><![CDATA[ <p>Cloudflare plays a significant role in supporting the Internet’s infrastructure. <a href="https://w3techs.com/technologies/history_overview/proxy/all/q"><u>As a reverse proxy by approximately 20% of all websites</u></a>, we sit directly in the request path between users and the origin, helping to improve performance, security, and reliability at scale. Beyond that, our global network powers services like <a href="https://www.cloudflare.com/en-gb/application-services/products/cdn/"><u>delivery</u></a>, <a href="https://workers.cloudflare.com/"><u>Workers</u></a>, and <a href="https://www.cloudflare.com/en-gb/developer-platform/products/r2/"><u>R2</u></a> — making Cloudflare not just a passive intermediary, but an active platform for delivering and hosting content across the Internet.</p><p>Since Cloudflare’s launch in 2010, we have collaborated with the National Center for Missing and Exploited Children (<a href="https://www.missingkids.org/home"><u>NCMEC</u></a>), a US-based clearinghouse for reporting child sexual abuse material (CSAM), and are committed to doing what we can to support identification and removal of CSAM content.</p><p>Members of the public, <a href="https://blog.cloudflare.com/cloudflares-response-to-csam-online/"><u>customers, and trusted organizations can submit reports</u></a> of abuse observed on Cloudflare’s network. A minority of these reports relate to CSAM, which are triaged with the highest priority by Cloudflare’s Trust &amp; Safety team. We will also forward details of the report, along with relevant files (where applicable) and supplemental information to NCMEC.</p><p>The process to generate and submit reports to NCMEC involves multiple steps, dependencies, and error handling, which quickly became complex under our original queue-based architecture. In this blog post, we discuss how Cloudflare <a href="https://developers.cloudflare.com/workflows/"><u>Workflows</u></a> helped streamline this process and simplify the code behind it.</p>
    <div>
      <h2>Life before Cloudflare Workflows</h2>
      <a href="#life-before-cloudflare-workflows">
        
      </a>
    </div>
    <p>When we designed our latest NCMEC reporting system in early 2024, <a href="https://blog.cloudflare.com/building-workflows-durable-execution-on-workers/"><u>Cloudflare Workflows</u></a> did not exist yet. We used the Workers platform <a href="https://developers.cloudflare.com/queues/"><b><u>Queues</u></b></a> as a solution for managing asynchronous tasks, and structured our system around them.</p><p>Our goal was to ensure reliability, fault tolerance, and automatic retries. However, without an orchestrator, we had to manually handle state, retries, and inter-queue messaging. While Queues worked, we needed something more explicit to help debug and observe the more complex asynchronous workflows we were building on top of the messaging system that Queues gave us.</p><p>In our queue-based architecture each report would go through multiple steps:</p><ol><li><p><b>Validate input</b>: Ensure the report has all necessary details.</p></li><li><p><b>Initiate report</b>: Call the NCMEC API to create a report.</p></li><li><p><b>Fetch impounded files (if applicable)</b>: Retrieve files stored in R2.</p></li><li><p><b>Upload files</b>: Send files to NCMEC via API.</p></li><li><p><b>Finalize report</b>: Mark the report as completed.</p></li></ol>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7n99a6YkThlegGitE2i7iv/a53e70ac11e21025d436c27dce7aaf3a/image2.png" />
          </figure><p><sup><i>A diagram of our queue-based architecture </i></sup></p><p>Each of these steps was handled by a separate queue, and if an error occurred, the system would retry the message several times before marking the report as failed. But errors weren’t always straightforward — for instance, if an external API call consistently failed due to bad input or returned an unexpected response shape, retries wouldn’t help. In those cases, the report could get stuck in an intermediate state, and we’d often have to manually dig through logs across different queues to figure out what went wrong.</p><p>Even more frustrating, when handling failed reports, we relied on a "Reaper" — a cron job that ran every hour to resubmit failed reports. Since a report could fail at any step, the Reaper had to deduce which queue failed and send a message to begin reprocessing. This meant:</p><ul><li><p><b>Debugging was a nightmare</b>: Tracing the journey of a single report meant jumping between logs for multiple queues.</p></li><li><p><b>Retries were unreliable</b>: Some queues had retry logic, while others relied on the Reaper, leading to inconsistencies.</p></li><li><p><b>State management was painful</b>: We had no clear way to track whether a report was halfway through the pipeline or completely lost, except by looking through the logs.</p></li><li><p><b>Operational overhead was high</b>: Developers frequently had to manually inspect failed reports and resubmit them.</p></li></ul><p>Queues gave us a solid foundation for moving messages around, but it wasn’t meant to handle orchestration. What we’d really done was build a bunch of loosely connected steps on top of a message bus and hoped it would all hold together. It worked, for the most part, but it was clunky, hard to reason about, and easy to break. Just understanding how a single report moved through the system meant tracing messages across multiple queues and digging through logs.</p><p>We knew we needed something better: a way to define workflows explicitly, with clear visibility into where things were and what had failed. But back then, we didn’t have a good way to do that without bringing in heavyweight tools or writing a bunch of glue code ourselves. When Cloudflare Workflows came along, it felt like the missing piece, finally giving us a simple, reliable way to orchestrate everything without duct tape.</p>
    <div>
      <h2>The solution: Cloudflare Workflows</h2>
      <a href="#the-solution-cloudflare-workflows">
        
      </a>
    </div>
    <p>Once <a href="https://developers.cloudflare.com/workflows/"><u>Cloudflare Workflows</u></a> was <a href="https://blog.cloudflare.com/building-workflows-durable-execution-on-workers/"><u>announced</u></a>, we saw an immediate opportunity to replace our queue-based architecture with a more structured, observable, and retryable system. Instead of relying on a web of multiple queues passing messages to each other, we now have a single workflow that orchestrates the entire process from start to finish. Critically, if any step failed, the Workflow could pick back up from where it left off, without having to repeat earlier processing steps, re-parsing files, or duplicating uploads.</p><p>With Cloudflare Workflows, each report follows a clear sequence of steps:</p><ol><li><p><b>Creating the report</b>: The system validates the incoming report and initiates it with NCMEC.</p></li><li><p><b>Checking for impounded files</b>: If there are impounded files associated with the report, the workflow proceeds to file collection.</p></li><li><p><b>Gathering files</b>: The system retrieves impounded files stored in R2 and prepares them for upload.</p></li><li><p><b>Uploading files to NCMEC</b>: Each file is uploaded to NCMEC using their API, ensuring all relevant evidence is submitted.</p></li><li><p><b>Adding file metadata</b>: Metadata about the uploaded files (hashes, timestamps, etc.) is attached to the report.</p></li><li><p><b>Finalizing the report</b>: Once all files are processed, the report is finalized and marked as complete.</p></li></ol><p>Here’s a simplified version of the orchestrator:</p>
            <pre><code>import { WorkflowEntrypoint, WorkflowEvent, WorkflowStep } from 'cloudflare:workers';


export class ReportWorkflow extends WorkflowEntrypoint&lt;Env, ReportType&gt; {
  async run(event: WorkflowEvent&lt;ReportType&gt;, step: WorkflowStep) {
    const reportToCreate: ReportType = event.payload;
    let reportId: number | undefined;


    try {
      await step.do('Create Report', async () =&gt; {
        const createdReport = await createReportStep(reportToCreate, this.env);
        reportId = createdReport?.id;
      });


      if (reportToCreate.hasImpoundedFiles) {
        await step.do('Gather Files', async () =&gt; {
          if (!reportId) throw new Error('Report ID is undefined.');
          await gatherFilesStep(reportId, this.env);
        });


        await step.do('Upload Files', async () =&gt; {
          if (!reportId) throw new Error('Report ID is undefined.');
          await uploadFilesStep(reportId, this.env);
        });


        await step.do('Add File Metadata', async () =&gt; {
          if (!reportId) throw new Error('Report ID is undefined.');
          await addFilesInfoStep(reportId, this.env);
        });
      }


      await step.do('Finalize Report', async () =&gt; {
        if (!reportId) throw new Error('Report ID is undefined.');
        await finalizeReportStep(reportId, this.env);
      });
    } catch (error) {
      console.error(error);
      throw error;
    }
  }
}</code></pre>
            <p>Not only can tasks be broken into discrete steps, but the Workflows dashboard gives us real-time visibility into each report processed and the status of each step in the workflow!</p><p>This allows us to easily see active and completed workflows, identify which steps failed and where, and retry failed steps or terminate workflows. These features revolutionize how we troubleshoot issues, providing us with a tool to deep dive into any issues that arise and retry steps with a click of a button.</p><p>Below are two dashboard screenshots, one of our running workflows and the second of an inspection of the success and failures of each step in the workflow. Some workflows look slower or “stuck” — that’s because failed steps are retried with exponential backoff. This helps smooth over transient issues like flaky APIs without manual intervention.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/2DjVg3WMp8e5QGy19TuHMj/69e611c9267598c44e5a2b120f0f59ac/image4.png" />
          </figure><p><sup><i>Cloudflare Workflows Dashboard for our NCMEC Workflow</i></sup></p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5ElqnGMtnJQumNhuWZI3nb/6866cc9aa2b27856a8730a9faebc1747/image3.png" />
          </figure><p><sup><i>Cloudflare Workflows Dashboard containing a breakout of the NCMEC Workflow Steps</i></sup></p><p>Cloudflare Workflows transformed how we handle NCMEC incident reports. What was once a complex, queue-based architecture is now a structured, retryable, and observable process. Debugging is easier, error handling is more robust, and monitoring is seamless. </p>
    <div>
      <h3>Deploy your own Workflows</h3>
      <a href="#deploy-your-own-workflows">
        
      </a>
    </div>
    <p>If you’re also building larger, multi-step applications, or have an existing Workers application that has started to approach what we ended up with for our incident reporting process, then you can typically wrap that code within a Workflow with minimal changes. <a href="https://developers.cloudflare.com/workflows/examples/backup-d1/"><u>Workflows can read from R2, write to KV, query D1</u></a> and call other APIs just like any other Worker, but are designed to help orchestrate asynchronous, long-running tasks.</p><p>To get started with Workflows, you can head to the <a href="https://developers.cloudflare.com/workflows/"><u>Workflows developer documentation</u></a> and/or pull down the starter project and dive into the code immediately:</p>
            <pre><code>$ npm create cloudflare@latest workflows-starter -- 
--template="cloudflare/workflows-starter"
</code></pre>
            <p><i>Learn more about </i><a href="https://developers.cloudflare.com/workers/workflows"><i><u>Cloudflare Workflows</u></i></a><i>, and about using </i><a href="https://developers.cloudflare.com/cache/reference/csam-scanning/"><i><u>the Cloudflare CSAM Scanning Tool</u></i></a><i>.</i></p> ]]></content:encoded>
            <category><![CDATA[Developer Week]]></category>
            <category><![CDATA[Workflows]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <category><![CDATA[CSAM Reporting]]></category>
            <category><![CDATA[Automation]]></category>
            <category><![CDATA[Security]]></category>
            <guid isPermaLink="false">32j7ZR5lpPUtSjC9lwtY0t</guid>
            <dc:creator>Mahmoud Salem</dc:creator>
            <dc:creator>Rachael Truong</dc:creator>
        </item>
        <item>
            <title><![CDATA[Cloudflare Workflows is now GA: production-ready durable execution]]></title>
            <link>https://blog.cloudflare.com/workflows-ga-production-ready-durable-execution/</link>
            <pubDate>Mon, 07 Apr 2025 14:00:00 GMT</pubDate>
            <description><![CDATA[ Workflows — a durable execution engine built directly on top of Workers — is now Generally Available. We’ve landed new human-in-the-loop capabilities, more scale, and more metrics. ]]></description>
            <content:encoded><![CDATA[ <p>Betas are useful for feedback and iteration, but at the end of the day, not everyone is willing to be a guinea pig or can tolerate the occasional sharp edge that comes along with beta software. Sometimes you need that big, shiny “Generally Available” label (or blog post), and now it’s Workflows’ turn.</p><p><a href="https://developers.cloudflare.com/workflows/"><u>Workflows</u></a>, our serverless durable execution engine that allows you to build long-running, multi-step applications (some call them “step functions”) on Workers, is now GA.</p><p>In short, that means it’s <i>production ready</i> —  but it also doesn’t mean Workflows is going to ossify. We’re continuing to scale Workflows (including more concurrent instances), bring new capabilities (like the new <code>waitForEvent</code> API), and make it easier to build <a href="https://www.cloudflare.com/learning/ai/what-is-agentic-ai/">AI agents</a> with <a href="https://developers.cloudflare.com/agents/api-reference/run-workflows/"><u>our Agents SDK and Workflows</u></a>.</p><p>If you prefer code to prose, you can quickly install the Workflows starter project and start exploring the code and the API with a single command:</p>
            <pre><code>npm create cloudflare@latest workflows-starter -- 
--template="cloudflare/workflows-starter"</code></pre>
            <p>How does Workflows work? What can I build with it? How do I think about building AI agents with Workflows and the <a href="https://developers.cloudflare.com/agents/"><u>Agents SDK</u></a>? Well, read on.</p>
    <div>
      <h2>Building with Workflows</h2>
      <a href="#building-with-workflows">
        
      </a>
    </div>
    <p>Workflows is a durable execution engine built on Cloudflare Workers that allows you to build resilient, multi-step applications.</p><p>At its core, Workflows implements a step-based architecture where each step in your application is independently retriable, with state automatically persisted between steps. This means that even if a step fails due to a transient error or network issue, Workflows can retry just that step without needing to restart your entire application from the beginning.</p><p>When you define a Workflow, you break your application into logical steps.</p><ul><li><p>Each step can either execute code (<code>step.do</code>), put your Workflow to sleep (<code>step.sleep</code> or <code>step.sleepUntil</code>), or wait on an event (<code>step.waitForEvent</code>).</p></li><li><p>As your Workflow executes, it automatically persists the state returned from each step, ensuring that your application can continue exactly where it left off, even after failures or hibernation periods. </p></li><li><p>This durable execution model is particularly powerful for applications that coordinate between multiple systems, process data in sequence, or need to handle long-running tasks that might span minutes, hours, or even days.</p></li></ul><p>Workflows are particularly useful at handling complex business processes that traditional stateless functions struggle with.</p><p>For example, an e-commerce order processing workflow might check inventory, charge a payment method, send an email confirmation, and update a database — all as separate steps. If the payment processing step fails due to a temporary outage, Workflows will automatically retry just that step when the payment service is available again, without duplicating the inventory check or restarting the entire process. </p><p>You can see how this works below: each call to a service can be modelled as a step, independently retried, and if needed, recovered from that step onwards:</p>
            <pre><code>import { WorkflowEntrypoint, WorkflowStep, WorkflowEvent } from 'cloudflare:workers';

// The params we expect when triggering this Workflow
type OrderParams = {
	orderId: string;
	customerId: string;
	items: Array&lt;{ productId: string; quantity: number }&gt;;
	paymentMethod: {
		type: string;
		id: string;
	};
};

// Our Workflow definition
export class OrderProcessingWorkflow extends WorkflowEntrypoint&lt;Env, OrderParams&gt; {
	async run(event: WorkflowEvent&lt;OrderParams&gt;, step: WorkflowStep) {
		// Step 1: Check inventory
		const inventoryResult = await step.do('check-inventory', async () =&gt; {
			console.log(`Checking inventory for order ${event.payload.orderId}`);

			// Mock: In a real workflow, you'd query your inventory system
			const inventoryCheck = await this.env.INVENTORY_SERVICE.checkAvailability(event.payload.items);

			// Return inventory status as state for the next step
			return {
				inStock: true,
				reservationId: 'inv-123456',
				itemsChecked: event.payload.items.length,
			};
		});

		// Exit workflow if items aren't in stock
		if (!inventoryResult.inStock) {
			return { status: 'failed', reason: 'out-of-stock' };
		}

		// Step 2: Process payment
		// Configure specific retry logic for payment processing
		const paymentResult = await step.do(
			'process-payment',
			{
				retries: {
					limit: 3,
					delay: '30 seconds',
					backoff: 'exponential',
				},
				timeout: '2 minutes',
			},
			async () =&gt; {
				console.log(`Processing payment for order ${event.payload.orderId}`);

				// Mock: In a real workflow, you'd call your payment processor
				const paymentResponse = await this.env.PAYMENT_SERVICE.processPayment({
					customerId: event.payload.customerId,
					orderId: event.payload.orderId,
					amount: calculateTotal(event.payload.items),
					paymentMethodId: event.payload.paymentMethod.id,
				});

				// If payment failed, throw an error that will trigger retry logic
				if (paymentResponse.status !== 'success') {
					throw new Error(`Payment failed: ${paymentResponse.message}`);
				}

				// Return payment info as state for the next step
				return {
					transactionId: 'txn-789012',
					amount: 129.99,
					timestamp: new Date().toISOString(),
				};
			},
		);

		// Step 3: Send email confirmation
		await step.do('send-confirmation-email', async () =&gt; {
			console.log(`Sending confirmation email for order ${event.payload.orderId}`);
			console.log(`Including payment confirmation ${paymentResult.transactionId}`);
			return await this.env.EMAIL_SERVICE.sendOrderConfirmation({ ... })
		});

		// Step 4: Update database
		const dbResult = await step.do('update-database', async () =&gt; {
			console.log(`Updating database for order ${event.payload.orderId}`);
			await this.updateOrderStatus(...)

			return { dbUpdated: true };
		});

		// Return final workflow state
		return {
			orderId: event.payload.orderId,
			processedAt: new Date().toISOString(),
		};
	}
}</code></pre>
            <p>
This combination of durability, automatic retries, and state persistence makes Workflows ideal for building reliable distributed applications that can handle real-world failures gracefully.</p>
    <div>
      <h2>Human-in-the-loop</h2>
      <a href="#human-in-the-loop">
        
      </a>
    </div>
    <p>Workflows are just code, and that makes them extremely powerful: you can define steps dynamically and on-the-fly, conditionally branch, and make API calls to any system you need. But sometimes you also need a Workflow to wait for something to happen in the real world.</p><p>For example:</p><ul><li><p>Approval from a human to progress.</p></li><li><p>An incoming webhook, like from a Stripe payment or a GitHub event. </p></li><li><p>A state change, such as a file upload to R2 that triggers an <a href="https://developers.cloudflare.com/r2/buckets/event-notifications/"><u>Event Notification</u></a>, and then pushes a reference to the file to the Workflow, so it can process the file (or run it through an AI model).</p></li></ul><p>The new <code>waitForEvent</code> API in Workflows allows you to do just that: </p>
            <pre><code>let event = await step.waitForEvent&lt;IncomingStripeWebhook&gt;("receive invoice paid webhook from Stripe", { type: "stripe-webhook", timeout: "1 hour" }) </code></pre>
            <p>You can then send an event to a specific instance from any external service that can make a HTTP request:</p>
            <pre><code>curl -d '{"transaction":"complete","id":"1234-6789"}' \
  -H "Authorization: Bearer ${CF_TOKEN}" \
\ "https://api.cloudflare.com/client/v4/accounts/{account_id}/workflows/{workflow_name}/instances/{instance_id}/events/{event_type}"</code></pre>
            <p>… or via the <a href="https://developers.cloudflare.com/workflows/build/workers-api/#workflowinstance"><u>Workers API</u></a> within a Worker itself:</p>
            <pre><code>interface Env {
  MY_WORKFLOW: Workflow;
}

interface Payload {
  transaction: string;
  id: string;
}

export default {
  async fetch(req: Request, env: Env) {
    const instanceId = new URL(req.url).searchParams.get("instanceId")
    const webhookPayload = await req.json&lt;Payload&gt;()

    let instance = await env.MY_WORKFLOW.get(instanceId);
    // Send our event, with `type` matching the event type defined in
    // our step.waitForEvent call
    await instance.sendEvent({type: "stripe-webhook", payload: webhookPayload})
    
    return Response.json({
      status: await instance.status(),
    });
  },
};</code></pre>
            <p>You can even wait for multiple events, using the <code>type</code> parameter, and/or race multiple events using <code>Promise.race</code> to continue on depending on which event was received first:</p>
            <pre><code>export class MyWorkflow extends WorkflowEntrypoint&lt;Env, Params&gt; {
	async run(event: WorkflowEvent&lt;Params&gt;, step: WorkflowStep) {
		let state = await step.do("get some data", () =&gt; { /* step call here */ })
		// Race the events, resolving the Promise based on which event
// we receive first
		let value = Promise.race([
step.waitForEvent("payment success", { type: "payment-success-webhook", timeout: "4 hours" ),
step.waitForEvent("payment failure", { type: "payment-failure-webhook", timeout: "4 hours" ),
])
// Continue on based on the value and event received
	}
}</code></pre>
            <p>To visualize <code>waitForEvent</code> in a bit more detail, let’s assume we have a Workflow that is triggered by a code review agent that watches a GitHub repository.</p><p>Without the ability to wait on events, our Workflow can’t easily get human approval to write suggestions back (or even submit a PR of its own). It <i>could</i> potentially poll for some state that was updated, but that means we have to call <code>step.sleep</code> for arbitrary periods of time, poll a storage service for an updated value, and repeat if it’s not there. That’s a lot of code and room for error:</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/64dgTwe9V6bAfKUDQgJ1z3/e0a897623a8ca452139f00dd2cff9733/1.png" />
          </figure><p><sup><i>Without waitForEvent, it’s harder to send data to a Workflow instance that’s running</i></sup></p><p>If we modified that same example to incorporate the new waitForEvent API, we could use it to wait for human approval before making a mutating change:  </p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/2BIuiSytb7roytyDhHVioz/0e005829fea9e60d772dcb6888acac2c/2.png" />
          </figure><p><sup><i>Adding waitForEvent to our code review Workflow, so it can seek explicit approval.</i></sup></p><p>You could even imagine an AI agent itself sending and/or acting on behalf of a human here: <code>waitForEvent</code> simply exposes a way for a Workflow to retrieve and pause on something in the world to change before it continues (or not).</p><p>Critically, you can call <code>waitForEvent</code> just like any other step in Workflows: you can call it conditionally, and/or multiple times, and/or in a loop. Workflows are just Workers: you have the full power of a programming language and are not restricted by a <a href="https://en.wikipedia.org/wiki/Domain-specific_language"><u>domain specific language (DSL)</u></a> or config language.</p>
    <div>
      <h2>Pricing</h2>
      <a href="#pricing">
        
      </a>
    </div>
    <p>Good news: we haven’t changed much since our original beta announcement! We’re adding storage pricing for state stored by your Workflows, and retaining our CPU-based and request (invocation) based pricing as follows:</p><table><tr><td><p><b>Unit</b></p></td><td><p><b>Workers Free</b></p></td><td><p><b>Workers Paid</b></p></td></tr><tr><td><p><b>CPU time (ms)</b></p></td><td><p>10 ms per Workflow</p></td><td><p>30 million CPU milliseconds included per month</p><p>+$0.02 per additional million CPU milliseconds</p></td></tr><tr><td><p><b>Requests</b></p></td><td><p>100,000 Workflow invocations per day (<a href="https://developers.cloudflare.com/workers/platform/pricing/#workers"><u>shared with Workers</u></a>)</p></td><td><p>10 million included per month</p><p>+$0.30 per additional million</p></td></tr><tr><td><p><b>Storage (GB)</b></p></td><td><p>1 GB</p></td><td><p>1 GB included per month
+ $0.20/ GB-month</p></td></tr></table><p>Because the storage pricing is new, we will not actively bill for storage until September 15, 2025. We will notify users above the included 1 GB limit ahead of charging for storage, and by default, Workflows will expire stored state after three (3) days (Free plan) or thirty (30) days (Paid plan).</p><p>If you’re wondering what “CPU time” is here: it’s the time your Workflow is actively consuming compute resources. It <i>doesn’t</i> include time spent waiting on API calls, reasoning LLMs, or other I/O (like writing to a database). That might seem like a small thing, but in practice, it adds up: most applications have single digit milliseconds of CPU time, and multiple seconds of wall time: an API or two taking 100 - 250 ms to respond adds up!</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6zRZ3gFQ0TrCetwlW0bqWG/87e41b7ab75ae48a4f2a6655d8ac2a86/3.png" />
          </figure><p><sup><i>Bill for CPU, not for time spent when a Workflow is idle or waiting.</i></sup></p><p>Workflow engines, especially, tend to spend a lot of time waiting: reading data from <a href="https://www.cloudflare.com/learning/cloud/what-is-object-storage/">object storage</a> (like <a href="https://www.cloudflare.com/developer-platform/products/r2/"><u>Cloudflare R2</u></a>), calling third-party APIs or LLMs like o3-mini or Claude 3.7, even querying databases like <a href="https://developers.cloudflare.com/d1/"><u>D1</u></a>, Postgres, or MySQL. With Workflows, just like Workers: you don’t pay for time your application is just waiting.</p>
    <div>
      <h2>Start building</h2>
      <a href="#start-building">
        
      </a>
    </div>
    <p>So you’ve got a good handle on Workflows, how it works, and want to get building. What next?</p><ol><li><p><a href="https://developers.cloudflare.com/workflows/"><u>Visit the Workflows documentation</u></a> to learn how it works, understand the Workflows API, and best practices</p></li><li><p>Review the code in the <a href="https://github.com/cloudflare/workflows-starter"><u>starter project</u></a></p></li><li><p>And lastly, deploy the starter to your own Cloudflare account with a few clicks:</p></li></ol><a href="https://deploy.workers.cloudflare.com/?url=https://github.com/cloudflare/workflows-starter"><img src="https://deploy.workers.cloudflare.com/button" /></a><p></p> ]]></content:encoded>
            <category><![CDATA[Developer Week]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[Workflows]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <guid isPermaLink="false">7ju3oFGzR3iR8gO2TmMleF</guid>
            <dc:creator>Sid Chatterjee</dc:creator>
            <dc:creator>Matt Silverlock</dc:creator>
        </item>
        <item>
            <title><![CDATA[Build durable applications on Cloudflare Workers: you write the Workflows, we take care of the rest]]></title>
            <link>https://blog.cloudflare.com/building-workflows-durable-execution-on-workers/</link>
            <pubDate>Thu, 24 Oct 2024 13:05:00 GMT</pubDate>
            <description><![CDATA[ Cloudflare Workflows is now in open beta! Workflows allows you to build reliable, repeatable, long-lived multi-step applications that can automatically retry, persist state, and scale out. Read on to learn how Workflows works, how we built it on top of Durable Objects, and how you can deploy your first Workflows application. ]]></description>
            <content:encoded><![CDATA[ <p>Workflows, Cloudflare’s durable execution engine that allows you to build reliable, repeatable multi-step applications that scale for you, is now in open beta. Any developer with a free or paid <a href="https://workers.cloudflare.com/"><u>Workers</u></a> plan can build and deploy a Workflow right now: no waitlist, no sign-up form, no fake line around-the-block.</p><p>If you learn by doing, you can create your first Workflow via a single command (or <a href="https://developers.cloudflare.com/workflows/get-started/guide/"><u>visit the docs for the full guide)</u></a>:</p>
            <pre><code>npm create cloudflare@latest workflows-starter -- \
  --template "cloudflare/workflows-starter"</code></pre>
            <p>Open the <code>src/index.ts</code> file, poke around, start extending it, and deploy it with a quick <code>wrangler deploy</code>.</p><p>If you want to learn more about how Workflows works, how you can use it to build applications, and how we built it, read on.</p>
    <div>
      <h2>Workflows? Durable Execution?</h2>
      <a href="#workflows-durable-execution">
        
      </a>
    </div>
    <p>Workflows—which we <a href="https://blog.cloudflare.com/data-anywhere-events-pipelines-durable-execution-workflows/#durable-execution"><u>announced back during Developer Week</u></a> earlier this year—is our take on the concept of “Durable Execution”: the ability to build and execute applications that are <i>durable</i> in the face of errors, network issues, upstream API outages, rate limits, and (most importantly) infrastructure failure.</p><p>As <a href="https://cloudflare.tv/event/xvm4qdgm?startTime=8m5s"><u>over 2.4 million developers</u></a> continue to build applications on top of Cloudflare Workers, R2, and Workers AI, we’ve noticed more developers building multi-step applications and workflows that process user data, transform unstructured data into structured, export metrics, persist state as they progress, and automatically retry &amp; restart. But writing any non-trivial application and making it <i>durable</i> in the face of failure is hard: this is where Workflows comes in. Workflows manages the retries, emitting the metrics, and durably storing the state (without you having to stand up your own database) as the Workflow progresses.</p><p>What makes Workflows different from other takes on “Durable Execution” is that we manage the underlying compute and storage infrastructure for you. You’re not left managing a compute cluster and hoping it scales both up (on a Monday morning) and down (during quieter periods) to manage costs, or ensuring that you have compute running in the right locations. Workflows is built on Cloudflare Workers — our job is to run your code and operate the infrastructure for you.</p><p>As an example of how Workflows can help you build durable applications, assume you want to post-process file uploads from your users that were uploaded to an R2 bucket directly via <a href="https://developers.cloudflare.com/r2/api/s3/presigned-urls/"><u>a pre-signed URL</u></a>. That post-processing could involve multiple actions: text extraction via a <a href="https://developers.cloudflare.com/workers-ai/models/"><u>Workers AI model</u></a>, calls to a third-party API to validate data, updating or querying rows in a database once the file has been processed… the list goes on.</p><p>But what each of these actions has in common is that it could <i>fail</i>. Maybe that upstream API is unavailable, maybe you get rate-limited, maybe your database is down. Having to write extensive retry logic around each action, manage backoffs, and (importantly) ensure your application doesn’t have to start from scratch when a later <i>step</i> fails is more boilerplate to write and more code to test and debug.</p><p>What’s a <i>step, </i>you ask? The core building block of every Workflow is the step: an individually retriable component of your application that can optionally emit state. That state is then persisted, even if subsequent steps were to fail. This means that your application doesn’t have to restart, allowing it to not only recover more quickly from failure scenarios, but it can also avoid doing redundant work. You don’t want your application hammering an expensive third-party API (or getting you rate limited) because it’s naively retrying an API call that you don’t have to.</p>
            <pre><code>export class MyWorkflow extends WorkflowEntrypoint&lt;Env, Params&gt; {
	async run(event: WorkflowEvent&lt;Params&gt;, step: WorkflowStep) {
		const files = await step.do('my first step', async () =&gt; {
			return {
				inputParams: event,
				files: [
					'doc_7392_rev3.pdf',
					'report_x29_final.pdf',
					'memo_2024_05_12.pdf',
					'file_089_update.pdf',
					'proj_alpha_v2.pdf',
					'data_analysis_q2.pdf',
					'notes_meeting_52.pdf',
					'summary_fy24_draft.pdf',
				],
			};
		});

		// Other steps...
	}
}
</code></pre>
            <p>Notably, a Workflow can have hundreds of steps: one of the <a href="https://developers.cloudflare.com/workflows/build/rules-of-workflows/"><u>Rules of Workflows</u></a> is to encapsulate every API call or stateful action within your application into its own step. Each step can also define its own retry strategy, automatically backing off, adding a delay and/or (eventually) giving up after a set number of attempts.</p>
            <pre><code>await step.do(
	'make a call to write that could maybe, just might, fail',
	// Define a retry strategy
	{
		retries: {
			limit: 5,
			delay: '5 seconds',
			backoff: 'exponential',
		},
		timeout: '15 minutes',
	},
	async () =&gt; {
		// Do stuff here, with access to the state from our previous steps
		if (Math.random() &gt; 0.5) {
			throw new Error('API call to $STORAGE_SYSTEM failed');
		}
	},
);
</code></pre>
            <p>To illustrate this further, imagine you have an application that reads text files from an R2 storage bucket, pre-processes the text into chunks, generates text embeddings <a href="https://developers.cloudflare.com/workers-ai/models/bge-large-en-v1.5/"><u>using Workers AI</u></a>, and then inserts those into a vector database (like <a href="https://developers.cloudflare.com/vectorize/"><u>Vectorize</u></a>) for semantic search.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7b9m0rPDlGvIiTnhguyvzI/3f27678b141ce600f1f54eb999e9d671/WORKFLOWS.png" />
          </figure><p>In the Workflows programming model, each of those is a discrete step, and each can emit state. For example, each of the four actions below can be a discrete <code>step.do</code> call in a Workflow:</p><ol><li><p>Reading the files from storage and emitting the list of filenames</p></li><li><p>Chunking the text and emitting the results</p></li><li><p>Generating text embeddings</p></li><li><p>Upserting them into Vectorize and capturing the result of a test query</p></li></ol><p>You can also start to imagine that some steps, such as chunking text or generating text embeddings, can be broken down into even more steps — a step per file that we chunk, or a step per API call to our text embedding model, so that our application is even more resilient to failure.</p><p>Steps can be created programmatically or conditionally based on input, allowing you to dynamically create steps based on the number of inputs your application needs to process. You do not need to define all steps ahead of time, and each instance of a Workflow may choose to conditionally create steps on the fly.</p>
    <div>
      <h2>Building Cloudflare on Cloudflare</h2>
      <a href="#building-cloudflare-on-cloudflare">
        
      </a>
    </div>
    <p>As the Cloudflare Developer platform <a href="https://www.cloudflare.com/birthday-week/"><u>continues to grow</u></a>, almost all of our own products are built on top of it. Workflows is yet another example of how we built a new product from scratch using nothing but Workers and its vast catalog of features and APIs. This section of the blog has two goals: to explain how we built it, and to demonstrate that anyone can create a complex application or platform with demanding requirements and multiple architectural layers on our stack, too.</p><p>If you’re wondering how Workflows manages to make durable execution easy, how it persists state, and how it automatically scales: it’s because we built it on Cloudflare Workers, including the brand-new <a href="https://blog.cloudflare.com/sqlite-in-durable-objects/"><u>zero-latency SQLite storage we recently introduced to Durable Objects</u></a>.
</p><p>To understand how Workflows uses Workers &amp; Durable Objects, here’s the high-level overview of our architecture:</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7pknYk0Sshxka3iPbxBCRj/bb8b75986601e38b6b69fe8d849c0cbe/image9.png" />
          </figure><p>There are three main blocks in this diagram:</p><p>The user-facing APIs are where the user interacts with the platform, creating and deploying new workflows or instances, controlling them, and accessing their state and activity logs. These operations can be executed through our public <a href="https://developers.cloudflare.com/api/"><u>API gateway</u></a> using REST calls, a Worker script using bindings, <a href="https://blog.cloudflare.com/wrangler3"><u>Wrangler</u></a> (Cloudflare's developer platform command line tool), or via the <a href="https://dash.cloudflare.com/"><u>Dashboard</u></a> user interface.</p><p>The managed platform holds the internal configuration APIs running on a Worker implementing a catalog of REST endpoints, the binding shim, which is supported by another dedicated Worker, every account controller, and their correspondent workflow engines, all powered by SQLite-backed Durable Objects. This is where all the magic happens and what we are sharing more details about in this technical blog.</p><p>Finally, there are the workflow instances, essentially independent clones of the workflow application. Instances are user account-owned and have a one-to-one relationship with a managed engine that powers them. You can run as many instances and engines as you want concurrently.</p><p>Let's get into more detail…</p>
    <div>
      <h3>Configuration API and Binding Shim</h3>
      <a href="#configuration-api-and-binding-shim">
        
      </a>
    </div>
    
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/2qEGr9M8KwgPS66Ju8mELL/189db9764392c00ae34dd3a44eeb1ed7/image6.png" />
          </figure><p>The Configuration API and the Binding Shim are two stateless Workers; one receives REST API calls from clients calling our <a href="https://developers.cloudflare.com/api/"><u>API Gateway</u></a> directly, using <a href="https://developers.cloudflare.com/workers/wrangler/"><u>Wrangler</u></a>, or navigating the <a href="https://dash.cloudflare.com/"><u>Dashboard</u></a> UI, and the other is the endpoint for the Workflows <a href="https://developers.cloudflare.com/workers/runtime-apis/bindings/"><u>binding</u></a>, an efficient and authenticated interface to interact with the Cloudflare Developer Platform resources from a Workers script.</p><p>The configuration API worker uses <a href="https://hono.dev/docs/getting-started/cloudflare-workers"><u>HonoJS</u></a> and <a href="https://hono.dev/examples/zod-openapi"><u>Zod</u></a> to implement the REST endpoints, which are declared in an <a href="https://swagger.io/specification/"><u>OpenAPI</u></a> schema and exported to our API Gateway, thus adding our methods to the Cloudflare API <a href="https://developers.cloudflare.com/api/"><u>catalog</u></a>.</p>
            <pre><code>import { swaggerUI } from '@hono/swagger-ui';
import { createRoute, OpenAPIHono, z } from '@hono/zod-openapi';
import { Hono } from 'hono';

...

​​api.openapi(
  createRoute({
    method: 'get',
    path: '/',
    request: {
      query: PaginationParams,
    },
    responses: {
      200: {
        content: {
          'application/json': {
             schema: APISchemaSuccess(z.array(WorkflowWithInstancesCountSchema)),
          },
        },
        description: 'List of all Workflows belonging to a account.',
      },
    },
  }),
  async (ctx) =&gt; {
    ...
  },
);

...

api.route('/:workflow_name', routes.workflows);
api.route('/:workflow_name/instances', routes.instances);
api.route('/:workflow_name/versions', routes.versions);</code></pre>
            <p>These Workers perform two different functions, but they share a large portion of their code and implement similar logic; once the request is authenticated and ready to travel to the next stage, they use the account ID to delegate the operation to a Durable Object called Account Controller.</p>
            <pre><code>// env.ACCOUNTS is the Account Controllers Durable Objects namespace
const accountStubId = c.env.ACCOUNTS.idFromName(accountId.toString());
const accountStub = c.env.ACCOUNTS.get(accountStubId);</code></pre>
            <p>As you can see, every account has its own Account Controller Durable Object.</p>
    <div>
      <h3>Account Controllers</h3>
      <a href="#account-controllers">
        
      </a>
    </div>
    <p>The Account Controller is a dedicated persisted database that stores the list of all the account’s workflows, versions, and instances. We scale to millions of account controllers, one per every Cloudflare account using Workflows, by leveraging the power of <a href="https://developers.cloudflare.com/durable-objects/best-practices/access-durable-objects-storage/#sqlite-storage-backend"><u>Durable Objects with SQLite backend</u></a>.</p><p><a href="https://developers.cloudflare.com/durable-objects/"><u>Durable Objects</u></a> (DOs) are single-threaded singletons that run in our data centers and are bound to a stateful storage API, in this case, SQLite. They are also Workers, just a special kind, and have access to all of our other APIs. This makes it easy to build consistent, highly available distributed applications with them.</p><p>Here’s what we get for free by using one Durable Object per Workflows account:</p><ul><li><p>Sharding based on account boundaries aligns perfectly with the way we manage resources at Cloudflare internally. Also, due to the nature of DOs, there are other things that this model gets us for free: Not that we expect them, but eventual bugs or state inconsistencies during beta are confined to the affected account, and don’t impact everyone.</p></li><li><p>DO instances run close to the end user; Alice is in London and will call the config API through our <a href="https://www.cloudflare.com/en-gb/network/"><u>LHR data center</u></a>, while Bob is in Lisbon and will connect to LIS.</p></li><li><p>Because every account is a Worker, we can gradually upgrade them to new versions, starting with the internal users, thus derisking real customers.</p></li></ul><p>Before SQLite, our only option was to use the Durable Object's <a href="https://developers.cloudflare.com/durable-objects/api/storage-api/#get"><u>key-value</u></a> storage API, but having a relational database at our fingertips and being able to create tables and do complex queries is a significant enabler. For example, take a look at how we implement the internal method getWorkflow():</p>
            <pre><code>async function getWorkflow(accountId: number, workflowName: string) {
  try {
    const res = this.ctx.storage.transactionSync(() =&gt; {
      const cursor = Array.from(
        this.ctx.storage.sql.exec(
          `
                    SELECT *,
                    (SELECT class_name
                        FROM   versions
                        WHERE  workflow_id = w.id
                        ORDER  BY created_on DESC
                        LIMIT  1) AS class_name
                    FROM   workflows w
                    WHERE  w.name = ? 
                    `,
          workflowName
        )
      )[0] as Workflow;

      return cursor;
    });

    this.sendAnalytics(accountId, begin, "getWorkflow");
    return res as Workflow | undefined;
  } catch (err) {
    this.sendErrorAnalytics(accountId, begin, "getWorkflow");
    throw err;
  }
}
</code></pre>
            <p>The other thing we take advantage of in Workflows is using the recently <a href="https://blog.cloudflare.com/javascript-native-rpc/"><u>announced</u></a> JavaScript-native RPC feature when communicating between components.</p><p>Before <a href="https://developers.cloudflare.com/workers/runtime-apis/rpc/"><u>RPC</u></a>, we had to <code>fetch()</code> between components, make HTTP requests, and serialize and deserialize the parameters and the payload. Now, we can async call the remote object's method as if it was local. Not only does this feel more natural and simplify our logic, but it's also more efficient, and we can take advantage of TypeScript type-checking when writing code.</p><p>This is how the Configuration API would call the Account Controller’s <code>countWorkflows()</code> method before:</p>
            <pre><code>const resp = await accountStub.fetch(
      "https://controller/count-workflows",
      {
        method: "POST",
        headers: {
          "Content-Type": "application/json; charset=utf-8",
        },
        body: JSON.stringify({ accountId }),
      },
    );

if (!resp.ok) {
  return new Response("Internal Server Error", { status: 500 });
}

const result = await resp.json();
const total_count = result.total_count;</code></pre>
            <p>This is how we do it using RPC:</p>
            <pre><code>const total_count = await accountStub.countWorkflows(accountId);</code></pre>
            <p>The other powerful feature of our RPC system is that it supports passing not only <a href="https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Structured_clone_algorithm#supported_types"><u>Structured Cloneable</u></a> objects back and forth but also entire classes. More on this later.</p><p>Let’s move on to Engine.</p>
    <div>
      <h3>Engine and instance</h3>
      <a href="#engine-and-instance">
        
      </a>
    </div>
    <p>Every instance of a workflow runs alongside an Engine instance. The Engine is responsible for starting up the user’s workflow entry point, executing the steps on behalf of the user, handling their results, and tracking the workflow state until completion.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6yrKsuF501oRCDujckr3yM/bde40097ec5bedda07793375e53e99b9/image1.png" />
          </figure><p>When we started thinking about the Engine, we thought about modeling it after a <a href="https://en.wikipedia.org/wiki/Finite-state_machine"><u>state machine</u></a>, and that was what our initial prototypes looked like. However, state machines require an ahead-of-time understanding of the userland code, which implies having a build step before running them. This is costly at scale and introduces additional complexity.</p><p>A few iterations later, we had another idea. What if we could model the engine as a game loop?</p><p>Unlike other computer programs, games operate regardless of a user's input. The game loop is essentially a sequence of tasks that implement the game's logic and update the display, typically one loop per video frame. Here’s an example of a game loop in pseudo-code:</p>
            <pre><code>while (game in running)
    check for user input
    move graphics
    play sounds
end while</code></pre>
            <p>Well, an oversimplified version of our Workflow engine would look like this:</p>
            <pre><code>while (last step not completed)
    iterate every step
       use memoized cache as response if the step has run already
       continue running step or timer if it hasn't finished yet
end while</code></pre>
            <p>A workflow is indeed a loop that keeps on going, performing the same sequence of logical tasks until the last step completes.</p><p>The Engine and the instance run hand-in-hand in a one-to-one relationship. The first is managed, and part of the platform. It uses SQLite and other platform APIs internally, and we can constantly add new features, fix bugs, and deploy new versions, while keeping everything transparent to the end user. The second is the actual account-owned Worker script that declares the Workflow steps.</p><p>For example, when someone passes a callback into <code>step.do()</code>:</p>
            <pre><code>export class MyWorkflow extends WorkflowEntrypoint&lt;Env, Params&gt; {
  async run(event: WorkflowEvent&lt;Params&gt;, step: WorkflowStep) {
    step.do('step1', () =&gt; { ... });
  }
}</code></pre>
            <p>We switch execution over to the Engine. Again, this is possible because of the power of JS RPC. Besides passing Structured Cloneable objects back and forth, JS RPC allows us to <a href="https://developers.cloudflare.com/workers/runtime-apis/rpc/#send-functions-as-parameters-of-rpc-methods"><u>create and pass entire application-defined classes</u></a> that extend the built-in RpcTarget. So this is what happens behind the scenes when your Instance calls <code>step.do()</code> (simplified):</p>
            <pre><code>export class Context extends RpcTarget {

  async do&lt;T&gt;(name: string, callback: () =&gt; Promise&lt;T&gt;): Promise&lt;T&gt; {

    // First we check we have a cache of this step.do() already
    const maybeResult = await this.#state.storage.get(name);

    // We return the cache if it exists
    if (maybeValue) { return maybeValue; }

    // Else we run the user callback
    return doWrapper(callback);
  }

}
</code></pre>
            <p>Here’s a more complete diagram of the Engine’s <code>step.do()</code> lifecycle:</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4MymVGS7BxwityCRlWcBOX/136d4dcf0affce04164f87b6bbe8b12a/image5.png" />
          </figure><p>Again, this diagram only partially represents everything we do in the Engine; things like logging for observability or handling exceptions are missing, and we don't get into the details of how queuing is implemented. However, it gives you a good idea of how the Engine abstracts and handles all the complexities of completing a step under the hood, allowing us to expose a simple-to-use API to end users.</p><p>Also, it's worth reiterating that every workflow instance is an Engine behind the scenes, and every Engine is an SQLite-backed Durable Object. This ensures that every instance runtime and state are isolated and independent of each other and that we can effortlessly scale to run billions of workflow instances, a solved problem for Durable Objects.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4uEoEAtsjNquPCD3F50S9d/006556baf2a0478d1de10e4514843baa/image3.png" />
          </figure>
    <div>
      <h3>Durability</h3>
      <a href="#durability">
        
      </a>
    </div>
    <p>Durable Execution is all the rage now when we talk about workflow engines, and ours is no exception. Workflows are typically long-lived processes that run multiple functions in sequence where anything can happen. Those functions can time out or fail because of a remote server error or a network issue and need to be retried. A workflow engine ensures that your application runs smoothly and completes regardless of the problems it encounters.</p><p>Durability means that if and when a workflow fails, the Engine can re-run it, resume from the last recorded step, and deterministically re-calculate the state from all the successful steps' cached responses. This is possible because steps are stateful and idempotent; they produce the same result no matter how many times we run them, thus not causing unintended duplicate effects like sending the same invoice to a customer multiple times.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1R5UfQfNMKI7hB6QXJfCUr/242e85f2b5287394871e916844359bd4/image7.png" />
          </figure><p>We ensure durability and handle failures and retries by sharing the same technique we use for a <code>step.sleep()</code> that requires sleeping for days or months: a combination of using <code>scheduler.wait()</code>, a method of the <a href="https://github.com/WICG/scheduling-apis"><u>upcoming WICG Scheduling API</u></a> that we already <a href="https://developers.cloudflare.com/workers/platform/changelog/historical-changelog/#2021-12-10"><u>support</u></a>, and <a href="https://developers.cloudflare.com/durable-objects/api/alarms/"><u>Durable Objects alarms</u></a>, which allow you to schedule the Durable Object to be woken up at a time in the future.</p><p>These two APIs allow us to overcome the lack of guarantees that a Durable Object runs forever, giving us complete control of its lifecycle. Since every state transition through userland code persists in the Engine’s strongly consistent SQLite, we track timestamps when a step begins execution, its attempts (if it needs retries), and its completion.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6FSCXRt9fO4EaaBP7hLV8x/a59de27dfbe18f39addd4eb8240b9df9/image10.png" />
          </figure><p>This means that steps pending if a Durable Object is <a href="https://developers.cloudflare.com/durable-objects/reference/in-memory-state/"><u>evicted</u></a> — perhaps due to a two-month-long timer — get rerun on the next lifetime of the Engine (with its cache from the previous lifetime hydrated) that is triggered by an alarm set with the timestamp of the next expected state transition. </p>
    <div>
      <h2>Real-life workflow, step by step</h2>
      <a href="#real-life-workflow-step-by-step">
        
      </a>
    </div>
    <p>Let's walk through an example of a real-life application. You run an e-commerce website and would like to send email reminders to your customers for forgotten carts that haven't been checked out in a few days.</p><p>What would typically have to be a combination of a queue, a cron job, and querying a database table periodically can now simply be a Workflow that we start on every new cart:</p>
            <pre><code>import {
  WorkflowEntrypoint,
  WorkflowEvent,
  WorkflowStep,
} from "cloudflare:workers";
import { sendEmail } from "./legacy-email-provider";

type Params = {
  cartId: string;
};

type Env = {
  DB: D1Database;
};

export class Purchase extends WorkflowEntrypoint&lt;Env, Params&gt; {
  async run(
    event: WorkflowEvent&lt;Params&gt;,
    step: WorkflowStep
  ): Promise&lt;unknown&gt; {
    await step.sleep("wait for three days", "3 days");

    // Retrieve cart from D1
    const cart = await step.do("retrieve cart from database", async () =&gt; {
      const { results } = await this.env.DB.prepare(`SELECT * FROM cart WHERE id = ?`)
        .bind(event.payload.cartId)
        .all();
      return results[0];
    });

    if (!cart.checkedOut) {
      await step.do("send an email", async () =&gt; {
        await sendEmail("reminder", cart);
      });
    }
  }
}
</code></pre>
            <p>This works great. However, sometimes the <code>sendEmail</code> function fails due to an upstream provider erroring out. While <code>step.do</code> automatically retries with a reasonable default configuration, we can define our settings:</p>
            <pre><code>if (cart.isComplete) {
  await step.do(
    "send an email",
    {
      retries: {
        limit: 5,
        delay: "1 min",
        backoff: "exponential",
      },
    },
    async () =&gt; {
      await sendEmail("reminder", cart);
    }
  );
}
</code></pre>
            
    <div>
      <h3>Managing Workflows</h3>
      <a href="#managing-workflows">
        
      </a>
    </div>
    <p>Workflows allows us to create and manage workflows using four different interfaces:</p><ul><li><p>Using our REST HTTP API available on <a href="https://developers.cloudflare.com/api/"><u>Cloudflare’s API catalog</u></a></p></li><li><p>Using <a href="https://developers.cloudflare.com/workers/wrangler/"><u>Wrangler</u></a>, Cloudflare's developer platform command-line tool</p></li><li><p>Programmatically inside a Worker using <a href="https://developers.cloudflare.com/workers/runtime-apis/bindings/"><u>bindings</u></a></p></li><li><p>Using our Web UI in the <a href="https://dash.cloudflare.com/"><u>dashboard</u></a></p></li></ul><p>The HTTP API makes it easy to trigger new instances of workflows from any system, even if it isn’t on Cloudflare, or from the command line. For example:</p>
            <pre><code>curl --request POST \
  --url https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/workflows/purchase-workflow/instances/$CART_INSTANCE_ID \
  --header 'Authorization: Bearer $ACCOUNT_TOKEN \
  --header 'Content-Type: application/json' \
  --data '{
	"id": "$CART_INSTANCE_ID",
	"params": {
		"cartId": "f3bcc11b-2833-41fb-847f-1b19469139d1"
	}
  }'</code></pre>
            <p>Wrangler goes one step further and gives us a friendlier set of commands to interact with workflows with fancy formatted outputs without needing to authenticate with tokens. Type <code>npx wrangler workflows</code> for help, or:</p>
            <pre><code>npx wrangler workflows trigger purchase-workflow '{ "cartId": "f3bcc11b-2833-41fb-847f-1b19469139d1" }'</code></pre>
            <p>Furthermore, Workflows has first-party support in wrangler, and you can test your instances locally. A Workflow is similar to a regular<a href="https://developers.cloudflare.com/workers/runtime-apis/bindings/service-bindings/rpc/"><u> WorkerEntrypoint</u></a> in your Worker, which means that <code>wrangler dev</code> just naturally works.</p>
            <pre><code>❯ npx wrangler dev

 ⛅️ wrangler 3.82.0
----------------------------

Your worker has access to the following bindings:
- Workflows:
  - CART_WORKFLOW: EcommerceCartWorkflow
⎔ Starting local server...
[wrangler:inf] Ready on http://localhost:8787
╭───────────────────────────────────────────────╮
│  [b] open a browser, [d] open devtools        │
╰───────────────────────────────────────────────╯
</code></pre>
            <p>Workflow APIs are also available as a Worker binding. You can interact with the platform programmatically from another Worker script in the same account without worrying about permissions or authentication. You can even have workflows that call and interact with other workflows.</p>
            <pre><code>import { WorkerEntrypoint } from "cloudflare:workers";

type Env = { DEMO_WORKFLOW: Workflow };
export default class extends WorkerEntrypoint&lt;Env&gt; {
  async fetch() {
    // Pass in a user defined name for this instance
    // In this case, we use the same as the cartId
    const instance = await this.env.DEMO_WORKFLOW.create({
      id: "f3bcc11b-2833-41fb-847f-1b19469139d1",
      params: {
          cartId: "f3bcc11b-2833-41fb-847f-1b19469139d1",
      }
    });
  }
  async scheduled() {
    // Restart errored out instances in a cron
    const instance = await this.env.DEMO_WORKFLOW.get(
      "f3bcc11b-2833-41fb-847f-1b19469139d1"
    );
    const status = await instance.status();
    if (status.error) {
      await instance.restart();
    }
  }
}</code></pre>
            
    <div>
      <h3>Observability </h3>
      <a href="#observability">
        
      </a>
    </div>
    <p>Having good <a href="https://www.cloudflare.com/learning/performance/what-is-observability/">observability</a> and data on often long-lived asynchronous tasks is crucial to understanding how we're doing under normal operation and, more importantly, when things go south, and we need to troubleshoot problems or when we are iterating on code changes.</p><p>We designed Workflows around the philosophy that there is no such thing as too much logging. You can get all the SQLite data for your workflow and its instances by calling the REST APIs. Here is the output of an instance:</p>
            <pre><code>{
  "success": true,
  "errors": [],
  "messages": [],
  "result": {
    "status": "running",
    "params": {},
    "trigger": { "source": "api" },
    "versionId": "ae042999-39ff-4d27-bbcd-22e03c7c4d02",
    "queued": "2024-10-21 17:15:09.350",
    "start": "2024-10-21 17:15:09.350",
    "end": null,
    "success": null,
    "steps": [
      {
        "name": "send email",
        "start": "2024-10-21 17:15:09.411",
        "end": "2024-10-21 17:15:09.678",
        "attempts": [
          {
            "start": "2024-10-21 17:15:09.411",
            "end": "2024-10-21 17:15:09.678",
            "success": true,
            "error": null
          }
        ],
        "config": {
          "retries": { "limit": 5, "delay": 1000, "backoff": "constant" },
          "timeout": "15 minutes"
        },
        "output": "celso@example.com",
        "success": true,
        "type": "step"
      },
      {
        "name": "sleep-1",
        "start": "2024-10-21 17:15:09.763",
        "end": "2024-10-21 17:17:09.763",
        "finished": false,
        "type": "sleep",
        "error": null
      }
    ],
    "error": null,
    "output": null
  }
}</code></pre>
            <p>As you can see, this is essentially a dump of the instance engine SQLite in JSON. You have the <b>errors</b>, <b>messages</b>, current <b>status</b>, and what happened with <b>every step</b>, all time stamped to the millisecond.</p><p>It's one thing to get data about a specific workflow instance, but it's another to zoom out and look at aggregated statistics of all your workflows and instances over time. Workflows data is available through our <a href="https://developers.cloudflare.com/analytics/graphql-api/"><u>GraphQL Analytics API</u></a>, so you can query it in aggregate and generate valuable insights and reports. In this example we ask for aggregated analytics about the wall time of all the instances of the “e-commerce-carts” workflow:</p>
            <pre><code>{
  viewer {
    accounts(filter: { accountTag: "febf0b1a15b0ec222a614a1f9ac0f0123" }) {
      wallTime: workflowsAdaptiveGroups(
        limit: 10000
        filter: {
          datetimeHour_geq: "2024-10-20T12:00:00.000Z"
          datetimeHour_leq: "2024-10-21T12:00:00.000Z"
          workflowName: "e-commerce-carts"
        }
        orderBy: [count_DESC]
      ) {
        count
        sum {
          wallTime
        }
        dimensions {
          date: datetimeHour
        }
      }
    }
  }
}
</code></pre>
            <p>For convenience, you can evidently also use Wrangler to describe a workflow or an instance and get an instant and beautifully formatted response:</p>
            <pre><code>sid ~ npx wrangler workflows instances describe purchase-workflow latest

 ⛅️ wrangler 3.80.4

Workflow Name:         purchase-workflow
Instance Id:           d4280218-7756-41d2-bccd-8d647b82d7ce
Version Id:            0c07dbc4-aaf3-44a9-9fd0-29437ed11ff6
Status:                ✅ Completed
Trigger:               🌎 API
Queued:                14/10/2024, 16:25:17
Success:               ✅ Yes
Start:                 14/10/2024, 16:25:17
End:                   14/10/2024, 16:26:17
Duration:              1 minute
Last Successful Step:  wait for three days
Output:                false
Steps:

  Name:      wait for three days
  Type:      💤 Sleeping
  Start:     14/10/2024, 16:25:17
  End:       17/10/2024, 16:25:17
  Duration:  3 day</code></pre>
            <p>And finally, we worked really hard to get you the best dashboard UI experience when navigating Workflows data.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/64XUtBwldkSXUTJ5xEJBgo/2aa861583c8c56c19194cb0869a15a2a/image8.png" />
          </figure>
    <div>
      <h2>So, how much does it cost?</h2>
      <a href="#so-how-much-does-it-cost">
        
      </a>
    </div>
    <p>It’d be painful if we introduced a powerful new way to build Workers applications but made it cost prohibitive.</p><p>Workflows is <a href="https://developers.cloudflare.com/workers/platform/pricing/#workers"><u>priced</u></a> just like Cloudflare Workers, where we <a href="https://blog.cloudflare.com/workers-pricing-scale-to-zero/"><u>introduced CPU-based pricing</u></a>: only on active CPU time and requests, not duration (aka: wall time).</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/11WroT4xt0zPj6bsou4u3X/8f2775569f280107345322cb97603b3e/image4.png" />
          </figure><p><sup><i>Workers Standard pricing model</i></sup></p><p>This is especially advantageous when building the long-running, multi-step applications that Workflows enables: if you had to pay while your Workflow was sleeping, waiting on an event, or making a network call to an API, writing the “right” code would be at odds with writing affordable code.</p><p>There’s also no need to keep a Kubernetes cluster or a group of virtual machines running (and burning a hole in your wallet): we manage the infrastructure, and you only pay for the compute your Workflows consume.   </p>
    <div>
      <h2>What’s next?</h2>
      <a href="#whats-next">
        
      </a>
    </div>
    <p>Today, after months of developing the platform, we are announcing the open beta program, and we couldn't be more excited to see how you will be using Workflows. Looking forward, we want to do things like triggering instances from queue messages and have other ideas, but at the same time, we are certain that your feedback will help us shape the roadmap ahead.</p><p>We hope that this blog post gets you thinking about how to use Workflows for your next application, but also that it inspires you on what you can build on top of Workers. Workflows as a platform is entirely built on top of Workers, its resources, and APIs. Anyone can do it, too.</p><p>To chat with the team and other developers building on Workflows, join the #workflows-beta channel on the<a href="https://discord.cloudflare.com/"> <u>Cloudflare Developer Discord</u></a>, and keep an eye on the<a href="https://developers.cloudflare.com/workflows/reference/changelog/"> <u>Workflows changelog</u></a> during the beta. Otherwise,<a href="https://developers.cloudflare.com/workflows/get-started/guide/"> visit the Workflows tutorial</a> to get started.</p><p>If you're an engineer, <a href="https://www.cloudflare.com/en-gb/careers/jobs/"><u>look for opportunities</u></a> to work with us and help us improve Workflows or build other products.</p> ]]></content:encoded>
            <category><![CDATA[Developer Platform]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[Durable Objects]]></category>
            <category><![CDATA[Workflows]]></category>
            <guid isPermaLink="false">1YRfz7LKvAGrEMbRGhNrFP</guid>
            <dc:creator>Sid Chatterjee</dc:creator>
            <dc:creator>Matt Silverlock</dc:creator>
            <dc:creator>Celso Martinho</dc:creator>
        </item>
    </channel>
</rss>