
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/">
    <channel>
        <title><![CDATA[ The Cloudflare Blog ]]></title>
        <description><![CDATA[ Get the latest news on how products at Cloudflare are built, technologies used, and join the teams helping to build a better Internet. ]]></description>
        <link>https://blog.cloudflare.com</link>
        <atom:link href="https://blog.cloudflare.com/" rel="self" type="application/rss+xml"/>
        <language>en-us</language>
        <image>
            <url>https://blog.cloudflare.com/favicon.png</url>
            <title>The Cloudflare Blog</title>
            <link>https://blog.cloudflare.com</link>
        </image>
        <lastBuildDate>Sat, 04 Apr 2026 09:30:18 GMT</lastBuildDate>
        <item>
            <title><![CDATA[Introducing Markdown for Agents]]></title>
            <link>https://blog.cloudflare.com/markdown-for-agents/</link>
            <pubDate>Thu, 12 Feb 2026 14:03:00 GMT</pubDate>
            <description><![CDATA[ The way content is discovered online is shifting, from traditional search engines to AI agents that need structured data from a Web built for humans. It’s time to consider not just human visitors, but start to treat agents as first-class citizens. Markdown for Agents automatically converts any HTML page requested from our network to markdown. ]]></description>
            <content:encoded><![CDATA[ <p>The way content and businesses are discovered online is changing rapidly. In the past, traffic originated from traditional search engines, and SEO determined who got found first. Now the traffic is increasingly coming from AI crawlers and agents that demand structured data within the often-unstructured Web that was built for humans.</p><p>As a business, to continue to stay ahead, now is the time to consider not just human visitors, or traditional wisdom for SEO-optimization, but start to treat agents as first-class citizens. </p>
    <div>
      <h2>Why markdown is important</h2>
      <a href="#why-markdown-is-important">
        
      </a>
    </div>
    <p>Feeding raw HTML to an AI is like paying by the word to read packaging instead of the letter inside. A simple <code>## About Us</code> on a page in markdown costs roughly 3 tokens; its HTML equivalent – <code>&lt;h2 class="section-title" id="about"&gt;About Us&lt;/h2&gt;</code> – burns 12-15, and that's before you account for the <code>&lt;div&gt;</code> wrappers, nav bars, and script tags that pad every real web page and have zero semantic value.</p><p>This blog post you’re reading takes 16,180 tokens in HTML and 3,150 tokens when converted to markdown. <b>That’s a 80% reduction in token usage</b>.</p><p><a href="https://en.wikipedia.org/wiki/Markdown"><u>Markdown</u></a> has quickly become the <i>lingua franca</i> for agents and AI systems as a whole. The format’s explicit structure makes it ideal for AI processing, ultimately resulting in better results while minimizing token waste.</p><p>The problem is that the Web is made of HTML, not markdown, and page weight has been <a href="https://almanac.httparchive.org/en/2025/page-weight#page-weight-over-time"><u>steadily increasing</u></a> over the years, making pages hard to parse. For agents, their goal is to filter out all non-essential elements and scan the relevant content.</p><p>The conversion of HTML to markdown is now a common step for any AI pipeline. Still, this process is far from ideal: it wastes computation, adds costs and processing complexity, and above all, it may not be how the content creator intended their content to be used in the first place.</p><p>What if AI agents could bypass the complexities of intent analysis and document conversion, and instead receive structured markdown directly from the source?</p>
    <div>
      <h2>Convert HTML to markdown, automatically</h2>
      <a href="#convert-html-to-markdown-automatically">
        
      </a>
    </div>
    <p>Cloudflare's network now supports real-time content conversion at the source, for <a href="https://developers.cloudflare.com/fundamentals/reference/markdown-for-agents/"><u>enabled zones</u></a> using <a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Guides/Content_negotiation"><u>content negotiation</u></a> headers. Now when AI systems request pages from any website that uses Cloudflare and has Markdown for Agents enabled, they can express the preference for text/markdown in the request. Our network will automatically and efficiently convert the HTML to markdown, when possible, on the fly.</p><p>Here’s how it works. To fetch the markdown version of any page from a zone with Markdown for Agents enabled, the client needs to add the <b>Accept</b> negotiation header with <code>text/markdown</code><b> </b>as one of the options. Cloudflare will detect this, fetch the original HTML version from the origin, and convert it to markdown before serving it to the client.</p><p>Here's a curl example with the Accept negotiation header requesting a page from our developer documentation:</p>
            <pre><code>curl https://developers.cloudflare.com/fundamentals/reference/markdown-for-agents/ \
  -H "Accept: text/markdown"
</code></pre>
            <p>Or if you’re building an AI Agent using Workers, you can use TypeScript:</p>
            <pre><code>const r = await fetch(
  `https://developers.cloudflare.com/fundamentals/reference/markdown-for-agents/`,
  {
    headers: {
      Accept: "text/markdown, text/html",
    },
  },
);
const tokenCount = r.headers.get("x-markdown-tokens");
const markdown = await r.text();
</code></pre>
            <p>We already see some of the most popular coding agents today – like Claude Code and OpenCode – send these accept headers with their requests for content. Now, the response to this request is formatted  in markdown. It's that simple.  </p>
            <pre><code>HTTP/2 200
date: Wed, 11 Feb 2026 11:44:48 GMT
content-type: text/markdown; charset=utf-8
content-length: 2899
vary: accept
x-markdown-tokens: 725
content-signal: ai-train=yes, search=yes, ai-input=yes

---
title: Markdown for Agents · Cloudflare Agents docs
---

## What is Markdown for Agents

The ability to parse and convert HTML to Markdown has become foundational for AI.
...
</code></pre>
            <p>Note that we include an <code>x-markdown-tokens</code> header with the converted response that indicates the estimated number of tokens in the markdown document. You can use this value in your flow, for example to calculate the size of a context window or to decide on your chunking strategy.</p><p>Here’s a diagram of how it works:</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6Zw1Q5kBBqTrouN1362H5I/3080d74a2a971be1f1e7e0ba79611998/BLOG-3162_2.png" />
          </figure>
    <div>
      <h3>Content Signals Policy</h3>
      <a href="#content-signals-policy">
        
      </a>
    </div>
    <p>During our last Birthday Week, Cloudflare <a href="https://blog.cloudflare.com/content-signals-policy/"><u>announced</u></a> Content Signals — <a href="http://contentsignals.org"><u>a framework</u></a> that allows anyone to express their preferences for how their content can be used after it has been accessed. </p><p>When you return markdown, you want to make sure your content is being used by the Agent or AI crawler. That’s why Markdown for Agents converted responses include the <code>Content-Signal: ai-train=yes, search=yes, ai-input=yes</code> header signaling that indicates content can be used for AI Training, Search results and AI Input, which includes agentic use. Markdown for Agents will provide options to define custom Content Signal policies in the future.</p><p>Check our dedicated <a href="https://contentsignals.org/"><u>Content Signals</u></a> page for more information on this framework.</p>
    <div>
      <h3>Try it with the Cloudflare Blog &amp; Developer Documentation </h3>
      <a href="#try-it-with-the-cloudflare-blog-developer-documentation">
        
      </a>
    </div>
    <p>We enabled this feature in our <a href="https://developers.cloudflare.com/"><u>Developer Documentation</u></a> and our <a href="https://blog.cloudflare.com/"><u>Blog</u></a>, inviting all AI crawlers and agents to consume our content using markdown instead of HTML.</p><p>Try it out now by requesting this blog with <code>Accept: text/markdown</code>.</p>
            <pre><code>curl https://blog.cloudflare.com/markdown-for-agents/ \
  -H "Accept: text/markdown"</code></pre>
            <p>The result is:</p>
            <pre><code>---
description: The way content is discovered online is shifting, from traditional search engines to AI agents that need structured data from a Web built for humans. It’s time to consider not just human visitors, but start to treat agents as first-class citizens. Markdown for Agents automatically converts any HTML page requested from our network to markdown.
title: Introducing Markdown for Agents
image: https://blog.cloudflare.com/images/markdown-for-agents.png
---

# Introducing Markdown for Agents

The way content and businesses are discovered online is changing rapidly. In the past, traffic originated from traditional search engines and SEO determined who got found first. Now the traffic is increasingly coming from AI crawlers and agents that demand structured data within the often-unstructured Web that was built for humans.

...</code></pre>
            
    <div>
      <h3>Other ways to convert to Markdown</h3>
      <a href="#other-ways-to-convert-to-markdown">
        
      </a>
    </div>
    <p>If you’re building AI systems that require arbitrary document conversion from outside Cloudflare or Markdown for Agents is not available from the content source, we provide other ways to convert documents to Markdown for your applications:</p><ul><li><p>Workers AI <a href="https://developers.cloudflare.com/workers-ai/features/markdown-conversion/"><u>AI.toMarkdown()</u></a> supports multiple document types, not just HTML, and summarization.</p></li><li><p>Browser Rendering <a href="https://developers.cloudflare.com/browser-rendering/rest-api/markdown-endpoint/"><u>/markdown</u></a> REST API supports markdown conversion if you need to render a dynamic page or application in a real browser before converting it.</p></li></ul>
    <div>
      <h2>Tracking markdown usage</h2>
      <a href="#tracking-markdown-usage">
        
      </a>
    </div>
    <p>Anticipating a shift in how AI systems browse the Web, Cloudflare Radar now includes content type insights for AI bot and crawler traffic, both globally on the <a href="https://radar.cloudflare.com/ai-insights#content-type"><u>AI Insights</u></a> page and in the <a href="https://radar.cloudflare.com/bots/directory/gptbot"><u>individual bot</u></a> information pages.</p><p>The new <code>content_type</code> dimension and filter shows the distribution of content types returned to AI agents and crawlers, grouped by <a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Guides/MIME_types"><u>MIME type</u></a> category.  </p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7vQzvzHsTLPXGhoQK0Xbr5/183129a8947990bc4ee5bb5ca7ba71b5/BLOG-3162_3.png" />
          </figure><p>You can also see the requests for markdown filtered by a specific agent or crawler. Here are the requests that return markdown to OAI-Searchbot, the crawler used by OpenAI to power ChatGPT’s search: </p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7Ah99DWLxnYjadW6xJhAXg/afef4a29ae504d4fe69df4f9823dd103/BLOG-3162_4.png" />
          </figure><p>This new data will allow us to track the evolution of how AI bots, crawlers, and agents are consuming Web content over time. As always, everything on Radar is freely accessible via the <a href="https://developers.cloudflare.com/api/resources/radar/"><u>public APIs</u></a> and the <a href="https://radar.cloudflare.com/explorer?dataSet=ai.bots&amp;groupBy=content_type&amp;filters=userAgent%253DGPTBot&amp;timeCompare=1"><u>Data Explorer</u></a>. </p>
    <div>
      <h2>Start using today</h2>
      <a href="#start-using-today">
        
      </a>
    </div>
    <p>To enable Markdown for Agents for your zone, log into the Cloudflare <a href="https://dash.cloudflare.com/"><u>dashboard</u></a>, select your account, select the zone, look for Quick Actions and toggle the Markdown for Agents button to enable. This feature is available today in Beta at no cost for Pro, Business and Enterprise plans, as well as SSL for SaaS customers.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1UqzmHrNa1UdCCI6eXIfmn/3da0ff51dd94219d8af87c172d83fc72/BLOG-3162_5.png" />
          </figure><p>You can find more information about Markdown for Agents on our<a href="https://developers.cloudflare.com/fundamentals/reference/markdown-for-agents/"> Developer Docs</a>. We welcome your feedback as we continue to refine and enhance this feature. We’re curious to see how AI crawlers and agents navigate and adapt to the unstructured nature of the Web as it evolves.</p> ]]></content:encoded>
            <category><![CDATA[AI]]></category>
            <category><![CDATA[Agents]]></category>
            <category><![CDATA[Developers]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <guid isPermaLink="false">5uEb99xvnHVk3QfN0KMjb6</guid>
            <dc:creator>Celso Martinho</dc:creator>
            <dc:creator>Will Allen</dc:creator>
        </item>
        <item>
            <title><![CDATA[Introducing Moltworker: a self-hosted personal AI agent, minus the minis]]></title>
            <link>https://blog.cloudflare.com/moltworker-self-hosted-ai-agent/</link>
            <pubDate>Thu, 29 Jan 2026 14:00:00 GMT</pubDate>
            <description><![CDATA[ Moltworker is a middleware Worker and adapted scripts that allows running OpenClaw (formerly Moltbot, formerly Clawdbot) on Cloudflare's Sandbox SDK and our Developer Platform APIs. So you can self-host an AI personal assistant — without any new hardware. ]]></description>
            <content:encoded><![CDATA[ <p><i></i></p><p><i>Editorial note: As of January 30, 2026, Moltbot has been </i><a href="https://openclaw.ai/blog/introducing-openclaw"><i><u>renamed</u></i></a><i> to OpenClaw.</i></p><p>The Internet woke up this week to a flood of people <a href="https://x.com/AlexFinn/status/2015133627043270750"><u>buying Mac minis</u></a> to run <a href="https://github.com/moltbot/moltbot"><u>Moltbot</u></a> (formerly Clawdbot), an open-source, self-hosted AI agent designed to act as a personal assistant. Moltbot runs in the background on a user's own hardware, has a sizable and growing list of integrations for chat applications, AI models, and other popular tools, and can be controlled remotely. Moltbot can help you with your finances, social media, organize your day — all through your favorite messaging app.</p><p>But what if you don’t want to buy new dedicated hardware? And what if you could still run your Moltbot efficiently and securely online? Meet <a href="https://github.com/cloudflare/moltworker"><u>Moltworker</u></a>, a middleware Worker and adapted scripts that allows running Moltbot on Cloudflare's Sandbox SDK and our Developer Platform APIs.</p>
    <div>
      <h2>A personal assistant on Cloudflare — how does that work? </h2>
      <a href="#a-personal-assistant-on-cloudflare-how-does-that-work">
        
      </a>
    </div>
    <p>Cloudflare Workers has never been <a href="https://developers.cloudflare.com/workers/runtime-apis/nodejs/"><u>as compatible</u></a> with Node.js as it is now. Where in the past we had to mock APIs to get some packages running, now those APIs are supported natively by the Workers Runtime.</p><p>This has changed how we can build tools on Cloudflare Workers. When we first implemented <a href="https://developers.cloudflare.com/browser-rendering/playwright/"><u>Playwright</u></a>, a popular framework for web testing and automation that runs on <a href="https://developers.cloudflare.com/browser-rendering/"><u>Browser Rendering</u></a>, we had to rely on <a href="https://www.npmjs.com/package/memfs"><u>memfs</u></a>. This was bad because not only is memfs a hack and an external dependency, but it also forced us to drift away from the official Playwright codebase. Thankfully, with more Node.js compatibility, we were able to start using <a href="https://github.com/cloudflare/playwright/pull/62/changes"><u>node:fs natively</u></a>, reducing complexity and maintainability, which makes upgrades to the latest versions of Playwright easy to do.</p><p>The list of Node.js APIs we support natively keeps growing. The blog post “<a href="https://blog.cloudflare.com/nodejs-workers-2025/"><u>A year of improving Node.js compatibility in Cloudflare Workers</u></a>” provides an overview of where we are and what we’re doing.</p><p>We measure this progress, too. We recently ran an experiment where we took the 1,000 most popular NPM packages, installed and let AI loose, to try to run them in Cloudflare Workers, <a href="https://ghuntley.com/ralph/"><u>Ralph Wiggum as a "software engineer"</u></a> style, and the results were surprisingly good. Excluding the packages that are build tools, CLI tools or browser-only and don’t apply, only 15 packages genuinely didn’t work. <b>That's 1.5%</b>.</p><p>Here’s a graphic of our Node.js API support over time:</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5GhwKJq2A2wG79I3NdhhDl/e462c30daf46b1b36d3f06bff479596b/image9.png" />
          </figure><p>We put together a page with the results of our internal experiment on npm packages support <a href="https://worksonworkers.southpolesteve.workers.dev/"><u>here</u></a>, so you can check for yourself.</p><p>Moltbot doesn’t necessarily require a lot of Workers Node.js compatibility because most of the code runs in a container anyway, but we thought it would be important to highlight how far we got supporting so many packages using native APIs. This is because when starting a new AI agent application from scratch, we can actually run a lot of the logic in Workers, closer to the user.</p><p>The other important part of the story is that the list of <a href="https://developers.cloudflare.com/directory/?product-group=Developer+platform"><u>products and APIs</u></a> on our Developer Platform has grown to the point where anyone can <a href="https://www.cloudflare.com/developer-platform/solutions/hosting/">build and run any kind of application</a> — even the most complex and demanding ones — on Cloudflare. And once launched, every application running on our Developer Platform immediately benefits from our secure and scalable global network.</p><p>Those products and services gave us the ingredients we needed to get started. First, we now have <a href="https://sandbox.cloudflare.com/"><u>Sandboxes</u></a>, where you can run untrusted code securely in isolated environments, providing a place to run the service. Next, we now have <a href="https://developers.cloudflare.com/browser-rendering/"><u>Browser Rendering</u></a>, where you can programmatically control and interact with headless browser instances. And finally, <a href="https://developers.cloudflare.com/r2/"><u>R2</u></a>, where you can store objects persistently. With those building blocks available, we could begin work on adapting Moltbot.</p>
    <div>
      <h2>How we adapted Moltbot to run on us</h2>
      <a href="#how-we-adapted-moltbot-to-run-on-us">
        
      </a>
    </div>
    <p>Moltbot on Workers, or Moltworker, is a combination of an entrypoint Worker that acts as an API router and a proxy between our APIs and the isolated environment, both protected by Cloudflare Access. It also provides an administration UI and connects to the Sandbox container where the standard Moltbot Gateway runtime and its integrations are running, using R2 for persistent storage.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3OD2oHgy5ilHpQO2GJvcLU/836a55b67a626d2cd378a654ad47901d/newdiagram.png" />
          </figure><p><sup>High-level architecture diagram of Moltworker.</sup></p><p>Let's dive in more.</p>
    <div>
      <h3>AI Gateway</h3>
      <a href="#ai-gateway">
        
      </a>
    </div>
    <p>Cloudflare AI Gateway acts as a proxy between your AI applications and any popular <a href="https://developers.cloudflare.com/ai-gateway/usage/providers/"><u>AI provider</u></a>, and gives our customers centralized visibility and control over the requests going through.</p><p>Recently we announced support for <a href="https://developers.cloudflare.com/changelog/2025-08-25-secrets-store-ai-gateway/"><u>Bring Your Own Key (BYOK)</u></a>, where instead of passing your provider secrets in plain text with every request, we centrally manage the secrets for you and can use them with your gateway configuration.</p><p>An even better option where you don’t have to manage AI providers' secrets at all end-to-end is to use <a href="https://developers.cloudflare.com/ai-gateway/features/unified-billing/"><u>Unified Billing</u></a>. In this case you top up your account with credits and use AI Gateway with any of the supported providers directly, Cloudflare gets charged, and we will deduct credits from your account.</p><p>To make Moltbot use AI Gateway, first we create a new gateway instance, then we enable the Anthropic provider for it, then we either add our Claude key or purchase credits to use Unified Billing, and then all we need to do is set the ANTHROPIC_BASE_URL environment variable so Moltbot uses the AI Gateway endpoint. That’s it, no code changes necessary.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/cMWRXgHR0mFLc5kp74nYk/a47fa09bdbb6acb3deb60fb16537945d/image11.png" />
          </figure><p>Once Moltbot starts using AI Gateway, you’ll have full visibility on costs and have access to logs and analytics that will help you understand how your AI agent is using the AI providers.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5GOrNdgtdwMcU4bE8oLE19/6bc29bcac643125f5332a8ffba9d1322/image1.png" />
          </figure><p>Note that Anthropic is one option; Moltbot supports <a href="https://www.molt.bot/integrations"><u>other</u></a> AI providers and so does <a href="https://developers.cloudflare.com/ai-gateway/usage/providers/"><u>AI Gateway</u></a>. The advantage of using AI Gateway is that if a better model comes along from any provider, you don’t have to swap keys in your AI Agent configuration and redeploy — you can simply switch the model in your gateway configuration. And more, you specify model or provider <a href="https://developers.cloudflare.com/ai-gateway/configuration/fallbacks/"><u>fallbacks</u></a> to handle request failures and ensure reliability.</p>
    <div>
      <h3>Sandboxes</h3>
      <a href="#sandboxes">
        
      </a>
    </div>
    <p>Last year we anticipated the growing need for AI agents to run untrusted code securely in isolated environments, and we <a href="https://developers.cloudflare.com/changelog/2025-06-24-announcing-sandboxes/"><u>announced</u></a> the <a href="https://developers.cloudflare.com/sandbox/"><u>Sandbox SDK</u></a>. This SDK is built on top of <a href="https://developers.cloudflare.com/containers/"><u>Cloudflare Containers</u></a>, but it provides a simple API for executing commands, managing files, running background processes, and exposing services — all from your Workers applications.</p><p>In short, instead of having to deal with the lower-level Container APIs, the Sandbox SDK gives you developer-friendly APIs for secure code execution and handles the complexity of container lifecycle, networking, file systems, and process management — letting you focus on building your application logic with just a few lines of TypeScript. Here’s an example:</p>
            <pre><code>import { getSandbox } from '@cloudflare/sandbox';
export { Sandbox } from '@cloudflare/sandbox';

export default {
  async fetch(request: Request, env: Env): Promise&lt;Response&gt; {
    const sandbox = getSandbox(env.Sandbox, 'user-123');

    // Create a project structure
    await sandbox.mkdir('/workspace/project/src', { recursive: true });

    // Check node version
    const version = await sandbox.exec('node -v');

    // Run some python code
    const ctx = await sandbox.createCodeContext({ language: 'python' });
    await sandbox.runCode('import math; radius = 5', { context: ctx });
    const result = await sandbox.runCode('math.pi * radius ** 2', { context: ctx });

    return Response.json({ version, result });
  }
};</code></pre>
            <p>This fits like a glove for Moltbot. Instead of running Docker in your local Mac mini, we run Docker on Containers, use the Sandbox SDK to issue commands into the isolated environment and use callbacks to our entrypoint Worker, effectively establishing a two-way communication channel between the two systems.</p>
    <div>
      <h3>R2 for persistent storage</h3>
      <a href="#r2-for-persistent-storage">
        
      </a>
    </div>
    <p>The good thing about running things in your local computer or VPS is you get persistent storage for free. Containers, however, are inherently <a href="https://developers.cloudflare.com/containers/platform-details/architecture/"><u>ephemeral</u></a>, meaning data generated within them is lost upon deletion. Fear not, though — the Sandbox SDK provides the sandbox.mountBucket() that you can use to automatically, well, mount your R2 bucket as a filesystem partition when the container starts.</p><p>Once we have a local directory that is guaranteed to survive the container lifecycle, we can use that for Moltbot to store session memory files, conversations and other assets that are required to persist.</p>
    <div>
      <h3>Browser Rendering for browser automation</h3>
      <a href="#browser-rendering-for-browser-automation">
        
      </a>
    </div>
    <p>AI agents rely heavily on browsing the sometimes not-so-structured web. Moltbot utilizes dedicated Chromium instances to perform actions, navigate the web, fill out forms, take snapshots, and handle tasks that require a web browser. Sure, we can run Chromium on Sandboxes too, but what if we could simplify and use an API instead?</p><p>With Cloudflare’s <a href="https://developers.cloudflare.com/browser-rendering/"><u>Browser Rendering</u></a>, you can programmatically control and interact with headless browser instances running at scale in our edge network. We support <a href="https://developers.cloudflare.com/browser-rendering/puppeteer/"><u>Puppeteer</u></a>, <a href="https://developers.cloudflare.com/browser-rendering/stagehand/"><u>Stagehand</u></a>, <a href="https://developers.cloudflare.com/browser-rendering/playwright/"><u>Playwright</u></a> and other popular packages so that developers can onboard with minimal code changes. We even support <a href="https://developers.cloudflare.com/browser-rendering/playwright/playwright-mcp/"><u>MCP</u></a> for AI.</p><p>In order to get Browser Rendering to work with Moltbot we do two things:</p><ul><li><p>First we create a <a href="https://github.com/cloudflare/moltworker/blob/main/src/routes/cdp.ts"><u>thin CDP proxy</u></a> (<a href="https://chromedevtools.github.io/devtools-protocol/"><u>CDP</u></a> is the protocol that allows instrumenting Chromium-based browsers) from the Sandbox container to the Moltbot Worker, back to Browser Rendering using the Puppeteer APIs.</p></li><li><p>Then we inject a <a href="https://github.com/cloudflare/moltworker/pull/20"><u>Browser Rendering skill</u></a> into the runtime when the Sandbox starts.</p></li></ul>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1ZvQa7vS1T9Mm3nywqarQZ/9dec3d8d06870ee575a519440d34c499/image12.png" />
          </figure><p>From the Moltbot runtime perspective, it has a local CDP port it can connect to and perform browser tasks.</p>
    <div>
      <h3>Zero Trust Access for authentication policies</h3>
      <a href="#zero-trust-access-for-authentication-policies">
        
      </a>
    </div>
    <p>Next up we want to protect our APIs and Admin UI from unauthorized access. Doing authentication from scratch is hard, and is typically the kind of wheel you don’t want to reinvent or have to deal with. Zero Trust Access makes it incredibly easy to protect your application by defining specific policies and login methods for the endpoints. </p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1MDXXjbMs4PViN3kp9iFBY/a3095f07c986594d0c07d0276dbf22cc/image3.png" />
          </figure><p><sup>Zero Trust Access Login methods configuration for the Moltworker application.</sup></p><p>Once the endpoints are protected, Cloudflare will handle authentication for you and automatically include a <a href="https://developers.cloudflare.com/cloudflare-one/access-controls/applications/http-apps/authorization-cookie/application-token/"><u>JWT token</u></a> with every request to your origin endpoints. You can then <a href="https://developers.cloudflare.com/cloudflare-one/access-controls/applications/http-apps/authorization-cookie/validating-json/"><u>validate</u></a> that JWT for extra protection, to ensure that the request came from Access and not a malicious third party.</p><p>Like with AI Gateway, once all your APIs are behind Access you get great observability on who the users are and what they are doing with your Moltbot instance.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3BV4eqxKPXTiq18vvVpmZh/e034b7e7ea637a00c73c2ebe4d1400aa/image8.png" />
          </figure>
    <div>
      <h2>Moltworker in action</h2>
      <a href="#moltworker-in-action">
        
      </a>
    </div>
    <p>Demo time. We’ve put up a Slack instance where we could play with our own instance of Moltbot on Workers. Here are some of the fun things we’ve done with it.</p><p>We hate bad news.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4FxN935AgINZ8953WSswKB/e52d3eb268aa0732c5e6aa64a8e2adba/image6.png" />
          </figure><p>Here’s a chat session where we ask Moltbot to find the shortest route between Cloudflare in London and Cloudflare in Lisbon using Google Maps and take a screenshot in a Slack channel. It goes through a sequence of steps using Browser Rendering to navigate Google Maps and does a pretty good job at it. Also look at Moltbot’s memory in action when we ask him the second time.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1phWt3cVUwxe9tvCYpuAW3/97f456094ede6ca8fb55bf0dddf65d5b/image10.png" />
          </figure><p>We’re in the mood for some Asian food today, let’s get Moltbot to work for help.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6nJY7GOCopGnMy4IY7KMcf/0d57794df524780c3f4b27e65c968e19/image5.png" />
          </figure><p>We eat with our eyes too.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5BzB9pqJhuevRbOSJloeG0/23c2905f0c12c1e7f104aa28fcc1f595/image7.png" />
          </figure><p>Let’s get more creative and ask Moltbot to create a video where it browses our developer documentation. As you can see, it downloads and runs ffmpeg to generate the video out of the frames it captured in the browser.</p><div>
  
</div>
    <div>
      <h2>Run your own Moltworker</h2>
      <a href="#run-your-own-moltworker">
        
      </a>
    </div>
    <p>We open-sourced our implementation and made it available at<a href="https://github.com/cloudflare/moltworker"> <u>https://github.com/cloudflare/moltworker</u></a>, so you can deploy and run your own Moltbot on top of Workers today.</p><p>The <a href="https://github.com/cloudflare/moltworker/blob/main/README.md">README</a> guides you through the necessary setup steps. You will need a Cloudflare account and a <a href="https://developers.cloudflare.com/workers/platform/pricing/"><u>Workers Paid plan</u></a> to access Sandbox Containers; however, all other products are either entirely free (like <a href="https://developers.cloudflare.com/ai-gateway/reference/pricing/"><u>AI Gateway</u></a>) or include generous <a href="https://developers.cloudflare.com/r2/pricing/#free-tier"><u>free tiers </u></a>that allow you to get started and scale under reasonable limits.</p><p><b>Note that Moltworker is a proof of concept, not a Cloudflare product</b>. Our goal is to showcase some of the most exciting features of our <a href="https://developers.cloudflare.com/learning-paths/workers/devplat/intro-to-devplat/">Developer Platform</a> that can be used to run AI agents and unsupervised code efficiently and securely, and get great observability while taking advantage of our global network.</p><p>Feel free to contribute to or fork our <a href="https://github.com/cloudflare/moltworker"><u>GitHub</u></a> repository; we will keep an eye on it for a while for support. We are also considering contributing upstream to the official project with Cloudflare skills in parallel.</p>
    <div>
      <h2>Conclusion</h2>
      <a href="#conclusion">
        
      </a>
    </div>
    <p>We hope you enjoyed this experiment, and we were able to convince you that Cloudflare is the perfect place to run your AI applications and agents. We’ve been working relentlessly trying to anticipate the future and release features like the <a href="https://developers.cloudflare.com/agents/"><u>Agents SDK</u></a> that you can use to build your first agent <a href="https://developers.cloudflare.com/agents/guides/slack-agent/"><u>in minutes</u></a>, <a href="https://developers.cloudflare.com/sandbox/"><u>Sandboxes</u></a> where you can run arbitrary code in an isolated environment without the complications of the lifecycle of a container, and <a href="https://developers.cloudflare.com/ai-search/"><u>AI Search</u></a>, Cloudflare’s managed vector-based search service, to name a few.</p><p>Cloudflare now offers a complete toolkit for AI development: inference, storage APIs, databases, durable execution for stateful workflows, and built-in AI capabilities. Together, these building blocks make it possible to build and run even the most demanding AI applications on our global edge network.</p><p>If you're excited about AI and want to help us build the next generation of products and APIs, we're <a href="https://www.cloudflare.com/en-gb/careers/jobs/?department=Engineering"><u>hiring</u></a>.</p> ]]></content:encoded>
            <category><![CDATA[AI]]></category>
            <category><![CDATA[Agents]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[Containers]]></category>
            <category><![CDATA[Sandbox]]></category>
            <guid isPermaLink="false">45LuZGCXAcs7EMnB64zTQm</guid>
            <dc:creator>Celso Martinho</dc:creator>
            <dc:creator>Brian Brunner</dc:creator>
            <dc:creator>Sid Chatterjee</dc:creator>
            <dc:creator>Andreas Jansson</dc:creator>
        </item>
        <item>
            <title><![CDATA[An AI Index for all our customers]]></title>
            <link>https://blog.cloudflare.com/an-ai-index-for-all-our-customers/</link>
            <pubDate>Fri, 26 Sep 2025 14:00:00 GMT</pubDate>
            <description><![CDATA[ Cloudflare will soon automatically create an AI-optimized search index for your domain, and expose a set of ready-to-use standard APIs and tools including an MCP server, LLMs.txt, and a search API. ]]></description>
            <content:encoded><![CDATA[ <p>Today, we’re announcing the <b>private beta</b> of <b>AI Index </b>for domains on Cloudflare, a new type of web index that gives content creators the tools to make their data discoverable by AI, and gives AI builders access to better data for fair compensation.</p><p>With AI Index enabled on your domain, we will automatically create an AI-optimized search index for your website, and expose a set of ready-to-use standard APIs and tools including an MCP server, LLMs.txt, and a search API. Our customers will own and control that index and how it’s used, and you will have the ability to monetize access through <a href="https://developers.cloudflare.com/ai-crawl-control/features/pay-per-crawl/what-is-pay-per-crawl/"><u>Pay per crawl</u></a> and the new <a href="https://blog.cloudflare.com/x402/"><u>x402 integrations</u></a>. You will be able to use it to build modern search experiences on your own site, and more importantly, interact with external AI and Agentic providers to make your content more discoverable while being fairly compensated.</p><p>For AI builders—whether developers creating agentic applications, or AI platform companies providing foundational LLM models—Cloudflare will offer a new way to discover and retrieve web content: direct <b>pub/sub connections</b> to individual websites with AI Index. Instead of indiscriminate crawling, builders will be able to subscribe to specific sites that have opted in for discovery, receive structured updates as soon as content changes, and pay fairly for each access. Access is always at the discretion of the site owner.</p><p>From the individual indexes, Cloudflare will also build an aggregated layer, the <b>Open Index</b>, that bundles together participating sites. Builders get a single place to search across collections or the broader web, while every site still retains control and can earn from participation. </p>
    <div>
      <h3>Why build an AI Index?</h3>
      <a href="#why-build-an-ai-index">
        
      </a>
    </div>
    <p>AI platforms are quickly becoming one of the main ways people discover information online. Whether asking a chatbot to summarize a news article or find a product recommendation, the path to that answer almost always starts with crawling original content and indexing or using that data for training. However, today, that process is largely controlled by platforms: what gets crawled, how often, and whether the site owner has any input in the matter.</p><p>Although Cloudflare now offers to monitor and control how AI services respect your access policies and how they access your content, it's still challenging to make new content visible. Content creators have no efficient way to signal to AI builders when a page is published or updated. On the other hand, for AI builders, crawling and recrawling unstructured content is costly, wastes resources, especially when you don’t know the quality and cost in advance.</p><p>We need a fairer and healthier ecosystem for content discovery and usage that bridges the gap between content creators and AI builders.</p>
    <div>
      <h3>How AI Index will work</h3>
      <a href="#how-ai-index-will-work">
        
      </a>
    </div>
    <p>When you onboard a domain to Cloudflare, or if you have an existing domain on Cloudflare, you will have the choice to enable an AI Index. If enabled, we will automatically create an AI-optimized search index for your domain that you own and control.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3kV7Oru6D5jPWeGeWDQDsi/7d738250f24250cf98db2e96222319ec/image1.png" />
          </figure><p>As your site updates and grows, the index will evolve with it. New or updated pages will be processed in real-time using the same technology that powers Cloudflare <a href="https://developers.cloudflare.com/ai-search/"><u>AI Search (formerly AutoRAG)</u></a> and its <a href="https://developers.cloudflare.com/ai-search/configuration/data-source/website/"><u>Website</u></a> as a data source. Best of all, we will manage everything; you won't have to worry about each individual component of compute, storage resources, databases, embeddings, chunking, or AI models. Everything will happen behind the scenes, automatically.</p><p>Importantly, you will have control over what content to <b>include or exclude </b>from your website's index, and <b>who</b> can get access to your content via <b>AI</b> <b>Crawl Control</b>, ensuring that only the data you want to expose is made searchable and accessible. You also will be able to opt out of the AI Index completely; it will all be up to you.</p><p>When your AI Index is set up, you will get a set of ready-to-use APIs:                                                                                                                                                   </p><ul><li><p><b>An MCP Server: </b>Agentic applications will be able to connect directly to your site using the <a href="https://www.cloudflare.com/learning/ai/what-is-model-context-protocol-mcp/"><u>Model Context Protocol (MCP)</u></a>, making your content discoverable to agents in a standardized way. This includes support for <a href="https://developers.cloudflare.com/ai-search/how-to/nlweb/"><u>NLWeb</u></a> tools, an open project developed by Microsoft that defines a standard protocol for natural language queries on websites.</p></li><li><p><b>A flexible search API: </b>This endpoint will<b> </b>return relevant results in structured JSON. </p></li><li><p><b>LLMs.txt and LLMs-full.txt: </b>Standard files that provide LLMs with a machine-readable map of your site, following <a href="https://github.com/AnswerDotAI/llms-txt"><u>emerging open standards</u></a>. These will help models understand how to use your site’s content at inference time. An example of <a href="https://developers.cloudflare.com/llms.txt"><u>llms.txt</u></a> exists in the Cloudflare Developer Documentation.</p></li><li><p><b>A bulk data API: </b>An endpoint<b> </b>for transferring large amounts of content efficiently, available under the rules you set. Instead of querying for every document, AI providers will be able to ingest in one shot.</p></li><li><p><b>Pub-sub subscriptions: </b>AI platforms will be able to subscribe to your site’s index and receive events and content updates directly from Cloudflare in a structured format in real-time, making it easy for them to stay current without re-crawling.</p></li><li><p><b>Discoverability directives:</b> In robots.txt and well-known URIs to allow AI agents and crawlers visiting your site to discover and use the available API automatically.</p></li></ul>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4Hr3EhsMBH0oVwMVKywwre/2a01efbe03d67a8154123b63c05c000f/image3.png" />
          </figure><p>The index will integrate directly with <a href="https://developers.cloudflare.com/ai-crawl-control/"><u>AI Crawl Control</u></a>, so you will be able to see who’s accessing your content, set rules, and manage permissions. And with <a href="https://developers.cloudflare.com/ai-crawl-control/features/pay-per-crawl/what-is-pay-per-crawl/"><u>Pay per crawl</u></a> and <a href="https://blog.cloudflare.com/x402/"><u>x402 integrations</u></a>, you can choose to directly monetize access to your content. </p>
    <div>
      <h3>A feed of the web for AI builders</h3>
      <a href="#a-feed-of-the-web-for-ai-builders">
        
      </a>
    </div>
    <p>As an AI builder, you will be able to discover and subscribe to high-quality, permissioned web data through individual site’s AI indexes. Instead of sending crawlers blindly across the open Internet, you will connect via a pub/sub model: participating websites will expose structured updates whenever their content changes, and you will be able to subscribe to receive those updates in real-time. With this model, your new workflow may look something like this:</p><ol><li><p><b>Discover websites that have opted in: </b>Browse and filter through a directory of websites that make their indexes available through Cloudflare.</p></li><li><p><b>Evaluate content with metadata and metrics: </b>Get content metadata information on various metrics (e.g., uniqueness, depth, contextual relevance, popularity) before accessing it.</p></li><li><p><b>Pay fairly for access:</b> When content is valuable, platforms can compensate creators directly through Pay per crawl. These payments not only enable access but also support the continued creation of original content, helping to sustain a healthier ecosystem for discovery.</p></li><li><p><b>Subscribe to updates: </b>Use pub-sub subscriptions to receive events about changes made by the website, so you know when to retrieve or crawl for new content without wasting resources on constant re-crawling. </p></li></ol><p>By shifting from blind crawling to a permissioned pub/sub system for the web, AI builders save time, cut costs, and gain access to cleaner, high-quality data while content creators remain in control and are fairly compensated.</p>
    <div>
      <h3>The aggregated Open Index</h3>
      <a href="#the-aggregated-open-index">
        
      </a>
    </div>
    <p>Individual indexes provide AI platforms with the ability to access data directly from specific sites, allowing them to subscribe for updates, evaluate value, and pay for full content access on a per-site basis. But when builders need to work at a larger scale, managing dozens or hundreds of separate subscriptions can become complex. The <b>Open Index </b>will provide an additional option: a bundled, opt-in collection of those indexes, featuring sophisticated features such as quality, uniqueness, originality, and depth of content filters, all accessible in one place.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6rjkK5UCh9BLSqceUuG0RI/92413aed318baced0ee8812bec511cfb/image2.png" />
          </figure><p>The Open Index is designed to make content discovery at scale easier:</p><ul><li><p><b>Get unified access: </b>Query and retrieve data across many participating sites simultaneously. This reduces integration overhead and enables builders to plug into a curated collection of data, or use it as a ready-made web search layer that can be accessed at query time.</p></li><li><p><b>Discover broader scopes: </b>Work with topic-specific bundles (e.g., news, documentation, scientific research) or a general discovery index covering the broader web. This makes it simple to explore new content sources you may not have identified individually.</p></li><li><p><b>Bottom-up monetization: </b>Results still originate from an individual site’s AI index, with monetization flowing back to that site through Pay per crawl, helping preserve fairness and sustainability at scale.</p></li></ul><p>Together, per-site AI indexes and the Open Index will provide flexibility and precise control when you want full content from individual sites (i.e., for training, AI agents, or search experiences), and broad search coverage when you need a unified search across the web.</p>
    <div>
      <h3>How you can participate in the shift</h3>
      <a href="#how-you-can-participate-in-the-shift">
        
      </a>
    </div>
    <p>With AI Index and the Cloudflare Open Index, we’re creating a model where websites decide how their content is accessed, and AI builders receive structured, reliable data at scale to build a fairer and healthier ecosystem for content discovery and usage on the Internet.</p><p>We’re starting with a <b>private beta</b>. If you want to enroll your website into the AI Index or access the pub/sub web feed as an AI builder, you can <a href="https://www.cloudflare.com/aiindex-signup/"><b><u>sign up today</u></b></a>.</p> ]]></content:encoded>
            <category><![CDATA[AI]]></category>
            <category><![CDATA[Birthday Week]]></category>
            <category><![CDATA[Pay Per Crawl]]></category>
            <category><![CDATA[AI Search]]></category>
            <category><![CDATA[MCP]]></category>
            <guid isPermaLink="false">7rcW6x4j6v7O6ZEHir5fmK</guid>
            <dc:creator>Celso Martinho</dc:creator>
            <dc:creator>Anni Wang</dc:creator>
        </item>
        <item>
            <title><![CDATA[Announcing Cloudflare Email Service’s private beta]]></title>
            <link>https://blog.cloudflare.com/email-service/</link>
            <pubDate>Thu, 25 Sep 2025 14:00:00 GMT</pubDate>
            <description><![CDATA[ Today, we’re launching Cloudflare Email Service. Send and receive email directly from your Workers with native bindings—no API keys needed. Sign up for the private beta. ]]></description>
            <content:encoded><![CDATA[ <p>If you are building an application, you rely on email to communicate with your users. You validate their signup, notify them about events, and send them invoices through email. The service continues to find new purpose with agentic workflows and other AI-powered tools that rely on a simple email as an input or output.</p><p>And it is a pain for developers to manage. It’s frequently the most annoying burden for most teams. Developers deserve a solution that is simple, reliable, and deeply integrated into their workflow. </p><p>Today, we're excited to announce just that: the private beta of Email Sending, a new capability that allows you to send transactional emails directly from Cloudflare Workers. Email Sending joins and expands our popular <a href="https://developers.cloudflare.com/email-routing/"><u>Email Routing</u></a> product, and together they form the new Cloudflare Email Service — a single, unified developer experience for all your email needs.</p><p>With Cloudflare Email Service, we’re distilling our years of experience <a href="https://developers.cloudflare.com/cloudflare-one/email-security/"><u>securing</u></a> and <a href="https://developers.cloudflare.com/email-routing/"><u>routing</u></a> emails, and combining it with the power of the developer platform. Now, sending an email is as easy as adding a binding to a Worker and calling <code>send</code>:</p>
            <pre><code>export default {
  async fetch(request, env, ctx) {

    await env.SEND_EMAIL.send({
      to: [{ email: "hello@example.com" }],
      from: { email: "api-sender@your-domain.com", name: "Your App" },
      subject: "Hello World",
      text: "Hello World!"
    });

    return new Response(`Successfully sent email!`);
  },
};</code></pre>
            
    <div>
      <h3>Email experience is user experience</h3>
      <a href="#email-experience-is-user-experience">
        
      </a>
    </div>
    <p>Email is a core tenet of your user experience. It’s how you stay in touch with your users when they are outside your applications. Users rely on email to inform them when they need to take actions such as password resets, purchase receipts, magic login links, and onboarding flows. When they fail, your application fails.</p><p>That means it’s crucial that emails need to land in your users’ inboxes, both reliably and quickly. A magic link that arrives ten minutes late is a lost user. An email delivered to a spam folder breaks user flows and can erode trust in your product. That’s why we’re focusing on deliverability and time-to-inbox with Cloudflare Email Service. </p><p>To do this, we’re tightly integrating with DNS to automatically configure the necessary DNS records — like SPF, DKIM and DMARC — such that email providers can verify your sending domain and trust your emails. Plus, in true Cloudflare fashion, Email Service is a global service. That means that we can deliver your emails with low latency anywhere in the world, without the complexity of managing servers across regions.</p>
    <div>
      <h3>Simple and flexible for developers</h3>
      <a href="#simple-and-flexible-for-developers">
        
      </a>
    </div>
    <p>Treating email as a core piece of your application also means building for every touchpoint in your development workflow. We’re building Email Service as part of the Cloudflare stack to make developing with email feels as natural as writing a Worker. </p><p>In practice, that means solving for every part of the transactional email workflow:</p><ul><li><p>Starting with Email Service is easy. Instead of managing API keys and secrets, you can use the <code>Email</code> binding to your <code>wrangler.jsonc</code> and send emails securely and with no risk of leaked credentials. </p></li><li><p>You can use Workers to process incoming mail, store attachments in R2, and add tasks to Queues to get email sending off the hot path of your application. And you can use <code>wrangler</code> to emulate Email Sending locally, allowing you to test your user journeys without jumping between tools and environments.</p></li><li><p>In production, you have clear <a href="https://www.cloudflare.com/learning/performance/what-is-observability/">observability</a> over your emails with bounce rates and delivery events. And, when a user reports a missing email, you can quickly dive into the delivery status to debug issues quickly and help get your user back on track.</p></li></ul><p>We’re also making sure Email Service seamlessly fits into your existing applications. If you need to send emails from external services, you can do so using either REST APIs or SMTP. Likewise, if you’ve been leaning on existing email frameworks (like <a href="https://react.email/"><u>React Email</u></a>) to send rich, HTML-rendered emails to users, you can continue to use them with Email Service. Import the library, render your template, and pass it to the `send` method just as you would elsewhere.</p>
            <pre><code>import { render, pretty, toPlainText } from '@react-email/render';
import { SignupConfirmation } from './templates';

export default {
  async fetch(request, env, ctx) {

    // Convert React Email template to html
    const html = await pretty(await render(&lt;SignupConfirmation url="https://your-domain.com/confirmation-id"/&gt;));

    // Use the Email Sending binding to send emails
    await env.SEND_EMAIL.send({
      to: [{ email: "hello@example.com" }],
      from: { email: "api-sender@your-domain.com", name: "Welcome" },
      subject: "Signup Confirmation",
      html,
      text: toPlainText(html)
    });

    return new Response(`Successfully sent email!`);
  }
};</code></pre>
            
    <div>
      <h3>Email Routing and Email Sending: Better together</h3>
      <a href="#email-routing-and-email-sending-better-together">
        
      </a>
    </div>
    <p>Sending email is only half the story. Applications often need to receive and parse emails to create powerful workflows. By combining Email Sending with our existing <a href="https://developers.cloudflare.com/email-routing"><u>Email Routing</u></a> capabilities, we're providing a complete, end-to-end solution for all your application's email needs.</p><p>Email Routing allows you to create custom email addresses on your domain and handle incoming messages programmatically with a Worker, which can enable powerful application flows such as:</p><ul><li><p>Using <a href="https://developers.cloudflare.com/workers-ai/"><u>Workers AI</u></a> to parse, summarize and even label incoming emails: flagging security events from customers, early signs of a bug or incident, and/or generating automatic responses based on those incoming emails.</p></li><li><p>Creating support tickets in systems like JIRA or Linear from emails sent to <code>support@your-domain.com</code>.</p></li><li><p>Processing invoices sent to <code>invoices@your-domain.com</code> and storing attachments in R2.</p></li></ul><p>To use Email Routing, add the <code>email</code> handler to your Worker application and process it as needed:</p>
            <pre><code>export default {
  // Create an email handler to process emails delivered to your Worker
  async email(message, env, ctx) {

    // Classify incoming emails using Workers AI
    const { score, label } = env.AI.run("@cf/huggingface/distilbert-sst-2-int8", { text: message.raw" })

    env.PROCESSED_EMAILS.send({score, label, message});
  },
};  </code></pre>
            <p>When you combine inbound routing with outbound sending, you can close the loop entirely within Cloudflare. Imagine a user emails your support address. A Worker can receive the email, parse its content, call a third-party API to create a ticket, and then use the Email Sending binding to send an immediate confirmation back to the user with their ticket number. That’s the power of a unified Email Service.</p><p>Email Sending will require a paid Workers subscription, and we'll be charging based on messages sent. We're still finalizing the packaging, and we'll update our documentation, <a href="https://developers.cloudflare.com/changelog/"><u>changelog</u></a>, and notify users as soon as we have final pricing and long before we start charging. Email Routing limits will remain unchanged.</p>
    <div>
      <h3>What’s next</h3>
      <a href="#whats-next">
        
      </a>
    </div>
    <p>Email is core to your application today, and it's becoming essential for the next generation of AI agents, background tasks, and automated workflows. We built the Cloudflare Email Service to be the engine for this new era of applications, we’ll be making it available in private beta this November.</p><ul><li><p>Interested in Email Sending? <a href="https://forms.gle/BX6ECfkar3oVLQxs7"><u>Sign up to the waitlist here.</u></a> </p></li><li><p>Want to start processing inbound emails? Get started with <a href="https://developers.cloudflare.com/email-routing/"><u>Email Routing</u></a>, which is available now, remains free and will be folded into the new email sending APIs coming.</p></li></ul><p>We’re excited to be adding Email Service to our Developer Platform, and we’re looking forward to seeing how you reimagine user experiences that increasingly rely on emails!</p> ]]></content:encoded>
            <category><![CDATA[Birthday Week]]></category>
            <category><![CDATA[Cloudflare Email Service]]></category>
            <guid isPermaLink="false">3yl6uG1uh1UE5rplzBlLad</guid>
            <dc:creator>Thomas Gauvin</dc:creator>
            <dc:creator>Celso Martinho</dc:creator>
        </item>
        <item>
            <title><![CDATA[Robotcop: enforcing your robots.txt policies and stopping bots before they reach your website]]></title>
            <link>https://blog.cloudflare.com/ai-audit-enforcing-robots-txt/</link>
            <pubDate>Tue, 10 Dec 2024 14:00:00 GMT</pubDate>
            <description><![CDATA[ The AI Crawl Control (formerly AI Audit) now allows you to quickly see which AI services are honoring your robots.txt policies and then automatically enforce the policies against those that aren’t.
 ]]></description>
            <content:encoded><![CDATA[ <p>Cloudflare’s <a href="https://blog.cloudflare.com/cloudflare-ai-audit-control-ai-content-crawlers/"><u>AI Crawl Control </u><i><u>(formerly AI Audit)</u></i></a><i> </i>dashboard allows you to easily understand how AI companies and services access your content. AI Crawl Control gives a summary of request counts broken out by bot, detailed path summaries for more granular insights, and the ability to filter by categories like <b>AI Search</b> or <b>AI Crawler</b>.</p><p>Today, we're going one step further. You can now quickly see which AI services are honoring your robots.txt policies, which aren’t, and then programmatically enforce these policies. </p>
    <div>
      <h3>What is robots.txt?</h3>
      <a href="#what-is-robots-txt">
        
      </a>
    </div>
    <p><a href="https://www.cloudflare.com/learning/bots/what-is-robots-txt/"><u>Robots.txt</u></a> is a plain text file hosted on your domain that implements the <a href="https://www.rfc-editor.org/rfc/rfc9309.html"><u>Robots Exclusion Protocol</u></a>, a standard that has been around since 1994. This file tells crawlers like Google, Bing, and many others which parts of your site, if any, they are allowed to access. </p><p>There are many reasons why site owners would want to define which portions of their websites crawlers are allowed to access: they might not want certain content available on search engines or social networks, they might trust one platform more than another, or they might simply want to reduce automated traffic to their servers.</p><p>With the advent of <a href="https://www.cloudflare.com/learning/ai/what-is-generative-ai/"><u>generative AI</u></a>, AI services have started crawling the Internet to collect training data for their models. These models are often proprietary and commercial and are used to generate new content. Many content creators and publishers that want to exercise control over how their content is used have started using robots.txt to declare policies that cover these AI bots, in addition to the traditional search engines.</p><p>Here’s an abbreviated real-world example of the robots.txt policy from a top online news site:</p>
            <pre><code>User-agent: GPTBot
Disallow: /

User-agent: ChatGPT-User
Disallow: /

User-agent: anthropic-ai
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: Bytespider
Disallow: /
</code></pre>
            <p>This policy declares that the news site doesn't want ChatGPT, Anthropic AI, Google Gemini, or ByteDance’s Bytespider to crawl any of their content.</p>
    <div>
      <h3>From voluntary compliance to enforcement</h3>
      <a href="#from-voluntary-compliance-to-enforcement">
        
      </a>
    </div>
    <p>Compliance with the Robots Exclusion Protocol has historically been voluntary. </p><p>That’s where our new feature comes in. We’ve extended <a href="https://blog.cloudflare.com/cloudflare-ai-audit-control-ai-content-crawlers/"><u>AI Crawl Control</u></a> to give our customers both the visibility into how AI services providers honor their robots.txt policies <i>and</i> the ability to enforce those policies at the network level in your <a href="https://developers.cloudflare.com/waf/"><u>WAF</u></a>. </p><p>Your robots.txt file declares your policy, but now we can help you enforce it. You might even call it … your Robotcop.  </p>
    <div>
      <h3>How it works</h3>
      <a href="#how-it-works">
        
      </a>
    </div>
    <p>AI Crawl Control takes the robots.txt files from your web properties, parses them, and then matches their rules against the AI bot traffic we see for the selected property. The summary table gives you an aggregated view of the number of requests and violations we see for every Bot across all paths. If you hover your mouse over the Robots.txt column, we will show you the defined policies for each Bot in the tooltip. You can also filter by violations from the top of the page. </p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/o2hHH0Nm68muUzaxmbx7E/0b9c2acfb33f2ca2d59e00625b4d0fc7/BLOG-2619_2.png" />
          </figure><p>In the “Most popular paths” section, whenever a path in your site gets traffic that has violated your policy, we flag it for visibility. Ideally, you wouldn't see violations in the Robots.txt column — if you do see them, someone's not complying.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1o5sChT2d6QK8JNPejImVk/79590e1721644a2fd067784bb9ce862e/BLOG-2619_3.png" />
          </figure><p>But that's not all… More importantly, AI Crawl Control allows you to enforce your robots.txt policy at the network level. By pressing the "Enforce robots.txt rules" button on the top of the summary table, we automatically translate the rules defined for AI Bots in your robots.txt into an advanced firewall rule, redirect you to the WAF configuration screen, and allow you to deploy the rule in our network.</p><p>This is how the robots.txt policy mentioned above looks after translation:</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5qYJG3RcvrDxzVDtb28Q2J/d73d7dcea94acb261e9fc525427c2e77/BLOG-2619_4.png" />
          </figure><p>Once you deploy a WAF rule built from your robots.txt policies, you are no longer simply requesting that AI services respect your policy, you're enforcing it.</p>
    <div>
      <h3>Conclusion</h3>
      <a href="#conclusion">
        
      </a>
    </div>
    <p>With AI Crawl Control, we are giving our customers even more visibility into how AI services access their content, helping them define their policies and then enforcing them at the network level.</p><p>This feature is live today for all Cloudflare customers. Simply log into the dashboard and navigate to your domain to begin auditing the bot traffic from AI services and enforcing your robots.txt directives.</p> ]]></content:encoded>
            <category><![CDATA[AI]]></category>
            <category><![CDATA[Network Services]]></category>
            <category><![CDATA[Application Services]]></category>
            <category><![CDATA[security.txt]]></category>
            <guid isPermaLink="false">6Bi6mGvw8vrskNZ7Mmp73F</guid>
            <dc:creator>Celso Martinho</dc:creator>
            <dc:creator>Will Allen</dc:creator>
            <dc:creator>Nelson Duarte</dc:creator>
        </item>
        <item>
            <title><![CDATA[Build durable applications on Cloudflare Workers: you write the Workflows, we take care of the rest]]></title>
            <link>https://blog.cloudflare.com/building-workflows-durable-execution-on-workers/</link>
            <pubDate>Thu, 24 Oct 2024 13:05:00 GMT</pubDate>
            <description><![CDATA[ Cloudflare Workflows is now in open beta! Workflows allows you to build reliable, repeatable, long-lived multi-step applications that can automatically retry, persist state, and scale out. Read on to learn how Workflows works, how we built it on top of Durable Objects, and how you can deploy your first Workflows application. ]]></description>
            <content:encoded><![CDATA[ <p>Workflows, Cloudflare’s durable execution engine that allows you to build reliable, repeatable multi-step applications that scale for you, is now in open beta. Any developer with a free or paid <a href="https://workers.cloudflare.com/"><u>Workers</u></a> plan can build and deploy a Workflow right now: no waitlist, no sign-up form, no fake line around-the-block.</p><p>If you learn by doing, you can create your first Workflow via a single command (or <a href="https://developers.cloudflare.com/workflows/get-started/guide/"><u>visit the docs for the full guide)</u></a>:</p>
            <pre><code>npm create cloudflare@latest workflows-starter -- \
  --template "cloudflare/workflows-starter"</code></pre>
            <p>Open the <code>src/index.ts</code> file, poke around, start extending it, and deploy it with a quick <code>wrangler deploy</code>.</p><p>If you want to learn more about how Workflows works, how you can use it to build applications, and how we built it, read on.</p>
    <div>
      <h2>Workflows? Durable Execution?</h2>
      <a href="#workflows-durable-execution">
        
      </a>
    </div>
    <p>Workflows—which we <a href="https://blog.cloudflare.com/data-anywhere-events-pipelines-durable-execution-workflows/#durable-execution"><u>announced back during Developer Week</u></a> earlier this year—is our take on the concept of “Durable Execution”: the ability to build and execute applications that are <i>durable</i> in the face of errors, network issues, upstream API outages, rate limits, and (most importantly) infrastructure failure.</p><p>As <a href="https://cloudflare.tv/event/xvm4qdgm?startTime=8m5s"><u>over 2.4 million developers</u></a> continue to build applications on top of Cloudflare Workers, R2, and Workers AI, we’ve noticed more developers building multi-step applications and workflows that process user data, transform unstructured data into structured, export metrics, persist state as they progress, and automatically retry &amp; restart. But writing any non-trivial application and making it <i>durable</i> in the face of failure is hard: this is where Workflows comes in. Workflows manages the retries, emitting the metrics, and durably storing the state (without you having to stand up your own database) as the Workflow progresses.</p><p>What makes Workflows different from other takes on “Durable Execution” is that we manage the underlying compute and storage infrastructure for you. You’re not left managing a compute cluster and hoping it scales both up (on a Monday morning) and down (during quieter periods) to manage costs, or ensuring that you have compute running in the right locations. Workflows is built on Cloudflare Workers — our job is to run your code and operate the infrastructure for you.</p><p>As an example of how Workflows can help you build durable applications, assume you want to post-process file uploads from your users that were uploaded to an R2 bucket directly via <a href="https://developers.cloudflare.com/r2/api/s3/presigned-urls/"><u>a pre-signed URL</u></a>. That post-processing could involve multiple actions: text extraction via a <a href="https://developers.cloudflare.com/workers-ai/models/"><u>Workers AI model</u></a>, calls to a third-party API to validate data, updating or querying rows in a database once the file has been processed… the list goes on.</p><p>But what each of these actions has in common is that it could <i>fail</i>. Maybe that upstream API is unavailable, maybe you get rate-limited, maybe your database is down. Having to write extensive retry logic around each action, manage backoffs, and (importantly) ensure your application doesn’t have to start from scratch when a later <i>step</i> fails is more boilerplate to write and more code to test and debug.</p><p>What’s a <i>step, </i>you ask? The core building block of every Workflow is the step: an individually retriable component of your application that can optionally emit state. That state is then persisted, even if subsequent steps were to fail. This means that your application doesn’t have to restart, allowing it to not only recover more quickly from failure scenarios, but it can also avoid doing redundant work. You don’t want your application hammering an expensive third-party API (or getting you rate limited) because it’s naively retrying an API call that you don’t have to.</p>
            <pre><code>export class MyWorkflow extends WorkflowEntrypoint&lt;Env, Params&gt; {
	async run(event: WorkflowEvent&lt;Params&gt;, step: WorkflowStep) {
		const files = await step.do('my first step', async () =&gt; {
			return {
				inputParams: event,
				files: [
					'doc_7392_rev3.pdf',
					'report_x29_final.pdf',
					'memo_2024_05_12.pdf',
					'file_089_update.pdf',
					'proj_alpha_v2.pdf',
					'data_analysis_q2.pdf',
					'notes_meeting_52.pdf',
					'summary_fy24_draft.pdf',
				],
			};
		});

		// Other steps...
	}
}
</code></pre>
            <p>Notably, a Workflow can have hundreds of steps: one of the <a href="https://developers.cloudflare.com/workflows/build/rules-of-workflows/"><u>Rules of Workflows</u></a> is to encapsulate every API call or stateful action within your application into its own step. Each step can also define its own retry strategy, automatically backing off, adding a delay and/or (eventually) giving up after a set number of attempts.</p>
            <pre><code>await step.do(
	'make a call to write that could maybe, just might, fail',
	// Define a retry strategy
	{
		retries: {
			limit: 5,
			delay: '5 seconds',
			backoff: 'exponential',
		},
		timeout: '15 minutes',
	},
	async () =&gt; {
		// Do stuff here, with access to the state from our previous steps
		if (Math.random() &gt; 0.5) {
			throw new Error('API call to $STORAGE_SYSTEM failed');
		}
	},
);
</code></pre>
            <p>To illustrate this further, imagine you have an application that reads text files from an R2 storage bucket, pre-processes the text into chunks, generates text embeddings <a href="https://developers.cloudflare.com/workers-ai/models/bge-large-en-v1.5/"><u>using Workers AI</u></a>, and then inserts those into a vector database (like <a href="https://developers.cloudflare.com/vectorize/"><u>Vectorize</u></a>) for semantic search.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7b9m0rPDlGvIiTnhguyvzI/3f27678b141ce600f1f54eb999e9d671/WORKFLOWS.png" />
          </figure><p>In the Workflows programming model, each of those is a discrete step, and each can emit state. For example, each of the four actions below can be a discrete <code>step.do</code> call in a Workflow:</p><ol><li><p>Reading the files from storage and emitting the list of filenames</p></li><li><p>Chunking the text and emitting the results</p></li><li><p>Generating text embeddings</p></li><li><p>Upserting them into Vectorize and capturing the result of a test query</p></li></ol><p>You can also start to imagine that some steps, such as chunking text or generating text embeddings, can be broken down into even more steps — a step per file that we chunk, or a step per API call to our text embedding model, so that our application is even more resilient to failure.</p><p>Steps can be created programmatically or conditionally based on input, allowing you to dynamically create steps based on the number of inputs your application needs to process. You do not need to define all steps ahead of time, and each instance of a Workflow may choose to conditionally create steps on the fly.</p>
    <div>
      <h2>Building Cloudflare on Cloudflare</h2>
      <a href="#building-cloudflare-on-cloudflare">
        
      </a>
    </div>
    <p>As the Cloudflare Developer platform <a href="https://www.cloudflare.com/birthday-week/"><u>continues to grow</u></a>, almost all of our own products are built on top of it. Workflows is yet another example of how we built a new product from scratch using nothing but Workers and its vast catalog of features and APIs. This section of the blog has two goals: to explain how we built it, and to demonstrate that anyone can create a complex application or platform with demanding requirements and multiple architectural layers on our stack, too.</p><p>If you’re wondering how Workflows manages to make durable execution easy, how it persists state, and how it automatically scales: it’s because we built it on Cloudflare Workers, including the brand-new <a href="https://blog.cloudflare.com/sqlite-in-durable-objects/"><u>zero-latency SQLite storage we recently introduced to Durable Objects</u></a>.
</p><p>To understand how Workflows uses Workers &amp; Durable Objects, here’s the high-level overview of our architecture:</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7pknYk0Sshxka3iPbxBCRj/bb8b75986601e38b6b69fe8d849c0cbe/image9.png" />
          </figure><p>There are three main blocks in this diagram:</p><p>The user-facing APIs are where the user interacts with the platform, creating and deploying new workflows or instances, controlling them, and accessing their state and activity logs. These operations can be executed through our public <a href="https://developers.cloudflare.com/api/"><u>API gateway</u></a> using REST calls, a Worker script using bindings, <a href="https://blog.cloudflare.com/wrangler3"><u>Wrangler</u></a> (Cloudflare's developer platform command line tool), or via the <a href="https://dash.cloudflare.com/"><u>Dashboard</u></a> user interface.</p><p>The managed platform holds the internal configuration APIs running on a Worker implementing a catalog of REST endpoints, the binding shim, which is supported by another dedicated Worker, every account controller, and their correspondent workflow engines, all powered by SQLite-backed Durable Objects. This is where all the magic happens and what we are sharing more details about in this technical blog.</p><p>Finally, there are the workflow instances, essentially independent clones of the workflow application. Instances are user account-owned and have a one-to-one relationship with a managed engine that powers them. You can run as many instances and engines as you want concurrently.</p><p>Let's get into more detail…</p>
    <div>
      <h3>Configuration API and Binding Shim</h3>
      <a href="#configuration-api-and-binding-shim">
        
      </a>
    </div>
    
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/2qEGr9M8KwgPS66Ju8mELL/189db9764392c00ae34dd3a44eeb1ed7/image6.png" />
          </figure><p>The Configuration API and the Binding Shim are two stateless Workers; one receives REST API calls from clients calling our <a href="https://developers.cloudflare.com/api/"><u>API Gateway</u></a> directly, using <a href="https://developers.cloudflare.com/workers/wrangler/"><u>Wrangler</u></a>, or navigating the <a href="https://dash.cloudflare.com/"><u>Dashboard</u></a> UI, and the other is the endpoint for the Workflows <a href="https://developers.cloudflare.com/workers/runtime-apis/bindings/"><u>binding</u></a>, an efficient and authenticated interface to interact with the Cloudflare Developer Platform resources from a Workers script.</p><p>The configuration API worker uses <a href="https://hono.dev/docs/getting-started/cloudflare-workers"><u>HonoJS</u></a> and <a href="https://hono.dev/examples/zod-openapi"><u>Zod</u></a> to implement the REST endpoints, which are declared in an <a href="https://swagger.io/specification/"><u>OpenAPI</u></a> schema and exported to our API Gateway, thus adding our methods to the Cloudflare API <a href="https://developers.cloudflare.com/api/"><u>catalog</u></a>.</p>
            <pre><code>import { swaggerUI } from '@hono/swagger-ui';
import { createRoute, OpenAPIHono, z } from '@hono/zod-openapi';
import { Hono } from 'hono';

...

​​api.openapi(
  createRoute({
    method: 'get',
    path: '/',
    request: {
      query: PaginationParams,
    },
    responses: {
      200: {
        content: {
          'application/json': {
             schema: APISchemaSuccess(z.array(WorkflowWithInstancesCountSchema)),
          },
        },
        description: 'List of all Workflows belonging to a account.',
      },
    },
  }),
  async (ctx) =&gt; {
    ...
  },
);

...

api.route('/:workflow_name', routes.workflows);
api.route('/:workflow_name/instances', routes.instances);
api.route('/:workflow_name/versions', routes.versions);</code></pre>
            <p>These Workers perform two different functions, but they share a large portion of their code and implement similar logic; once the request is authenticated and ready to travel to the next stage, they use the account ID to delegate the operation to a Durable Object called Account Controller.</p>
            <pre><code>// env.ACCOUNTS is the Account Controllers Durable Objects namespace
const accountStubId = c.env.ACCOUNTS.idFromName(accountId.toString());
const accountStub = c.env.ACCOUNTS.get(accountStubId);</code></pre>
            <p>As you can see, every account has its own Account Controller Durable Object.</p>
    <div>
      <h3>Account Controllers</h3>
      <a href="#account-controllers">
        
      </a>
    </div>
    <p>The Account Controller is a dedicated persisted database that stores the list of all the account’s workflows, versions, and instances. We scale to millions of account controllers, one per every Cloudflare account using Workflows, by leveraging the power of <a href="https://developers.cloudflare.com/durable-objects/best-practices/access-durable-objects-storage/#sqlite-storage-backend"><u>Durable Objects with SQLite backend</u></a>.</p><p><a href="https://developers.cloudflare.com/durable-objects/"><u>Durable Objects</u></a> (DOs) are single-threaded singletons that run in our data centers and are bound to a stateful storage API, in this case, SQLite. They are also Workers, just a special kind, and have access to all of our other APIs. This makes it easy to build consistent, highly available distributed applications with them.</p><p>Here’s what we get for free by using one Durable Object per Workflows account:</p><ul><li><p>Sharding based on account boundaries aligns perfectly with the way we manage resources at Cloudflare internally. Also, due to the nature of DOs, there are other things that this model gets us for free: Not that we expect them, but eventual bugs or state inconsistencies during beta are confined to the affected account, and don’t impact everyone.</p></li><li><p>DO instances run close to the end user; Alice is in London and will call the config API through our <a href="https://www.cloudflare.com/en-gb/network/"><u>LHR data center</u></a>, while Bob is in Lisbon and will connect to LIS.</p></li><li><p>Because every account is a Worker, we can gradually upgrade them to new versions, starting with the internal users, thus derisking real customers.</p></li></ul><p>Before SQLite, our only option was to use the Durable Object's <a href="https://developers.cloudflare.com/durable-objects/api/storage-api/#get"><u>key-value</u></a> storage API, but having a relational database at our fingertips and being able to create tables and do complex queries is a significant enabler. For example, take a look at how we implement the internal method getWorkflow():</p>
            <pre><code>async function getWorkflow(accountId: number, workflowName: string) {
  try {
    const res = this.ctx.storage.transactionSync(() =&gt; {
      const cursor = Array.from(
        this.ctx.storage.sql.exec(
          `
                    SELECT *,
                    (SELECT class_name
                        FROM   versions
                        WHERE  workflow_id = w.id
                        ORDER  BY created_on DESC
                        LIMIT  1) AS class_name
                    FROM   workflows w
                    WHERE  w.name = ? 
                    `,
          workflowName
        )
      )[0] as Workflow;

      return cursor;
    });

    this.sendAnalytics(accountId, begin, "getWorkflow");
    return res as Workflow | undefined;
  } catch (err) {
    this.sendErrorAnalytics(accountId, begin, "getWorkflow");
    throw err;
  }
}
</code></pre>
            <p>The other thing we take advantage of in Workflows is using the recently <a href="https://blog.cloudflare.com/javascript-native-rpc/"><u>announced</u></a> JavaScript-native RPC feature when communicating between components.</p><p>Before <a href="https://developers.cloudflare.com/workers/runtime-apis/rpc/"><u>RPC</u></a>, we had to <code>fetch()</code> between components, make HTTP requests, and serialize and deserialize the parameters and the payload. Now, we can async call the remote object's method as if it was local. Not only does this feel more natural and simplify our logic, but it's also more efficient, and we can take advantage of TypeScript type-checking when writing code.</p><p>This is how the Configuration API would call the Account Controller’s <code>countWorkflows()</code> method before:</p>
            <pre><code>const resp = await accountStub.fetch(
      "https://controller/count-workflows",
      {
        method: "POST",
        headers: {
          "Content-Type": "application/json; charset=utf-8",
        },
        body: JSON.stringify({ accountId }),
      },
    );

if (!resp.ok) {
  return new Response("Internal Server Error", { status: 500 });
}

const result = await resp.json();
const total_count = result.total_count;</code></pre>
            <p>This is how we do it using RPC:</p>
            <pre><code>const total_count = await accountStub.countWorkflows(accountId);</code></pre>
            <p>The other powerful feature of our RPC system is that it supports passing not only <a href="https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Structured_clone_algorithm#supported_types"><u>Structured Cloneable</u></a> objects back and forth but also entire classes. More on this later.</p><p>Let’s move on to Engine.</p>
    <div>
      <h3>Engine and instance</h3>
      <a href="#engine-and-instance">
        
      </a>
    </div>
    <p>Every instance of a workflow runs alongside an Engine instance. The Engine is responsible for starting up the user’s workflow entry point, executing the steps on behalf of the user, handling their results, and tracking the workflow state until completion.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6yrKsuF501oRCDujckr3yM/bde40097ec5bedda07793375e53e99b9/image1.png" />
          </figure><p>When we started thinking about the Engine, we thought about modeling it after a <a href="https://en.wikipedia.org/wiki/Finite-state_machine"><u>state machine</u></a>, and that was what our initial prototypes looked like. However, state machines require an ahead-of-time understanding of the userland code, which implies having a build step before running them. This is costly at scale and introduces additional complexity.</p><p>A few iterations later, we had another idea. What if we could model the engine as a game loop?</p><p>Unlike other computer programs, games operate regardless of a user's input. The game loop is essentially a sequence of tasks that implement the game's logic and update the display, typically one loop per video frame. Here’s an example of a game loop in pseudo-code:</p>
            <pre><code>while (game in running)
    check for user input
    move graphics
    play sounds
end while</code></pre>
            <p>Well, an oversimplified version of our Workflow engine would look like this:</p>
            <pre><code>while (last step not completed)
    iterate every step
       use memoized cache as response if the step has run already
       continue running step or timer if it hasn't finished yet
end while</code></pre>
            <p>A workflow is indeed a loop that keeps on going, performing the same sequence of logical tasks until the last step completes.</p><p>The Engine and the instance run hand-in-hand in a one-to-one relationship. The first is managed, and part of the platform. It uses SQLite and other platform APIs internally, and we can constantly add new features, fix bugs, and deploy new versions, while keeping everything transparent to the end user. The second is the actual account-owned Worker script that declares the Workflow steps.</p><p>For example, when someone passes a callback into <code>step.do()</code>:</p>
            <pre><code>export class MyWorkflow extends WorkflowEntrypoint&lt;Env, Params&gt; {
  async run(event: WorkflowEvent&lt;Params&gt;, step: WorkflowStep) {
    step.do('step1', () =&gt; { ... });
  }
}</code></pre>
            <p>We switch execution over to the Engine. Again, this is possible because of the power of JS RPC. Besides passing Structured Cloneable objects back and forth, JS RPC allows us to <a href="https://developers.cloudflare.com/workers/runtime-apis/rpc/#send-functions-as-parameters-of-rpc-methods"><u>create and pass entire application-defined classes</u></a> that extend the built-in RpcTarget. So this is what happens behind the scenes when your Instance calls <code>step.do()</code> (simplified):</p>
            <pre><code>export class Context extends RpcTarget {

  async do&lt;T&gt;(name: string, callback: () =&gt; Promise&lt;T&gt;): Promise&lt;T&gt; {

    // First we check we have a cache of this step.do() already
    const maybeResult = await this.#state.storage.get(name);

    // We return the cache if it exists
    if (maybeValue) { return maybeValue; }

    // Else we run the user callback
    return doWrapper(callback);
  }

}
</code></pre>
            <p>Here’s a more complete diagram of the Engine’s <code>step.do()</code> lifecycle:</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4MymVGS7BxwityCRlWcBOX/136d4dcf0affce04164f87b6bbe8b12a/image5.png" />
          </figure><p>Again, this diagram only partially represents everything we do in the Engine; things like logging for observability or handling exceptions are missing, and we don't get into the details of how queuing is implemented. However, it gives you a good idea of how the Engine abstracts and handles all the complexities of completing a step under the hood, allowing us to expose a simple-to-use API to end users.</p><p>Also, it's worth reiterating that every workflow instance is an Engine behind the scenes, and every Engine is an SQLite-backed Durable Object. This ensures that every instance runtime and state are isolated and independent of each other and that we can effortlessly scale to run billions of workflow instances, a solved problem for Durable Objects.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4uEoEAtsjNquPCD3F50S9d/006556baf2a0478d1de10e4514843baa/image3.png" />
          </figure>
    <div>
      <h3>Durability</h3>
      <a href="#durability">
        
      </a>
    </div>
    <p>Durable Execution is all the rage now when we talk about workflow engines, and ours is no exception. Workflows are typically long-lived processes that run multiple functions in sequence where anything can happen. Those functions can time out or fail because of a remote server error or a network issue and need to be retried. A workflow engine ensures that your application runs smoothly and completes regardless of the problems it encounters.</p><p>Durability means that if and when a workflow fails, the Engine can re-run it, resume from the last recorded step, and deterministically re-calculate the state from all the successful steps' cached responses. This is possible because steps are stateful and idempotent; they produce the same result no matter how many times we run them, thus not causing unintended duplicate effects like sending the same invoice to a customer multiple times.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1R5UfQfNMKI7hB6QXJfCUr/242e85f2b5287394871e916844359bd4/image7.png" />
          </figure><p>We ensure durability and handle failures and retries by sharing the same technique we use for a <code>step.sleep()</code> that requires sleeping for days or months: a combination of using <code>scheduler.wait()</code>, a method of the <a href="https://github.com/WICG/scheduling-apis"><u>upcoming WICG Scheduling API</u></a> that we already <a href="https://developers.cloudflare.com/workers/platform/changelog/historical-changelog/#2021-12-10"><u>support</u></a>, and <a href="https://developers.cloudflare.com/durable-objects/api/alarms/"><u>Durable Objects alarms</u></a>, which allow you to schedule the Durable Object to be woken up at a time in the future.</p><p>These two APIs allow us to overcome the lack of guarantees that a Durable Object runs forever, giving us complete control of its lifecycle. Since every state transition through userland code persists in the Engine’s strongly consistent SQLite, we track timestamps when a step begins execution, its attempts (if it needs retries), and its completion.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6FSCXRt9fO4EaaBP7hLV8x/a59de27dfbe18f39addd4eb8240b9df9/image10.png" />
          </figure><p>This means that steps pending if a Durable Object is <a href="https://developers.cloudflare.com/durable-objects/reference/in-memory-state/"><u>evicted</u></a> — perhaps due to a two-month-long timer — get rerun on the next lifetime of the Engine (with its cache from the previous lifetime hydrated) that is triggered by an alarm set with the timestamp of the next expected state transition. </p>
    <div>
      <h2>Real-life workflow, step by step</h2>
      <a href="#real-life-workflow-step-by-step">
        
      </a>
    </div>
    <p>Let's walk through an example of a real-life application. You run an e-commerce website and would like to send email reminders to your customers for forgotten carts that haven't been checked out in a few days.</p><p>What would typically have to be a combination of a queue, a cron job, and querying a database table periodically can now simply be a Workflow that we start on every new cart:</p>
            <pre><code>import {
  WorkflowEntrypoint,
  WorkflowEvent,
  WorkflowStep,
} from "cloudflare:workers";
import { sendEmail } from "./legacy-email-provider";

type Params = {
  cartId: string;
};

type Env = {
  DB: D1Database;
};

export class Purchase extends WorkflowEntrypoint&lt;Env, Params&gt; {
  async run(
    event: WorkflowEvent&lt;Params&gt;,
    step: WorkflowStep
  ): Promise&lt;unknown&gt; {
    await step.sleep("wait for three days", "3 days");

    // Retrieve cart from D1
    const cart = await step.do("retrieve cart from database", async () =&gt; {
      const { results } = await this.env.DB.prepare(`SELECT * FROM cart WHERE id = ?`)
        .bind(event.payload.cartId)
        .all();
      return results[0];
    });

    if (!cart.checkedOut) {
      await step.do("send an email", async () =&gt; {
        await sendEmail("reminder", cart);
      });
    }
  }
}
</code></pre>
            <p>This works great. However, sometimes the <code>sendEmail</code> function fails due to an upstream provider erroring out. While <code>step.do</code> automatically retries with a reasonable default configuration, we can define our settings:</p>
            <pre><code>if (cart.isComplete) {
  await step.do(
    "send an email",
    {
      retries: {
        limit: 5,
        delay: "1 min",
        backoff: "exponential",
      },
    },
    async () =&gt; {
      await sendEmail("reminder", cart);
    }
  );
}
</code></pre>
            
    <div>
      <h3>Managing Workflows</h3>
      <a href="#managing-workflows">
        
      </a>
    </div>
    <p>Workflows allows us to create and manage workflows using four different interfaces:</p><ul><li><p>Using our REST HTTP API available on <a href="https://developers.cloudflare.com/api/"><u>Cloudflare’s API catalog</u></a></p></li><li><p>Using <a href="https://developers.cloudflare.com/workers/wrangler/"><u>Wrangler</u></a>, Cloudflare's developer platform command-line tool</p></li><li><p>Programmatically inside a Worker using <a href="https://developers.cloudflare.com/workers/runtime-apis/bindings/"><u>bindings</u></a></p></li><li><p>Using our Web UI in the <a href="https://dash.cloudflare.com/"><u>dashboard</u></a></p></li></ul><p>The HTTP API makes it easy to trigger new instances of workflows from any system, even if it isn’t on Cloudflare, or from the command line. For example:</p>
            <pre><code>curl --request POST \
  --url https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/workflows/purchase-workflow/instances/$CART_INSTANCE_ID \
  --header 'Authorization: Bearer $ACCOUNT_TOKEN \
  --header 'Content-Type: application/json' \
  --data '{
	"id": "$CART_INSTANCE_ID",
	"params": {
		"cartId": "f3bcc11b-2833-41fb-847f-1b19469139d1"
	}
  }'</code></pre>
            <p>Wrangler goes one step further and gives us a friendlier set of commands to interact with workflows with fancy formatted outputs without needing to authenticate with tokens. Type <code>npx wrangler workflows</code> for help, or:</p>
            <pre><code>npx wrangler workflows trigger purchase-workflow '{ "cartId": "f3bcc11b-2833-41fb-847f-1b19469139d1" }'</code></pre>
            <p>Furthermore, Workflows has first-party support in wrangler, and you can test your instances locally. A Workflow is similar to a regular<a href="https://developers.cloudflare.com/workers/runtime-apis/bindings/service-bindings/rpc/"><u> WorkerEntrypoint</u></a> in your Worker, which means that <code>wrangler dev</code> just naturally works.</p>
            <pre><code>❯ npx wrangler dev

 ⛅️ wrangler 3.82.0
----------------------------

Your worker has access to the following bindings:
- Workflows:
  - CART_WORKFLOW: EcommerceCartWorkflow
⎔ Starting local server...
[wrangler:inf] Ready on http://localhost:8787
╭───────────────────────────────────────────────╮
│  [b] open a browser, [d] open devtools        │
╰───────────────────────────────────────────────╯
</code></pre>
            <p>Workflow APIs are also available as a Worker binding. You can interact with the platform programmatically from another Worker script in the same account without worrying about permissions or authentication. You can even have workflows that call and interact with other workflows.</p>
            <pre><code>import { WorkerEntrypoint } from "cloudflare:workers";

type Env = { DEMO_WORKFLOW: Workflow };
export default class extends WorkerEntrypoint&lt;Env&gt; {
  async fetch() {
    // Pass in a user defined name for this instance
    // In this case, we use the same as the cartId
    const instance = await this.env.DEMO_WORKFLOW.create({
      id: "f3bcc11b-2833-41fb-847f-1b19469139d1",
      params: {
          cartId: "f3bcc11b-2833-41fb-847f-1b19469139d1",
      }
    });
  }
  async scheduled() {
    // Restart errored out instances in a cron
    const instance = await this.env.DEMO_WORKFLOW.get(
      "f3bcc11b-2833-41fb-847f-1b19469139d1"
    );
    const status = await instance.status();
    if (status.error) {
      await instance.restart();
    }
  }
}</code></pre>
            
    <div>
      <h3>Observability </h3>
      <a href="#observability">
        
      </a>
    </div>
    <p>Having good <a href="https://www.cloudflare.com/learning/performance/what-is-observability/">observability</a> and data on often long-lived asynchronous tasks is crucial to understanding how we're doing under normal operation and, more importantly, when things go south, and we need to troubleshoot problems or when we are iterating on code changes.</p><p>We designed Workflows around the philosophy that there is no such thing as too much logging. You can get all the SQLite data for your workflow and its instances by calling the REST APIs. Here is the output of an instance:</p>
            <pre><code>{
  "success": true,
  "errors": [],
  "messages": [],
  "result": {
    "status": "running",
    "params": {},
    "trigger": { "source": "api" },
    "versionId": "ae042999-39ff-4d27-bbcd-22e03c7c4d02",
    "queued": "2024-10-21 17:15:09.350",
    "start": "2024-10-21 17:15:09.350",
    "end": null,
    "success": null,
    "steps": [
      {
        "name": "send email",
        "start": "2024-10-21 17:15:09.411",
        "end": "2024-10-21 17:15:09.678",
        "attempts": [
          {
            "start": "2024-10-21 17:15:09.411",
            "end": "2024-10-21 17:15:09.678",
            "success": true,
            "error": null
          }
        ],
        "config": {
          "retries": { "limit": 5, "delay": 1000, "backoff": "constant" },
          "timeout": "15 minutes"
        },
        "output": "celso@example.com",
        "success": true,
        "type": "step"
      },
      {
        "name": "sleep-1",
        "start": "2024-10-21 17:15:09.763",
        "end": "2024-10-21 17:17:09.763",
        "finished": false,
        "type": "sleep",
        "error": null
      }
    ],
    "error": null,
    "output": null
  }
}</code></pre>
            <p>As you can see, this is essentially a dump of the instance engine SQLite in JSON. You have the <b>errors</b>, <b>messages</b>, current <b>status</b>, and what happened with <b>every step</b>, all time stamped to the millisecond.</p><p>It's one thing to get data about a specific workflow instance, but it's another to zoom out and look at aggregated statistics of all your workflows and instances over time. Workflows data is available through our <a href="https://developers.cloudflare.com/analytics/graphql-api/"><u>GraphQL Analytics API</u></a>, so you can query it in aggregate and generate valuable insights and reports. In this example we ask for aggregated analytics about the wall time of all the instances of the “e-commerce-carts” workflow:</p>
            <pre><code>{
  viewer {
    accounts(filter: { accountTag: "febf0b1a15b0ec222a614a1f9ac0f0123" }) {
      wallTime: workflowsAdaptiveGroups(
        limit: 10000
        filter: {
          datetimeHour_geq: "2024-10-20T12:00:00.000Z"
          datetimeHour_leq: "2024-10-21T12:00:00.000Z"
          workflowName: "e-commerce-carts"
        }
        orderBy: [count_DESC]
      ) {
        count
        sum {
          wallTime
        }
        dimensions {
          date: datetimeHour
        }
      }
    }
  }
}
</code></pre>
            <p>For convenience, you can evidently also use Wrangler to describe a workflow or an instance and get an instant and beautifully formatted response:</p>
            <pre><code>sid ~ npx wrangler workflows instances describe purchase-workflow latest

 ⛅️ wrangler 3.80.4

Workflow Name:         purchase-workflow
Instance Id:           d4280218-7756-41d2-bccd-8d647b82d7ce
Version Id:            0c07dbc4-aaf3-44a9-9fd0-29437ed11ff6
Status:                ✅ Completed
Trigger:               🌎 API
Queued:                14/10/2024, 16:25:17
Success:               ✅ Yes
Start:                 14/10/2024, 16:25:17
End:                   14/10/2024, 16:26:17
Duration:              1 minute
Last Successful Step:  wait for three days
Output:                false
Steps:

  Name:      wait for three days
  Type:      💤 Sleeping
  Start:     14/10/2024, 16:25:17
  End:       17/10/2024, 16:25:17
  Duration:  3 day</code></pre>
            <p>And finally, we worked really hard to get you the best dashboard UI experience when navigating Workflows data.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/64XUtBwldkSXUTJ5xEJBgo/2aa861583c8c56c19194cb0869a15a2a/image8.png" />
          </figure>
    <div>
      <h2>So, how much does it cost?</h2>
      <a href="#so-how-much-does-it-cost">
        
      </a>
    </div>
    <p>It’d be painful if we introduced a powerful new way to build Workers applications but made it cost prohibitive.</p><p>Workflows is <a href="https://developers.cloudflare.com/workers/platform/pricing/#workers"><u>priced</u></a> just like Cloudflare Workers, where we <a href="https://blog.cloudflare.com/workers-pricing-scale-to-zero/"><u>introduced CPU-based pricing</u></a>: only on active CPU time and requests, not duration (aka: wall time).</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/11WroT4xt0zPj6bsou4u3X/8f2775569f280107345322cb97603b3e/image4.png" />
          </figure><p><sup><i>Workers Standard pricing model</i></sup></p><p>This is especially advantageous when building the long-running, multi-step applications that Workflows enables: if you had to pay while your Workflow was sleeping, waiting on an event, or making a network call to an API, writing the “right” code would be at odds with writing affordable code.</p><p>There’s also no need to keep a Kubernetes cluster or a group of virtual machines running (and burning a hole in your wallet): we manage the infrastructure, and you only pay for the compute your Workflows consume.   </p>
    <div>
      <h2>What’s next?</h2>
      <a href="#whats-next">
        
      </a>
    </div>
    <p>Today, after months of developing the platform, we are announcing the open beta program, and we couldn't be more excited to see how you will be using Workflows. Looking forward, we want to do things like triggering instances from queue messages and have other ideas, but at the same time, we are certain that your feedback will help us shape the roadmap ahead.</p><p>We hope that this blog post gets you thinking about how to use Workflows for your next application, but also that it inspires you on what you can build on top of Workers. Workflows as a platform is entirely built on top of Workers, its resources, and APIs. Anyone can do it, too.</p><p>To chat with the team and other developers building on Workflows, join the #workflows-beta channel on the<a href="https://discord.cloudflare.com/"> <u>Cloudflare Developer Discord</u></a>, and keep an eye on the<a href="https://developers.cloudflare.com/workflows/reference/changelog/"> <u>Workflows changelog</u></a> during the beta. Otherwise,<a href="https://developers.cloudflare.com/workflows/get-started/guide/"> visit the Workflows tutorial</a> to get started.</p><p>If you're an engineer, <a href="https://www.cloudflare.com/en-gb/careers/jobs/"><u>look for opportunities</u></a> to work with us and help us improve Workflows or build other products.</p> ]]></content:encoded>
            <category><![CDATA[Developer Platform]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[Durable Objects]]></category>
            <category><![CDATA[Workflows]]></category>
            <guid isPermaLink="false">1YRfz7LKvAGrEMbRGhNrFP</guid>
            <dc:creator>Sid Chatterjee</dc:creator>
            <dc:creator>Matt Silverlock</dc:creator>
            <dc:creator>Celso Martinho</dc:creator>
        </item>
        <item>
            <title><![CDATA[Browser Rendering API GA, rolling out Cloudflare Snippets, SWR, and bringing Workers for Platforms to all users]]></title>
            <link>https://blog.cloudflare.com/browser-rendering-api-ga-rolling-out-cloudflare-snippets-swr-and-bringing-workers-for-platforms-to-our-paygo-plans/</link>
            <pubDate>Fri, 05 Apr 2024 13:01:00 GMT</pubDate>
            <description><![CDATA[ Browser Rendering API is now available to all paid Workers customers with improved session management ]]></description>
            <content:encoded><![CDATA[ <p></p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5kiBNiPfz0fqooxige54uO/378848632e2d4633c9f41678f1cff82c/Workers-for-Platforms-now-available-for-PAYGO.png" />
            
            </figure>
    <div>
      <h3>Browser Rendering API is now available to all paid Workers customers with improved session management</h3>
      <a href="#browser-rendering-api-is-now-available-to-all-paid-workers-customers-with-improved-session-management">
        
      </a>
    </div>
    <p>In May 2023, we <a href="/browser-rendering-open-beta">announced</a> the open beta program for the <a href="https://developers.cloudflare.com/browser-rendering/">Browser Rendering API</a>. Browser Rendering allows developers to programmatically control and interact with a headless browser instance and create automation flows for their applications and products.</p><p>At the same time, we launched a version of the <a href="https://developers.cloudflare.com/browser-rendering/platform/puppeteer/">Puppeteer library</a> that works with Browser Rendering. With that, developers can use a familiar API on top of Cloudflare Workers to create all sorts of workflows, such as taking screenshots of pages or automatic software testing.</p><p>Today, we take Browser Rendering one step further, taking it out of beta and making it available to all paid Workers' plans. Furthermore, we are enhancing our API and introducing a new feature that we've been discussing for a long time in the open beta community: session management.</p>
    <div>
      <h3>Session Management</h3>
      <a href="#session-management">
        
      </a>
    </div>
    <p>Session management allows developers to reuse previously opened browsers across Worker's scripts. Reusing browser sessions has the advantage that you don't need to instantiate a new browser for every request and every task, drastically increasing performance and lowering costs.</p><p>Before, to keep a browser instance alive and reuse it, you'd have to implement complex code using Durable Objects. Now, we've simplified that for you by keeping your browsers running in the background and extending the Puppeteer API with new <a href="https://developers.cloudflare.com/browser-rendering/platform/puppeteer/#session-management">session management methods</a> that give you access to all of your running sessions, activity history, and active limits.</p><p>Here’s how you can list your active sessions:</p>
            <pre><code>const sessions = await puppeteer.sessions(env.RENDERING);
console.log(sessions);
[
   {
      "connectionId": "2a2246fa-e234-4dc1-8433-87e6cee80145",
      "connectionStartTime": 1711621704607,
      "sessionId": "478f4d7d-e943-40f6-a414-837d3736a1dc",
      "startTime": 1711621703708
   },
   {
      "sessionId": "565e05fb-4d2a-402b-869b-5b65b1381db7",
      "startTime": 1711621703808
   }
]</code></pre>
            <p>We have added a Worker script <a href="https://developers.cloudflare.com/browser-rendering/get-started/reuse-sessions/#4-code">example on how to use session management</a> to the Developer Documentation.</p>
    <div>
      <h3>Analytics and logs</h3>
      <a href="#analytics-and-logs">
        
      </a>
    </div>
    <p>Observability is an essential part of any Cloudflare product. You can find detailed analytics and logs of your Browser Rendering usage in the dashboard under your account's Worker &amp; Pages section.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/2jlU3vFhUa0fXCF7lKYq73/9e63676a0dc7bc54da3ab4cf5efd85dd/image4-10.png" />
            
            </figure><p>Browser Rendering is now available to all customers with a paid Workers plan. Each account is <a href="https://developers.cloudflare.com/browser-rendering/platform/limits/">limited</a> to running two new browsers per minute and two concurrent browsers at no cost during this period. Check our <a href="https://developers.cloudflare.com/browser-rendering/get-started/">developers page</a> to get started.</p>
    <div>
      <h3>We are rolling out access to Cloudflare Snippets</h3>
      <a href="#we-are-rolling-out-access-to-cloudflare-snippets">
        
      </a>
    </div>
    <p>Powerful, programmable, and free of charge, Snippets are the best way to perform complex HTTP request and response modifications on Cloudflare. What was once too advanced to achieve using Rules products is now possible with Snippets. Since the initial <a href="/snippets-announcement">announcement</a> during Developer Week 2022, the promise of extending out-of-the-box Rules functionality by writing simple JavaScript code is keeping the Cloudflare community excited.</p><p>During the first 3 months of 2024 alone, the amount of traffic going through Snippets increased over 7x, from an average of 2,200 requests per second in early January to more than 17,000 in March.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6XCqU9QOeEcg9KoaOShf4x/94bba253b62bf832126baf20b18f5cb4/image2-14.png" />
            
            </figure><p>However, instead of opening the floodgates and letting millions of Cloudflare users in to test (and potentially break) Snippets in the most unexpected ways, we are going to pace ourselves and opt for a phased rollout, much like the newly released <a href="/workers-production-safety">Gradual Rollouts</a> for Workers.</p><p>In the next few weeks, 5% of Cloudflare users will start seeing “Snippets” under the Rules tab of the zone-level menu in their dashboard. If you happen to be part of the first 5%, snip into action and try out how fast and powerful Snippets are even for <a href="/cloudflare-snippets-alpha#what-can-you-build-with-cloudflare-snippets">advanced use cases</a> like dynamically changing the date in headers or A / B testing leveraging the `math.random` function. Whatever you use Snippets for, just keep one thing in mind: this is still an alpha, so please do not use Snippets for production traffic just yet.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/45mBS7TWL4BL6skGoXRDn3/a99f87e1d885e457b1bb35af5773fdb2/Screenshot-2024-04-04-at-6.12.42-PM.png" />
            
            </figure><p>Until then, keep your eyes out for the new Snippets tab in the Cloudflare dashboard and learn more how powerful and flexible Snippets are at the <a href="https://developers.cloudflare.com/rules/snippets">developer documentation</a> in the meantime.</p>
    <div>
      <h3>Coming soon: asynchronous revalidation with stale-while-revalidate</h3>
      <a href="#coming-soon-asynchronous-revalidation-with-stale-while-revalidate">
        
      </a>
    </div>
    <p>One of the features most requested by our customers is the asynchronous revalidation with stale-while-revalidate (SWR) cache directive, and we will be bringing this to you in the second half of 2024.  This functionality will be available by design as part of our new CDN architecture that is being built using Rust with performance and memory safety at top of mind.</p><p>Currently, when a client requests a resource, such as a web page or an image, Cloudflare checks to see if the asset is in cache and provides a cached copy if available. If the file is not in the cache or has expired and become stale, Cloudflare connects to the origin server to check for a fresh version of the file and forwards this fresh version to the end user. This wait time adds latency to these requests and impacts performance.</p><p>Stale-while-revalidate is a cache directive that allows the expired or stale version of the asset to be served to the end user while simultaneously allowing Cloudflare to check the origin to see if there's a fresher version of the resource available. If an updated version exists, the origin forwards it to Cloudflare, updating the cache in the process. This mechanism allows the client to receive a response quickly from the cache while ensuring that it always has access to the most up-to-date content. Stale-while-revalidate strikes a balance between serving content efficiently and ensuring its freshness, resulting in improved performance and a smoother user experience.</p><p>Customers who want to be part of our beta testers and “cache” in on the fun can register <a href="https://forms.gle/EEFDtB97sLG5G5Ui9">here</a>, and we will let you know when the feature is ready for testing!</p>
    <div>
      <h3>Coming on April 16, 2024: Workers for Platforms for our pay-as-you-go plan</h3>
      <a href="#coming-on-april-16-2024-workers-for-platforms-for-our-pay-as-you-go-plan">
        
      </a>
    </div>
    <p>Today, we’re excited to share that on April 16th, Workers for Platforms will be available to all developers through our new $25 pay-as-you-go plan!</p><p>Workers for Platforms is changing the way we build software – it gives you the ability to embed personalization and customization directly into your product. With Workers for Platforms, you can deploy custom code on behalf of your users or let your users directly deploy their own code to your platform, without you or your users having to manage any infrastructure. You can use Workers for Platforms with all the exciting announcements that have come out this Developer Week – it supports all the <a href="https://developers.cloudflare.com/workers/configuration/bindings/">bindings</a> that come with Workers (including <a href="https://developers.cloudflare.com/workers-ai/">Workers AI</a>, <a href="https://developers.cloudflare.com/d1/">D1</a> and <a href="https://developers.cloudflare.com/durable-objects/">Durable Objects</a>) as well as <a href="https://developers.cloudflare.com/workers/languages/python/">Python Workers</a>.  </p><p>Here’s what some of our customers – ranging from enterprises to startups – are building on Workers for Platforms:</p><ul><li><p><a href="https://www.shopify.com/plus/solutions/headless-commerce">Shopify Oxygen</a> is a hosting platform for their Remix-based eCommerce framework Hydrogen, and it’s built on Workers for Platforms! The Hydrogen/Oxygen combination gives Shopify merchants control over their buyer experience without the restrictions of generic storefront templates.</p></li><li><p><a href="https://grafbase.com/">Grafbase</a> is a data platform for developers to create a serverless GraphQL API that unifies data sources across a business under one endpoint. They use Workers for Platforms to give their developers the control and flexibility to deploy their own code written in JavaScript/TypeScript or WASM.</p></li><li><p><a href="https://www.triplit.dev/">Triplit</a> is an open-source database that syncs data between server and browser in real-time. It allows users to build low latency, real-time applications with features like relational querying, schema management and server-side storage built in. Their query and sync engine is built on top of Durable Objects, and they’re using Workers for Platforms to allow their customers to package custom Javascript alongside their Triplit DB instance.</p></li></ul>
    <div>
      <h3>Tools for observability and platform level controls</h3>
      <a href="#tools-for-observability-and-platform-level-controls">
        
      </a>
    </div>
    <p>Workers for Platforms doesn’t just allow you to deploy Workers to your platform – we also know how important it is to have observability and control over your users’ Workers. We have a few solutions that help with this:</p><ul><li><p><a href="https://developers.cloudflare.com/cloudflare-for-platforms/workers-for-platforms/platform/custom-limits/">Custom Limits</a>: Set CPU time or subrequest caps on your users’ Workers. Can be used to set limits in order to control your costs on Cloudflare and/or shape your own pricing and packaging model. For example, if you run a freemium model on your platform, you can lower the CPU time limit for customers on your free tier.</p></li><li><p><a href="https://developers.cloudflare.com/workers/observability/logging/tail-workers/">Tail Workers</a>: Tail Worker events contain metadata about the Worker, console.log() messages, and capture any unhandled exceptions. They can be used to provide your developers with live logging in order to monitor for errors and troubleshoot in real time.</p></li><li><p><a href="https://developers.cloudflare.com/cloudflare-for-platforms/workers-for-platforms/reference/outbound-workers/">Outbound Workers</a>: Get visibility into all outgoing requests from your users’ Workers. Outbound Workers sit between user Workers and the fetch() requests they make, so you get full visibility over the request before it’s sent out to the Internet.</p></li></ul>
    <div>
      <h3>Pricing</h3>
      <a href="#pricing">
        
      </a>
    </div>
    <p>We wanted to make sure that Workers for Platforms was affordable for hobbyists, solo developers, and indie developers. Workers for Platforms is part of a new $25 pay-as-you-go plan, and it includes the following:</p>
<table>
<thead>
  <tr>
    <th></th>
    <th><span>Included Amounts</span></th>
  </tr>
</thead>
<tbody>
  <tr>
    <td><span>Requests</span></td>
    <td><span>20 million requests/month </span><br /><span>+$0.30 per additional million</span></td>
  </tr>
  <tr>
    <td><span>CPU time</span></td>
    <td><span>60 million CPU milliseconds/month</span><br /><span>+$0.02 per additional million CPU milliseconds</span></td>
  </tr>
  <tr>
    <td><span>Scripts</span></td>
    <td><span>1000 scripts</span><br /><span>+0.02 per additional script/month</span></td>
  </tr>
</tbody>
</table>
    <div>
      <h3>Workers for Platforms will be available to purchase on April 16, 2024!</h3>
      <a href="#workers-for-platforms-will-be-available-to-purchase-on-april-16-2024">
        
      </a>
    </div>
    <p>The Workers for Platforms will be available to purchase under the Workers for Platforms tab on the Cloudflare Dashboard on April 16, 2024.</p><p>In the meantime, to learn more about Workers for Platforms, check out our <a href="https://github.com/cloudflare/workers-for-platforms-example">starter project</a> and <a href="https://developers.cloudflare.com/cloudflare-for-platforms/workers-for-platforms/">developer documentation</a>.</p> ]]></content:encoded>
            <category><![CDATA[Developer Week]]></category>
            <category><![CDATA[Developers]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <category><![CDATA[Application Services]]></category>
            <category><![CDATA[Product News]]></category>
            <category><![CDATA[General Availability]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <guid isPermaLink="false">2wPhlTmw4FThQkJsChhkwy</guid>
            <dc:creator>Tanushree Sharma</dc:creator>
            <dc:creator>Celso Martinho</dc:creator>
            <dc:creator>Nikita Cano</dc:creator>
            <dc:creator>Matt Bullock</dc:creator>
            <dc:creator>Tim Kornhammar</dc:creator>
        </item>
        <item>
            <title><![CDATA[Mitigating a token-length side-channel attack in our AI products]]></title>
            <link>https://blog.cloudflare.com/ai-side-channel-attack-mitigated/</link>
            <pubDate>Thu, 14 Mar 2024 12:30:30 GMT</pubDate>
            <description><![CDATA[ The Workers AI and AI Gateway team recently collaborated closely with security researchers at Ben Gurion University regarding a report submitted through our Public Bug Bounty program. Through this process, we discovered and fully patched a vulnerability affecting all LLM providers. Here’s the story ]]></description>
            <content:encoded><![CDATA[ <p></p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5do9zHtgVCZfCILMjoXAmV/0f7e2e3b4bdb298d7fd8c0a97d3b2a19/Mitigating-a-Token-Length-Side-Channel-attack-in-our-AI-products.png" />
            
            </figure><p>Since the discovery of <a href="https://en.wikipedia.org/wiki/CRIME">CRIME</a>, <a href="https://breachattack.com/">BREACH</a>, <a href="https://media.blackhat.com/eu-13/briefings/Beery/bh-eu-13-a-perfect-crime-beery-wp.pdf">TIME</a>, <a href="https://en.wikipedia.org/wiki/Lucky_Thirteen_attack">LUCKY-13</a> etc., length-based side-channel attacks have been considered practical. Even though packets were encrypted, attackers were able to infer information about the underlying plaintext by analyzing metadata like the packet length or timing information.</p><p>Cloudflare was recently contacted by a group of researchers at <a href="https://cris.bgu.ac.il/en/">Ben Gurion University</a> who wrote a paper titled “<a href="https://cdn.arstechnica.net/wp-content/uploads/2024/03/LLM-Side-Channel.pdf">What Was Your Prompt? A Remote Keylogging Attack on AI Assistants</a>” that describes “a novel side-channel that can be used to read encrypted responses from AI Assistants over the web”.</p><p>The Workers AI and AI Gateway team collaborated closely with these security researchers through our <a href="/cloudflare-bug-bounty-program/">Public Bug Bounty program</a>, discovering and fully patching a vulnerability that affects LLM providers. You can read the detailed research paper <a href="https://cdn.arstechnica.net/wp-content/uploads/2024/03/LLM-Side-Channel.pdf">here</a>.</p><p>Since being notified about this vulnerability, we've implemented a mitigation to help secure all Workers AI and AI Gateway customers. As far as we could assess, there was no outstanding risk to Workers AI and AI Gateway customers.</p>
    <div>
      <h3>How does the side-channel attack work?</h3>
      <a href="#how-does-the-side-channel-attack-work">
        
      </a>
    </div>
    <p>In the paper, the authors describe a method in which they intercept the stream of a chat session with an LLM provider, use the network packet headers to infer the length of each token, extract and segment their sequence, and then use their own dedicated LLMs to infer the response.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6EeuXpPSqqvqIZKZUFPKEY/951a777d273caf172933639d9f5d6f12/pasted-image-0--2--3.png" />
            
            </figure><p>The two main requirements for a successful attack are an AI chat client running in <b>streaming</b> mode and a malicious actor capable of capturing network traffic between the client and the AI chat service. In streaming mode, the LLM tokens are emitted sequentially, introducing a token-length side-channel. Malicious actors could eavesdrop on packets via public networks or within an ISP.</p><p>An example request vulnerable to the side-channel attack looks like this:</p>
            <pre><code>curl -X POST \
https://api.cloudflare.com/client/v4/accounts/&lt;account-id&gt;/ai/run/@cf/meta/llama-2-7b-chat-int8 \
  -H "Authorization: Bearer &lt;Token&gt;" \
  -d '{"stream":true,"prompt":"tell me something about portugal"}'</code></pre>
            <p>Let’s use <a href="https://www.wireshark.org/">Wireshark</a> to inspect the network packets on the LLM chat session while streaming:</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6sII07hkJGaVXBKlWoBoEW/a1c3be395e0bee3ec5ed690947737d51/media.png" />
            
            </figure><p>The first packet has a length of 95 and corresponds to the token "Port" which has a length of four. The second packet has a length of 93 and corresponds to the token "ug" which has a length of two, and so on. By removing the likely token envelope from the network packet length, it is easy to infer how many tokens were transmitted and their sequence and individual length just by sniffing encrypted network data.</p><p>Since the attacker needs the sequence of individual token length, this vulnerability only affects text generation models using streaming. This means that AI inference providers that use streaming — the most common way of interacting with LLMs — like Workers AI, are potentially vulnerable.</p><p>This method requires that the attacker is on the same network or in a position to observe the communication traffic and its accuracy depends on knowing the target LLM’s writing style. In ideal conditions, the researchers claim that their system “can reconstruct 29% of an AI assistant’s responses and successfully infer the topic from 55% of them”. It’s also important to note that unlike other side-channel attacks, in this case the attacker has no way of evaluating its prediction against the ground truth. That means that we are as likely to get a sentence with near perfect accuracy as we are to get one where only things that match are conjunctions.</p>
    <div>
      <h3>Mitigating LLM side-channel attacks</h3>
      <a href="#mitigating-llm-side-channel-attacks">
        
      </a>
    </div>
    <p>Since this type of attack relies on the length of tokens being inferred from the packet, it can be just as easily mitigated by obscuring token size. The researchers suggested a few strategies to mitigate these side-channel attacks, one of which is the simplest: padding the token responses with random length noise to obscure the length of the token so that responses can not be inferred from the packets. While we immediately added the mitigation to our own inference product — Workers AI, we wanted to help customers secure their LLMs regardless of where they are running them by adding it to our AI Gateway.</p><p>As of today, all users of Workers AI and AI Gateway are now automatically protected from this side-channel attack.</p>
    <div>
      <h3>What we did</h3>
      <a href="#what-we-did">
        
      </a>
    </div>
    <p>Once we got word of this research work and how exploiting the technique could potentially impact our AI products, we did what we always do in situations like this: we assembled a team of systems engineers, security engineers, and product managers and started discussing risk mitigation strategies and next steps. We also had a call with the researchers, who kindly attended, presented their conclusions, and answered questions from our teams.</p><p>The research team provided a testing notebook that we could use to validate the attack's results. While we were able to reproduce the results for the notebook's examples, we found that the accuracy varied immensely with our tests using different prompt responses and different LLMs. Nonetheless, the paper has merit, and the risks are not negligible.</p><p>We decided to incorporate the first mitigation suggestion in the paper: including random padding to each message to hide the actual length of tokens in the stream, thereby complicating attempts to infer information based solely on network packet size.</p>
    <div>
      <h3>Workers AI, our inference product, is now protected</h3>
      <a href="#workers-ai-our-inference-product-is-now-protected">
        
      </a>
    </div>
    <p>With our inference-as-a-service product, anyone can use the <a href="https://developers.cloudflare.com/workers-ai/">Workers AI</a> platform and make API calls to our supported AI models. This means that we oversee the inference requests being made to and from the models. As such, we have a responsibility to ensure that the service is secure and protected from potential vulnerabilities. We immediately rolled out a fix once we were notified of the research, and all Workers AI customers are now automatically protected from this side-channel attack. We have not seen any malicious attacks exploiting this vulnerability, other than the ethical testing from the researchers.</p><p>Our solution for Workers AI is a variation of the mitigation strategy suggested in the research document. Since we stream JSON objects rather than the raw tokens, instead of padding the tokens with whitespace characters, we added a new property, "p" (for padding) that has a string value of variable random length.</p><p>Example streaming response using the <a href="https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events">SSE</a> syntax:</p>
            <pre><code>data: {"response":"portugal","p":"abcdefghijklmnopqrstuvwxyz0123456789a"}
data: {"response":" is","p":"abcdefghij"}
data: {"response":" a","p":"abcdefghijklmnopqrstuvwxyz012"}
data: {"response":" southern","p":"ab"}
data: {"response":" European","p":"abcdefgh"}
data: {"response":" country","p":"abcdefghijklmno"}
data: {"response":" located","p":"abcdefghijklmnopqrstuvwxyz012345678"}</code></pre>
            <p>This has the advantage that no modifications are required in the SDK or the client code, the changes are invisible to the end-users, and no action is required from our customers. By adding random variable length to the JSON objects, we introduce the same network-level variability, and the attacker essentially loses the required input signal. Customers can continue using Workers AI as usual while benefiting from this protection.</p>
    <div>
      <h3>One step further: AI Gateway protects users of any inference provider</h3>
      <a href="#one-step-further-ai-gateway-protects-users-of-any-inference-provider">
        
      </a>
    </div>
    <p>We added protection to our AI inference product, but we also have a product that proxies requests to any provider — <a href="https://developers.cloudflare.com/ai-gateway/">AI Gateway</a>. AI Gateway acts as a proxy between a user and supported inference providers, helping developers gain control, performance, and <a href="https://www.cloudflare.com/learning/performance/what-is-observability/">observability</a> over their AI applications. In line with our mission to help build a better Internet, we wanted to quickly roll out a fix that can help all our customers using text generation AIs, regardless of which provider they use or if they have mitigations to prevent this attack. To do this, we implemented a similar solution that pads all streaming responses proxied through AI Gateway with random noise of variable length.</p><p>Our AI Gateway customers are now automatically protected against this side-channel attack, even if the upstream inference providers have not yet mitigated the vulnerability. If you are unsure if your inference provider has patched this vulnerability yet, use AI Gateway to proxy your requests and ensure that you are protected.</p>
    <div>
      <h3>Conclusion</h3>
      <a href="#conclusion">
        
      </a>
    </div>
    <p>At Cloudflare, our mission is to help build a better Internet – that means that we care about all citizens of the Internet, regardless of what their tech stack looks like. We are proud to be able to improve the security of our AI products in a way that is transparent and requires no action from our customers.</p><p>We are grateful to the researchers who discovered this vulnerability and have been very collaborative in helping us understand the problem space. If you are a security researcher who is interested in helping us make our products more secure, check out our Bug Bounty program at <a href="http://hackerone.com/cloudflare">hackerone.com/cloudflare</a>.</p> ]]></content:encoded>
            <category><![CDATA[Bug Bounty]]></category>
            <category><![CDATA[LLM]]></category>
            <category><![CDATA[Vulnerabilities]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <category><![CDATA[Workers AI]]></category>
            <category><![CDATA[AI Gateway]]></category>
            <category><![CDATA[SASE]]></category>
            <guid isPermaLink="false">1R32EruY6C8Pu6LrFCGXwy</guid>
            <dc:creator>Celso Martinho</dc:creator>
            <dc:creator>Michelle Chen</dc:creator>
        </item>
        <item>
            <title><![CDATA[Streaming and longer context lengths for LLMs on Workers AI]]></title>
            <link>https://blog.cloudflare.com/workers-ai-streaming/</link>
            <pubDate>Tue, 14 Nov 2023 14:00:33 GMT</pubDate>
            <description><![CDATA[ Workers AI now supports streaming text responses for the LLM models in our catalog, including Llama-2, using server-sent events ]]></description>
            <content:encoded><![CDATA[ <p></p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6hqH5G1qi0RIrmIsdkb1Ql/0d7746c5af2fe23d347ef7192d868b36/pasted-image-0--3--2.png" />
            
            </figure><p>Workers AI is our serverless GPU-powered inference platform running on top of Cloudflare’s global network. It provides a growing catalog of off-the-shelf models that run seamlessly with Workers and enable developers to build powerful and scalable AI applications in minutes. We’ve already seen developers doing amazing things with Workers AI, and we can’t wait to see what they do as we continue to expand the platform. To that end, today we’re excited to announce some of our most-requested new features: streaming responses for all <a href="https://www.cloudflare.com/learning/ai/what-is-large-language-model/">Large Language Models</a> (LLMs) on Workers AI, larger context and sequence windows, and a full-precision <a href="https://developers.cloudflare.com/workers-ai/models/llm/">Llama-2</a> model variant.</p><p>If you’ve used ChatGPT before, then you’re familiar with the benefits of response streaming, where responses flow in token by token. LLMs work internally by generating responses sequentially using a process of repeated inference — the full output of a LLM model is essentially a sequence of hundreds or thousands of individual prediction tasks. For this reason, while it only takes a few milliseconds to generate a single token, generating the full response takes longer, on the order of seconds. The good news is we can start displaying the response as soon as the first tokens are generated, and append each additional token until the response is complete. This yields a much better experience for the end user —  displaying text incrementally as it's generated not only provides instant responsiveness, but also gives the end-user time to read and interpret the text.</p><p>As of today, you can now use response streaming for any LLM model in our catalog, including the very popular <a href="https://developers.cloudflare.com/workers-ai/models/llm/">Llama-2 model</a>. Here’s how it works.</p>
    <div>
      <h3>Server-sent events: a little gem in the browser API</h3>
      <a href="#server-sent-events-a-little-gem-in-the-browser-api">
        
      </a>
    </div>
    <p><a href="https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events">Server-sent events</a> are easy to use, simple to implement on the server side, standardized, and broadly available across many platforms natively or as a polyfill. Server-sent events fill a niche of handling a stream of updates from the server, removing the need for the boilerplate code that would otherwise be necessary to handle the event stream.</p>
<table>
<thead>
  <tr>
    <th></th>
    <th><span>Easy-to-use</span></th>
    <th><span>Streaming</span></th>
    <th><span>Bidirectional</span></th>
  </tr>
</thead>
<tbody>
  <tr>
    <td><span>fetch</span></td>
    <td><span>✅</span></td>
    <td></td>
    <td></td>
  </tr>
  <tr>
    <td><span>Server-sent events</span></td>
    <td><span>✅</span></td>
    <td><span>✅</span></td>
    <td></td>
  </tr>
  <tr>
    <td><span>Websockets</span></td>
    <td></td>
    <td><span>✅</span></td>
    <td><span>✅</span></td>
  </tr>
</tbody>
</table><p><sup>Comparing fetch, server-sent events, and websockets</sup></p><p>To get started using streaming on Workers AI’s text generation models with server-sent events, set the “stream” parameter to true in the input of request. This will change the response format and <code>mime-type</code> to <code>text/event-stream</code>.</p><p>Here’s an example of using streaming with the <a href="https://developers.cloudflare.com/workers-ai/get-started/rest-api/">REST API</a>:</p>
            <pre><code>curl -X POST \
"https://api.cloudflare.com/client/v4/accounts/&lt;account&gt;/ai/run/@cf/meta/llama-2-7b-chat-int8" \
-H "Authorization: Bearer &lt;token&gt;" \
-H "Content-Type:application/json" \
-d '{ "prompt": "where is new york?", "stream": true }'

data: {"response":"New"}

data: {"response":" York"}

data: {"response":" is"}

data: {"response":" located"}

data: {"response":" in"}

data: {"response":" the"}

...

data: [DONE]</code></pre>
            <p>And here’s an example using a Worker script:</p>
            <pre><code>import { Ai } from "@cloudflare/ai";
export default {
    async fetch(request, env, ctx) {
        const ai = new Ai(env.AI, { sessionOptions: { ctx: ctx } });
        const stream = await ai.run(
            "@cf/meta/llama-2-7b-chat-int8",
            { prompt: "where is new york?", stream: true  }
        );
        return new Response(stream,
            { headers: { "content-type": "text/event-stream" } }
        );
    }
}</code></pre>
            <p>If you want to consume the output event-stream from this Worker in a browser page, the client-side JavaScript is something like:</p>
            <pre><code>const source = new EventSource("/worker-endpoint");
source.onmessage = (event) =&gt; {
    if(event.data=="[DONE]") {
        // SSE spec says the connection is restarted
        // if we don't explicitly close it
        source.close();
        return;
    }
    const data = JSON.parse(event.data);
    el.innerHTML += data.response;
}</code></pre>
            <p>You can use this simple code with any simple HTML page, complex SPAs using React or other Web frameworks.</p><p>This creates a much more interactive experience for the user, who now sees the page update as the response is incrementally created, instead of waiting with a spinner until the entire response sequence has been generated. Try it out streaming on <a href="https://ai.cloudflare.com">ai.cloudflare.com</a>.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6VIIO6crNIkpaz8hG9n8jg/c703ab696213d0fa814aff31d6d36d09/llama-streaming.gif" />
            
            </figure><p>Workers AI supports streaming text responses for the <a href="https://developers.cloudflare.com/workers-ai/models/llm/">Llama-2</a> model and any future LLM models we are adding to our catalog.</p><p>But this is not all.</p>
    <div>
      <h3>Higher precision, longer context and sequence lengths</h3>
      <a href="#higher-precision-longer-context-and-sequence-lengths">
        
      </a>
    </div>
    <p>Another top request we heard from our community after the launch of Workers AI was for longer questions and answers in our Llama-2 model. In LLM terminology, this translates to higher context length (the number of tokens the model takes as input before making the prediction) and higher sequence length (the number of tokens the model generates in the response.)</p><p>We’re listening, and in conjunction with streaming, today we are adding a higher 16-bit full-precision Llama-2 variant to the catalog, and increasing the context and sequence lengths for the existing 8-bit version.</p>
<table>
<thead>
  <tr>
    <th><span>Model</span></th>
    <th><span>Context length (in)</span></th>
    <th><span>Sequence length (out)</span></th>
  </tr>
</thead>
<tbody>
  <tr>
    <td><span>@cf/meta/llama-2-7b-chat-int8</span></td>
    <td><span>2048 (768 before)</span></td>
    <td><span>1800 (256 before)</span></td>
  </tr>
  <tr>
    <td><span>@cf/meta/llama-2-7b-chat-fp16</span></td>
    <td><span>3072</span></td>
    <td><span>2500</span></td>
  </tr>
</tbody>
</table><p>Streaming, higher precision, and longer context and sequence lengths provide a better user experience and enable new, richer applications using large language models in Workers AI.</p><p>Check the Workers AI <a href="https://developers.cloudflare.com/workers-ai">developer documentation</a> for more information and options. If you have any questions or feedback about Workers AI, please come see us in the <a href="https://community.cloudflare.com/">Cloudflare Community</a> and the <a href="https://discord.gg/cloudflaredev">Cloudflare Discord</a>.If you are interested in machine learning and serverless AI, the Cloudflare Workers AI team is building a global-scale platform and tools that enable our customers to run fast, low-latency inference tasks on top of our network. Check our <a href="https://www.cloudflare.com/careers/jobs/">jobs page</a> for opportunities.</p> ]]></content:encoded>
            <category><![CDATA[Workers AI]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <category><![CDATA[JavaScript]]></category>
            <category><![CDATA[Serverless]]></category>
            <category><![CDATA[1.1.1.1]]></category>
            <guid isPermaLink="false">4RWvzttPkO6JoYsMwoovJ8</guid>
            <dc:creator>Jesse Kipp</dc:creator>
            <dc:creator>Celso Martinho</dc:creator>
        </item>
        <item>
            <title><![CDATA[Email Routing subdomain support, new APIs and security protocols]]></title>
            <link>https://blog.cloudflare.com/email-routing-subdomains/</link>
            <pubDate>Thu, 26 Oct 2023 13:10:06 GMT</pubDate>
            <description><![CDATA[ It's been two years since we announced Email Routing, our solution to create custom email addresses for your domains and route incoming emails to your preferred mailbox. Since then, the team has worked hard to evolve the product and add more powerful features to meet our users' expectations.  ]]></description>
            <content:encoded><![CDATA[ <p></p><p>It's been two years since we announced Email Routing, our solution to create custom email addresses for your domains and route incoming emails to your preferred mailbox. Since then, the team has worked hard to evolve the product and add more powerful features to meet our users' expectations. Examples include <a href="/announcing-route-to-workers/">Route to Workers</a>, which allows you to <a href="https://developers.cloudflare.com/email-routing/email-workers/">process your Emails programmatically</a> using Workers scripts, <a href="/email-routing-leaves-beta/">Public APIs</a>, Audit Logs, or <a href="/dmarc-management/">DMARC Management</a>.</p><p>We also made significant progress in supporting more email security extensions and protocols, protecting our customers from unwanted traffic, and keeping our IP space reputation for email egress impeccable to maximize our deliverability rates to whatever inbox upstream provider you chose.</p><p>Since <a href="/email-routing-leaves-beta/">leaving beta</a>, Email Routing has grown into one of our most popular products; it’s used by more than one million different customer zones globally, and we forward around 20 million messages daily to every major email platform out there. Our product is mature, robust enough for general usage, and suitable for any production environment. And it keeps evolving: today, we announce three new features that will help make Email Routing more secure, flexible, and powerful than ever.</p>
    <div>
      <h2>New security protocols</h2>
      <a href="#new-security-protocols">
        
      </a>
    </div>
    <p>The SMTP email protocol has been around since the early 80s. Naturally, it wasn't designed with the best security practices and requirements in mind, at least not the ones that the Internet expects today. For that reason, several protocol revisions and extensions have been standardized and adopted by the community over the years. Cloudflare is known for being an early adopter of promising emerging technologies; Email Routing already <a href="https://developers.cloudflare.com/email-routing/postmaster/">supports</a> things like SPF, DKIM signatures, DMARC policy enforcement, TLS transport, STARTTLS, and IPv6 egress, to name a few. Today, we are introducing support for two new standards to help <a href="https://www.cloudflare.com/zero-trust/products/email-security/">increase email security</a> and improve deliverability to third-party upstream email providers.</p>
    <div>
      <h3>ARC</h3>
      <a href="#arc">
        
      </a>
    </div>
    <p><a href="https://arc-spec.org/">Authenticated Received Chain</a> (ARC) is an email authentication system designed to allow an intermediate email server (such as Email Routing) to preserve email authentication results. In other words, with ARC, we can securely preserve the results of validating sender authentication mechanisms like SPF and DKIM, which we support when the email is received, and transport that information to the upstream provider when we forward the message. ARC establishes a chain of trust with all the hops the message has passed through. So, if it was tampered with or changed in one of the hops, it is possible to see where by following that chain.</p><p>We began rolling out ARC support to Email Routing a few weeks ago. Here’s how it works:</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/67xk7IFzgYjOSwQEqUSbY/d48e08b735580f20fcafca988bb43748/pasted-image-0--1--2.png" />
            
            </figure><p>As you can see, <code>joe@example.com</code> sends an Email to <code>henry@domain.example</code>, an Email Routing address, which in turn is forwarded to the final address, <code>example@gmail.com</code>.</p><p>Email Routing will use <code>@example.com</code>’s DMARC policy to check the SPF and DKIM alignments (SPF, DKIM, and DMARC <a href="https://www.cloudflare.com/learning/email-security/dmarc-dkim-spf/">help authenticate</a> email senders by verifying that the emails came from the domain that they claim to be from.) It then stores this authentication result by adding a <code>Arc-Authentication-Results</code> header in the message:</p>
            <pre><code>ARC-Authentication-Results: i=1; mx.cloudflare.net; dkim=pass header.d=cloudflare.com header.s=example09082023 header.b=IRdayjbb; dmarc=pass header.from=example.com policy.dmarc=reject; spf=none (mx.cloudflare.net: no SPF records found for postmaster@example.com) smtp.helo=smtp.example.com; spf=pass (mx.cloudflare.net: domain of joe@example.com designates 2a00:1440:4824:20::32e as permitted sender) smtp.mailfrom=joe@example.com; arc=none smtp.remote-ip=2a00:1440:4824:20::32e</code></pre>
            <p>Then we take a snapshot of all the headers and the body of the original message, and we generate an <code>Arc-Message-Signature</code> header with a DKIM-like cryptographic signature (in fact ARC uses the same DKIM keys):</p>
            <pre><code>ARC-Message-Signature: i=1; a=rsa-sha256; s=2022; d=email.cloudflare.net; c=relaxed/relaxed; h=To:Date:Subject:From:reply-to:cc:resent-date:resent-from:resent-to :resent-cc:in-reply-to:references:list-id:list-help:list-unsubscribe :list-subscribe:list-post:list-owner:list-archive; t=1697709687; bh=sN/+...aNbf==;</code></pre>
            <p>Finally, before forwarding the message to <code>example@gmail.com</code>, Email Routing generates the <code>Arc-Seal</code> header, another DKIM-like signature, composed out of the <code>Arc-Authentication-Results</code> and <code>Arc-Message-Signature</code>, and cryptographically “seals” the message:</p>
            <pre><code>ARC-Seal: i=1; a=rsa-sha256; s=2022; d=email.cloudflare.net; cv=none; b=Lx35lY6..t4g==;</code></pre>
            <p>When Gmail receives the message from Email Routing, it not only normally authenticates the last hop domain.example domain (Email Routing uses <a href="https://developers.cloudflare.com/email-routing/postmaster/#sender-rewriting">SRS</a>), but it also checks the ARC seal header, which provides the authentication results of the original sender.</p><p>ARC increases the traceability of the message path through email intermediaries, allowing for more informed delivery decisions by those who receive emails as well as higher deliverability rates for those who transport them, like Email Routing. It has been adopted by all the major email providers like <a href="https://support.google.com/a/answer/175365?hl=en">Gmail</a> and Microsoft. You can read more about the ARC protocol in the <a href="https://datatracker.ietf.org/doc/html/rfc8617">RFC8617</a>.</p>
    <div>
      <h3>MTA-STS</h3>
      <a href="#mta-sts">
        
      </a>
    </div>
    <p>As we said earlier, SMTP is an old protocol. Initially Email communications were done in the clear, in plain-text and unencrypted. At some point in time in the late 90s, the email providers community standardized STARTTLS, also known as Opportunistic TLS. The <a href="https://datatracker.ietf.org/doc/html/rfc3207">STARTTLS extension</a> allowed a client in a SMTP session to upgrade to TLS encrypted communications.</p><p>While at the time this seemed like a step forward in the right direction, we later found out that because STARTTLS can start with an unencrypted plain-text connection, and that can be hijacked, the protocol is <a href="https://lwn.net/Articles/866481/">susceptible to man-in-the-middle attacks</a>.</p><p>A few years ago MTA Strict Transport Security (<a href="https://datatracker.ietf.org/doc/html/rfc8461">MTA-STS</a>) was introduced by email service providers including Microsoft, Google and Yahoo as a solution to protect against downgrade and man-in-the-middle attacks in SMTP sessions, as well as solving the lack of security-first communication standards in email.</p><p>Suppose that <code>example.com</code> uses Email Routing. Here’s how you can enable MTA-STS for it.</p><p>First, log in to the <a href="https://dash.cloudflare.com/">Cloudflare dashboard</a> and select your account and zone. Then go to <b>DNS</b> &gt; <b>Records</b> and create a new CNAME record with the name “<code>_mta-sts</code>” that points to Cloudflare’s record “<code>_mta-sts.mx.cloudflare.net</code>”. Make sure to disable the proxy mode.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4czTYhSi9X5kPU3TZ0m861/e7d8162ff6f40494ce6d11fbf5899dad/pasted-image-0-2.png" />
            
            </figure><p>Confirm that the record was created:</p>
            <pre><code>$ dig txt _mta-sts.example.com
_mta-sts.example.com.	300	IN	CNAME	_mta-sts.mx.cloudflare.net.
_mta-sts.mx.cloudflare.net. 300	IN	TXT	"v=STSv1; id=20230615T153000;"</code></pre>
            <p>This tells the other end client that is trying to connect to us that we support MTA-STS.</p><p>Next you need an HTTPS endpoint at <code>mta-sts.example.com</code> to serve your policy file. This file defines the mail servers in the domain that use MTA-STS. The reason why HTTPS is used here instead of DNS is because not everyone uses DNSSEC yet, so we want to avoid another MITM attack vector.</p><p>To do this you need to deploy a very simple Worker that allows Email clients to pull Cloudflare’s Email Routing <a href="https://mta-sts.mx.cloudflare.net/.well-known/mta-sts.txt">policy</a> file using the <a href="https://en.wikipedia.org/wiki/Well-known_URI">“well-known” URI</a> convention. Go to your <b>Account</b> &gt; <b>Workers &amp; Pages</b> and press <b>Create Application</b>. Pick the “MTA-STS” template from the list.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6BBFtG8hiHehJw74L2DbHX/d2afee1d61f266382082c08681e05e1a/pasted-image-0--2--2.png" />
            
            </figure><p>This Worker simply proxies <code>https://mta-sts.mx.cloudflare.net/.well-known/mta-sts.txt</code> to your own domain. After deploying it, go to the Worker configuration, then <b>Triggers</b> &gt; <b>Custom Domains</b> and <b>Add Custom Domain</b>.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7MWHc7AuevDzxafJ0gfaFb/f659d8c0ae8c30f9a1457bc4b20f3535/customdomains.png" />
            
            </figure><p>You can then confirm that your policy file is working:</p>
            <pre><code>$ curl https://mta-sts.example.com/.well-known/mta-sts.txt
version: STSv1
mode: enforce
mx: *.mx.cloudflare.net
max_age: 86400</code></pre>
            <p>This says that we enforce MTA-STS. Capable email clients will only deliver email to this domain over a secure connection to the specified MX servers. If no secure connection can be established the email will not be delivered.</p><p>Email Routing also supports MTA-STS upstream, which greatly improves security when forwarding your Emails to service providers like <a href="https://support.google.com/a/answer/9261504?hl=en">Gmail</a> or <a href="https://learn.microsoft.com/en-us/purview/enhancing-mail-flow-with-mta-sts">Microsoft</a>, and others.</p><p>While enabling MTA-STS involves a few steps today, we plan to simplify things for you and automatically configure MTA-STS for your domains from the Email Routing dashboard as a future improvement.</p>
    <div>
      <h2>Sending emails and replies from Workers</h2>
      <a href="#sending-emails-and-replies-from-workers">
        
      </a>
    </div>
    <p>Last year we announced <a href="https://developers.cloudflare.com/email-routing/email-workers/">Email Workers</a>, allowing anyone using Email Routing to associate a Worker script to an Email address rule, and programmatically process their incoming emails in any way they want. <a href="https://developers.cloudflare.com/workers/">Workers</a> is our serverless compute platform, it provides hundreds of features and APIs, like <a href="https://developers.cloudflare.com/workers/databases/">databases</a> and <a href="https://developers.cloudflare.com/r2/api/workers/workers-api-reference/">storage</a>. Email Workers opened doors to a flood of use-cases and applications that weren’t possible before like implementing allow/block lists, advanced rules, notifications to messaging applications, honeypot aggregators and more.</p><p>Still, you could only act on the incoming email event. You could read and process the email message, you could even manipulate and create some headers, but you couldn’t rewrite the body of the message or create new emails from scratch.</p><p>Today we’re announcing two new powerful Email Workers APIs that will further enhance what you can do with Email Routing and Workers.</p>
    <div>
      <h3>Send emails from Workers</h3>
      <a href="#send-emails-from-workers">
        
      </a>
    </div>
    <p>Now you can send an email from any Worker, from scratch, whenever you want, not just when you receive incoming messages, to any email address verified on Email Routing under your account. Here are a few practical examples where sending email from Workers to your verified addresses can be helpful:</p><ul><li><p>Daily digests with the news from your favorite publications.</p></li><li><p>Alert messages whenever the weather conditions are adverse.</p></li><li><p>Automatic notifications when systems complete tasks.</p></li><li><p>Receive a message composed of the inputs of a form online on a contact page.</p></li></ul><p>Let's see a simple example of a Worker sending an email. First you need to create “<code>send_email</code>” bindings in your wrangler.toml configuration:</p>
            <pre><code>send_email = [
    {type = "send_email", name = "EMAIL_OUT"}
 ]</code></pre>
            <p>And then creating a new message and sending it in a Workers is as simple as:</p>
            <pre><code>import { EmailMessage } from "cloudflare:email";
import { createMimeMessage } from "mimetext";

export default {
 async fetch(request, env) {
   const msg = createMimeMessage();
   msg.setSender({ name: "Workers AI story", addr: "joe@example.com" });
   msg.setRecipient("mary@domain.example");
   msg.setSubject("An email generated in a worker");
   msg.addMessage({
       contentType: 'text/plain',
       data: `Congratulations, you just sent an email from a worker.`
   });

   var message = new EmailMessage(
     "joe@example.com",
     "mary@domain.example",
     msg.asRaw()
   );
   try {
     await env.EMAIL_OUT.send(message);
   } catch (e) {
     return new Response(e.message);
   }

   return new Response("email sent!");
 },
};</code></pre>
            <p>This example makes use of <a href="https://muratgozel.github.io/MIMEText/">mimetext</a>, an open-source raw email message generator.</p><p>Again, for security reasons, you can only send emails to the addresses for which you confirmed ownership in Email Routing under your Cloudflare account. If you’re looking for sending email campaigns or newsletters to destination addresses that you do not control or larger subscription groups, you should consider other options like our <a href="/sending-email-from-workers-with-mailchannels/">MailChannels integration</a>.</p><p>Since sending Emails from Workers is not tied to the EmailEvent, you can send them from any type of Worker, including <a href="https://developers.cloudflare.com/workers/configuration/cron-triggers/">Cron Triggers</a> and <a href="https://developers.cloudflare.com/durable-objects/">Durable Objects</a>, whenever you want, you control all the logic.</p>
    <div>
      <h3>Reply to emails</h3>
      <a href="#reply-to-emails">
        
      </a>
    </div>
    <p>One of our most-requested features has been to provide a way to programmatically respond to incoming emails. It has been possible to do this with Email Workers in a very limited capacity by returning a permanent SMTP error message — but this may or may not be visible to the end user depending on the client implementation.</p>
            <pre><code>export default {
  async email(message, env, ctx) {
      message.setReject("Address not allowed");
  }
}
</code></pre>
            <p>As of today, you can now truly reply to incoming emails with another new message and implement smart auto-responders programmatically, adding any content and context in the main body of the message. Think of a customer support email automatically generating a ticket and returning the link to the sender, an out-of-office reply with instructions when you're on vacation, or a detailed explanation of why you rejected an email. Here’s a code example:</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4NgbXFwy3Xw0VHLemZ4smZ/682a581c21af850880fada5bbc17e99f/Screenshot-2023-10-26-at-12.05.33.png" />
            
            </figure><p>To mitigate security risks and abuse, replying to incoming emails has a few requirements:</p><ul><li><p>The incoming email has to have valid DMARC.</p></li><li><p>The email can only be replied to once.</p></li><li><p>The <code>In-Reply-To</code> header of the reply message must match the <code>Message-ID</code> of the incoming message.</p></li><li><p>The recipient of the reply must match the incoming sender.</p></li><li><p>The outgoing sender domain must match the same domain that received the email.</p></li></ul><p>If these and other internal conditions are not met, then <code>reply()</code> will fail with an exception, otherwise you can freely compose your reply message and send it back to the original sender.</p><p>For more information the documentation to these APIs is available in our <a href="https://developers.cloudflare.com/email-routing/email-workers/runtime-api/">Developer Docs</a>.</p>
    <div>
      <h2>Subdomains support</h2>
      <a href="#subdomains-support">
        
      </a>
    </div>
    <p>This is a big one.</p><p>Email Routing is a <a href="https://developers.cloudflare.com/fundamentals/concepts/accounts-and-zones/#zones">zone-level</a> feature. A zone has a <a href="https://www.cloudflare.com/learning/dns/top-level-domain/">top-level domain</a> (the same as the zone name) and it can have subdomains (managed under the DNS feature.) As an example, I can have the <code>example.com</code>  zone, and then the <code>mail.example.com</code> and <code>corp.example.com</code> subdomains under it. However, we can only use Email Routing with the top-level domain of the zone, <code>example.com</code> in this example. While this is fine for the vast majority of use cases, some customers — particularly bigger organizations with complex email requirements — have asked for more flexibility.</p><p>This changes today. Now you can use Email Routing with any subdomain of any zone in your account. To make this possible we redesigned the dashboard UI experience to make it easier to get you started and manage all your Email Routing domains and subdomains, rules and destination addresses in one single place. Let’s see how it works.</p><p>To add Email Routing features to a new subdomain, log in to the <a href="https://dash.cloudflare.com/">Cloudflare dashboard</a> and select your account and zone. Then go to <b>Email</b> &gt; <b>Email Routing</b> &gt; <b>Settings</b> and click “Add subdomain”.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1WwS0LP1o8Ijlk0IzcqzCE/8528ed0f90a34029777d66b411d9e696/prev-req-rec.png" />
            
            </figure><p>Once the subdomain is added and the DNS records are configured, you can see it in the <b>Settings</b> list under the <b>Subdomains</b> section:</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7gwBTgYQ36QxcvCGHfBqEd/450707647df2a8277eb0dc66e966088e/Domain.png" />
            
            </figure><p>Now you can go to <b>Email</b> &gt; <b>Email Routing</b> &gt; <b>Routing rules</b> and create new custom addresses that will show you the option of using either the top domain of the zone or any other configured subdomain.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1KJ9AIM6MpcaYeV5IrVZQw/1e306de0bd46177eb2601e8e4e600930/Screenshot-2023-10-25-at-11.55.31-AM.png" />
            
            </figure><p>After the new custom address for the subdomain is created you can see it in the list with all the other addresses, and manage it from there.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6vEJFroWoVivSr9n6SwPVl/28a4938f201e4153c964895d4687f1b2/custom-addresses.png" />
            
            </figure><p>It’s this easy.</p>
    <div>
      <h2>Final words</h2>
      <a href="#final-words">
        
      </a>
    </div>
    <p>We hope you enjoy the new features that we are announcing today. Still, we want to be clear: there are no changes in pricing, and Email Routing is still free for Cloudflare customers.</p><p>Ever since Email Routing was launched, we’ve been listening to customers’ feedback and trying to adjust our roadmap to both our requirements and their own ideas and requests. Email shouldn't be difficult; our goal is to listen, learn and keep improving the <a href="https://www.cloudflare.com/zero-trust/solutions/email-security-services/">email security service</a> with better, more powerful features.</p><p>You can find detailed information about the new features and more in our Email Routing <a href="https://developers.cloudflare.com/email-routing">Developer Docs</a>.</p><p>If you have any questions or feedback about Email Routing, please come see us in the <a href="https://community.cloudflare.com/new-topic?category=Feedback/Previews%20%26%20Betas&amp;tags=email">Cloudflare Community</a> and the <a href="https://discord.gg/cloudflaredev">Cloudflare Discord</a>.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1OKqc3VieWKGRFBDtPU7io/18e8d2db548d341b0cb78a111aaa8480/Email-Routing-spot.png" />
            
            </figure><p></p> ]]></content:encoded>
            <category><![CDATA[Email Routing]]></category>
            <category><![CDATA[Email Workers]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <category><![CDATA[Developers]]></category>
            <guid isPermaLink="false">54W5SKQEt6kELFJMaWSRyh</guid>
            <dc:creator>Celso Martinho</dc:creator>
            <dc:creator>André Cruz</dc:creator>
            <dc:creator>Nelson Duarte</dc:creator>
        </item>
        <item>
            <title><![CDATA[You can now use WebGPU in Cloudflare Workers]]></title>
            <link>https://blog.cloudflare.com/webgpu-in-workers/</link>
            <pubDate>Wed, 27 Sep 2023 13:00:56 GMT</pubDate>
            <description><![CDATA[ Today, we are introducing WebGPU support to Cloudflare Workers. This blog will explain why it's important, why we did it, how you can use it, and what comes next ]]></description>
            <content:encoded><![CDATA[ <p></p><p>The browser as an app platform is real and stronger every day; long gone are the Browser Wars. Vendors and standard bodies have done amazingly well over the last years, working together and advancing web standards with new <a href="https://www.cloudflare.com/learning/security/api/what-is-an-api/">APIs</a> that allow developers to build fast and powerful applications, finally comparable to those we got used to seeing in the native OS' environment.</p><p>Today, browsers can render web pages and run code that interfaces with an <a href="https://developer.mozilla.org/en-US/docs/Web/API">extensive catalog of modern Web APIs</a>. Things like networking, rendering accelerated graphics, or even accessing low-level hardware features like USB devices are all now possible within the browser sandbox.</p><p>One of the most exciting new browser APIs that browser vendors have been rolling out over the last months is WebGPU, a modern, low-level GPU programming interface designed for high-performance 2D and 3D graphics and general purpose GPU compute.</p><p>Today, we are introducing <a href="https://developer.chrome.com/blog/webgpu-release/">WebGPU</a> support to Cloudflare Workers. This blog will explain why it's important, why we did it, how you can use it, and what comes next.</p>
    <div>
      <h3>The history of the GPU in the browser</h3>
      <a href="#the-history-of-the-gpu-in-the-browser">
        
      </a>
    </div>
    <p>To understand why WebGPU is a big deal, we must revisit history and see how browsers went from relying only on the CPU for everything in the early days to taking advantage of GPUs over the years.</p><p>In 2011, <a href="https://en.wikipedia.org/wiki/WebGL">WebGL 1</a>, a limited port of <a href="https://www.khronos.org/opengles/">OpenGL ES 2.0</a>, was introduced, providing an API for fast, accelerated 3D graphics in the browser for the first time. By then, this was somewhat of a revolution in enabling gaming and 3D visualizations in the browser. Some of the most popular 3D animation frameworks, like <a href="https://threejs.org/">Three.js</a>, launched in the same period. Who doesn't remember going to the (now defunct) <a href="https://en.wikipedia.org/wiki/Google_Chrome_Experiments">Google Chrome Experiments</a> page and spending hours in awe exploring the demos? Another option then was using the Flash Player, which was still dominant in the desktop environment, and their <a href="https://en.wikipedia.org/wiki/Stage3D">Stage 3D</a> API.</p><p>Later, in 2017, based on the learnings and shortcomings of its predecessor, WebGL 2 was a significant upgrade and brought more advanced GPU capabilities like shaders and more flexible textures and rendering.</p><p>WebGL, however, has proved to be a steep and complex learning curve for developers who want to take control of things, do low-level 3D graphics using the GPU, and not use 3rd party abstraction libraries.</p><p>Furthermore and more importantly, with the advent of <a href="https://www.cloudflare.com/learning/ai/what-is-machine-learning/">machine learning</a> and cryptography, we discovered that GPUs are great not only at drawing graphics but can be used for other applications that can take advantage of things like high-speed data or blazing-fast matrix multiplications, and one can use them to perform general computation. This became known as <a href="https://en.wikipedia.org/wiki/General-purpose_computing_on_graphics_processing_units">GPGPU</a>, short for general-purpose computing on graphics processing units.</p><p>With this in mind, in the native desktop and mobile operating system worlds, developers started using more advanced frameworks like <a href="https://en.wikipedia.org/wiki/CUDA">CUDA</a>, <a href="https://developer.apple.com/metal/">Metal</a>, <a href="https://en.wikipedia.org/wiki/DirectX#DirectX_12">DirectX 12</a>, or <a href="https://www.vulkan.org/learn#key-resources">Vulkan</a>. WebGL stayed behind. To fill this void and bring the browser up to date, in 2017, companies like Google, Apple, Intel, Microsoft, Kronos, and Mozilla created the <a href="https://www.w3.org/community/gpu/">GPU for Web Community Working Group</a> to collaboratively design the successor of WebGL and create the next modern 3D graphics and computation capabilities APIs for the Web.</p>
    <div>
      <h3>What is WebGPU</h3>
      <a href="#what-is-webgpu">
        
      </a>
    </div>
    <p>WebGPU was developed with the following advantages in mind:</p><ul><li><p><b>Lower Level Access</b> - WebGPU provides lower-level, direct access to the GPU vs. the high-level abstractions in WebGL. This enables more control over GPU resources.</p></li><li><p><b>Multi-Threading</b> - WebGPU can leverage multi-threaded rendering and compute, allowing improved CPU/GPU parallelism compared to WebGL, which relies on a single thread.</p></li><li><p><b>Compute Shaders</b> - First-class support for general-purpose compute shaders for GPGPU tasks, not just graphics. WebGL compute is limited.</p></li><li><p><b>Safety</b> - WebGPU ensures memory and GPU access safety, avoiding common WebGL pitfalls.</p></li><li><p><b>Portability</b> - WGSL shader language targets cross-API portability across GPU vendors vs. GLSL in WebGL.</p></li><li><p><b>Reduced Driver Overhead</b> - The lower level Vulkan/Metal/D3D12 basis improves overhead vs. OpenGL drivers in WebGL.</p></li><li><p><b>Pipeline State Objects</b> - Predefined pipeline configs avoid per-draw driver overhead in WebGL.</p></li><li><p><b>Memory Management</b> - Finer-grained buffer and resource management vs. WebGL.</p></li></ul><p>The “too long didn't read” version is that WebGPU provides lower-level control over the GPU hardware with reduced overhead. It's safer, has multi-threading, is focused on compute, not just graphics, and has portability advantages compared to WebGL.</p><p>If these aren't reasons enough to get excited, developers are also looking at WebGPU as an option for native platforms, not just the Web. For instance, you can use this <a href="https://github.com/webgpu-native/webgpu-headers/blob/main/webgpu.h">C API</a> that mimics the JavaScript specification. If you think about this and the power of WebAssembly, you can effectively have a truly platform-agnostic GPU hardware layer that you can use to <a href="https://developer.chrome.com/blog/webgpu-cross-platform/">develop</a> platforms for any operating system or browser.</p>
    <div>
      <h3>More than just graphics</h3>
      <a href="#more-than-just-graphics">
        
      </a>
    </div>
    <p>As explained above, besides being a graphics API, WebGPU makes it possible to perform tasks such as:</p><ul><li><p><b>Machine Learning</b> - Implement ML applications like <a href="https://www.cloudflare.com/learning/ai/what-is-neural-network/">neural networks</a> and computer vision algorithms using WebGPU compute shaders and matrices.</p></li><li><p><b>Scientific Computing</b> - Perform complex scientific computation like physics simulations and mathematical modeling using the GPU.</p></li><li><p><b>High Performance Computing</b> - Unlock breakthrough performance for parallel workloads by connecting WebGPU to languages like Rust, C/C++ via <a href="https://webassembly.org/">WebAssembly</a>.</p></li></ul><p><a href="https://gpuweb.github.io/gpuweb/wgsl/">WGSL</a>, the shader language for WebGPU, is what enables the general-purpose compute feature. Shaders, or more precisely, <a href="https://www.khronos.org/opengl/wiki/Compute_Shader">compute shaders</a>, have no user-defined inputs or outputs and are used for computing arbitrary information. Here are <a href="https://webgpufundamentals.org/webgpu/lessons/webgpu-compute-shaders.html">some examples</a> of simple WebGPU compute shaders if you want to learn more.</p>
    <div>
      <h3>WebGPU in Workers</h3>
      <a href="#webgpu-in-workers">
        
      </a>
    </div>
    <p>We've been watching WebGPU since the API was published. Its general-purpose compute features perfectly fit our Workers' ecosystem and capabilities and align well with our vision of providing our customers multiple compute and hardware options and bringing GPU workloads to our global network, close to clients.</p><p>Cloudflare also has a track record of pioneering support for emerging web standards on our network and services, accelerating their adoption for our customers. Examples of these are <a href="https://developers.cloudflare.com/workers/runtime-apis/web-crypto/">Web Crypto API</a>, <a href="/introducing-http2/">HTTP/2</a>, <a href="/http3-the-past-present-and-future/">HTTP/3</a>, <a href="/introducing-tls-1-3/">TLS 1.3</a>, or <a href="/early-hints/">Early hints</a>, but <a href="https://developers.cloudflare.com/workers/runtime-apis/">there are more</a>.</p><p>Bringing WebGPU to Workers was both natural and timely. Today, we are announcing that we have released a version of <a href="https://github.com/cloudflare/workerd">workerd</a>, the open-sourced JavaScript / Wasm runtime that powers Cloudflare Workers, with <a href="https://github.com/cloudflare/workerd/tree/main/src/workerd/api/gpu">WebGPU support</a>, that you can start playing and developing applications with, locally.</p><p>Starting today anyone can run this on their personal computer and experiment with WebGPU-enabled workers. Implementing local development first allows us to put this API in the hands of our customers and developers earlier and get feedback that will guide the development of this feature for production use.</p><p>But before we dig into code examples, let's explain how we built it.</p>
    <div>
      <h3>How we built WebGPU on top of Workers</h3>
      <a href="#how-we-built-webgpu-on-top-of-workers">
        
      </a>
    </div>
    
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/2rem1ZyAcVa3LO7ue6OTC7/debccf54201fe93a9221a6dd01bc5338/image2-22.png" />
            
            </figure><p>To implement the WebGPU API, we took advantage of <a href="https://dawn.googlesource.com/dawn/">Dawn</a>, an open-source library backed by Google, the same used in Chromium and Chrome, that provides applications with an implementation of the WebGPU standard. It also provides the <a href="https://github.com/webgpu-native/webgpu-headers/blob/main/webgpu.h">webgpu.h</a> headers file, the de facto reference for all the other implementations of the standard.</p><p>Dawn can interoperate with Linux, MacOS, and Windows GPUs by interfacing with each platform's native GPU frameworks. For example, when an application makes a WebGPU draw call, Dawn will convert that draw command into the equivalent Vulkan, Metal, or Direct3D 12 API call, depending on the platform.</p><p>From an application standpoint, Dawn handles the interactions with the underlying native graphics APIs that communicate directly with the GPU drivers. Dawn essentially acts as a middle layer that translates the WebGPU API calls into calls for the platform's native graphics API.</p><p>Cloudflare <a href="/workerd-open-source-workers-runtime/">workerd</a> is the underlying open-source runtime engine that executes Workers code. It shares most of its code with the same runtime that powers Cloudflare Workers' production environment but with some changes designed to make it more portable to other environments. We then have release cycles that aim to synchronize both codebases; more on that later. Workerd is also used with <a href="https://github.com/cloudflare/workers-sdk">wrangler</a>, our command-line tool for building and interacting with Cloudflare Workers, to support local development.</p><p>The WebGPU code that interfaces with the Dawn library can be found <a href="https://github.com/cloudflare/workerd/tree/main/src/workerd/api/gpu">here</a>, and can easily be enabled with a flag, checked <a href="https://github.com/cloudflare/workerd/blob/main/src/workerd/api/global-scope.c%2B%2B#L728">here</a>.</p>
            <pre><code>jsg::Ref&lt;api::gpu::GPU&gt; Navigator::getGPU(CompatibilityFlags::Reader flags) {
  // is this a durable object?
  KJ_IF_MAYBE (actor, IoContext::current().getActor()) {
    JSG_REQUIRE(actor-&gt;getPersistent() != nullptr, TypeError,
                "webgpu api is only available in Durable Objects (no storage)");
  } else {
    JSG_FAIL_REQUIRE(TypeError, "webgpu api is only available in Durable Objects");
  };

  JSG_REQUIRE(flags.getWebgpu(), TypeError, "webgpu needs the webgpu compatibility flag set");

  return jsg::alloc&lt;api::gpu::GPU&gt;();
}</code></pre>
            <p>The WebGPU API can only be accessed using <a href="https://developers.cloudflare.com/durable-objects/">Durable Objects</a>, which are essentially global singleton instances of Cloudflare Workers. There are two important reasons for this:</p><ul><li><p>WebGPU code typically wants to store the state between requests, for example, loading an <a href="https://www.cloudflare.com/learning/ai/what-is-artificial-intelligence/">AI model</a> into the GPU memory once and using it multiple times for inference.</p></li><li><p>Not all Cloudflare servers have GPUs yet, so although the worker that receives the request is typically the closest one available, the Durable Object that uses WebGPU will be instantiated where there are GPU resources available, which may not be on the same machine.</p></li></ul><p>Using Durable Objects instead of regular Workers allow us to address both of these issues.</p>
    <div>
      <h3>The WebGPU Hello World in Workers</h3>
      <a href="#the-webgpu-hello-world-in-workers">
        
      </a>
    </div>
    <p>Wrangler uses Miniflare 3, a <a href="/wrangler3/">fully-local simulator for Workers</a>, which in turn is powered by workerd. This means you can start experimenting and doing WebGPU code locally on your machine right now before we prepare things in our production environment.</p><p>Let’s get coding then.</p><p>Since Workers doesn't render graphics yet, we started with implementing the general-purpose GPU (GPGPU) APIs in the <a href="https://www.w3.org/TR/webgpu/">WebGPU specification</a>. In other words, we fully support the part of the API that the <a href="https://www.w3.org/TR/webgpu/#gpucomputepipeline">compute shaders and the compute pipeline</a> require, but we are not yet focused on fragment or vertex shaders used in rendering pipelines.</p><p>Here’s a typical “hello world” in WebGPU. This Durable Object script will output the name of the GPU device that workerd found in your machine to your console.</p>
            <pre><code>const adapter = await navigator.gpu.requestAdapter();
const adapterInfo = await adapter.requestAdapterInfo(["device"]);
console.log(adapterInfo.device);</code></pre>
            <p>A more interesting example, though, is a simple compute shader. In this case, we will fill a results buffer with an incrementing value taken from the iteration number via <code>global_invocation_id</code>.</p><p>For this, we need two buffers, one to store the results of the computations as they happen (<code>storageBuffer</code>) and another to copy the results at the end (<code>mappedBuffer</code>).</p><p>We then dispatch four workgroups, meaning that the increments can happen in parallel. This parallelism and programmability are two key reasons why compute shaders and GPUs provide an advantage for things like machine learning inference workloads. Other advantages are:</p><ul><li><p><b>Bandwidth</b> - GPUs have a very high memory bandwidth, up to 10-20x more than CPUs. This allows fast reading and writing of all the model parameters and data needed for inference.</p></li><li><p><b>Floating-point performance</b> - GPUs are optimized for high floating point operation throughput, which are used extensively in neural networks. They can deliver much higher <a href="https://www.tomshardware.com/reviews/gpu-hierarchy,4388.html">TFLOPs than CPUs</a>.</p></li></ul><p>Let’s look at the code:</p>
            <pre><code>// Create device and command encoder
const adapter = await navigator.gpu.requestAdapter();
const device = await adapter.requestDevice();
const encoder = device.createCommandEncoder();

// Storage buffer
const storageBuffer = device.createBuffer({
  size: 4 * Float32Array.BYTES_PER_ELEMENT, // 4 float32 values
  usage: GPUBufferUsage.STORAGE | GPUBufferUsage.COPY_SRC,
});

// Mapped buffer
const mappedBuffer = device.createBuffer({
  size: 4 * Float32Array.BYTES_PER_ELEMENT,
  usage: GPUBufferUsage.MAP_READ | GPUBufferUsage.COPY_DST,
});

// Create shader that writes incrementing numbers to storage buffer
const computeShaderCode = `
    @group(0) @binding(0)
    var&lt;storage, read_write&gt; result : array&lt;f32&gt;;

    @compute @workgroup_size(1)
    fn main(@builtin(global_invocation_id) gid : vec3&lt;u32&gt;) {
      result[gid.x] = f32(gid.x);
    }
`;

// Create compute pipeline
const computePipeline = device.createComputePipeline({
  layout: "auto",
  compute: {
    module: device.createShaderModule({ code: computeShaderCode }),
    entryPoint: "main",
  },
});

// Bind group
const bindGroup = device.createBindGroup({
  layout: computePipeline.getBindGroupLayout(0),
  entries: [{ binding: 0, resource: { buffer: storageBuffer } }],
});

// Dispatch compute work
const computePass = encoder.beginComputePass();
computePass.setPipeline(computePipeline);
computePass.setBindGroup(0, bindGroup);
computePass.dispatchWorkgroups(4);
computePass.end();

// Copy from storage to mapped buffer
encoder.copyBufferToBuffer(
  storageBuffer,
  0,
  mappedBuffer,
  0,
  4 * Float32Array.BYTES_PER_ELEMENT //mappedBuffer.size
);

// Submit and read back result
const gpuBuffer = encoder.finish();
device.queue.submit([gpuBuffer]);

await mappedBuffer.mapAsync(GPUMapMode.READ);
console.log(new Float32Array(mappedBuffer.getMappedRange()));
// [0, 1, 2, 3]</code></pre>
            <p>Now that we covered the basics of WebGPU and compute shaders, let's move to something more demanding. What if we could perform machine learning inference using Workers and GPUs?</p>
    <div>
      <h3>ONNX WebGPU demo</h3>
      <a href="#onnx-webgpu-demo">
        
      </a>
    </div>
    <p>The <a href="https://github.com/microsoft/onnxruntime">ONNX runtime</a> is a popular open-source cross-platform, high performance machine learning inferencing accelerator. <a href="https://github.com/webonnx/wonnx">Wonnx</a> is a GPU-accelerated version of the same engine, written in Rust, that can be compiled to WebAssembly and take advantage of WebGPU in the browser. We are going to run it in Workers using a combination of <a href="https://github.com/cloudflare/workers-rs">workers-rs</a>, our Rust bindings for Cloudflare Workers, and the workerd WebGPU APIs.</p><p>For this demo, we are using <a href="https://www.kdnuggets.com/2016/09/deep-learning-reading-group-squeezenet.html">SqueezeNet</a>. This small image classification model can run under lower resources but still achieves similar levels of accuracy on the <a href="https://en.wikipedia.org/wiki/ImageNet">ImageNet</a> image classification validation dataset as larger models like <a href="https://en.wikipedia.org/wiki/AlexNet">AlexNet</a>.</p><p>In essence, our worker will receive any uploaded image and attempt to classify it according to the 1000 ImageNet classes. Once ONNX runs the machine learning model using the GPU, it will return the list of classes with the highest probability scores. Let’s go step by step.</p><p>First we load the model from R2 into the GPU memory the first time the Durable Object is called:</p>
            <pre><code>#[durable_object]
pub struct Classifier {
    env: Env,
    session: Option&lt;wonnx::Session&gt;,
}

impl Classifier {
    async fn ensure_session(&amp;mut self) -&gt; Result&lt;()&gt; {
        match self.session {
            Some(_) =&gt; worker::console_log!("DO already has a session"),
            None =&gt; {
                // No session, so this should be the first request. In this case
                // we will fetch the model from R2, build a wonnx session, and
                // store it for subsequent requests.
                let model_bytes = fetch_model(&amp;self.env).await?;
                let session = wonnx::Session::from_bytes(&amp;model_bytes)
                    .await
                    .map_err(|err| err.to_string())?;
                worker::console_log!("session created in DO");
                self.session = Some(session);
            }
        };
        Ok(())
    }
}</code></pre>
            <p>This is only required once, when the Durable Object is instantiated. For subsequent requests, we retrieve the model input tensor, call the existing session for the inference, and return to the calling worker the result tensor converted to JSON:</p>
            <pre><code>        let request_data: ArrayBase&lt;OwnedRepr&lt;f32&gt;, Dim&lt;[usize; 4]&gt;&gt; =
            serde_json::from_str(&amp;req.text().await?)?;
        let mut input_data = HashMap::new();
        input_data.insert("data".to_string(), request_data.as_slice().unwrap().into());

        let result = self
            .session
            .as_ref()
            .unwrap() // we know the session exists
            .run(&amp;input_data)
            .await
            .map_err(|err| err.to_string())?;
...
        let probabilities: Vec&lt;f32&gt; = result
            .into_iter()
            .next()
            .ok_or("did not obtain a result tensor from session")?
            .1
            .try_into()
            .map_err(|err: TensorConversionError| err.to_string())?;

        let do_response = serde_json::to_string(&amp;probabilities)?;
        Response::ok(do_response)</code></pre>
            <p>On the Worker script itself, we load the uploaded image and pre-process it into a model input tensor:</p>
            <pre><code>    let image_file: worker::File = match req.form_data().await?.get("file") {
        Some(FormEntry::File(buf)) =&gt; buf,
        Some(_) =&gt; return Response::error("`file` part of POST form must be a file", 400),
        None =&gt; return Response::error("missing `file`", 400),
    };
    let image_content = image_file.bytes().await?;
    let image = load_image(&amp;image_content)?;</code></pre>
            <p>Finally, we call the GPU Durable Object, which runs the model and returns the most likely classes of our image:</p>
            <pre><code>    let probabilities = execute_gpu_do(image, stub).await?;
    let mut probabilities = probabilities.iter().enumerate().collect::&lt;Vec&lt;_&gt;&gt;();
    probabilities.sort_unstable_by(|a, b| b.1.partial_cmp(a.1).unwrap());
    Response::ok(LABELS[probabilities[0].0])</code></pre>
            <p>We packaged this demo in a public repository, so you can also run it. Make sure that you have a <a href="https://www.rust-lang.org/">Rust</a> compiler, <a href="https://nodejs.org/en">Node.js</a>, <a href="https://git-scm.com/">Git</a> and <a href="https://curl.se/">curl</a> installed, then clone the repository:</p>
            <pre><code>git clone https://github.com/cloudflare/workers-wonnx.git
cd workers-wonnx</code></pre>
            <p>Upload the model to the local R2 simulator:</p>
            <pre><code>npx wrangler@latest r2 object put model-bucket-dev/opt-squeeze.onnx --local --file models/opt-squeeze.onnx</code></pre>
            <p>And then run the Worker locally:</p>
            <pre><code>npx wrangler@latest dev</code></pre>
            <p>With the Worker running and waiting for requests you can then open another terminal window and upload one of the image examples in the same repository using curl:</p>
            <pre><code>&gt; curl -F "file=@images/pelican.jpeg" http://localhost:8787
n02051845 pelican</code></pre>
            <p>If everything goes according to plan the result of the curl command will be the most likely class of the image.</p>
    <div>
      <h3>Next steps and final words</h3>
      <a href="#next-steps-and-final-words">
        
      </a>
    </div>
    <p>Over the upcoming weeks, we will merge the workerd WebGPU code in the Cloudflare Workers production environment and make it available globally, on top of our growing GPU nodes fleet. We didn't do it earlier because that environment is subject to strict security and isolation requirements. For example, we can't break the <a href="https://developers.cloudflare.com/workers/learning/security-model/">security model</a> of our process sandbox and have V8 talking to the GPU hardware directly, that would be a problem; we must create a configuration where another process is closer to the GPU and use IPC (inter-process communication) to talk to it. Other things like managing resource allocation and billing are being sorted out.</p><p>For now, we wanted to get the good news out that we will support WebGPU in Cloudflare Workers and ensure that you can start playing and coding with it today and learn from it. WebGPU and general-purpose computing on GPUs is still in its early days. We presented a machine-learning demo, but we can imagine other applications taking advantage of this new feature, and we hope you can show us some of them.</p><p>As usual, you can talk to us on our <a href="https://discord.cloudflare.com/">Developers Discord</a> or the <a href="https://community.cloudflare.com/c/developers/39">Community forum</a>; the team will be listening. We are eager to hear from you and learn about what you're building.</p> ]]></content:encoded>
            <category><![CDATA[Birthday Week]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[Standards]]></category>
            <category><![CDATA[Developers]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <guid isPermaLink="false">4osLizDbNHndEk9BG23KFi</guid>
            <dc:creator>André Cruz</dc:creator>
            <dc:creator>Celso Martinho</dc:creator>
        </item>
        <item>
            <title><![CDATA[Workers AI: serverless GPU-powered inference on Cloudflare’s global network]]></title>
            <link>https://blog.cloudflare.com/workers-ai/</link>
            <pubDate>Wed, 27 Sep 2023 13:00:47 GMT</pubDate>
            <description><![CDATA[ We are excited to launch Workers AI - an AI inference as a service platform, empowering developers to run AI models with just a few lines of code, all powered by our global network of GPUs ]]></description>
            <content:encoded><![CDATA[ <p></p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1kH38tclcLOGwYv40vTHNy/300956275074e73dd480a93898d43c08/image1-29.png" />
            
            </figure><p>If you're anywhere near the developer community, it's almost impossible to avoid the impact that AI’s recent advancements have had on the ecosystem. Whether you're using <a href="https://www.cloudflare.com/learning/ai/what-is-artificial-intelligence/">AI</a> in your workflow to improve productivity, or you’re shipping AI based features to your users, it’s everywhere. The focus on AI improvements are extraordinary, and we’re super excited about the opportunities that lay ahead, but it's not enough.</p><p>Not too long ago, if you wanted to leverage the power of AI, you needed to know the ins and outs of <a href="https://www.cloudflare.com/learning/ai/what-is-machine-learning/">machine learning</a>, and be able to manage the infrastructure to power it.</p><p>As a developer platform with over one million active developers, we believe there is so much potential yet to be unlocked, so we’re changing the way AI is delivered to developers. Many of the current solutions, while powerful, are based on closed, proprietary models and don't address privacy needs that developers and users demand. Alternatively, the open source scene is exploding with powerful models, but they’re simply not accessible enough to every developer. Imagine being able to run a model, from your code, wherever it’s <a href="https://www.cloudflare.com/developer-platform/solutions/hosting/">hosted</a>, and never needing to find GPUs or deal with setting up the infrastructure to support it.</p><p>That's why we are excited to launch Workers AI - an AI inference as a service platform, empowering developers to run AI models with just a few lines of code, all powered by our global network of GPUs. It's open and accessible, serverless, privacy-focused, runs near your users, pay-as-you-go, and it's built from the ground up for a best in class developer experience.</p>
    <div>
      <h2>Workers AI - making inference <b>just work</b></h2>
      <a href="#workers-ai-making-inference-just-work">
        
      </a>
    </div>
    <p>We’re launching Workers AI to put AI inference in the hands of every developer, and to actually deliver on that goal, it should <b>just work</b> out of the box. How do we achieve that?</p><ul><li><p>At the core of everything, it runs on the right infrastructure - our world-class network of GPUs</p></li><li><p>We provide off-the-shelf models that run seamlessly on our infrastructure</p></li><li><p>Finally, deliver it to the end developer, in a way that’s delightful. A developer should be able to build their first Workers AI app in minutes, and say “Wow, that’s kinda magical!”.</p></li></ul><p>So what exactly is Workers AI? It’s another building block that we’re adding to our developer platform - one that helps developers run well-known AI models on serverless GPUs, all on Cloudflare’s trusted global network. As one of the latest additions to our developer platform, it works seamlessly with Workers + Pages, but to make it truly accessible, we’ve made it platform-agnostic, so it also works everywhere else, made available via a REST API.</p>
    <div>
      <h2>Models you know and love</h2>
      <a href="#models-you-know-and-love">
        
      </a>
    </div>
    <p>We’re launching with a curated set of popular, open source models, that cover a wide range of inference tasks:</p><ul><li><p><b>Text generation (large language model):</b> meta/llama-2-7b-chat-int8</p></li><li><p><b>Automatic speech recognition (ASR):</b> openai/whisper</p></li><li><p><b>Translation:</b> meta/m2m100-1.2</p></li><li><p><b>Text classification:</b> huggingface/distilbert-sst-2-int8</p></li><li><p><b>Image classification:</b> microsoft/resnet-50</p></li><li><p><b>Embeddings:</b> baai/bge-base-en-v1.5</p></li></ul><p>You can browse all available models in your Cloudflare dashboard, and soon you’ll be able to dive into logs and analytics on a per model basis!</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3iLFApyCjCwTCEtV8QRhke/91793f5eaabe3c426cf5fb7f421f4508/image4-14.png" />
            
            </figure><p>This is just the start, and we’ve got big plans. After launch, we’ll continue to expand based on community feedback. Even more exciting - in an effort to take our catalog from zero to sixty, we’re announcing a partnership with Hugging Face, a leading AI community + hub. The partnership is multifaceted, and you can read more about it <a href="/best-place-region-earth-inference">here</a>, but soon you’ll be able to browse and run a subset of the Hugging Face catalog directly in Workers AI.</p>
    <div>
      <h2>Accessible to everyone</h2>
      <a href="#accessible-to-everyone">
        
      </a>
    </div>
    <p>Part of the mission of our developer platform is to provide <b>all</b> the building blocks that developers need to build the applications of their dreams. Having access to the right blocks is just one part of it — as a developer your job is to put them together into an application. Our goal is to make that as easy as possible.</p><p>To make sure you could use Workers AI easily regardless of entry point, we wanted to provide access via: Workers or Pages to make it easy to use within the Cloudflare ecosystem, and via REST API if you want to use Workers AI with your current stack.</p><p>Here’s a quick CURL example that translates some text from English to French:</p>
            <pre><code>curl https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/ai/run/@cf/meta/m2m100-1.2b \
-H "Authorization: Bearer {API_TOKEN}" \
	-d '{ "text": "I'll have an order of the moule frites", "target_lang": "french" }'</code></pre>
            <p>And here are what the response looks like:</p>
            <pre><code>{
  "result": {
    "answer": "Je vais commander des moules frites"
  },
  "success": true,
  "errors":[],
  "messages":[]
}</code></pre>
            <p>Use it with any stack, anywhere - your favorite Jamstack framework, Python + Django/Flask, Node.js, Ruby on Rails, the possibilities are endless. And deploy.</p>
    <div>
      <h2>Designed for developers</h2>
      <a href="#designed-for-developers">
        
      </a>
    </div>
    <p>Developer experience is really important to us. In fact, most of this post has been about just that. Making sure it works out of the box. Providing popular models that just work. Being accessible to all developers whether you build and deploy with Cloudflare or elsewhere. But it’s more than that - the experience should be frictionless, zero to production should be fast, and it should feel good along the way.</p><p>Let’s walk through another example to show just how easy it is to use! We’ll run Llama 2, a popular <a href="https://www.cloudflare.com/learning/ai/what-is-large-language-model/">large language model</a> open sourced by Meta, in a worker.</p><p>We’ll assume you have some of the basics already complete (Cloudflare account, Node, NPM, etc.), but if you don’t <a href="https://developers.cloudflare.com/workers-ai/get-started/local-dev-setup/">this guide</a> will get you properly set up!</p>
    <div>
      <h3>1. Create a Workers project</h3>
      <a href="#1-create-a-workers-project">
        
      </a>
    </div>
    <p>Create a new project named workers-ai by running:</p>
            <pre><code>$ npm create cloudflare@latest</code></pre>
            <p>When setting up your workers-ai worker, answer the setup questions as follows:</p><ul><li><p>Enter <b>workers-ai</b> for the app name</p></li><li><p>Choose <b>Hello World</b> script for the type of application</p></li><li><p>Select <b>yes</b> to using TypeScript</p></li><li><p>Select <b>yes</b> to using Git</p></li><li><p>Select <b>no</b> to deploying</p></li></ul><p>Lastly navigate to your new app directory:</p>
            <pre><code>cd workers-ai</code></pre>
            
    <div>
      <h3>2. Connect Workers AI to your worker</h3>
      <a href="#2-connect-workers-ai-to-your-worker">
        
      </a>
    </div>
    <p>Create a Workers AI binding, which allows your worker to access the Workers AI service without having to manage an API key yourself.</p><p>To bind Workers AI to your worker, add the following to the end of your <b>wrangler.toml</b> file:</p>
            <pre><code>[ai]
binding = "AI" #available in your worker via env.AI</code></pre>
            <p>You can also bind Workers AI to a Pages Function. For more information, refer to <a href="https://developers.cloudflare.com/pages/platform/functions/bindings/#ai">Functions Bindings</a>.</p>
    <div>
      <h3>3. Install the Workers AI client library</h3>
      <a href="#3-install-the-workers-ai-client-library">
        
      </a>
    </div>
    
            <pre><code>npm install @cloudflare/ai</code></pre>
            
    <div>
      <h3>4. Run an inference task in your worker</h3>
      <a href="#4-run-an-inference-task-in-your-worker">
        
      </a>
    </div>
    <p>Update the <b>source/index.ts</b> with the following code:</p>
            <pre><code>import { Ai } from '@cloudflare/ai'
export default {
  async fetch(request, env) {
    const ai = new Ai(env.AI);
    const input = { prompt: "What's the origin of the phrase 'Hello, World'" };
    const output = await ai.run('@cf/meta/llama-2-7b-chat-int8', input );
    return new Response(JSON.stringify(output));
  },
};</code></pre>
            
    <div>
      <h3>5. Develop locally with Wrangler</h3>
      <a href="#5-develop-locally-with-wrangler">
        
      </a>
    </div>
    <p>While in your project directory, test Workers AI locally by running:</p>
            <pre><code>$ npx wrangler dev --remote</code></pre>
            <p><b>Note -</b> These models currently only run on Cloudflare’s network of GPUs (and not locally), so setting <code>--remote</code> above is a must, and you’ll be prompted to log in at this point.</p><p>Wrangler will give you a URL (most likely localhost:8787). Visit that URL, and you’ll see a response like this</p>
            <pre><code>{
  "response": "Hello, World is a common phrase used to test the output of a computer program, particularly in the early stages of programming. The phrase "Hello, World!" is often the first program that a beginner learns to write, and it is included in many programming language tutorials and textbooks as a way to introduce basic programming concepts. The origin of the phrase "Hello, World!" as a programming test is unclear, but it is believed to have originated in the 1970s. One of the earliest known references to the phrase is in a 1976 book called "The C Programming Language" by Brian Kernighan and Dennis Ritchie, which is considered one of the most influential books on the development of the C programming language.
}</code></pre>
            
    <div>
      <h3>6. Deploy your worker</h3>
      <a href="#6-deploy-your-worker">
        
      </a>
    </div>
    <p>Finally, deploy your worker to make your project accessible on the Internet:</p>
            <pre><code>$ npx wrangler deploy
# Outputs: https://workers-ai.&lt;YOUR_SUBDOMAIN&gt;.workers.dev</code></pre>
            <p>And that’s it. You can literally go from zero to deployed AI in minutes. This is obviously a simple example, but shows how easy it is to run Workers AI from any project. </p>
    <div>
      <h2>Privacy by default</h2>
      <a href="#privacy-by-default">
        
      </a>
    </div>
    <p>When Cloudflare was founded, our value proposition had three pillars: more secure, more reliable, and more performant. Over time, we’ve realized that a better Internet is also a more private Internet, and we want to play a role in building it.</p><p>That’s why Workers AI is private by default - we don’t train our models, LLM or otherwise, on your data or conversations, and our models don’t learn from your usage. You can feel confident using Workers AI in both personal and business settings, without having to worry about leaking your data. Other providers only offer this fundamental feature with their enterprise version. With us, it’s built in for everyone.</p><p>We’re also excited to support data localization in the future. To make this happen, we have an ambitious GPU rollout plan - we’re launching with seven sites today, roughly 100 by the end of 2023, and nearly everywhere by the end of 2024. Ultimately, this will empower developers to keep delivering killer AI features to their users, while staying compliant with their end users’ data localization requirements.</p>
    <div>
      <h2>The power of the platform</h2>
      <a href="#the-power-of-the-platform">
        
      </a>
    </div>
    
    <div>
      <h4>Vector database - Vectorize</h4>
      <a href="#vector-database-vectorize">
        
      </a>
    </div>
    <p>Workers AI is all about running Inference, and making it really easy to do so, but sometimes inference is only part of the equation. Large language models are trained on a fixed set of data, based on a snapshot at a specific point in the past, and have no context on your business or use case. When you submit a prompt, information specific to you can increase the quality of results, making it more useful and relevant. That’s why we’re also launching Vectorize, our <a href="https://www.cloudflare.com/learning/ai/what-is-vector-database/">vector database</a> that’s designed to work seamlessly with Workers AI. Here’s a quick overview of how you might use Workers AI + Vectorize together.</p><p>Example: Use your data (knowledge base) to provide additional context to an LLM when a user is chatting with it.</p><ol><li><p><b>Generate initial embeddings:</b> run your data through Workers AI using an <a href="https://www.cloudflare.com/learning/ai/what-are-embeddings/">embedding model</a>. The output will be embeddings, which are numerical representations of those words.</p></li><li><p><b>Insert those embeddings into Vectorize:</b> this essentially seeds the vector database with your data, so we can later use it to retrieve embeddings that are similar to your users’ query</p></li><li><p><b>Generate embedding from user question:</b> when a user submits a question to your AI app, first, take that question, and run it through Workers AI using an embedding model.</p></li><li><p><b>Get context from Vectorize:</b> use that embedding to query Vectorize. This should output embeddings that are similar to your user’s question.</p></li><li><p><b>Create context aware prompt:</b> Now take the original text associated with those embeddings, and create a new prompt combining the text from the vector search, along with the original question</p></li><li><p><b>Run prompt:</b> run this prompt through Workers AI using an LLM model to get your final result</p></li></ol>
    <div>
      <h4>AI Gateway</h4>
      <a href="#ai-gateway">
        
      </a>
    </div>
    <p>That covers a more advanced use case. On the flip side, if you are running models elsewhere, but want to get more out of the experience, you can run those APIs through our AI gateway to get features like caching, rate-limiting, analytics and logging. These features can be used to protect your end point, monitor and optimize costs, and also help with data loss prevention. Learn more about AI gateway <a href="/announcing-ai-gateway">here</a>.</p>
    <div>
      <h2>Start building today</h2>
      <a href="#start-building-today">
        
      </a>
    </div>
    <p>Try it out for yourself, and let us know what you think. Today we’re launching Workers AI as an open Beta for all Workers plans - free or paid. That said, it’s super early, so…</p>
    <div>
      <h4>Warning - It’s an early beta</h4>
      <a href="#warning-its-an-early-beta">
        
      </a>
    </div>
    <p>Usage is <b>not currently recommended for production apps</b>, and limits + access are subject to change.</p>
    <div>
      <h4>Limits</h4>
      <a href="#limits">
        
      </a>
    </div>
    <p>We’re initially launching with limits on a per-model basis</p><ul><li><p>@cf/meta/llama-2-7b-chat-int8: 50 reqs/min globally</p></li></ul><p>Checkout our <a href="https://developers.cloudflare.com/workers-ai/platform/limits/">docs</a> for a full overview of our limits.</p>
    <div>
      <h4>Pricing</h4>
      <a href="#pricing">
        
      </a>
    </div>
    <p>What we released today is just a small preview to give you a taste of what’s coming (we simply couldn’t hold back), but we’re looking forward to putting the full-throttle version of Workers AI in your hands.</p><p>We realize that as you approach building something, you want to understand: how much is this going to cost me? Especially with AI costs being so easy to get out of hand. So we wanted to share the upcoming pricing of Workers AI with you.</p><p>While we won’t be billing on day one, we are announcing what we expect our pricing will look like.</p><p>Users will be able to choose from two ways to run Workers AI:</p><ul><li><p><b>Regular Twitch Neurons (RTN)</b> - running wherever there's capacity at $0.01 / 1k neurons</p></li><li><p><b>Fast Twitch Neurons (FTN)</b> - running at nearest user location at $0.125 / 1k neurons</p></li></ul><p>You may be wondering — what’s a neuron?</p><p>Neurons are a way to measure AI output that always scales down to zero (if you get no usage, you will be charged for 0 neurons). To give you a sense of what you can accomplish with a thousand neurons, you can: generate 130 LLM responses, 830 image classifications, or 1,250 embeddings.</p><p>Our goal is to help our customers pay only for what they use, and choose the pricing that best matches their use case, whether it’s price or latency that is top of mind.</p>
    <div>
      <h3>What’s on the roadmap?</h3>
      <a href="#whats-on-the-roadmap">
        
      </a>
    </div>
    <p>Workers AI is just getting started, and we want your feedback to help us make it great. That said, there are some exciting things on the roadmap.</p>
    <div>
      <h4>More models, please</h4>
      <a href="#more-models-please">
        
      </a>
    </div>
    <p>We're launching with a solid set of models that just work, but will continue to roll out new models based on your feedback. If there’s a particular model you'd love to see on Workers AI, pop into our <a href="https://discord.cloudflare.com/">Discord</a> and let us know!</p><p>In addition to that, we're also announcing a <a href="/best-place-region-earth-inference">partnership with Hugging Face</a>, and soon you'll be able to access and run a subset of the Hugging Face catalog directly from Workers AI.</p>
    <div>
      <h4>Analytics + observability</h4>
      <a href="#analytics-observability">
        
      </a>
    </div>
    <p>Up to this point, we’ve been hyper focussed on one thing - making it really easy for any developer to run powerful AI models in just a few lines of code. But that’s only one part of the story. Up next, we’ll be working on some analytics and <a href="https://www.cloudflare.com/learning/performance/what-is-observability/">observability</a> capabilities to give you insights into your usage + performance + spend on a per-model basis, plus the ability to fig into your logs if you want to do some exploring.</p>
    <div>
      <h4>A road to global GPU coverage</h4>
      <a href="#a-road-to-global-gpu-coverage">
        
      </a>
    </div>
    <p>Our goal is to be the best place to run inference on Region: Earth, so we're adding GPUs to our data centers as fast as we can.</p><p><b>We plan to be in 100 data centers by the end of this year</b></p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5A8SGUOEAcs3sjNjv48yIh/bafbc77b256fef490d4357613b036603/image3-28.png" />
            
            </figure><p><b>And nearly everywhere by the end of 2024</b></p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/2rrL2H0dHYZ4hxOBq0X1pw/f38d122af92f789dc2b31d3bdea1ab06/unnamed-3.png" />
            
            </figure><p><b>We’re really excited to see you build</b> - head over to <a href="https://developers.cloudflare.com/workers-ai/">our docs</a> to get started.</p><p>If you need inspiration, want to share something you’re building, or have a question - pop into our <a href="https://discord.com/invite/cloudflaredev">Developer Discord</a>.</p> ]]></content:encoded>
            <category><![CDATA[Birthday Week]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[AI]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <category><![CDATA[Database]]></category>
            <category><![CDATA[Vectorize]]></category>
            <category><![CDATA[Developers]]></category>
            <guid isPermaLink="false">6jSrrIFC7yStZxCaqaM0c1</guid>
            <dc:creator>Phil Wittig</dc:creator>
            <dc:creator>Rita Kozlov</dc:creator>
            <dc:creator>Rebecca Weekly</dc:creator>
            <dc:creator>Celso Martinho</dc:creator>
            <dc:creator>Meaghan Choi</dc:creator>
        </item>
        <item>
            <title><![CDATA[Wasm core dumps and debugging Rust in Cloudflare Workers]]></title>
            <link>https://blog.cloudflare.com/wasm-coredumps/</link>
            <pubDate>Mon, 14 Aug 2023 13:00:33 GMT</pubDate>
            <description><![CDATA[ Debugging Rust and Wasm with Cloudflare Workers involves a lot of the good old time-consuming and nerve-wracking printf'ing strategy. What if there’s a better way? This blog is about enabling and using Wasm core dumps and how you can easily debug Rust in Cloudflare Workers ]]></description>
            <content:encoded><![CDATA[ <p></p><p>A clear sign of maturing for any new programming language or environment is how easy and efficient debugging them is. Programming, like any other complex task, involves various challenges and potential pitfalls. Logic errors, off-by-ones, null pointer dereferences, and memory leaks are some examples of things that can make software developers desperate if they can't pinpoint and fix these issues quickly as part of their workflows and tools.</p><p><a href="https://webassembly.org/">WebAssembly</a> (Wasm) is a binary instruction format designed to be a portable and efficient target for the compilation of high-level languages like <a href="https://www.rust-lang.org/">Rust</a>, C, C++, and others. In recent years, it has gained significant traction for building high-performance applications in web and serverless environments.</p><p>Cloudflare Workers has had <a href="https://github.com/cloudflare/workers-rs">first-party support for Rust and Wasm</a> for quite some time. We've been using this powerful combination to bootstrap and build some of our most recent services, like <a href="/introducing-d1/">D1</a>, <a href="/introducing-constellation/">Constellation</a>, and <a href="/automatic-signed-exchanges/">Signed Exchanges</a>, to name a few.</p><p>Using tools like <a href="https://github.com/cloudflare/workers-sdk">Wrangler</a>, our command-line tool for building with Cloudflare developer products, makes streaming real-time logs from our applications running remotely easy. Still, to be honest, debugging Rust and Wasm with Cloudflare Workers involves a lot of the good old time-consuming and nerve-wracking <a href="https://news.ycombinator.com/item?id=26925570">printf'ing</a> strategy.</p><p>What if there’s a better way? This blog is about enabling and using Wasm core dumps and how you can easily debug Rust in Cloudflare Workers.</p>
    <div>
      <h3>What are core dumps?</h3>
      <a href="#what-are-core-dumps">
        
      </a>
    </div>
    <p>In computing, a <a href="https://en.wikipedia.org/wiki/Core_dump">core dump</a> consists of the recorded state of the working memory of a computer program at a specific time, generally when the program has crashed or otherwise terminated abnormally. They also add things like the processor registers, stack pointer, program counter, and other information that may be relevant to fully understanding why the program crashed.</p><p>In most cases, depending on the system’s configuration, core dumps are usually initiated by the operating system in response to a program crash. You can then use a debugger like <a href="https://linux.die.net/man/1/gdb">gdb</a> to examine what happened and hopefully determine the cause of a crash. <a href="https://linux.die.net/man/1/gdb">gdb</a> allows you to run the executable to try to replicate the crash in a more controlled environment, inspecting the variables, and much more. The Windows' equivalent of a core dump is a <a href="https://learn.microsoft.com/en-us/troubleshoot/windows-client/performance/read-small-memory-dump-file">minidump</a>. Other mature languages that are interpreted, like Python, or languages that run inside a virtual machine, like <a href="https://docs.oracle.com/javase/8/docs/technotes/guides/visualvm/coredumps.html">Java</a>, also have their ways of generating core dumps for post-mortem analysis.</p><p>Core dumps are particularly useful for post-mortem debugging, determining the conditions that lead to a failure after it has occurred.</p>
    <div>
      <h3>WebAssembly core dumps</h3>
      <a href="#webassembly-core-dumps">
        
      </a>
    </div>
    <p>WebAssembly has had a <a href="https://github.com/WebAssembly/tool-conventions/blob/main/Coredump.md">proposal for implementing core dumps</a> in discussion for a while. It's a work-in-progress experimental specification, but it provides basic support for the main ideas of post-mortem debugging, including using the <a href="https://yurydelendik.github.io/webassembly-dwarf/">DWARF</a> (debugging with attributed record formats) debug format, the same that Linux and gdb use. Some of the most popular Wasm runtimes, like <a href="https://github.com/bytecodealliance/wasmtime/pull/5868">Wasmtime</a> and <a href="https://github.com/wasmerio/wasmer/pull/3626">Wasmer</a>, have experimental flags that you can enable and start playing with Wasm core dumps today.</p><p>If you run Wasmtime or Wasmer with the flag:</p>
            <pre><code>--coredump-on-trap=/path/to/coredump/file</code></pre>
            <p>The core dump file will be emitted at that location path if a crash happens. You can then use tools like <a href="https://github.com/xtuc/wasm-coredump/tree/main/bin/wasmgdb">wasmgdb</a> to inspect the file and debug the crash.</p><p>But let's dig into how the core dumps are generated in WebAssembly, and what’s inside them.</p>
    <div>
      <h3>How are Wasm core dumps generated</h3>
      <a href="#how-are-wasm-core-dumps-generated">
        
      </a>
    </div>
    <p>(and what’s inside them)</p><p>When WebAssembly terminates execution due to abnormal behavior, we say that it entered a trap. With Rust, examples of operations that can trap are accessing out-of-bounds addresses or a division by zero arithmetic call. You can read about the <a href="https://webassembly.org/docs/security/">security model of WebAssembly</a> to learn more about traps.</p><p>The core dump specification plugs into the trap workflow. When WebAssembly crashes and enters a trap, core dumping support kicks in and starts unwinding the call <a href="https://webassembly.github.io/spec/core/exec/runtime.html#stack">stack</a> gathering debugging information. For each frame in the stack, it collects the <a href="https://webassembly.github.io/spec/core/syntax/modules.html#syntax-func">function</a> parameters and the values stored in locals and in the stack, along with binary offsets that help us map to exact locations in the source code. Finally, it snapshots the <a href="https://webassembly.github.io/spec/core/syntax/modules.html#syntax-mem">memory</a> and captures information like the <a href="https://webassembly.github.io/spec/core/syntax/modules.html#syntax-table">tables</a> and the <a href="https://webassembly.github.io/spec/core/syntax/modules.html#syntax-global">global variables</a>.</p><p><a href="https://dwarfstd.org/">DWARF</a> is used by many mature languages like C, C++, Rust, Java, or Go. By emitting DWARF information into the binary at compile time a debugger can provide information such as the source name and the line number where the exception occurred, function and argument names, and more. Without DWARF, the core dumps would be just pure assembly code without any contextual information or metadata related to the source code that generated it before compilation, and they would be much harder to debug.</p><p>WebAssembly <a href="https://webassembly.github.io/spec/core/appendix/custom.html#name-section">uses a (lighter) version of DWARF</a> that maps functions, or a module and local variables, to their names in the source code (you can read about the <a href="https://webassembly.github.io/spec/core/appendix/custom.html#name-section">WebAssembly name section</a> for more information), and naturally core dumps use this information.</p><p>All this information for debugging is then bundled together and saved to the file, the core dump file.</p><p>The <a href="https://github.com/WebAssembly/tool-conventions/blob/main/Coredump.md#coredump-file-format">core dump structure</a> has multiple sections, but the most important are:</p><ul><li><p>General information about the process;</p></li><li><p>The <a href="https://webassembly.github.io/threads/core/">threads</a> and their stack frames (note that WebAssembly is <a href="https://developers.cloudflare.com/workers/runtime-apis/webassembly/#threading">single threaded</a> in Cloudflare Workers);</p></li><li><p>A snapshot of the WebAssembly linear memory or only the relevant regions;</p></li><li><p>Optionally, other sections like globals, data, or table.</p></li></ul><p>Here’s the thread definition from the core dump specification:</p>
            <pre><code>corestack   ::= customsec(thread-info vec(frame))
thread-info ::= 0x0 thread-name:name ...
frame       ::= 0x0 ... funcidx:u32 codeoffset:u32 locals:vec(value)
                stack:vec(value)</code></pre>
            <p>A thread is a custom section called <code>corestack</code>. A corestack section contains the thread name and a vector (or array) of frames. Each frame contains the function index in the WebAssembly module (<code>funcidx</code>), the code offset relative to the function's start (<code>codeoffset</code>), the list of locals, and the list of values in the stack.</p><p>Values are defined as follows:</p>
            <pre><code>value ::= 0x01       =&gt; ∅
        | 0x7F n:i32 =&gt; n
        | 0x7E n:i64 =&gt; n
        | 0x7D n:f32 =&gt; n
        | 0x7C n:f64 =&gt; n</code></pre>
            <p>At the time of this writing these are the possible <a href="https://webassembly.github.io/spec/core/binary/types.html#binary-numtype">numbers types</a> in a value. Again, we wanted to describe the basics; you should track the full <a href="https://github.com/WebAssembly/tool-conventions/blob/main/Coredump.md#coredump-file-format">specification</a> to get more detail or find information about future changes. WebAssembly core dump support is in its early stages of specification and implementation, things will get better, things might change.</p><p>This is all great news. Unfortunately, however, the Cloudflare Workers <a href="https://github.com/cloudflare/workerd">runtime</a> doesn’t support WebAssembly core dumps yet. There is no technical impediment to adding this feature to <a href="https://github.com/cloudflare/workerd">workerd</a>; after all, it's <a href="https://developers.cloudflare.com/workers/learning/how-workers-works/">based on V8</a>, but since it powers a critical part of our production infrastructure and products, we tend to be conservative when it comes to adding specifications or standards that are still considered experimental and still going through the definitions phase.</p><p>So, how do we get core Wasm dumps in Cloudflare Workers today?</p>
    <div>
      <h3>Polyfilling</h3>
      <a href="#polyfilling">
        
      </a>
    </div>
    <p>Polyfilling means using userland code to provide modern functionality in older environments that do not natively support it. <a href="https://developer.mozilla.org/en-US/docs/Glossary/Polyfill">Polyfills</a> are widely popular in the JavaScript community and the browser environment; they've been used extensively to address issues where browser vendors still didn't catch up with the latest standards, or when they implement the same features in different ways, or address cases where old browsers can never support a new standard.</p><p>Meet <a href="https://github.com/xtuc/wasm-coredump/tree/main/bin/rewriter">wasm-coredump-rewriter</a>, a tool that you can use to rewrite a Wasm module and inject the core dump runtime functionality in the binary. This runtime code will catch most traps (exceptions in host functions are not yet catched and memory violation not by default) and generate a standard core dump file. To some degree, this is similar to how Binaryen's <a href="https://github.com/WebAssembly/binaryen/blob/main/src/passes/Asyncify.cpp">Asyncify</a> <a href="https://kripken.github.io/blog/wasm/2019/07/16/asyncify.html">works</a>.</p><p>Let’s look at code and see how this works. He’s some simple pseudo code:</p>
            <pre><code>export function entry(v1, v2) {
    return addTwo(v1, v2)
}

function addTwo(v1, v2) {
  res = v1 + v2;
  throw "something went wrong";

  return res
}</code></pre>
            <p>An imaginary compiler could take that source and generate the following Wasm binary code:</p>
            <pre><code>  (func $entry (param i32 i32) (result i32)
    (local.get 0)
    (local.get 1)
    (call $addTwo)
  )

  (func $addTwo (param i32 i32) (result i32)
    (local.get 0)
    (local.get 1)
    (i32.add)
    (unreachable) ;; something went wrong
  )

  (export "entry" (func $entry))</code></pre>
            <p><i>“;;” is used to denote a comment.</i></p><p><code>entry()</code> is the Wasm function <a href="https://webassembly.github.io/spec/core/exec/runtime.html#syntax-hostfunc">exported to the host</a>. In an environment like the browser, JavaScript (being the host) can call entry().</p><p>Irrelevant parts of the code have been snipped for brevity, but this is what the Wasm code will look like after <a href="https://github.com/xtuc/wasm-coredump/tree/main/bin/rewriter">wasm-coredump-rewriter</a> rewrites it:</p>
            <pre><code>  (func $entry (type 0) (param i32 i32) (result i32)
    ...
    local.get 0
    local.get 1
    call $addTwo ;; see the addTwo function bellow
    global.get 2 ;; is unwinding?
    if  ;; label = @1
      i32.const x ;; code offset
      i32.const 0 ;; function index
      i32.const 2 ;; local_count
      call $coredump/start_frame
      local.get 0
      call $coredump/add_i32_local
      local.get 1
      call $coredump/add_i32_local
      ...
      call $coredump/write_coredump
      unreachable
    end)

  (func $addTwo (type 0) (param i32 i32) (result i32)
    local.get 0
    local.get 1
    i32.add
    ;; the unreachable instruction was here before
    call $coredump/unreachable_shim
    i32.const 1 ;; funcidx
    i32.const 2 ;; local_count
    call $coredump/start_frame
    local.get 0
    call $coredump/add_i32_local
    local.get 1
    call $coredump/add_i32_local
    ...
    return)

  (export "entry" (func $entry))</code></pre>
            <p>As you can see, a few things changed:</p><ol><li><p>The (unreachable) instruction in <code>addTwo()</code> was replaced by a call to <code>$coredump/unreachable_shim</code> which starts the unwinding process. Then, the location and debugging data is captured, and the function returns normally to the <code>entry()</code> caller.</p></li><li><p>Code has been added after the <code>addTwo()</code> call instruction in <code>entry()</code> that detects if we have an unwinding process in progress or not. If we do, then it also captures the local debugging data, writes the core dump file and then, finally, moves to the unconditional trap unreachable.</p></li></ol><p>In short, we unwind until the host function <code>entry()</code> gets destroyed by calling unreachable.</p><p>Let’s go over the runtime functions that we inject for more clarity, stay with us:</p><ul><li><p><code>$coredump/start_frame(funcidx, local_count)</code> starts a new frame in the coredump.</p></li><li><p><code>$coredump/add_*_local(value)</code> captures the values of function arguments and in locals (currently capturing values from the stack isn’t implemented.)</p></li><li><p><code>$coredump/write_coredump</code> is used at the end and writes the core dump in memory. We take advantage of the first 1 KiB of the Wasm linear memory, which is unused, to store our core dump.</p></li></ul><p>A diagram is worth a thousand words:</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/27DxZQioAhBsBiltjwIiyL/2dc57b370b6741120a5bb263c2795652/image2-7.png" />
            
            </figure><p>Wait, what’s this about the first 1 KiB of the memory being unused, you ask? Well, it turns out that most WebAssembly linters and tools, including <a href="https://github.com/emscripten-core/emscripten/issues/5775#issuecomment-344049528">Emscripten</a> and <a href="https://github.com/llvm/llvm-project/blob/121e15f96ce401c875e717992a4d054e308ba775/lld/wasm/Writer.cpp#L366">WebAssembly’s LLVM</a> don’t use the first 1 KiB of memory. <a href="https://github.com/rust-lang/rust/issues/50543">Rust</a> and <a href="https://github.com/ziglang/zig/issues/4496">Zig</a> also use LLVM, but they changed the default. This isn’t pretty, but the hugely popular Asyncify polyfill relies on the same trick, so there’s reasonable support until we find a better way.</p><p>But we digress, let’s continue. After the crash, the host, typically JavaScript in the browser, can now catch the exception and extract the core dump from the Wasm instance’s memory:</p>
            <pre><code>try {
    wasmInstance.exports.someExportedFunction();
} catch(err) {
    const image = new Uint8Array(wasmInstance.exports.memory.buffer);
    writeFile("coredump." + Date.now(), image);
}</code></pre>
            <p>If you're curious about the actual details of the core dump implementation, you can find the <a href="https://github.com/xtuc/wasm-coredump/blob/main/lib/asc-coredump/assembly/coredump.ts">source code here</a>. It was written in <a href="https://www.assemblyscript.org/">AssemblyScript</a>, a TypeScript-like language for WebAssembly.</p><p>This is how we use the polyfilling technique to implement Wasm core dumps when the runtime doesn’t support them yet. Interestingly, some Wasm runtimes, being optimizing compilers, are likely to make debugging more difficult because function arguments, locals, or functions themselves can be optimized away. Polyfilling or rewriting the binary could actually preserve more source-level information for debugging.</p><p>You might be asking what about performance? We did some testing and found that the <a href="https://github.com/xtuc/wasm-coredump-bench/blob/main/results.md">impact is negligible</a>; the cost-benefit of being able to debug our crashes is positive. Also, you can easily turn wasm core dumps on or off for specific builds or environments; deciding when you need them is up to you.</p>
    <div>
      <h3>Debugging from a core dump</h3>
      <a href="#debugging-from-a-core-dump">
        
      </a>
    </div>
    <p>We now know how to generate a core dump, but how do we use it to diagnose and debug a software crash?</p><p>Similarly to <a href="https://en.wikipedia.org/wiki/GNU_Debugger">gdb</a> (GNU Project Debugger) on Linux, <a href="https://github.com/xtuc/wasm-coredump/blob/main/bin/wasmgdb/README.md">wasmgdb</a> is the tool you can use to parse and make sense of core dumps in WebAssembly; it understands the file structure, uses DWARF to provide naming and contextual information, and offers interactive commands to navigate the data. To exemplify how it works, <a href="https://github.com/xtuc/wasm-coredump/blob/main/bin/wasmgdb/demo.md">wasmgdb has a demo</a> of a Rust application that deliberately crashes; we will use it.</p><p>Let's imagine that our Wasm program crashed, wrote a core dump file, and we want to debug it.</p>
            <pre><code>$ wasmgdb source-program.wasm /path/to/coredump
wasmgdb&gt;</code></pre>
            <p>When you fire wasmgdb, you enter a <a href="https://en.wikipedia.org/wiki/Read%E2%80%93eval%E2%80%93print_loop">REPL (Read-Eval-Print Loop)</a> interface, and you can start typing commands. The tool tries to mimic the gdb command syntax; you can find the <a href="https://github.com/xtuc/wasm-coredump/blob/main/bin/wasmgdb/README.md#commands">list here</a>.</p><p>Let's examine the backtrace using the bt command:</p>
            <pre><code>wasmgdb&gt; bt
#18     000137 as __rust_start_panic () at library/panic_abort/src/lib.rs
#17     000129 as rust_panic () at library/std/src/panicking.rs
#16     000128 as rust_panic_with_hook () at library/std/src/panicking.rs
#15     000117 as {closure#0} () at library/std/src/panicking.rs
#14     000116 as __rust_end_short_backtrace&lt;std::panicking::begin_panic_handler::{closure_env#0}, !&gt; () at library/std/src/sys_common/backtrace.rs
#13     000123 as begin_panic_handler () at library/std/src/panicking.rs
#12     000194 as panic_fmt () at library/core/src/panicking.rs
#11     000198 as panic () at library/core/src/panicking.rs
#10     000012 as calculate (value=0x03000000) at src/main.rs
#9      000011 as process_thing (thing=0x2cff0f00) at src/main.rs
#8      000010 as main () at src/main.rs
#7      000008 as call_once&lt;fn(), ()&gt; (???=0x01000000, ???=0x00000000) at /rustc/b833ad56f46a0bbe0e8729512812a161e7dae28a/library/core/src/ops/function.rs
#6      000020 as __rust_begin_short_backtrace&lt;fn(), ()&gt; (f=0x01000000) at /rustc/b833ad56f46a0bbe0e8729512812a161e7dae28a/library/std/src/sys_common/backtrace.rs
#5      000016 as {closure#0}&lt;()&gt; () at /rustc/b833ad56f46a0bbe0e8729512812a161e7dae28a/library/std/src/rt.rs
#4      000077 as lang_start_internal () at library/std/src/rt.rs
#3      000015 as lang_start&lt;()&gt; (main=0x01000000, argc=0x00000000, argv=0x00000000, sigpipe=0x00620000) at /rustc/b833ad56f46a0bbe0e8729512812a161e7dae28a/library/std/src/rt.rs
#2      000013 as __original_main () at &lt;directory not found&gt;/&lt;file not found&gt;
#1      000005 as _start () at &lt;directory not found&gt;/&lt;file not found&gt;
#0      000264 as _start.command_export at &lt;no location&gt;</code></pre>
            <p>Each line represents a frame from the program's call <a href="https://webassembly.github.io/spec/core/exec/runtime.html#stack">stack</a>; see frame #3:</p>
            <pre><code>#3      000015 as lang_start&lt;()&gt; (main=0x01000000, argc=0x00000000, argv=0x00000000, sigpipe=0x00620000) at /rustc/b833ad56f46a0bbe0e8729512812a161e7dae28a/library/std/src/rt.rs</code></pre>
            <p>You can read the funcidx, function name, arguments names and values and source location are all present. Let's select frame #9 now and inspect the locals, which include the function arguments:</p>
            <pre><code>wasmgdb&gt; f 9
000011 as process_thing (thing=0x2cff0f00) at src/main.rs
wasmgdb&gt; info locals
thing: *MyThing = 0xfff1c</code></pre>
            <p>Let’s use the <code>p</code> command to inspect the content of the thing argument:</p>
            <pre><code>wasmgdb&gt; p (*thing)
thing (0xfff2c): MyThing = {
    value (0xfff2c): usize = 0x00000003
}</code></pre>
            <p>You can also use the <code>p</code> command to inspect the value of the variable, which can be useful for nested structures:</p>
            <pre><code>wasmgdb&gt; p (*thing)-&gt;value
value (0xfff2c): usize = 0x00000003</code></pre>
            <p>And you can use p to inspect memory addresses. Let’s point at <code>0xfff2c</code>, the start of the <code>MyThing</code> structure, and inspect:</p>
            <pre><code>wasmgdb&gt; p (MyThing) 0xfff2c
0xfff2c (0xfff2c): MyThing = {
    value (0xfff2c): usize = 0x00000003
}</code></pre>
            <p>All this information in every step of the stack is very helpful to determine the cause of a crash. In our test case, if you look at frame #10, we triggered an integer overflow. Once you get comfortable walking through wasmgdb and using its commands to inspect the data, debugging core dumps will be another powerful skill under your belt.</p>
    <div>
      <h3>Tidying up everything in Cloudflare Workers</h3>
      <a href="#tidying-up-everything-in-cloudflare-workers">
        
      </a>
    </div>
    <p>We learned about core dumps and how they work, and we know how to make Cloudflare Workers generate them using the wasm-coredump-rewriter polyfill, but how does all this work in practice end to end?</p><p>We've been dogfooding the technique described in this blog at Cloudflare for a while now. Wasm core dumps have been invaluable in helping us debug Rust-based services running on top of Cloudflare Workers like <a href="/introducing-d1/">D1</a>, <a href="/privacy-edge-making-building-privacy-first-apps-easier/">Privacy Edge</a>, <a href="/announcing-amp-real-url/">AMP</a>, or <a href="/introducing-constellation/">Constellation</a>.</p><p>Today we're open-sourcing the <a href="https://github.com/cloudflare/wasm-coredump">Wasm Coredump Service</a> and enabling anyone to deploy it. This service collects the Wasm core dumps originating from your projects and applications when they crash, parses them, prints an exception with the stack information in the logs, and can optionally store the full core dump in a file in an <a href="https://developers.cloudflare.com/r2/">R2 bucket</a> (which you can then use with wasmgdb) or send the exception to <a href="https://sentry.io/">Sentry</a>.</p><p>We use a <a href="https://developers.cloudflare.com/workers/configuration/bindings/about-service-bindings/">service binding</a> to facilitate the communication between your application Worker and the Coredump service Worker. A Service binding allows you to send HTTP requests to another Worker without those requests going over the Internet, thus avoiding network latency or having to deal with authentication. Here’s a diagram of how it works:</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/gntGbV7rjDOncMZhFP7x1/3429a64f297c0edbf3327c677d56e0d3/image1-12.png" />
            
            </figure><p>Using it is as simple as npm/yarn installing <code>@cloudflare/wasm-coredump</code>, configuring a few options, and then adding a few lines of code to your other applications running in Cloudflare Workers, in the exception handling logic:</p>
            <pre><code>import shim, { getMemory, wasmModule } from "../build/worker/shim.mjs"

const timeoutSecs = 20;

async function fetch(request, env, ctx) {
    try {
        // see https://github.com/rustwasm/wasm-bindgen/issues/2724.
        return await Promise.race([
            shim.fetch(request, env, ctx),
            new Promise((r, e) =&gt; setTimeout(() =&gt; e("timeout"), timeoutSecs * 1000))
        ]);
    } catch (err) {
      const memory = getMemory();
      const coredumpService = env.COREDUMP_SERVICE;
      await recordCoredump({ memory, wasmModule, request, coredumpService });
      throw err;
    }
}</code></pre>
            <p>The <code>../build/worker/shim.mjs</code> import comes from the <a href="https://github.com/cloudflare/workers-rs/tree/main/worker-build">worker-build</a> tool, from the <a href="https://github.com/cloudflare/workers-rs/tree/main">workers-rs</a> packages and is automatically generated when <a href="https://developers.cloudflare.com/workers/wrangler/install-and-update/">wrangler</a> builds your Rust-based Cloudflare Workers project. If the Wasm throws an exception, we catch it, extract the core dump from memory, and send it to our Core dump service.</p><p>You might have noticed that we race the <a href="https://github.com/cloudflare/workers-rs">workers-rs</a> <code>shim.fetch()</code> entry point with another Promise to generate a timeout exception if the Rust code doesn't respond earlier. This is because currently, <a href="https://github.com/rustwasm/wasm-bindgen/">wasm-bindgen</a>, which generates the glue between the JavaScript and Rust land, used by workers-rs, has <a href="https://github.com/rustwasm/wasm-bindgen/issues/2724">an issue</a> where a Promise might not be rejected if Rust panics asynchronously (leading to the Worker runtime killing the worker with “Error: The script will never generate a response”.). This can block the wasm-coredump code and make the core dump generation flaky.</p><p>We are working to improve this, but in the meantime, make sure to adjust <code>timeoutSecs</code> to something slightly bigger than the typical response time of your application.</p><p>Here’s an example of a Wasm core dump exception in Sentry:</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/gqZyPFslc9uqCV7jEgaqW/9425701b4209952518e3aef155d9b572/image3-4.png" />
            
            </figure><p>You can find a <a href="https://github.com/cloudflare/wasm-coredump/tree/main/example">working example</a>, the Sentry and R2 configuration options, and more details in the <a href="https://github.com/cloudflare/wasm-coredump">@cloudflare/wasm-coredump</a> GitHub repository.</p>
    <div>
      <h3>Too big to fail</h3>
      <a href="#too-big-to-fail">
        
      </a>
    </div>
    <p>It's worth mentioning one corner case of this debugging technique and the solution: sometimes your codebase is so big that adding core dump and DWARF debugging information might result in a Wasm binary that is too big to fit in a Cloudflare Worker. Well, worry not; we have a solution for that too.</p><p>Fortunately the DWARF for WebAssembly specification also supports <a href="https://yurydelendik.github.io/webassembly-dwarf/#external-DWARF">external DWARF files</a>. To make this work, we have a tool called <a href="https://github.com/xtuc/wasm-coredump/tree/main/bin/debuginfo-split">debuginfo-split</a> that you can add to the build command in the <code>wrangler.toml</code> configuration:</p>
            <pre><code>command = "... &amp;&amp; debuginfo-split ./build/worker/index.wasm"</code></pre>
            <p>What this does is it strips the debugging information from the Wasm binary, and writes it to a new separate file called <code>debug-{UUID}.wasm</code>. You then need to upload this file to the same R2 bucket used by the Wasm Coredump Service (you can automate this as part of your CI or build scripts). The same UUID is also injected into the main Wasm binary; this allows us to correlate the Wasm binary with its corresponding DWARF debugging information. Problem solved.</p><p>Binaries without DWARF information can be significantly smaller. Here’s our example:</p>
<table>
<thead>
  <tr>
    <th><span>4.5 MiB</span></th>
    <th><span>debug-63372dbe-41e6-447d-9c2e-e37b98e4c656.wasm</span></th>
  </tr>
</thead>
<tbody>
  <tr>
    <td><span>313 KiB</span></td>
    <td><span>build/worker/index.wasm</span></td>
  </tr>
</tbody>
</table>
    <div>
      <h3>Final words</h3>
      <a href="#final-words">
        
      </a>
    </div>
    <p>We hope you enjoyed reading this blog as much as we did writing it and that it can help you take your Wasm debugging journeys, using Cloudflare Workers or not, to another level.</p><p>Note that while the examples used here were around using Rust and WebAssembly because that's a common pattern, you can use the same techniques if you're compiling WebAssembly from other languages like C or C++.</p><p>Also, note that the WebAssembly core dump standard is a hot topic, and its implementations and adoption are evolving quickly. We will continue improving the <a href="https://github.com/xtuc/wasm-coredump/tree/main/bin/rewriter">wasm-coredump-rewriter</a>, <a href="https://github.com/xtuc/wasm-coredump/tree/main/bin/debuginfo-split">debuginfo-split</a>, and <a href="https://github.com/xtuc/wasm-coredump/tree/main/bin/wasmgdb">wasmgdb</a> tools and the <a href="https://github.com/cloudflare/wasm-coredump">wasm-coredump service</a>. More and more runtimes, including V8, will eventually support core dumps natively, thus eliminating the need to use polyfills, and the tooling, in general, will get better; that's a certainty. For now, we present you with a solution that works today, and we have strong incentives to keep supporting it.</p><p>As usual, you can talk to us on our <a href="https://discord.cloudflare.com/">Developers Discord</a> or the <a href="https://community.cloudflare.com/c/developers/constellation/97">Community forum</a> or open issues or PRs in our GitHub repositories; the team will be listening.</p> ]]></content:encoded>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[WASM]]></category>
            <category><![CDATA[WebAssembly]]></category>
            <category><![CDATA[Rust]]></category>
            <category><![CDATA[Developers]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <guid isPermaLink="false">7xtevgzV4ycZa3fIFTQOP5</guid>
            <dc:creator>Sven Sauleau</dc:creator>
            <dc:creator>Celso Martinho</dc:creator>
        </item>
        <item>
            <title><![CDATA[Cloudflare Radar's new BGP origin hijack detection system]]></title>
            <link>https://blog.cloudflare.com/bgp-hijack-detection/</link>
            <pubDate>Fri, 28 Jul 2023 13:00:26 GMT</pubDate>
            <description><![CDATA[ BGP origin hijacks allow attackers to intercept, monitor, redirect, or drop traffic destined for the victim's networks. We explain how Cloudflare built its BGP hijack detection system, from its design and implementation to its integration on Cloudflare Radar ]]></description>
            <content:encoded><![CDATA[ <p></p><p><a href="https://www.cloudflare.com/learning/security/glossary/what-is-bgp/">Border Gateway Protocol</a> (BGP) is the de facto inter-domain routing protocol used on the Internet. It enables networks and organizations to exchange reachability information for blocks of IP addresses (IP prefixes) among each other, thus allowing routers across the Internet to forward traffic to its destination. BGP was designed with the assumption that networks do not intentionally propagate falsified information, but unfortunately that’s not a valid assumption on today’s Internet.</p><p>Malicious actors on the Internet who control BGP routers can perform BGP hijacks by falsely announcing ownership of groups of IP addresses that they do not own, control, or route to. By doing so, an attacker is able to redirect traffic destined for the victim network to itself, and monitor and intercept its traffic. A BGP hijack is much like if someone were to change out all the signs on a stretch of freeway and reroute automobile traffic onto incorrect exits.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3eRfapZmJQLB67OmDnppNJ/29c5285c15fc25ad65a2b615b8abe131/image11.png" />
            
            </figure><p>You can learn more about <a href="https://www.cloudflare.com/learning/security/glossary/what-is-bgp/">BGP</a> and <a href="https://www.cloudflare.com/learning/security/glossary/bgp-hijacking/">BGP hijacking</a> and its consequences in our learning center.</p><p>At Cloudflare, we have long been monitoring suspicious BGP anomalies internally. With our recent efforts, we are bringing BGP origin hijack detection to the <a href="https://radar.cloudflare.com/security-and-attacks">Cloudflare Radar</a> platform, sharing our detection results with the public. In this blog post, we will explain how we built our detection system and how people can use Radar and its APIs to integrate our data into their own workflows**.**</p>
    <div>
      <h2>What is BGP origin hijacking?</h2>
      <a href="#what-is-bgp-origin-hijacking">
        
      </a>
    </div>
    <p>Services and devices on the Internet locate each other using IP addresses. Blocks of IP addresses are called an IP prefix (or just prefix for short), and multiple prefixes from the same organization are aggregated into an <a href="https://www.cloudflare.com/learning/network-layer/what-is-an-autonomous-system/">autonomous system</a> (AS).</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1V7wyuIuBZwT9BV8uQjBRs/93167df81577e93c6e01dcae78883700/Screenshot-2023-07-26-at-18.26.17.png" />
            
            </figure><p>Using the BGP protocol, ASes announce which routes can be imported or exported to other ASes and routers from their routing tables. This is called the AS routing policy. Without this routing information, operating the Internet on a large scale would quickly become impractical: data packets would get lost or take too long to reach their destinations.</p><p>During a BGP origin hijack, an attacker creates fake announcements for a targeted prefix, falsely identifying an <a href="https://developers.cloudflare.com/radar/glossary/#autonomous-systems">autonomous systems (AS)</a> under their control as the origin of the prefix.</p><p>In the following graph, we show an example where <code>AS 4</code> announces the prefix <code>P</code> that was previously originated by <code>AS 1</code>. The receiving parties, i.e. <code>AS 2</code> and <code>AS 3</code>, accept the hijacked routes and forward traffic toward prefix <code>P</code> to <code>AS 4</code> instead.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4868WfpNphX6eCHwMpZKGr/9edb6690b917913bdfcbc361eadbbae3/image2-15.png" />
            
            </figure><p>As you can see, the normal and hijacked traffic flows back in the opposite direction of the BGP announcements we receive.</p><p>If successful, this type of attack will result in the dissemination of the falsified prefix origin announcement throughout the Internet, causing network traffic previously intended for the victim network to be redirected to the AS controlled by the attacker. As an example of a famous BGP hijack attack, in 2018 <a href="/bgp-leaks-and-crypto-currencies/">someone was able</a> to convince parts of the Internet to reroute traffic for AWS to malicious servers where they used DNS to redirect MyEtherWallet.com, a popular crypto wallet, to a hacked page.</p>
    <div>
      <h2>Prevention mechanisms and why they’re not perfect (yet)</h2>
      <a href="#prevention-mechanisms-and-why-theyre-not-perfect-yet">
        
      </a>
    </div>
    <p>The key difficulty in preventing BGP origin hijacks is that the BGP protocol itself does not provide a mechanism to validate the announcement content. In other words, the original BGP protocol does not provide any authentication or ownership safeguards; any route can be originated and announced by any random network, independent of its rights to announce that route.</p><p>To address this problem, operators and researchers have proposed the <a href="https://en.wikipedia.org/wiki/Resource_Public_Key_Infrastructure">Resource Public Key Infrastructure (RPKI)</a> to store and validate prefix-to-origin mapping information. With RPKI, operators can prove the ownership of their network resources and create ROAs, short for Route Origin Authorisations, cryptographically signed objects that define which Autonomous System (AS) is authorized to originate a specific prefix.</p><p>Cloudflare <a href="/rpki/">committed to support RPKI</a> since the early days of the <a href="https://datatracker.ietf.org/doc/html/rfc6480">RFC</a>. With RPKI, IP prefix owners can store and share the ownership information securely, and other operators can validate BGP announcements by checking the prefix origin to the information stored on RPKI. Any hijacking attempt to announce an IP prefix with an incorrect origin AS will result in invalid validation results, and such invalid BGP messages will be discarded. This validation process is referred to as route origin validation (ROV).</p><p>In order to further advocate for RPKI deployment and filtering of RPKI invalid announcements, Cloudflare has been providing a RPKI test service, <a href="https://isbgpsafeyet.com/">Is BGP Safe Yet?</a>, allowing users to test whether their ISP filters RPKI invalid announcements. We also provide rich information with regard to the RPKI status of individual prefixes and ASes at <a href="https://rpki.cloudflare.com/">https://rpki.cloudflare.com/</a>.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4MNPvwC6PpJCPQCIQhC8sl/82309707277d649b0b810ed6e3028947/image8-1.png" />
            
            </figure><p><b>However</b>, the effectiveness of RPKI on preventing BGP origin hijacks depends on two factors:</p><ol><li><p>The ratio of prefix owners register their prefixes on RPKI;</p></li><li><p>The ratio of networks performing route origin validation.</p></li></ol><p>Unfortunately, neither ratio is at a satisfactory level yet. As of today, July 27, 2023, only about 45% of the IP prefixes routable on the Internet are covered by some ROA on RPKI. The remaining prefixes are highly vulnerable to BGP origin hijacks. Even for the 45% prefix that are covered by some ROA, origin hijack attempts can still affect them due to the low ratio of networks that perform route origin validation (ROV). Based on our <a href="/rpki-updates-data/">recent study,</a> only 6.5% of the Internet users are protected by ROV from BGP origin hijacks.</p><p>Despite the benefits of RPKI and RPKI ROAs, their effectiveness in preventing BGP origin hijacks is limited by the slow adoption and deployment of these technologies. Until we achieve a high rate of RPKI ROA registration and RPKI invalid filtering, BGP origin hijacks will continue to pose a significant threat to the daily operations of the Internet and the security of everyone connected to it. Therefore, it’s also essential to prioritize developing and deploying BGP monitoring and detection tools to enhance the security and stability of the Internet's routing infrastructure.</p>
    <div>
      <h2>Design of Cloudflare’s BGP hijack detection system</h2>
      <a href="#design-of-cloudflares-bgp-hijack-detection-system">
        
      </a>
    </div>
    <p>Our system comprises multiple data sources and three distinct modules that work together to detect and analyze potential BGP hijack events: prefix origin change detection, hijack detection and the storage and notification module.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/67bYKbfT4zBONmISARpcta/8790aaad3f3db54ba012bd751dd393c9/image6-7.png" />
            
            </figure><p>The Prefix Origin Change Detection module provides the data, the Hijack Detection module analyzes the data, and the Alerts Storage and Delivery module stores and provides access to the results. Together, these modules work in tandem to provide a comprehensive system for detecting and analyzing potential BGP hijack events.</p>
    <div>
      <h3>Prefix origin change detection module</h3>
      <a href="#prefix-origin-change-detection-module">
        
      </a>
    </div>
    <p>At its core, the BGP protocol involves:</p><ol><li><p>Exchanging prefix reachability (routing) information;</p></li><li><p>Deciding where to forward traffic based on the reachability information received.</p></li></ol><p>The reachability change information is encoded in BGP update messages while the routing decision results are encoded as a route information base (RIB) on the routers, also known as the <a href="https://en.wikipedia.org/wiki/Routing_table">routing table</a>.</p><p>In our origin hijack detection system, we focus on investigating BGP <a href="https://datatracker.ietf.org/doc/html/rfc4271">update messages</a> that contain changes to the origin ASes of any IP prefixes. There are two types of BGP update messages that could indicate prefix origin changes: <b>announcements</b> and <b>withdrawals</b>.</p><p>Announcements include an AS-level path toward one or more prefixes. The path tells the receiving parties through which sequence of networks (ASes) one can reach the corresponding prefixes. The last hop of an AS path is the origin AS. In the following diagram, AS 1 is the origin AS of the announced path.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4bF9IfSM2X5mtlWgqsKt4e/8a30a1f33717082d8e5d0f0ba35ae68a/image4-6.png" />
            
            </figure><p>Withdrawals, on the other hand, simply inform the receiving parties that the prefixes are no longer reachable.</p><p>Both types of messages are stateless. They inform us of the current route changes, but provide no information about the previous states. As a result, detecting origin changes is not as straightforward as one may think. Our system needs to keep track of historical BGP updates and build some sort of state over time so that we can verify if a BGP update contains origin changes.</p><p>We didn't want to deal with a complex system like a database to manage the state of all the prefixes we see resulting from all the BGP updates we get from them. Fortunately, there's this thing called <a href="https://en.wikipedia.org/wiki/Trie">prefix trie</a> in computer science that you can use to store and look up string-indexed data structures, which is ideal for our use case. We ended up developing a fast Rust-based custom IP prefix trie that we use to hold the relevant information such as the origin ASN and the AS path for each IP prefix and allows information to be updated based on BGP announcements and withdrawals.</p><p>The example figure below shows an example of the AS path information for prefix <code>192.0.2.0/24</code> stored on a prefix trie. When updating the information on the prefix trie, if we see a change of origin ASN for any given prefix, we record the BGP message as well as the change and create an <code>Origin Change Signal</code>.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/2tZK0inZhfzbvgpQIBKkFE/4215c1630012b91785c9a79866117eae/Screenshot-2023-07-26-at-18.20.07.png" />
            
            </figure><p>The prefix origin changes detection module collects and processes live-stream and historical BGP data from various sources. For <a href="https://www.cloudflare.com/developer-platform/solutions/live-streaming/">live streams</a>, our system applies a thin layer of data processing to translate BGP messages into our internal data structure. At the same time, for historical archives, we use a dedicated deployment of the <a href="https://bgpkit.com/broker">BGPKIT broker</a> and <a href="https://bgpkit.com/parser">parser</a> to convert MRT files from <a href="https://www.routeviews.org/">RouteViews</a> and <a href="https://www.ripe.net/analyse/internet-measurements/routing-information-service-ris">RIPE RIS</a> into BGP message streams as they become available.</p><p>After the data is collected, consolidated and normalized it then creates, maintains and destroys the prefix tries so that we can know what changed from previous BGP announcements from the same peers. Based on these calculations we then send enriched messages downstream to be analyzed.</p>
    <div>
      <h3>Hijack detection module</h3>
      <a href="#hijack-detection-module">
        
      </a>
    </div>
    <p>Determining whether BGP messages suggest a hijack is a complex task, and no common scoring mechanism can be used to provide a definitive answer. Fortunately, there are several types of data sources that can collectively provide a relatively good idea of whether a BGP announcement is legitimate or not. These data sources can be categorized into two types: inter-AS relationships and prefix-origin binding.</p><p>The inter-AS relationship datasets include AS2org and AS2rel datasets from <a href="https://www.caida.org/">CAIDA/UCSD</a>, AS2rel datasets from <a href="https://bgpkit.com/">BGPKIT</a>, AS organization datasets from <a href="https://www.peeringdb.com/">PeeringDB</a>, and <a href="/route-leak-detection-with-cloudflare-radar/#route-leak-detection">per-prefix AS relationship data</a> built at Cloudflare. These datasets provide information about the relationship between autonomous systems, such as whether they are upstream or downstream from one another, or if the origins of any change signal belong to the same organization.</p><p>Prefix-to-origin binding datasets include live RPKI validated ROA payload (VRP) from the <a href="https://rpki.cloudflare.com/">Cloudflare RPKI portal</a>, daily Internet Routing Registry (IRR) dumps curated and cleaned up by <a href="https://www.manrs.org/">MANRS</a>, and prefix and AS <a href="https://en.wikipedia.org/wiki/Bogon_filtering">bogon</a> lists (private and reserved addresses defined by <a href="https://datatracker.ietf.org/doc/html/rfc1918">RFC 1918</a>, <a href="https://datatracker.ietf.org/doc/html/rfc5735">RFC 5735</a>, and <a href="https://datatracker.ietf.org/doc/html/rfc6598">RFC 6598</a>). These datasets provide information about the ownership of prefixes and the ASes that are authorized to originate them.</p><p>By combining all these data sources, it is possible to collect information about each BGP announcement and answer questions programmatically. For this, we have a scoring function that takes all the evidence gathered for a specific BGP event as the input and runs that data through a sequence of checks. Each condition returns a neutral, positive, or negative weight that keeps adding to the final score. The higher the score, the more likely it is that the event is a hijack attempt.</p><p>The following diagram illustrates this sequence of checks:</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4y6YiYezGew65wYeNFPOqs/f2883b0c385966acbc991be815bb4a1c/image1-12.png" />
            
            </figure><p>As you can see, for each event, several checks are involved that help calculate the final score: RPKI, Internet Routing Registry (IRR), bogon prefixes and ASNs lists, AS relationships, and AS path.</p><p>Our guiding principles are: if the newly announced origins are RPKI or IRR invalid, it’s more likely that it’s a hijack, but if the old origins are also invalid, then it’s less likely. We discard events about private and reserved ASes and prefixes. If the new and old origins have a direct business relationship, then it’s less likely that it’s a hijack. If the new AS path indicates that the traffic still goes through the old origin, then it’s probably not a hijack.</p><p>Signals that are deemed legitimate are discarded, while signals with a high enough confidence score are flagged as potential hijacks and sent downstream for further analysis.</p><p>It's important to reiterate that the decision is not binary but a score. There will be situations where we find false negatives or false positives. The advantage of this framework is that we can easily monitor the results, learn from additional datasets and conduct the occasional manual inspection, which allows us to adjust the weights, add new conditions and continue improving the score precision over time.</p>
    <div>
      <h4>Aggregating BGP hijack events</h4>
      <a href="#aggregating-bgp-hijack-events">
        
      </a>
    </div>
    <p>Our BGP hijack detection system provides fast response time and requires minimal resources by operating on a per-message basis.</p><p>However, when a hijack is happening, the number of hijack signals can be overwhelming for operators to manage. To address this issue, we designed a method to aggregate individual hijack messages into <b>BGP hijack events</b>, thereby reducing the number of alerts triggered.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/20mBP4fcykkGkNIXwXqlAa/326f9aeed83dfd6c337bf374ae0f233b/image10.png" />
            
            </figure><p>An event aggregates BGP messages that are coming from the same hijacker related to prefixes from the same victim. The start date is the same as the date of the first suspicious signal. To calculate the end of an event we look for one of the following conditions:</p><ul><li><p>A BGP withdrawn message for the hijacked prefix: regardless of who sends the withdrawal, the route towards the prefix is no longer via the hijacker, and thus this hijack message is considered finished.</p></li><li><p>A new BGP announcement message with the previous (legitimate) network as the origin: this indicates that the route towards the prefix is reverted to the state before the hijack, and the hijack is therefore considered finished.</p></li></ul><p>If all BGP messages for an event have been withdrawn or reverted, and there are no more new suspicious origin changes from the hijacker ASN for <b>six hours</b>, we mark the event as finished and set the end date.</p><p>Hijack events can capture both small-scale and large-scale attacks. Alerts are then based on these aggregated events, not individual messages, making it easier for operators to manage and respond appropriately.</p>
    <div>
      <h3>Alerts, Storage and Notifications module</h3>
      <a href="#alerts-storage-and-notifications-module">
        
      </a>
    </div>
    <p>This module provides access to detected BGP hijack events and sends out notifications to relevant parties. It handles storage of all detected events and provides a user interface for easy access and search of historical events. It also generates notifications and delivers them to the relevant parties, such as network administrators or security analysts, when a potential BGP hijack event is detected. Additionally, this module can build dashboards to display high-level information and visualizations of detected events to facilitate further analysis.</p>
    <div>
      <h3>Lightweight and portable implementation</h3>
      <a href="#lightweight-and-portable-implementation">
        
      </a>
    </div>
    <p>Our BGP hijack detection system is implemented as a Rust-based command line application that is lightweight and portable. The whole detection pipeline runs off a single binary application that connects to a PostgreSQL database and essentially runs a complete self-contained BGP data pipeline. And if you are wondering, yes, the full system, including the database, can run well on a laptop.</p><p>The runtime cost mainly comes from maintaining the in-memory prefix tries for each full-feed router, each costing roughly 200 MB RAM. For the beta deployment, we use about 170 full-feed peers and the whole system runs well on a single 32 GB node with 12 threads.</p>
    <div>
      <h2>Using the BGP Hijack Detection</h2>
      <a href="#using-the-bgp-hijack-detection">
        
      </a>
    </div>
    <p>The BGP Hijack Detection results are now available on both the <a href="https://radar.cloudflare.com/security-and-attacks">Cloudflare Radar</a> website and the <a href="https://developers.cloudflare.com/api/operations/radar-get-bgp-hijacks-events">Cloudflare Radar API</a>.</p>
    <div>
      <h3>Cloudflare Radar</h3>
      <a href="#cloudflare-radar">
        
      </a>
    </div>
    <p>Under the “Security &amp; Attacks” section of the Cloudflare Radar for both global and ASN view, we now display the BGP origin hijacks table. In this table, we show a list of detected potential BGP hijack events with the following information:</p><ul><li><p>The detected and expected origin ASes;</p></li><li><p>The start time and event duration;</p></li><li><p>The number of BGP messages and route collectors peers that saw the event;</p></li><li><p>The announced prefixes;</p></li><li><p>Evidence tags and confidence level (on the likelihood of the event being a hijack).</p></li></ul>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1VwQO8aPGpngp78MrDyCmH/5b27df69735cf09eb21dea04385d8bcc/image3-6.png" />
            
            </figure><p>For each BGP event, our system generates relevant evidence tags to indicate why the event is considered suspicious or not. These tags are used to inform the confidence score assigned to each event. Red tags indicate evidence that increases the likelihood of a hijack event, while green tags indicate the opposite.</p><p>For example, the red tag "RPKI INVALID" indicates an event is likely a hijack, as it suggests that the RPKI validation failed for the announcement. Conversely, the tag "SIBLING ORIGINS" is a green tag that indicates the detected and expected origins belong to the same organization, making it less likely for the event to be a hijack.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/44EZVKQqIpl7O5tS7QrweM/1a763e761fda826d151fd43e951c6167/Screenshot-2023-07-26-at-18.22.35.png" />
            
            </figure><p>Users can now access the BGP hijacks table in the following ways:</p><ol><li><p>Global view under <a href="https://radar.cloudflare.com/security-and-attacks">Security &amp; Attacks</a> page without location filters. This view lists the most recent 150 detected BGP hijack events globally.</p></li><li><p>When filtered by a specific ASN, the table will appear on Overview, Traffic, and Traffic &amp; Attacks tabs.</p></li></ol>
    <div>
      <h3>Cloudflare Radar API</h3>
      <a href="#cloudflare-radar-api">
        
      </a>
    </div>
    <p>We also provide programmable access to the BGP hijack detection results via the Cloudflare Radar API, which is freely available under <a href="https://radar.cloudflare.com/about">CC BY-NC 4.0 license</a>. The API documentation is available at the <a href="https://developers.cloudflare.com/api/operations/radar-get-bgp-hijacks-events">Cloudflare API portal</a>.</p><p>The following <code>curl</code> command fetches the most recent 10 BGP hijack events relevant to AS64512.</p>
            <pre><code>curl -X GET "https://api.cloudflare.com/client/v4/radar/bgp/hijacks/events?invlovedAsn=64512&amp;format=json&amp;per_page=10" \
    -H "Authorization: Bearer &lt;API_TOKEN&gt;"</code></pre>
            <p>Users can further filter events with high confidence by specifying the <code>minConfidence</code> parameter with a 0-10 value, where a higher value indicates higher confidence of the events being a hijack. The following example expands on the previous example by adding the minimum confidence score of 8 to the query:</p>
            <pre><code>curl -X GET "https://api.cloudflare.com/client/v4/radar/bgp/hijacks/events?invlovedAsn=64512&amp;format=json&amp;per_page=10&amp;minConfidence=8" \
    -H "Authorization: Bearer &lt;API_TOKEN&gt;"</code></pre>
            <p>Additionally, users can also quickly build custom hijack alerters using a Cloudflare <a href="https://developers.cloudflare.com/workers/wrangler/workers-kv/#workers-kv">Workers + KV combination</a>. We have a full tutorial on building alerters that send out webhook-based messages or emails (with <a href="https://developers.cloudflare.com/email-routing/">Email Routing</a>) available on the <a href="https://developers.cloudflare.com/radar/investigate/bgp-anomalies/">Cloudflare Radar documentation site</a>.</p>
    <div>
      <h2>More routing security on Cloudflare Radar</h2>
      <a href="#more-routing-security-on-cloudflare-radar">
        
      </a>
    </div>
    <p>As we continue improving Cloudflare Radar, we are planning to introduce additional Internet routing and security data. For example, Radar will soon get a dedicated routing section to provide digestible BGP information for given networks or regions, such as distinct routable prefixes, RPKI valid/invalid/unknown routes, distribution of IPv4/IPv6 prefixes, etc. Our goal is to provide the best data and tools for routing security to the community, so that we can build a better and more secure Internet together.</p><p>Visit <a href="https://radar.cloudflare.com/">Cloudflare Radar</a> for additional insights around (Internet disruptions, routing issues, Internet traffic trends, attacks, Internet quality, etc.). Follow us on social media at <a href="https://twitter.com/CloudflareRadar">@CloudflareRadar</a> (Twitter), <a href="https://noc.social/@cloudflareradar">https://noc.social/@cloudflareradar</a> (Mastodon), and <a href="https://bsky.app/profile/radar.cloudflare.com">radar.cloudflare.com</a> (Bluesky), or contact us via <a>e-mail</a>.</p> ]]></content:encoded>
            <category><![CDATA[Radar]]></category>
            <category><![CDATA[BGP]]></category>
            <category><![CDATA[Radar Alerts]]></category>
            <guid isPermaLink="false">33xptAfGQ0z94EAn4h1oKn</guid>
            <dc:creator>Mingwei Zhang</dc:creator>
            <dc:creator>Celso Martinho</dc:creator>
        </item>
        <item>
            <title><![CDATA[Globally distributed AI and a Constellation update]]></title>
            <link>https://blog.cloudflare.com/globally-distributed-ai-and-a-constellation-update/</link>
            <pubDate>Thu, 22 Jun 2023 13:00:05 GMT</pubDate>
            <description><![CDATA[ Today we’re announcing new Constellation features, explain why it’s the first globally distributed AI platform and why deploying your machine learning tasks in our global network is advantageous. ]]></description>
            <content:encoded><![CDATA[ <p></p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/d7mzaMT9WUUuX6PKMqP0o/f33c7430024dca3b9eeb0b65b8ee4d21/image2-27.png" />
            
            </figure><p>During Cloudflare's 2023 Developer Week, we announced <a href="/introducing-constellation/">Constellation</a>, a set of APIs that allow everyone to run fast, low-latency inference tasks using pre-trained machine learning/AI models, directly on Cloudflare’s network.</p>
    <div>
      <h3>Constellation update</h3>
      <a href="#constellation-update">
        
      </a>
    </div>
    <p>We now have a few thousand accounts onboarded in the Constellation private beta and have been listening to our customer's feedback to evolve and improve the platform. Today, one month after the announcement, we are upgrading Constellation with three new features:</p><p><b>Bigger models</b>We are increasing the size limit of your models from 10 MB to 50 MB. While still somewhat conservative during the private beta, this new limit opens doors to more pre-trained and optimized models you can use with Constellation.</p><p><b>Tensor caching</b>When you run a Constellation inference task, you pass multiple tensor objects as inputs, sometimes creating big data payloads. These inputs travel through the wire protocol back and forth when you repeat the same task, even when the input changes from multiple runs are minimal, creating unnecessary network and data parsing overhead.</p><p>The client API now supports caching input tensors resulting in even better network latency and faster inference times.</p><p><b>XGBoost runtime</b>Constellation started with the ONNX runtime, but our vision is to support multiple runtimes under a common API. Today we're adding the XGBoost runtime to the list.</p><p><a href="https://xgboost.ai/">XGBoost</a> is an optimized distributed gradient boosting library designed to be highly efficient, flexible, and portable, and it's known for its performance in structured and tabular data tasks.</p><p>You can start uploading and using XGBoost models today.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1KCtc0U75hfGd2cztvKt2H/89eda58cb5e59d749cfe84549fde59af/SRYFZbgQfXB60T7k7-Fj0PAP3cWbivZm6oU_jQFzoDbQQSF4IHkCmfugCHSHLwL4_pfE6-tlYfgHrsNg7Z5z5JLvjlUp8aPOiXPVHqGOT_mfAVgB70rn8_QxvekK.png" />
            
            </figure><p>You can find the updated documentation with these new features and an example on how to use the XGBoost runtime with Constellation in our <a href="https://developers.cloudflare.com/constellation">Developers Documentation</a>.</p>
    <div>
      <h3>An era of globally distributed AI</h3>
      <a href="#an-era-of-globally-distributed-ai">
        
      </a>
    </div>
    <p>Since Cloudflare’s network is globally distributed, Constellation is our first public release of globally distributed machine learning.</p><p>But what does this mean? You may not think of a global network as the place to deploy your machine learning tasks, but machine learning has been a core part of what’s enabled much of Cloudflare’s core functionality for many years. And we run it across our <a href="https://www.cloudflare.com/network/">global network in 300 cities</a>.</p><p>Is this large spike in traffic an attack or a Black Friday sale? What’s going to be the best way to route this request based on current traffic patterns? Is this request coming from a human or a bot? Is this HTTP traffic a zero-day? Being able to answer these questions using automated <a href="https://www.cloudflare.com/learning/ai/what-is-machine-learning/">machine learning</a> and <a href="https://www.cloudflare.com/learning/ai/what-is-artificial-intelligence/">AI</a>, rather than human intervention, is one of the things that’s enabled Cloudflare to scale.</p><p>But this is just a small sample of what globally distributed machine learning enables. The reason this was so helpful for us was because we were able to run this machine learning as an integrated part of our stack, which is why we’re now in the process of opening it up to more and more developers with Constellation.</p><p>As Michelle Zatlyn, our co-founder likes to say, we’re just getting started (in this space) — every day we’re adding hundreds of new users to our <a href="/introducing-constellation/">Constellation beta</a>, testing out and globally deploying new models, and beyond that, deploying new hardware to support the new types of workloads that <a href="https://www.cloudflare.com/learning/ai/what-is-artificial-intelligence/">AI</a> will bring to the our global network.</p><p>With that, we wanted to share a few announcements and some use cases that help illustrate why we’re so excited about globally distributed AI. And since it’s Speed Week, it should be no surprise that, well, speed is at the crux of it all.</p>
    <div>
      <h3>Custom tailored web experiences, powered by AI</h3>
      <a href="#custom-tailored-web-experiences-powered-by-ai">
        
      </a>
    </div>
    <p>We’ve long known about the importance of performance when it comes to web experiences — in e-commerce, every second of page load time can have as much as a 7% drop off effect on conversion. But being fast is not enough. It’s necessary, but not sufficient. You also have to be accurate.</p><p>That is, rather than serving one-size-fits-all experiences, users have come to expect that you know what they want before they do.</p><p>So you have to serve personalized experiences, and you have to do it fast. That’s where Constellation can come into play. With Constellation, as a part of your e-commerce application that may already be served from Cloudflare’s network through Workers or Pages, or even store data in <a href="https://developers.cloudflare.com/d1/">D1</a>, you can now perform tasks such as categorization (what demographic is this customer most likely in?) and personalization (if you bought this, you may also like that).</p>
    <div>
      <h3>Making devices smarter wherever they are</h3>
      <a href="#making-devices-smarter-wherever-they-are">
        
      </a>
    </div>
    <p>Another use case where performance is critical is in interacting with the real world. Imagine a face recognition system that detects whether you’re human or not every time you go into your house. Every second of latency makes a difference (especially if you’re holding heavy groceries).</p><p>Running inference on Cloudflare’s network, means that within 95% of the world’s population, compute, and thus a decision, is never going to be more than 50ms away. This is in huge contrast to centralized compute, where if you live in Europe, but bought a doorbell system from a US-based company, may be up to hundreds of milliseconds round trip away.</p><p>You may be thinking, why not just run the compute on the device then?</p><p>For starters, running inference on the device doesn’t guarantee fast performance. Most devices with built in intelligence are run on microcontrollers, often with limited computational abilities (not a high-end GPU or server-grade CPU). Milliseconds become seconds; depending on the volume of workloads you need to process, the local inference might not be suitable. The compute that can be fit on devices is simply not powerful enough for high-volume complex operations, certainly not for operating at low-latency.</p><p>But even user experience aside (some devices don’t interface with a user directly), there are other downsides to running compute directly on devices.</p><p>The first is battery life — the longer the compute, the shorter the battery life. There's always a power consumption hit, even if you have a custom <a href="https://en.wikipedia.org/wiki/Application-specific_integrated_circuit">ASIC chip</a> or a Tensor Processing Unit (<a href="https://en.wikipedia.org/wiki/Tensor_Processing_Unit">TPU</a>), meaning shorter battery life if that's one of your constraints. For consumer products, this means having to switch out your doorbell battery (lest you get locked out). For operating fleets of devices at scale (imagine watering devices in a field) this means costs of keeping up with, and swapping out batteries.</p><p>Lastly, device hardware, and even software, is harder to update. As new technologies or more efficient chips become available, upgrading fleets of hundreds or thousands of devices is challenging. And while software updates may be easier to manage, they’ll never be as easy as updating on-cloud software, where you can effortlessly ship updates multiple times a day!</p><p>Speaking of shipping software…</p>
    <div>
      <h3>AI applications, easier than ever with Constellation</h3>
      <a href="#ai-applications-easier-than-ever-with-constellation">
        
      </a>
    </div>
    <p>Speed Week is not just about making your applications or devices faster, but also your team!</p><p>For the past six years, our developer platform has been making it easy for developers to ship new code with Cloudflare Workers. With Constellation, it’s now just as easy to add Machine Learning to your existing application, with just a few commands.</p><p>And if you don’t believe us, don’t just take our word for it. We’re now in the process of opening up the beta to more and more customers. To request access, head on over to the Cloudflare Dashboard where you’ll see a new tab for Constellation. We encourage you to check out <a href="https://developers.cloudflare.com/constellation/get-started/first-constellation-worker/">our tutorial</a> for getting started with Constellation — this AI thing may be even easier than you expected it to be!</p>
    <div>
      <h3>We’re just getting started</h3>
      <a href="#were-just-getting-started">
        
      </a>
    </div>
    <p>This is just the beginning of our journey for helping developers build AI driven applications, and we’re already thinking about what’s next.</p><p>We look forward to seeing what you build, and hearing your feedback.</p> ]]></content:encoded>
            <category><![CDATA[Speed Week]]></category>
            <category><![CDATA[AI]]></category>
            <category><![CDATA[Machine Learning]]></category>
            <category><![CDATA[Product News]]></category>
            <guid isPermaLink="false">38XFu6bOuh2W9Z5jgDbZis</guid>
            <dc:creator>Rita Kozlov</dc:creator>
            <dc:creator>Celso Martinho</dc:creator>
        </item>
        <item>
            <title><![CDATA[Workers Browser Rendering API enters open beta]]></title>
            <link>https://blog.cloudflare.com/browser-rendering-open-beta/</link>
            <pubDate>Fri, 19 May 2023 13:00:32 GMT</pubDate>
            <description><![CDATA[ The Workers Browser Rendering API allows developers to programmatically control and interact with a headless browser instance and create automation flows for their applications and products. Today we enter the open beta and start onboarding our customers. ]]></description>
            <content:encoded><![CDATA[ 
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/48ITDIuGFsDkkeLlfW5wR6/a1a70dec0c91931ba63a9c0a5851ed5e/image1-56.png" />
            
            </figure><p>The Workers Browser Rendering API allows developers to programmatically control and interact with a headless browser instance and create automation flows for their applications and products.</p><p>Since the <a href="/introducing-workers-browser-rendering-api/">private beta announcement</a>, based on the feedback we've been receiving and our own roadmap, the team has been working on the developer experience and improving the platform architecture for the best possible performance and reliability. <b>Today we enter the open beta and will start onboarding the customers on the</b> <a href="https://www.cloudflare.com/lp/workers-browser-rendering-api?ref=blog.cloudflare.com"><b>wait list</b></a><b>.</b></p>
    <div>
      <h3>Developer experience</h3>
      <a href="#developer-experience">
        
      </a>
    </div>
    <p>Starting today, <a href="https://developers.cloudflare.com/workers/wrangler/">Wrangler</a>, our command-line tool for configuring, building, and deploying applications with Cloudflare developer products, has support for the Browser Rendering API bindings.</p><p>You can install Wrangler Beta using <a href="https://www.npmjs.com/package/npm">npm</a>:</p>
            <pre><code>npm install wrangler --save-dev</code></pre>
            <p>Bindings allow your Workers to interact with resources on the Cloudflare developer platform. In this case, they will provide your Worker script with an authenticated endpoint to interact with a dedicated Chromium browser instance.</p><p>This is all you need in your <code>wrangler.toml</code> once this service is enabled for your account:</p>
            <pre><code>browser = { binding = "MYBROWSER", type = "browser" }</code></pre>
            <p>Now you can deploy any Worker script that requires Browser Rendering capabilities. You can spawn Chromium instances and interact with them programmatically in any way you typically do manually behind your browser.</p><p>Under the hood, the Browser Rendering API gives you access to a WebSocket endpoint that speaks the <a href="https://chromedevtools.github.io/devtools-protocol/">DevTools Protocol</a>. DevTools is what allows us to instrument a Chromium instance running in our global network, and it's the same protocol that Chrome uses on your computer when you inspect a page.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3jPdmcv9Dtzd3yq8R3DFUR/9065566d8deb7780efe494da03cdb1bc/image2-35.png" />
            
            </figure><p>With enough dedication, you can, in fact, implement your own DevTools client and talk the protocol directly. But that'd be crazy; almost no one does that.</p><p>So…</p>
    <div>
      <h3>Puppeteer</h3>
      <a href="#puppeteer">
        
      </a>
    </div>
    <p><a href="https://pptr.dev/">Puppeteer</a> is one of the most popular libraries that abstract the lower-level DevTools protocol from developers and provides a high-level API that you can use to easily instrument Chrome/Chromium and automate browsing sessions. It's widely used for things like creating screenshots, crawling pages, and testing web applications.</p><p>Puppeteer typically <a href="https://pptr.dev/api/puppeteer.puppeteer.connect">connects</a> to a local Chrome or Chromium browser using the DevTools port.</p><p>We forked a version of Puppeteer and patched it to connect to the Workers Browser Rendering API instead. The <a href="https://github.com/cloudflare/puppeteer/blob/main/src/puppeteer-core.ts">changes</a> are minimal; after connecting the developers can then use the full Puppeteer API as they would on a standard setup.</p><p>Our version is <a href="https://github.com/cloudflare/puppeteer">open sourced here</a>, and the npm can be installed from <a href="https://www.npmjs.com/">npmjs</a> as <a href="https://www.npmjs.com/package/@cloudflare/puppeteer">@cloudflare/puppeteer</a>. Using it from a Worker is as easy as:</p>
            <pre><code>import puppeteer from "@cloudflare/puppeteer";</code></pre>
            <p>And then all it takes to launch a browser from your script is:</p>
            <pre><code>const browser = await puppeteer.launch(env.MYBROWSER);</code></pre>
            <p>In the long term, we will update Puppeteer to keep matching the version of our Chromium instances infrastructure running in our network.</p>
    <div>
      <h3>Developer documentation</h3>
      <a href="#developer-documentation">
        
      </a>
    </div>
    <p>Following the tradition with other Developer products, we created a dedicated section for the Browser Rendering APIs in our <a href="https://developers.cloudflare.com/browser-rendering">Developer's Documentation site</a>.</p><p>You can access this page to learn more about how the service works, Wrangler support, APIs, and limits, and find examples of starter templates for common applications.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4xCzHB3baMJCw7CRov6P1P/33414eea87ea7b9e226b4dcc62395dcb/download-16.png" />
            
            </figure>
    <div>
      <h3>An example application: taking screenshots</h3>
      <a href="#an-example-application-taking-screenshots">
        
      </a>
    </div>
    <p>Taking screenshots from web pages is one of the typical cases for browser automation.</p><p>Let's create a Worker that uses the Browser Rendering API to do just that. This is a perfect example of how to set up everything and get an application running in minutes, it will give you a good overview of the steps involved and the basics of the Puppeteer API, and then you can move from here to other more sophisticated use-cases.</p><p>Step one, start a project, install Wrangler and Cloudflare’s fork of Puppeteer:</p>
            <pre><code>npm init -f
npm install wrangler -save-dev
npm install @cloudflare/puppeteer -save-dev</code></pre>
            <p>Step two, let’s create the simplest possible wrangler.toml configuration file with the Browser Rendering API binding:</p>
            <pre><code>name = "browser-worker"
main = "src/index.ts"
compatibility_date = "2023-03-14"
node_compat = true
workers_dev = true

browser = { binding = "MYBROWSER", type = "browser" }</code></pre>
            <p>Step three, create src/index.ts with your Worker code:</p>
            <pre><code>import puppeteer from "@cloudflare/puppeteer";

export default {
    async fetch(request: Request, env: Env): Promise&lt;Response&gt; {
        const { searchParams } = new URL(request.url);
        let url = searchParams.get("url");
        let img: Buffer;
        if (url) {
            const browser = await puppeteer.launch(env.MYBROWSER);
            const page = await browser.newPage();
            await page.goto(url);
            img = (await page.screenshot()) as Buffer;
            await browser.close();
            return new Response(img, {
                headers: {
                    "content-type": "image/jpeg",
                },
            });
        } else {
            return new Response(
                "Please add the ?url=https://example.com/ parameter"
            );
        }
    },
};</code></pre>
            <p>That's it, no more steps. This Worker instantiates a browser using Puppeteer, opens a new page, navigates to whatever you put in the "url" parameter, takes a screenshot of the page, closes the browser, and responds with the JPEG image of the screenshot. It can't get any easier to get started with the Browser Rendering API.</p><p>Run <code>npx wrangler dev –remote</code> to test it and <code>npx wrangler publish</code> when you’re done.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4SBhHt9GtxSXSEERkrnwSl/d159f82e6319be6d7a37c34709059067/image4-21.png" />
            
            </figure><p>You can explore the <a href="https://github.com/cloudflare/puppeteer/blob/main/docs/api/index.md">entire Puppeteer API</a> and implement other functionality and logic from here. And, because it's Workers, you can add other <a href="https://developers.cloudflare.com/">developer products</a> to your code. You might need a <a href="https://developers.cloudflare.com/d1/">relational database</a>, or a <a href="https://developers.cloudflare.com/workers/runtime-apis/kv/#kv">KV store</a> to cache your screenshots, or an <a href="https://developers.cloudflare.com/r2/">R2 bucket</a> to archive your crawled pages and assets, or maybe use a <a href="https://developers.cloudflare.com/workers/runtime-apis/durable-objects/#durable-objects">Durable Object</a> to keep your browser instance alive and share it with multiple requests, or <a href="https://developers.cloudflare.com/queues/">queues</a> to handle your jobs asynchronous, we have all of this and <a href="https://developers.cloudflare.com/">more</a>.</p><p>You can also find this and other examples of how to use Browser Rendering in the <a href="https://developers.cloudflare.com/browser-rendering">Developer Documentation</a>.</p>
    <div>
      <h3>How do we use Browser Rendering</h3>
      <a href="#how-do-we-use-browser-rendering">
        
      </a>
    </div>
    <p>Dogfooding our products is one of the best ways to test and improve them, and in some cases, our internal needs dictate or influence our roadmap. Workers Browser Rendering is a good example of that; it was born out of our necessities before we realized it could be a product. We've been using it extensively for things like taking screenshots of pages for social sharing or dashboards, testing web software in CI, or gathering page load performance metrics of our applications.</p><p>But there's one product we've been using to stress test and push the limits of the Browser Rendering API and drive the engineering sprints that brought us to open the beta to our customers today: The <a href="https://radar.cloudflare.com/scan">Cloudflare Radar URL Scanner</a>.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/wv4BHcjI8iY4ZkhNJSGEK/d824748dab36a93bf8393820f0045a89/image3-22.png" />
            
            </figure><p>The URL Scanner scans any URL and compiles a full report containing technical, performance, privacy, and security details about that page. It's processing thousands of scans per day currently. It was built on top of Workers and uses a combination of the Browser Rendering APIs with Puppeteer to create enriched <a href="https://en.wikipedia.org/wiki/HAR_(file_format)">HAR archives</a> and page screenshots, Durable Objects to reuse browser instances, Queues to handle customers' load and execute jobs asynchronously, and <a href="https://www.cloudflare.com/developer-platform/r2/">R2</a> to store the final reports.</p><p>This tool will soon have its own "how we built it" blog. Still, we wanted to let you know about it now because it is a good example of how you can build sophisticated applications using Browser Rendering APIs at scale starting today.</p>
    <div>
      <h3>Future plans</h3>
      <a href="#future-plans">
        
      </a>
    </div>
    <p>The team will keep improving the Browser Rendering API, but a few things are worth mentioning today.</p><p>First, we are looking into upstreaming the changes in our Puppeteer fork to the main project so that using the official library with the Cloudflare Workers Browser Rendering API becomes as easy as a configuration option.</p><p>Second, one of the reasons why we decided to expose the <a href="https://chromedevtools.github.io/devtools-protocol/">DevTools</a> protocol bare naked in the Worker binding is so that it can support other browser instrumentalization libraries in the future. <a href="https://playwright.dev/docs/api/class-playwright">Playwright</a> is a good example of another popular library that developers want to use.</p><p>And last, we are also keeping an eye on and testing <a href="https://w3c.github.io/webdriver-bidi/">WebDriver BiDi</a>, a "new standard browser automation protocol that bridges the gap between the WebDriver Classic and CDP (DevTools) protocols." Click here to know more about the <a href="https://developer.chrome.com/blog/webdriver-bidi-2023/">status of WebDriver BiDi.</a></p>
    <div>
      <h3>Final words</h3>
      <a href="#final-words">
        
      </a>
    </div>
    <p>The Workers Browser Rendering API enters open beta today. We will gradually be enabling the customers in the <a href="https://www.cloudflare.com/en-gb/lp/workers-browser-rendering-api/?ref=blog.cloudflare.com">wait list</a> in batches and sending them emails. We look forward to seeing what you will be building with it and want to hear from you.</p><p>As usual, you can talk to us on our <a href="https://discord.cloudflare.com/">Developers Discord</a> or the <a href="https://community.cloudflare.com/">Community forum</a>; the team will be listening.</p>
    <div>
      <h3>Watch on Cloudflare TV</h3>
      <a href="#watch-on-cloudflare-tv">
        
      </a>
    </div>
    <div></div> ]]></content:encoded>
            <category><![CDATA[Developer Week]]></category>
            <category><![CDATA[Developers]]></category>
            <category><![CDATA[API]]></category>
            <category><![CDATA[Product News]]></category>
            <guid isPermaLink="false">49dZCDezJuyDIP4HFOOCS3</guid>
            <dc:creator>Celso Martinho</dc:creator>
            <dc:creator>Joshua Claeys</dc:creator>
        </item>
        <item>
            <title><![CDATA[Introducing Constellation, bringing AI to the Cloudflare stack]]></title>
            <link>https://blog.cloudflare.com/introducing-constellation/</link>
            <pubDate>Mon, 15 May 2023 13:05:00 GMT</pubDate>
            <description><![CDATA[ Today, we're excited to welcome Constellation to the Cloudflare stack. Constellation allows you to run fast, low-latency inference tasks on pre-trained machine learning models natively on Workers ]]></description>
            <content:encoded><![CDATA[ <p></p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3R4RgKJy7bJgVYTAAfaVHG/f3f2ef323f76c75d0da732f003a25863/image4-6.png" />
            
            </figure><p>The Cloudflare Workers' ecosystem now features products and features ranging from compute, hosting, storage, databases, streaming, networking, security, and <a href="https://developers.cloudflare.com/">much more</a>. Over time, we've been trying to inspire others to switch from traditional software architectures, <a href="/welcome-to-wildebeest-the-fediverse-on-cloudflare/">proving</a> and <a href="/technology-behind-radar2/">documenting</a> how it's possible to build complex applications that scale globally on top of our stack.</p><p>Today, we're excited to welcome Constellation to the Cloudflare stack, enabling developers to run pre-trained machine learning models and inference tasks on Cloudflare's network.</p>
    <div>
      <h2>One more building block in our Supercloud</h2>
      <a href="#one-more-building-block-in-our-supercloud">
        
      </a>
    </div>
    <p><a href="https://www.cloudflare.com/learning/ai/what-is-machine-learning/">Machine learning</a> and <a href="https://www.cloudflare.com/learning/ai/what-is-artificial-intelligence/">AI</a> have been hot topics lately, but the reality is that we have been using these technologies in our daily lives for years now, even if we do not realize it. Our mobile phones, computers, cars, and home assistants, to name a few examples, all have AI. It's everywhere.</p><p>But it isn't a commodity to developers yet, though. They often need to understand the mathematics behind it, the software and tools are dispersed and complex, and the hardware or cloud services to run the frameworks and data are expensive.</p><p><b>Today we're introducing another feature to our stack, allowing everyone to run machine learning models and perform inference on top of Cloudflare Workers.</b></p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6xTzpRtAzAndPm92lizbSW/8915b8578353576900208e752426c8dc/image2-8.png" />
            
            </figure>
    <div>
      <h2>Introducing Constellation</h2>
      <a href="#introducing-constellation">
        
      </a>
    </div>
    <p>Constellation allows you to run fast, low-latency inference tasks using pre-trained machine learning models natively with Cloudflare Workers scripts.</p><p>Some examples of applications that you can deploy leveraging Constellation are:</p><ul><li><p>Image or audio classification or object detection</p></li><li><p>Anomaly Detection in Data</p></li><li><p>Text translation, summarization, or similarity analysis</p></li><li><p>Natural Language Processing</p></li><li><p>Sentiment analysis</p></li><li><p>Speech recognition or text-to-speech</p></li><li><p>Question answering</p></li></ul><p>Developers can upload any supported model to Constellation. They can train them independently or download pre-trained models from machine learning hubs like <a href="https://huggingface.co/models?library=onnx&amp;sort=downloads">HuggingFace</a> or <a href="https://github.com/onnx/models">ONNX Zoo</a>.</p><p>However, not everyone will want to train models or browse the Internet for models they didn't test yet. For that reason, Cloudflare will also maintain a catalog of verified and ready-to-use models.</p><p>We built Constellation with a great developer experience and simple-to-use APIs in mind. Here's an example to get you started.</p>
    <div>
      <h2>Image classification application</h2>
      <a href="#image-classification-application">
        
      </a>
    </div>
    <p>In this example, we will build an image classification app powered by the Constellation inference API and the <a href="https://github.com/onnx/models/blob/main/vision/classification/squeezenet/README.md">SqueezeNet</a> model, a convolutional neural network (CNN) that was pre-trained on more than one million images from the open-source <a href="https://www.image-net.org/">ImageNet</a> database and can classify images into no more than 1,000 categories.</p><p>SqueezeNet compares to <a href="https://en.wikipedia.org/wiki/AlexNet">AlexNet</a>, one of the original CNNs and benchmarks for image classification, by being much faster (~3x) and much smaller (~500x) while still achieving similar levels of accuracy. Its small footprint makes it ideal for running on portable devices with limited resources or custom hardware.</p><p>First, let's create a new Constellation project using the ONNX runtime. <a href="https://developers.cloudflare.com/workers/wrangler/">Wrangler</a> now has functionality for Constellation built-in with the <code>constellation</code> keyword.</p>
            <pre><code>$ npx wrangler constellation project create "image-classifier" ONNX</code></pre>
            <p>Now let’s create the <code>wrangler.toml</code> configuration file with the project binding:</p>
            <pre><code># Top-level configuration
name = "image-classifier-worker"
main = "src/index.ts"
compatibility_date = "2022-07-12"

constellation = [
    {
      binding = 'CLASSIFIER',
      project_id = '2193053a-af0a-40a6-b757-00fa73908ef6'
    },
]</code></pre>
            <p>Installing the Constellation client API library:</p>
            <pre><code>$ npm install @cloudflare/constellation --save-dev</code></pre>
            <p>Upload the pre-trained SqueezeNet 1.1 ONNX model to the project.</p>
            <pre><code>$ wget https://github.com/microsoft/onnxjs-demo/raw/master/docs/squeezenet1_1.onnx
$ npx wrangler constellation model upload "image-classifier" "squeezenet11" squeezenet1_1.onnx</code></pre>
            <p>As we said above, SqueezeNet classifies images into no more than 1,000 object classes. These classes are actually in the form of a list of synonym rings or synsets. A <a href="http://wordnet-rdf.princeton.edu/pwn30/01440764-n">synset</a> has an id and a label; it derives from Princeton's <a href="https://wordnet.princeton.edu/">WordNet</a> database <a href="https://wordnet.princeton.edu/documentation/">terminology</a>, the same used to label the <a href="https://www.image-net.org/about.php">ImageNet</a> image database.</p><p>To translate SqueezeNet's results into human-readable image classes, we need a file that maps the synset ids (what we get from the model) to their corresponding labels.</p>
            <pre><code>$ mkdir src; cd src
$ wget https://raw.githubusercontent.com/microsoft/onnxjs-demo/master/src/data/imagenet.ts</code></pre>
            <p>And finally, let’s code and deploy our image classification script:</p>
            <pre><code>import { imagenetClasses } from './imagenet';
import { Tensor, run } from '@cloudflare/constellation';

export interface Env {
    CLASSIFIER: any,
}

export default {
    async fetch(request: Request, env: Env, ctx: ExecutionContext) {
        const formData = await request.formData();
        const file = formData.get("file");
        const data = await file.arrayBuffer();
        const result = await processImage(env, data);
        return new Response(JSON.stringify(result));
    },
};

async function processImage(env: Env, data: ArrayBuffer) {
    const input = await decodeImage(data)

    const tensorInput = new Tensor("float32", [1, 3, 224, 224], input)

    const output = await run(env.CLASSIFIER, "MODEL-UUID", tensorInput);

    const probs = output.squeezenet0_flatten0_reshape0.value
    const softmaxResult = softmax(probs)
    const results = imagenetClasses(softmaxResult, 5);
    const topResult = results[0];
    return topResult
}</code></pre>
            <p>This script reads an image from the request, decodes it into a multidimensional float32 tensor (right now we only decode PNGs, but we can add other formats), feeds it to the SqueezeNet model running in Constellation, gets the results, matches them with the ImageNet classes list, and returns the human-readable tags for the image.</p><p>Pretty simple, no? Let’s test it:</p>
            <pre><code>$ curl https://ai.cloudflare.com/demos/image-classifier -F file=@images/mountain.png | jq .name

alp

$ curl https://ai.cloudflare.com/demos/image-classifier -F file=@images/car.png | jq .name

convertible

$ curl https://ai.cloudflare.com/demos/image-classifier -F file=@images/dog.png | jq .name

Ibizan hound</code></pre>
            
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7xWxx9wYlUqkpVYUlOOeYL/ee44e25d82b6297eb048938e18384678/Screenshot-2023-05-15-at-12.55.21.png" />
            
            </figure><p>You can see the probabilities in action here. The model is quite sure about the Alp and the Convertible, but the Ibizan hound has a lower probability. Indeed, the dog in the picture is from another breed.</p><p>This small app demonstrates how easy and fast you can start using machine learning models and Constellation when building applications on top of Workers. Check the full source code <a href="https://developers.cloudflare.com/constellation/get-started/first-constellation-worker/">here</a> and deploy it yourself.</p>
    <div>
      <h2>Transformers</h2>
      <a href="#transformers">
        
      </a>
    </div>
    <p><a href="https://en.wikipedia.org/wiki/Transformer_(machine_learning_model)">Transformers</a> were introduced by Google; they are deep-learning models designed to process sequential input data and are commonly used for natural language processing (NLP), like translations, summarizations, or sentiment analysis, and computer vision (CV) tasks, like image classification.</p><p><a href="https://github.com/xenova/transformers.js">Transformers.js</a> is a popular demo that loads transformer models from HuggingFace and runs them inside your browser using the ONNX Runtime compiled to <a href="https://developers.cloudflare.com/workers/platform/web-assembly/">WebAssembly</a>. We ported this demo to use Constellation APIs instead.</p><p>Here's the link to our version: <a href="https://transformers-js.pages.dev/">https://transformers-js.pages.dev/</a></p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/75BazjDxuRzLywP7ZwjeMz/d9ebaebfffd4b4d79954fbb2d77bc6c2/image3-5.png" />
            
            </figure>
    <div>
      <h2>Interoperability with Workers</h2>
      <a href="#interoperability-with-workers">
        
      </a>
    </div>
    <p>The other interesting element of Constellation is that because it runs natively in Workers, you can orchestrate it with other products and APIs in our stack. You can use KV, R2, D1, Queues, anything, even Email.</p><p>Here's an example of a Worker that <a href="https://developers.cloudflare.com/email-routing/email-workers/">receives</a> Emails for your domain on Cloudflare using <a href="https://developers.cloudflare.com/email-routing/">Email Routing</a>, runs Constellation using the <a href="https://huggingface.co/Xenova/t5-small/tree/main/onnx">t5-small</a> sentiment analysis model, adds a header with the resulting score, and forwards it to the destination address.</p>
            <pre><code>import { Tensor, run } from '@cloudflare/constellation';
import * as PostalMime from 'postal-mime';

export interface Env {
    SENTIMENT: any,
}

export default {
  async email(message, env, ctx) {
    const rawEmail = await streamToArrayBuffer(event.raw, event.rawSize);
    const parser = new PostalMime.default();
    const parsedEmail = await parser.parse(rawEmail);

    const input = tokenize(parsedEmail.text)
    const output = await run( env.SENTIMENT, "MODEL-UUID", input);


    var headers = new Headers();
    headers.set("X-Sentiment", idToLabel[output.label]);
    await message.forward("gooddestination@example.com", headers);
  }
}</code></pre>
            <p>Now you can use Gmail or any email client to apply a rule to your messages based on the 'X-Sentiment' header. For example, you might want to move all the angry emails outside your Inbox to a different folder on arrival.</p>
    <div>
      <h2>Start using Constellation</h2>
      <a href="#start-using-constellation">
        
      </a>
    </div>
    <p>Constellation starts today in private beta. To join the waitlist, please head to the dashboard, click the Workers tab under your account, and click the "Request access" button under the <a href="https://dash.cloudflare.com/?to=/:account/workers/constellation">Constellation entry</a>. The team will be onboarding accounts in batches; you'll get an email when your account is enabled.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/2Gn2kLFDXOMHjK2x0biPlS/9a1af77c70a2ea36f472f0643e13d396/image1-25.png" />
            
            </figure><p>In the meantime, you can read our <a href="https://developers.cloudflare.com/constellation/">Constellation Developer Documentation</a> and learn more about how it works and the APIs. Constellation can be used from Wrangler, our command-line tool for configuring, building, and deploying applications with Cloudflare developer products, or managed directly in the Dashboard UI.</p><p>We are eager to learn how you want to use ML/AI with your applications. Constellation will keep improving with higher limits, more supported runtimes, and larger models, but we want to hear from you. Your feedback will certainly influence our roadmap decisions.</p><p>One last thing: today, we've been talking about how you can write Workers that use Constellation, but here's an inception fact: Constellation itself was built using the power of WebAssembly, Workers, R2, and our APIs. We'll make sure to write a follow-up blog soon about how we built it; stay tuned.</p><p>As usual, you can talk to us on our <a href="https://discord.cloudflare.com">Developers Discord</a> (join the #constellation channel) or the <a href="https://community.cloudflare.com/c/developers/constellation/97">Community forum</a>; the team will be listening.</p>
    <div>
      <h3>Watch on Cloudflare TV</h3>
      <a href="#watch-on-cloudflare-tv">
        
      </a>
    </div>
    <div></div><p></p> ]]></content:encoded>
            <category><![CDATA[Developer Week]]></category>
            <category><![CDATA[AI]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[Product News]]></category>
            <category><![CDATA[Developers]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <guid isPermaLink="false">7gPZqgjeqOZXqiC7XdNp8p</guid>
            <dc:creator>Celso Martinho</dc:creator>
            <dc:creator>Rita Kozlov</dc:creator>
        </item>
        <item>
            <title><![CDATA[Query Cloudflare Radar and our docs using ChatGPT plugins]]></title>
            <link>https://blog.cloudflare.com/cloudflare-chatgpt-plugins/</link>
            <pubDate>Mon, 15 May 2023 13:00:32 GMT</pubDate>
            <description><![CDATA[ We’re excited to share two new Cloudflare ChatGPT plugins – the Cloudflare Radar plugin and the Cloudflare Docs plugin ]]></description>
            <content:encoded><![CDATA[ 
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/13OKMGwTZyyLLic6XwPfQd/4a942e67cba851aae4420f25a17f94ad/image7-6.png" />
            
            </figure><p>When OpenAI launched ChatGPT plugins in alpha we knew that it opened the door for new possibilities for both Cloudflare users and developers building on Cloudflare. After the launch, our team quickly went to work seeing what we could build, and today we’re very excited to share with you two new Cloudflare ChatGPT plugins – the Cloudflare Radar plugin and the Cloudflare Docs plugin.</p><p>The Cloudflare Radar plugin allows you to talk to ChatGPT about real-time Internet patterns powered by <a href="https://radar.cloudflare.com/">Cloudflare Radar</a>.</p><p>The Cloudflare Docs plugin allows developers to use ChatGPT to help them write and build Cloudflare applications with the most up-to-date information from our documentation. It also serves as an open source example of how to build a ChatGPT plugin with Cloudflare Workers.</p><p>Let’s do a deeper dive into how each of these plugins work and how we built them.</p>
    <div>
      <h3>Cloudflare Radar ChatGPT plugin</h3>
      <a href="#cloudflare-radar-chatgpt-plugin">
        
      </a>
    </div>
    <p>When ChatGPT introduced <a href="https://openai.com/blog/chatgpt-plugins">plugins</a>, one of their use cases was retrieving real-time data from third-party applications and their APIs and letting users ask relevant questions using natural language.</p><p><a href="https://radar.cloudflare.com/">Cloudflare Radar</a> has lots of data about how people use the Internet, a well-documented <a href="https://developers.cloudflare.com/radar/">public API</a>, an OpenAPI specification, and it’s entirely <a href="/technology-behind-radar2/">built on top of Workers</a>, which gives us lots of flexibility for improvements and extensibility. We had all the building blocks to create a ChatGPT plugin quickly. So, that's what we did.</p><p>We added an <a href="https://api.radar.cloudflare.com/.well-known/ai-plugin.json">OpenAI manifest endpoint</a> which describes what the plugin does, some branding assets, and an <a href="https://api.radar.cloudflare.com/.well-known/openai-schema.json">enriched OpenAPI schema</a> to tell ChatGPT how to use our data APIs. The longest part of our work was fine-tuning the schema with good descriptions (written in natural language, obviously) and examples of how to query our endpoints.</p><p>Amusingly, the descriptions ended up much improved by the need to explain the API endpoints to ChatGPT. An interesting side effect is that this benefits us humans also.</p>
            <pre><code>{
    "/api/v1/http/summary/ip_version": {
        "get": {
            "operationId": "get_SummaryIPVersion",
            "parameters": [
                {
                    "description": "Date range from today minus the number of days or weeks specified in this parameter, if not provided always send 14d in this parameter.",
                    "required": true,
                    "schema": {
                        "type": "string",
                        "example": "14d",
                        "enum": ["14d","1d","2d","7d","28d","12w","24w","52w"]
                    },
                    "name": "dateRange",
                    "in": "query"
                }
            ]
        }
    }</code></pre>
            <p>Luckily, <a href="https://github.com/cloudflare/itty-router-openapi">itty-router-openapi</a>, an easy and compact OpenAPI 3 schema generator and validator for Cloudflare Workers that we built and <a href="/technology-behind-radar2/">open-sourced</a> when we launched Radar 2.0, made it really easy for us to add the missing parts.</p>
            <pre><code>import { OpenAPIRouter } from '@cloudflare/itty-router-openapi'

const router = OpenAPIRouter({
  aiPlugin: {
    name_for_human: 'Cloudflare Radar API',
    name_for_model: 'cloudflare_radar',
    description_for_human: "Get data insights from Cloudflare's point of view.",
    description_for_model:
      "Plugin for retrieving the data based on Cloudflare Radar's data. Use it whenever a user asks something that might be related to Internet usage, eg. outages, Internet traffic, or Cloudflare Radar's data in particular.",
    contact_email: 'radar@cloudflare.com',
    legal_info_url: 'https://www.cloudflare.com/website-terms/',
    logo_url: 'https://cdn-icons-png.flaticon.com/512/5969/5969044.png',
  },
})</code></pre>
            <p>We incorporated our changes into itty-router-openapi, and now it <a href="https://github.com/cloudflare/itty-router-openapi#aiplugin">supports</a> the OpenAI manifest and route, and a few other <a href="https://github.com/cloudflare/itty-router-openapi#openai-plugin-support">options</a> that make it possible for anyone to build their own ChatGPT plugin on top of Workers too.</p><p>The Cloudflare Radar ChatGPT is available to non-free ChatGPT users or anyone on OpenAI’s plugin's <a href="https://openai.com/waitlist/plugins">waitlist</a>. To use it, simply open <a href="https://chat.openai.com/">ChatGPT</a>, go to the Plugin store and install Cloudflare Radar.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3R2uxXouXE7NMwAXwzwuLS/4cf0e59cbe09e0f429a06d74e141a206/image6-6.png" />
            
            </figure><p>Once installed, you can talk to it and ask questions about our data using natural language.</p><p>When you add plugins to your account, ChatGPT will prioritize using their data based on what the language model understands from the human-readable descriptions found in the manifest and Open API schema. If ChatGPT doesn't think your prompt can benefit from what the plugin provides, then it falls back to its standard capabilities.</p><p>Another interesting thing about plugins is that they extend ChatGPT's limited knowledge of the world and events after 2021 and can provide fresh insights based on recent data.</p><p>Here are a few examples to get you started:</p><p><b>"What is the percentage distribution of traffic per TLS protocol version?"</b></p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1emtbmUbCUo2RhILUplHft/3789e6672bd172c5d6e3d17ce56f283c/download--5--3.png" />
            
            </figure><p><b>"What's the HTTP protocol version distribution in Portugal?"</b></p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/2ADYGYFmtoTb1MUa3YVdzP/5fb35fa683292e30c87a5bbcaa772a2e/download-8.png" />
            
            </figure><p>Now that ChatGPT has context, you can add some variants, like switching the country and the date range.</p><p><b>“How about the US in the last six months?”</b></p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3RKxEE4nzqd5cBMOUfVBlH/14812d97f2fc04aa118bcd401275957d/download--1--5.png" />
            
            </figure><p>You can also combine multiple topics (ChatGPT will make multiple API calls behind the scenes and combine the results in the best possible way).</p><p><b>“How do HTTP protocol versions compare with TLS protocol versions?”</b></p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5kwMCXTMv3vQHmqdpFft2E/f0a525b78399b761a80d4aba84bb6096/download--2--4.png" />
            
            </figure><p>Out of ideas? Ask it “<b>What can I ask the Radar plugin?</b>”, or “<b>Give me a random insight</b>”.</p><p>Be creative, too; it understands a lot about our data, and we keep improving it. You can also add date or country filters using natural language in your prompts.</p>
    <div>
      <h3>Cloudflare Docs ChatGPT plugin</h3>
      <a href="#cloudflare-docs-chatgpt-plugin">
        
      </a>
    </div>
    <p>The Cloudflare Docs plugin is a <a href="https://openai.com/blog/chatgpt-plugins#retrieval">ChatGPT Retrieval Plugin</a> that lets you access the most up-to-date knowledge from our developer documentation using ChatGPT. This means if you’re using ChatGPT to assist you with building on Cloudflare that the answers you’re getting or code that’s being generated will be informed by current best practices and information located within our docs. You can set up and run the Cloudflare Docs ChatGPT Plugin by following the read me in <a href="https://github.com/cloudflare/chatgpt-plugin/tree/main/example-retrieval-plugin">the example repo</a>.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/HTxO668NzSVaOOZ525ZtE/976dd480de0d58039a57f98ae2accb09/image1-20.png" />
            
            </figure><p>The plugin was built entirely on Workers and uses <a href="https://developers.cloudflare.com/workers/learning/how-kv-works/">KV</a> as a vector store. It can also keep its index up-to-date using <a href="https://developers.cloudflare.com/workers/platform/triggers/cron-triggers/">Cron Triggers</a>, <a href="https://developers.cloudflare.com/queues/">Queues</a> and <a href="https://developers.cloudflare.com/workers/runtime-apis/durable-objects/">Durable Objects</a>.</p><p>The plugin is a Worker that responds to POST requests from ChatGPT to a <code>/query</code> endpoint. When a query comes in, the Worker converts the query text into an <a href="https://platform.openai.com/docs/guides/embeddings">embedding vector via the OpenAI embeddings API</a> and uses this to find, and return, the most relevant document snippets from Cloudflare’s developer documentation.</p><p>The way this is achieved is by first converting every document in Cloudflare’s developer documentation on GitHub into embedding vectors (again using OpenAI’s API) and storing them in KV. This storage format allows you to find semantically similar content by doing a <a href="https://en.wikipedia.org/wiki/Similarity_search">similarity search</a> (we use <a href="https://en.wikipedia.org/wiki/Cosine_similarity">cosine similarity</a>), where two pieces of text that are similar in meaning will result in the two embedding vectors having a high similarity score. Cloudflare’s entire developer documentation compresses to under 5MB when converted to embedding vectors, so fetching these from KV is very quick. We’ve also explored building larger vector stores on Workers, as can be seen in <a href="https://ai.cloudflare.com/demos/vector-store/">this demo of 1 million vectors stored on Durable Object storage</a>. We’ll be releasing more open source libraries to support these vector store use cases in the near future.</p><p>So ChatGPT will query the plugin when it believes the user’s question is related to Cloudflare’s developer tools, and the plugin will return a list of up-to-date information snippets directly from our documentation. ChatGPT can then decide how to use these snippets to best answer the user’s question.</p><p>The plugin also includes a “Scheduler” Worker that can periodically refresh the documentation embedding vectors, so that the information is always up-to-date. This is advantageous because ChatGPT’s own knowledge has a cutoff of September 2021 – so it’s not aware of changes in documentation, or new Cloudflare products.</p><p>The Scheduler Worker is triggered by a <a href="https://developers.cloudflare.com/workers/platform/triggers/cron-triggers/">Cron Trigger</a>, on a schedule you can set (eg, hourly), where it will check which content has changed since it last ran via GitHub’s API. It then sends these document paths in messages to a <a href="https://developers.cloudflare.com/queues/">Queue</a> to be processed. Workers will batch process these messages – for each message, the content is fetched from GitHub, and then turned into embedding vectors via OpenAI’s API. A <a href="https://developers.cloudflare.com/workers/runtime-apis/durable-objects/">Durable Object</a> is used to coordinate all the Queue processing so that when all the batches have finished processing, the resulting embedding vectors can be combined and stored in KV, ready for querying by the plugin.</p><p>This is a great example of how Workers can be used not only for front-facing HTTP APIs, but also for scheduled batch-processing use cases.</p>
    <div>
      <h3>Let us know what you think</h3>
      <a href="#let-us-know-what-you-think">
        
      </a>
    </div>
    <p>We are in a time when technology is constantly changing and evolving, so as you experiment with these new plugins please let us know what you think. What do you like? What could be better? Since ChatGPT plugins are in alpha, changes to the plugins user interface or performance (i.e. latency) may occur. If you build your own plugin, we’d love to see it and if it’s open source you can submit a pull request on our <a href="https://github.com/cloudflare/chatgpt-plugin">example repo</a>. You can always find us hanging out in our <a href="http://discord.cloudflare.com/">developer discord</a>.</p>
    <div>
      <h3>Watch on Cloudflare TV</h3>
      <a href="#watch-on-cloudflare-tv">
        
      </a>
    </div>
    <div></div> ]]></content:encoded>
            <category><![CDATA[Developer Week]]></category>
            <category><![CDATA[ChatGPT]]></category>
            <category><![CDATA[AI]]></category>
            <category><![CDATA[OpenAI]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[Developers]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <guid isPermaLink="false">1V5D6mOrE8HfIU7fkANorJ</guid>
            <dc:creator>Ricky Robinett</dc:creator>
            <dc:creator>Celso Martinho</dc:creator>
            <dc:creator>Michael Hart</dc:creator>
        </item>
        <item>
            <title><![CDATA[Welcome to Wildebeest: the Fediverse on Cloudflare]]></title>
            <link>https://blog.cloudflare.com/welcome-to-wildebeest-the-fediverse-on-cloudflare/</link>
            <pubDate>Wed, 08 Feb 2023 19:00:00 GMT</pubDate>
            <description><![CDATA[ Today we're announcing Wildebeest, an open-source, easy-to-deploy ActivityPub and Mastodon-compatible server built entirely on top of Cloudflare's Supercloud. ]]></description>
            <content:encoded><![CDATA[ <p></p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5BbRRixkMxcIiNYgdA9go/f2d1e27e932958951271d36ccffa2c16/Wildebeest.png" />
            
            </figure><p><a href="https://en.wikipedia.org/wiki/Fediverse">The Fediverse</a> has been a hot topic of discussion lately, with thousands, if not <a href="https://bitcoinhackers.org/@mastodonusercount">millions</a>, of new users creating accounts on platforms like <a href="https://joinmastodon.org/">Mastodon</a> to either move entirely to "the other side" or experiment and learn about this new social network.</p><p>Today we're introducing <a href="https://github.com/cloudflare/wildebeest">Wildebeest</a>, an open-source, easy-to-deploy ActivityPub and Mastodon-compatible server built entirely on top of Cloudflare's Supercloud. If you want to run your own spot in the Fediverse you can now do it entirely on Cloudflare.</p>
    <div>
      <h2>The Fediverse, built on Cloudflare</h2>
      <a href="#the-fediverse-built-on-cloudflare">
        
      </a>
    </div>
    <p>Today you're left with two options if you want to join the Mastodon federated network: either you join one of the <a href="https://joinmastodon.org/servers">existing servers</a> (servers are also called communities, and each one has its own infrastructure and rules), or you can run your self-hosted server.</p><p>There are a few reasons why you'd want to run your own server:</p><ul><li><p>You want to create a new community and attract other users over a common theme and usage rules.</p></li><li><p>You don't want to have to trust third-party servers or abide by their policies and want your server, under your domain, for your personal account.</p></li><li><p>You want complete control over your data, personal information, and content and visibility over what happens with your instance.</p></li></ul><p>The Mastodon gGmbH non-profit organization provides a server implementation using Ruby, Node.js, PostgreSQL and Redis. Running the <a href="https://github.com/mastodon/mastodon">official server</a> can be challenging, though. You need to own or rent a server or VPS somewhere; you have to install and configure the software, set up the database and public-facing web server, and configure and protect your network against attacks or abuse. And then you have to maintain all of that and deal with constant updates. It's a lot of scripting and technical work before you can get it up and running; definitely not something for the less technical enthusiasts.</p><p>Wildebeest serves two purposes: you can quickly deploy your Mastodon-compatible server on top of Cloudflare and connect it to the Fediverse in minutes, and you don't need to worry about maintaining or protecting it from abuse or attacks; Cloudflare will do it for you automatically.</p><p>Wildebeest is not a managed service. It's your instance, data, and code running in our cloud under your Cloudflare account. Furthermore, it's <a href="https://github.com/cloudflare/wildebeest">open-sourced</a>, which means it keeps evolving with more features, and anyone can <a href="https://github.com/cloudflare/wildebeest/pulls">extend</a> and improve it.</p><p>Here's what we support today:</p><ul><li><p><a href="https://www.w3.org/TR/activitypub/">ActivityPub</a>, <a href="https://www.rfc-editor.org/rfc/rfc7033">WebFinger</a>, <a href="https://github.com/cloudflare/wildebeest/tree/main/functions/nodeinfo">NodeInfo</a>, <a href="https://datatracker.ietf.org/doc/html/rfc8030">WebPush</a> and <a href="https://docs.joinmastodon.org/api/">Mastodon-compatible</a> APIs. Wildebeest can connect to or receive connections from other Fediverse servers.</p></li><li><p>Compatible with the most popular Mastodon <a href="https://github.com/nolanlawson/pinafore">web</a> (like <a href="https://github.com/nolanlawson/pinafore">Pinafore</a>), desktop, and <a href="https://joinmastodon.org/apps">mobile clients</a>. We also provide a simple read-only web interface to explore the timelines and user profiles.</p></li><li><p>You can publish, edit, boost, or delete posts, sorry, toots. We support text, images, and (soon) video.</p></li><li><p>Anyone can follow you; you can follow anyone.</p></li><li><p>You can search for content.</p></li><li><p>You can register one or multiple accounts under your instance. Authentication can be email-based on or using any Cloudflare Access compatible IdP, like GitHub or Google.</p></li><li><p>You can edit your profile information, avatar, and header image.</p></li></ul>
    <div>
      <h2>How we built it</h2>
      <a href="#how-we-built-it">
        
      </a>
    </div>
    <p>Our implementation is built entirely on top of our <a href="https://www.cloudflare.com/cloudflare-product-portfolio/">products</a> and <a href="https://developers.cloudflare.com/">APIs</a>. Building Wildebeest was another excellent opportunity to showcase our technology stack's power and versatility and prove how anyone can also use Cloudflare to build larger applications that involve multiple systems and complex requirements.</p><p>Here's a birds-eye diagram of Wildebeest's architecture:</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/33R5UHXYSBDBUsoFLMkoC8/0304880c93af0a41d168616da4c73b90/Screenshot-2023-02-08-at-10.58.01-AM.png" />
            
            </figure><p>Let's get into the details and get technical now.</p>
    <div>
      <h3>Cloudflare Pages</h3>
      <a href="#cloudflare-pages">
        
      </a>
    </div>
    <p>At the core, Wildebeest is a <a href="https://pages.cloudflare.com/">Cloudflare Pages</a> project running its code using <a href="https://developers.cloudflare.com/pages/platform/functions/">Pages Functions</a>. Cloudflare Pages provides an excellent foundation for building and deploying your application and serving your bundled assets, Functions gives you full access to the Workers ecosystem, where you can run any code.</p><p>Functions has a built-in <a href="https://developers.cloudflare.com/pages/platform/functions/routing/">file-based router</a>. The <a href="https://github.com/cloudflare/wildebeest/tree/main/functions">/functions</a> directory structure, which is uploaded by Wildebeest’s continuous deployment builds, defines your application routes and what files and code will process each HTTP endpoint request. This routing technique is similar to what other frameworks like Next.js <a href="https://nextjs.org/docs/routing/introduction">use</a>.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5NsNlFYtyqKVzhFFBuGeRW/767c9b102b9d97ad067c343df387c5db/2b.png" />
            
            </figure><p>For example, Mastodon’s <a href="https://docs.joinmastodon.org/methods/timelines/#public">/api/v1/timelines/public</a> API endpoint is handled by <a href="https://github.com/cloudflare/wildebeest/blob/main/functions/api/v1/timelines/public.ts">/functions/api/v1/timelines/public.ts</a> with the onRequest method.</p>
            <pre><code>export onRequest = async ({ request, env }) =&gt; {
	const { searchParams } = new URL(request.url)
	const domain = new URL(request.url).hostname
...
	return handleRequest(domain, env.DATABASE, {})
}

export async function handleRequest(
    …
): Promise&lt;Response&gt; {
    …
}
</code></pre>
            <p>Unit testing these endpoints becomes easier too, since we only have to call the handleRequest() function from the testing framework. Check one of our <a href="https://jestjs.io/">Jest</a> tests, <a href="https://github.com/cloudflare/wildebeest/blob/main/backend/test/mastodon.spec.ts">mastodon.spec.ts</a>:</p>
            <pre><code>import * as v1_instance from 'wildebeest/functions/api/v1/instance'

describe('Mastodon APIs', () =&gt; {
	describe('instance', () =&gt; {
		test('return the instance infos v1', async () =&gt; {
			const res = await v1_instance.handleRequest(domain, env)
			assert.equal(res.status, 200)
			assertCORS(res)

			const data = await res.json&lt;Data&gt;()
			assert.equal(data.rules.length, 0)
			assert(data.version.includes('Wildebeest'))
		})
       })
})
</code></pre>
            <p>As with any other regular Worker, Functions also lets you set up <a href="https://developers.cloudflare.com/pages/platform/functions/bindings/">bindings</a> to interact with other Cloudflare products and features like <a href="https://developers.cloudflare.com/workers/runtime-apis/kv/">KV</a>, <a href="https://developers.cloudflare.com/r2/data-access/workers-api/workers-api-reference/">R2</a>, <a href="https://developers.cloudflare.com/d1/">D1</a>, <a href="https://developers.cloudflare.com/workers/runtime-apis/durable-objects/">Durable Objects</a>, and more. The list keeps growing.</p><p>We use Functions to implement a large portion of the official <a href="https://docs.joinmastodon.org/api/">Mastodon API</a> specification, making Wildebeest compatible with the existing ecosystem of other servers and client applications, and also to run our own read-only web frontend under the same project codebase.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/Wz8EZKQyMvyEfDvH7cOV9/02183c976fe7c619c2fc4f8e99795463/3b.png" />
            
            </figure><p>Wildebeest’s web frontend uses <a href="https://qwik.builder.io/">Qwik</a>, a general-purpose web framework that is optimized for speed, uses modern concepts like the JSX JavaScript syntax extension and supports server-side-rendering (SSR) and static site generation (SSG).</p><p>Qwik provides a <a href="https://qwik.builder.io/integrations/deployments/cloudflare-pages/">Cloudflare Pages Adaptor</a> out of the box, so we use that (check our <a href="https://developers.cloudflare.com/pages/framework-guides/deploy-a-qwik-site/">framework guide</a> to know more about how to deploy a Qwik site on Cloudflare Pages). For styling we use the <a href="https://tailwindcss.com/">Tailwind CSS</a> framework, which Qwik supports natively.</p><p>Our frontend website code and static assets can be found under the <a href="https://github.com/cloudflare/wildebeest/tree/main/frontend">/frontend</a> directory. The application is handled by the <a href="https://github.com/cloudflare/wildebeest/blob/main/functions/%5B%5Bpath%5D%5D.ts">/functions/[[path]].js</a> dynamic route, which basically catches all the non-API requests, and then <a href="https://github.com/cloudflare/wildebeest/blob/main/frontend/src/entry.cloudflare-pages.tsx">invokes</a> Qwik’s own internal router, <a href="https://qwik.builder.io/qwikcity/routing/overview/">Qwik City</a>, which takes over everything else after that.</p><p>The power and versatility of Pages and Functions routes make it possible to run both the backend APIs and a server-side-rendered dynamic client, effectively a full-stack app, under the same project.</p><p>Let's dig even deeper now, and understand how the server interacts with the other components in our architecture.</p>
    <div>
      <h3>D1</h3>
      <a href="#d1">
        
      </a>
    </div>
    <p>Wildebeest uses <a href="https://developers.cloudflare.com/d1/">D1</a>, <a href="https://www.cloudflare.com/developer-platform/products/d1/">Cloudflare’s first SQL database</a> for the Workers platform built on top of SQLite, now open to everyone in <a href="/d1-open-alpha/">alpha</a>, to store and query data. Here’s our schema:</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/67Xq9kbn6qh2XgRveGSdHt/3a020d6c71a89f0020b8fb7e87433601/4b.png" />
            
            </figure><p>The schema will probably change in the future, as we add more features. That’s fine, D1 supports <a href="https://developers.cloudflare.com/d1/platform/migrations/">migrations</a> which are great when you need to update your database schema without losing your data. With each new Wildebeest version, we can create a <a href="https://github.com/cloudflare/wildebeest/blob/main/migrations/0001_add-unique-following.sql">new migration</a> file if it requires database schema changes.</p>
            <pre><code>-- Migration number: 0001 	 2023-01-16T13:09:04.033Z

CREATE UNIQUE INDEX unique_actor_following ON actor_following (actor_id, target_actor_id);
</code></pre>
            <p>D1 exposes a powerful <a href="https://developers.cloudflare.com/d1/platform/client-api/">client API</a> that developers can use to manipulate and query data from Worker scripts, or in our case, Pages Functions.</p><p>Here’s a simplified example of how we interact with D1 when you start following someone on the Fediverse:</p>
            <pre><code>export async function addFollowing(db, actor, target, targetAcct): Promise&lt;UUID&gt; {
	const query = `INSERT OR IGNORE INTO actor_following (id, actor_id, target_actor_id, state, target_actor_acct) VALUES (?, ?, ?, ?, ?)`
	const out = await db
		.prepare(query)
		.bind(id, actor.id.toString(), target.id.toString(), STATE_PENDING, targetAcct)
		.run()
	return id
}
</code></pre>
            <p>Cloudflare’s culture of dogfooding and building on top of our own products means that we sometimes experience their shortcomings before our users. We did face a few challenges using D1, which is built on SQLite, to store our data. Here are two examples.</p><p><a href="https://www.w3.org/TR/activitypub/">ActivityPub</a> uses <a href="https://www.rfc-editor.org/rfc/rfc4122.txt">UUIDs</a> to identify objects and reference them in URIs extensively. These objects need to be stored in the database. Other databases like PostgreSQL provide built-in functions to <a href="https://www.postgresql.org/docs/current/functions-uuid.html">generate unique identifiers</a>. SQLite and D1 don't have that, yet, it’s in our roadmap.</p><p>Worry not though, the Workers runtime supports <a href="https://developers.cloudflare.com/workers/runtime-apis/web-crypto/">Web Crypto</a>, so we use crypto.randomUUID() to get our unique identifiers. Check the <a href="https://github.com/cloudflare/wildebeest/blob/main/backend/src/activitypub/actors/inbox.ts">/backend/src/activitypub/actors/inbox.ts</a>:</p>
            <pre><code>export async function addObjectInInbox(db, actor, obj) {
	const id = crypto.randomUUID()
	const out = await db
		.prepare('INSERT INTO inbox_objects(id, actor_id, object_id) VALUES(?, ?, ?)')
		.bind(id, actor.id.toString(), obj.id.toString())
		.run()
}</code></pre>
            <p>Problem solved.</p><p>The other example is that we need to store dates with sub-second resolution. Again, databases like PostgreSQL have that:</p>
            <pre><code>psql&gt; select now();
2023-02-01 11:45:17.425563+00</code></pre>
            <p>However SQLite falls short with:</p>
            <pre><code>sqlite&gt; select datetime();
2023-02-01 11:44:02</code></pre>
            <p>We worked around this problem with a small hack using <a href="https://www.sqlite.org/lang_datefunc.html">strftime()</a>:</p>
            <pre><code>sqlite&gt; select strftime('%Y-%m-%d %H:%M:%f', 'NOW');
2023-02-01 11:49:35.624</code></pre>
            <p>See our <a href="https://github.com/cloudflare/wildebeest/blob/main/migrations/0000_initial.sql">initial SQL schema</a>, look for the <i>cdate</i> defaults.</p>
    <div>
      <h3>Images</h3>
      <a href="#images">
        
      </a>
    </div>
    <p>Mastodon content has a lot of rich media. We don't need to reinvent the wheel and build an image pipeline; Cloudflare Images <a href="https://developers.cloudflare.com/images/">provides APIs</a> to upload, transform, and serve optimized images from our global CDN, so it's the perfect fit for Wildebeest's requirements.</p><p>Things like posting content images, the profile avatar, or headers, all use the Images APIs. See <a href="https://github.com/cloudflare/wildebeest/blob/main/backend/src/media/image.ts">/backend/src/media/image.ts</a> to understand how we interface with Images.</p>
            <pre><code>async function upload(file: File, config: Config): Promise&lt;UploadResult&gt; {
	const formData = new FormData()
	const url = `https://api.cloudflare.com/client/v4/accounts/${config.accountId}/images/v1`

	formData.set('file', file)

	const res = await fetch(url, {
		method: 'POST',
		body: formData,
		headers: {
			authorization: 'Bearer ' + config.apiToken,
		},
	})

      const data = await res.json()
	return data.result
}</code></pre>
            <p>If you're curious about Images for your next project, here's a tutorial on <a href="https://developers.cloudflare.com/images/cloudflare-images/tutorials/integrate-cloudflare-images/">how to integrate Cloudflare Images</a> on your website.</p><p>Cloudflare Images is also available from the dashboard. You can use it to browse or manage your catalog quickly.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1A4gwBFdbSGDvS4DAJRyhR/95849178c4b10c82d5f619ffc1153ba0/5b.png" />
            
            </figure>
    <div>
      <h3>Queues</h3>
      <a href="#queues">
        
      </a>
    </div>
    <p>The <a href="https://www.w3.org/TR/activitypub/">ActivityPub</a> protocol is chatty by design. Depending on the size of your social graph, there might be a lot of back-and-forth HTTP traffic. We can’t have the clients blocked waiting for hundreds of Fediverse message deliveries every time someone posts something.</p><p>We needed a way to work asynchronously and launch background jobs to offload data processing away from the main app and keep the clients snappy. The official Mastodon server has a similar strategy using <a href="https://docs.joinmastodon.org/admin/scaling/#sidekiq">Sidekiq</a> to do background processing.</p><p>Fortunately, we don't need to worry about any of this complexity either. <a href="https://developers.cloudflare.com/queues/">Cloudflare Queues</a> allows developers to send and receive messages with guaranteed delivery, and offload work from your Workers' requests, effectively providing you with asynchronous batch job capabilities.</p><p>To put it simply, you have a queue topic identifier, which is basically a buffered list that scales automatically, then you have one or more producers that, well, produce structured messages, JSON objects in our case, and put them in the queue (you define their schema), and finally you have one or more consumers that subscribes that queue, receive its messages and process them, at their own speed.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5m1TSSTZesMX1jt7K7YpHS/c192aa543426e12c03b2c753f4e4b8c4/6b.png" />
            
            </figure><p>Here’s the <a href="https://developers.cloudflare.com/queues/learning/how-queues-works/">How Queues works</a> page for more information.</p><p>In our case, the main application produces queue jobs whenever any incoming API call requires long, expensive operations. For example, when someone posts, sorry, <i>toots</i> something, we need to broadcast that to their followers' inboxes, potentially triggering many requests to remote servers. <a href="https://github.com/cloudflare/wildebeest/blob/main/backend/src/activitypub/deliver.ts">Here we are</a> queueing a job for that, thus freeing the APIs to keep responding:</p>
            <pre><code>export async function deliverFollowers(
	db: D1Database,
	from: Actor,
	activity: Activity,
	queue: Queue
) {
	const followers = await getFollowers(db, from)

	const messages = followers.map((id) =&gt; {
		const body = {
			activity: JSON.parse(JSON.stringify(activity)),
			actorId: from.id.toString(),
			toActorId: id,
		}
		return { body }
	})

	await queue.sendBatch(messages)
}</code></pre>
            <p>Similarly, we don't want to stop the main APIs when remote servers deliver messages to our instance inboxes. Here's Wildebeest creating asynchronous jobs when it <a href="https://github.com/cloudflare/wildebeest/blob/main/functions/ap/users/%5Bid%5D/inbox.ts">receives messages</a> in the inbox:</p>
            <pre><code>export async function handleRequest(
	domain: string,
	db: D1Database,
	id: string,
	activity: Activity,
	queue: Queue,
): Promise&lt;Response&gt; {
	const handle = parseHandle(id)

	const actorId = actorURL(domain, handle.localPart)
const actor = await actors.getPersonById(db, actorId)

      // creates job
	await queue.send({
		type: MessageType.Inbox,
		actorId: actor.id.toString(),
		activity,
	})

	// frees the API
	return new Response('', { status: 200 })
}</code></pre>
            <p>And the final piece of the puzzle, our <a href="https://github.com/cloudflare/wildebeest/tree/main/consumer">queue consumer</a> runs in a separate Worker, independently from the Pages project. The consumer listens for new messages and processes them sequentially, at its rhythm, freeing everyone else from blocking. When things get busy, the queue grows its buffer. Still, things keep running, and the jobs will eventually get dispatched, freeing the main APIs for the critical stuff: responding to remote servers and clients as quickly as possible.</p>
            <pre><code>export default {
	async queue(batch, env, ctx) {
		for (const message of batch.messages) {
			…

			switch (message.body.type) {
				case MessageType.Inbox: {
					await handleInboxMessage(...)
					break
				}
				case MessageType.Deliver: {
					await handleDeliverMessage(...)
					break
				}
			}
		}
	},
}</code></pre>
            <p>If you want to get your hands dirty with Queues, here’s a simple example on <a href="https://developers.cloudflare.com/queues/examples/send-errors-to-r2/">Using Queues to store data in R2</a>.</p>
    <div>
      <h3>Caching and Durable Objects</h3>
      <a href="#caching-and-durable-objects">
        
      </a>
    </div>
    <p>Caching repetitive operations is yet another strategy for improving performance in complex applications that require data processing. A famous Netscape developer, Phil Karlton, once said: "There are only two hard things in Computer Science: <b>cache invalidation</b> and naming things."</p><p>Cloudflare obviously knows a lot about caching since <a href="https://developers.cloudflare.com/cache/">it's a core feature</a> of our global CDN. We also provide <a href="https://developers.cloudflare.com/workers/learning/how-kv-works/">Workers KV</a> to our customers, a global, low-latency, key-value data store that anyone can use to cache data objects in our data centers and build fast websites and applications.</p><p>However, KV achieves its performance by being eventually consistent. While this is fine for many applications and use cases, it's not ideal for others.</p><p>The ActivityPub protocol is highly transactional and can't afford eventual consistency. Here's an example: generating complete timelines is expensive, so we cache that operation. However, when you post something, we need to invalidate that cache before we reply to the client. Otherwise, the new post won't be in the timeline and the client can fail with an error because it doesn’t see it. This actually happened to us with one of the most popular clients.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7mBZfs5UZumkHzh9ITpUSn/1f9d5e53e7d61417d962a9fd566df9e6/7b.png" />
            
            </figure><p>We needed to get clever. The team discussed a few options. Fortunately, our API catalog has plenty of options. Meet <a href="https://developers.cloudflare.com/workers/learning/using-durable-objects/">Durable Objects</a>.</p><p>Durable Objects are single-instance Workers that provide a transactional storage API. They're ideal when you need central coordination, strong consistency, and state persistence. You can use Durable Objects in cases like handling the state of <a href="https://developers.cloudflare.com/workers/learning/using-websockets/#durable-objects-and-websocket-state">multiple WebSocket</a> connections, coordinating and routing messages in a <a href="https://github.com/cloudflare/workers-chat-demo">chatroom</a>, or even <a href="/doom-multiplayer-workers/">running a multiplayer game like Doom</a>.</p><p>You know where this is going now. Yes, we implemented our key-value caching subsystem for Wildebeest <a href="https://github.com/cloudflare/wildebeest/tree/main/do">on top of a Durable Object</a>. By taking advantage of the DO's native transactional storage API, we can have strong guarantees that whenever we create or change a key, the next read will always return the latest version.</p><p>The idea is so simple and effective that it took us literally a <a href="https://github.com/cloudflare/wildebeest/blob/main/do/src/index.ts">few lines of code</a> to implement a key-value cache with two primitives: HTTP PUT and GET.</p>
            <pre><code>export class WildebeestCache {
	async fetch(request: Request) {
		if (request.method === 'GET') {
			const { pathname } = new URL(request.url)
			const key = pathname.slice(1)
			const value = await this.storage.get(key)
			return new Response(JSON.stringify(value))
		}

		if (request.method === 'PUT') {
			const { key, value } = await request.json()
			await this.storage.put(key, value)
			return new Response('', { status: 201 })
		}
	}
}</code></pre>
            <p>Strong consistency it is. Let's move to user registration and authentication now.</p>
    <div>
      <h3>Zero Trust Access</h3>
      <a href="#zero-trust-access">
        
      </a>
    </div>
    <p>The official Mastodon server <a href="https://docs.joinmastodon.org/user/signup/">handles user registrations</a>, typically using email, before you can choose your local username and start using the service. Handling user registration and authentication can be daunting and time-consuming if we were to build it from scratch though.</p><p>Furthermore, people don't want to create new credentials for every new service they want to use and instead want more convenient OAuth-like authorization and authentication methods so that they can reuse their existing Apple, Google, or GitHub accounts.</p><p>We wanted to simplify things using Cloudflare’s built-in features. Needless to say, we have a product that handles user onboarding, authentication, and <a href="https://developers.cloudflare.com/cloudflare-one/policies/access/policy-management/">access policies</a> to any application behind Cloudflare; it's called <a href="https://developers.cloudflare.com/cloudflare-one/">Zero Trust</a>. So we put Wildebeest behind it.</p><p>Zero Trust Access can either do one-time PIN (<a href="https://en.wikipedia.org/wiki/One-time_password">OTP</a>) authentication using email or single-sign-on (SSO) with many identity providers (examples: Google, Facebook, GitHub, LinkedIn), including any generic one supporting <a href="https://developers.cloudflare.com/cloudflare-one/identity/idp-integration/generic-saml/">SAML 2.0</a>.</p><p>When you start using Wildebeest with a client, you don't need to register at all. Instead, you go straight to log in, which will redirect you to the Access page and handle the authentication according to the policy that you, the owner of your instance, configured.</p><p>The policy defines who can authenticate, and how.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1zDpfgueYrKRmhmNvHCBGX/68b6b579fcb33110566b07ea6e5a3d3e/8b.png" />
            
            </figure><p>When authenticated, Access will redirect you back to Wildebeest. The first time this happens, we will detect that we don't have information about the user and ask for your Username and Display Name. This will be asked only once and is what will be to create your public Mastodon profile.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/76J7DmTtShD7slpawYXNAE/ccc908ed0dffb75a7ce6afc7b0b55510/9b.png" />
            
            </figure><p>Technically, Wildebeest implements the <a href="https://docs.joinmastodon.org/spec/oauth/#implementation">OAuth 2 specification</a>. <a href="https://www.cloudflare.com/learning/security/glossary/what-is-zero-trust/">Zero Trust</a> protects the <a href="https://github.com/cloudflare/wildebeest/blob/main/functions/oauth/authorize.ts">/oauth/authorize</a> endpoint and issues a valid <a href="https://developers.cloudflare.com/cloudflare-one/identity/authorization-cookie/validating-json/">JWT token</a> in the request headers when the user is authenticated. Wildebeest then reads and verifies the JWT and returns an authorization code in the URL redirect.</p><p>Once the client has an authorization code, it can use the <a href="https://github.com/cloudflare/wildebeest/blob/main/functions/oauth/token.ts">/oauth/token</a> endpoint to obtain an API access token. Subsequent API calls inject a bearer token in the Authorization header:</p><p><code>Authorization: Bearer access_token</code></p>
    <div>
      <h3>Deployment and Continuous Integration</h3>
      <a href="#deployment-and-continuous-integration">
        
      </a>
    </div>
    <p>We didn't want to run a managed service for Mastodon as it would somewhat diminish the concepts of federation and data ownership. Also, we recognize that ActivityPub and Mastodon are emerging, fast-paced technologies that will evolve quickly and in ways that are difficult to predict just yet.</p><p>For these reasons, we thought the best way to help the ecosystem right now would be to provide an open-source software package that anyone could use, customize, improve, and deploy on top of our cloud. Cloudflare will obviously keep improving Wildebeest and support the community, but we want to give our Fediverse maintainers complete control and ownership of their instances and data.</p><p>The remaining question was, how do we distribute the Wildebeest bundle and make it easy to deploy into someone's account when it requires configuring so many Cloudflare features, and how do we facilitate updating the software over time?</p><p>The solution ended up being a clever mix of using GitHub with <a href="https://github.com/features/actions">GitHub Actions</a>, <a href="https://developers.cloudflare.com/workers/platform/deploy-button/">Deploy with Workers</a>, and <a href="https://github.com/cloudflare/terraform-provider-cloudflare">Terraform</a>.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5V8fRfu3U03n2ZVNtsh01L/404637763a8476b425a562ff5bbf8739/Screenshot-2023-02-08-at-11.13.05-AM-1.png" />
            
            </figure><p>The Deploy with Workers button is a specially crafted link that auto-generates a workflow page where the user gets asked some questions, and Cloudflare handles authorizing GitHub to deploy to Workers, automatically forks the Wildebeest repository into the user's account, and then configures and deploys the project using a <a href="https://github.com/marketplace/actions/deploy-to-cloudflare-workers-with-wrangler">GitHub Actions</a> workflow.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3MhyoOAbQEjlNnEhwl70Jm/5000c8c1dc1dfea549ee6ca62f8460b4/10b.png" />
            
            </figure><p>A GitHub Actions <a href="https://docs.github.com/en/actions/using-workflows/about-workflows">workflow</a> is a YAML file that declares what to do in every step. Here’s the <a href="https://github.com/cloudflare/wildebeest/blob/main/.github/workflows/deploy.yml">Wildebeest workflow</a> (simplified):</p>
            <pre><code>name: Deploy
on:
  push:
    branches:
      - main
  repository_dispatch:
jobs:
  deploy:
    runs-on: ubuntu-latest
    timeout-minutes: 60
    steps:
      - name: Ensure CF_DEPLOY_DOMAIN and CF_ZONE_ID are defined
        ...
      - name: Create D1 database
        uses: cloudflare/wrangler-action@2.0.0
        with:
          command: d1 create wildebeest-${{ env.OWNER_LOWER }}
        ...
      - name: retrieve Zero Trust organization
        ...
      - name: retrieve Terraform state KV namespace
        ...
      - name: download VAPID keys
        ...
      - name: Publish DO
      - name: Configure
        run: terraform plan &amp;&amp; terraform apply -auto-approve
      - name: Create Queue
        ...
      - name: Publish consumer
        ...
      - name: Publish
        uses: cloudflare/wrangler-action@2.0.0
        with:
          command: pages publish --project-name=wildebeest-${{ env.OWNER_LOWER }} .</code></pre>
            
    <div>
      <h4>Updating Wildebeest</h4>
      <a href="#updating-wildebeest">
        
      </a>
    </div>
    <p>This workflow runs automatically every time the main branch changes, so updating the Wildebeest is as easy as synchronizing the upstream official repository with the fork. You don't even need to use git commands for that; GitHub provides a convenient Sync button in the UI that you can simply click.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6vkcs7XzLMZdihq7z5n2L5/b83e8499970012ebcf0b47e686b6518a/11b.png" />
            
            </figure><p>What's more? Updates are incremental and non-destructive. When the GitHub Actions workflow redeploys Wildebeest, we only make the necessary changes to your configuration and nothing else. You don't lose your data; we don't need to delete your existing configurations. Here’s how we achieved this:</p><p>We use <a href="https://registry.terraform.io/providers/cloudflare/cloudflare/latest/docs">Terraform</a>, a declarative configuration language and tool that interacts with our APIs and can query and configure your Cloudflare features. Here's the trick, whenever we apply a new configuration, we keep a copy of the Terraform state for Wildebeest in a <a href="https://developers.cloudflare.com/workers/learning/how-kv-works/">Cloudflare KV</a> key. When a new deployment is triggered, we get that state from the KV copy, calculate the differences, then change only what's necessary.</p><p>Data loss is not a problem either because, as you read above, D1 supports <a href="https://developers.cloudflare.com/d1/platform/migrations/">migrations</a>. If we need to add a new column to a table or a new table, we don't need to destroy the database and create it again; we just apply the necessary SQL to that change.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3UW6Qm1KE662wiFVgrWFPZ/ba80730c09161abc81c85bb89fd758b3/12b.png" />
            
            </figure>
    <div>
      <h3>Protection, optimization and observability, naturally</h3>
      <a href="#protection-optimization-and-observability-naturally">
        
      </a>
    </div>
    <p>Once Wildebeest is up and running, you can protect it from bad traffic and malicious actors. Cloudflare offers you <a href="https://www.cloudflare.com/ddos/">DDoS</a>, <a href="https://www.cloudflare.com/waf/">WAF</a>, and <a href="https://www.cloudflare.com/products/bot-management/">Bot Management</a> protection out of the box at a click's distance.</p><p>Likewise, you'll get instant network and content delivery optimizations from our products and <a href="https://www.cloudflare.com/analytics/">analytics</a> on how your Wildebeest instance is performing and being used.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4EYUh9pE5NNPnpj9mwVSfz/7d97cf99ad29cd9436b8e48c7918cd24/13b.png" />
            
            </figure>
    <div>
      <h3>ActivityPub, WebFinger, NodeInfo and Mastodon APIs</h3>
      <a href="#activitypub-webfinger-nodeinfo-and-mastodon-apis">
        
      </a>
    </div>
    <p>Mastodon popularized the Fediverse concept, but many of the underlying technologies used have been around for quite a while. This is one of those rare moments when everything finally comes together to create a working platform that answers an actual use case for Internet users. Let's quickly go through the protocols that Wildebeest had to implement:</p>
    <div>
      <h4>ActivityPub</h4>
      <a href="#activitypub">
        
      </a>
    </div>
    <p><a href="https://www.w3.org/TR/activitypub/">ActivityPub</a> is a decentralized social networking protocol and has been around as a W3C recommendation since at least 2018. It defines client APIs for creating and manipulating content and server-to-server APIs for content exchange and notifications, also known as federation. ActivityPub uses <a href="https://www.w3.org/TR/activitystreams-core/">ActivityStreams</a>, an even older W3C protocol, for its vocabulary.</p><p>The concepts of <a href="https://www.w3.org/TR/activitypub/#actors">Actors</a> (profiles), messages or <a href="https://www.w3.org/TR/activitypub/#obj">Objects</a> (the toots), <a href="https://www.w3.org/TR/activitypub/#inbox">inbox</a> (where you receive toots from people you follow), and <a href="https://www.w3.org/TR/activitypub/#outbox">outbox</a> (where you send your toots to the people you follow), to name a few of many other actions and activities, are all defined on the ActivityPub specification.</p><p>Here’s our folder with the <a href="https://github.com/cloudflare/wildebeest/tree/main/backend/src/activitypub">ActivityPub implementation</a>.</p>
            <pre><code>import type { APObject } from 'wildebeest/backend/src/activitypub/objects'
import type { Actor } from 'wildebeest/backend/src/activitypub/actors'

export async function addObjectInInbox(db, actor, obj) {
	const id = crypto.randomUUID()
	const out = await db
		.prepare('INSERT INTO inbox_objects(id, actor_id, object_id) VALUES(?, ?, ?)')
		.bind(id, actor.id.toString(), obj.id.toString())
		.run()
}
</code></pre>
            
    <div>
      <h4>WebFinger</h4>
      <a href="#webfinger">
        
      </a>
    </div>
    <p>WebFinger is a simple HTTP protocol used to discover information about any entity, like a profile, a server, or a specific feature. It resolves URIs to resource objects.</p><p>Mastodon uses <a href="https://www.rfc-editor.org/rfc/rfc7033">WebFinger</a> lookups to discover information about remote users. For example, say you want to interact with @<a>user@example.com</a>. Your local server would <a href="https://github.com/cloudflare/wildebeest/blob/main/backend/src/webfinger/index.ts">request</a> <a href="https://example.com/.well-known/webfinger?resource=acct:user@example.com">https://example.com/.well-known/webfinger?resource=acct:user@example.com</a> (using the <a href="https://www.rfc-editor.org/rfc/rfc7565">acct scheme</a>) and get something like this:</p>
            <pre><code>{
    "subject": "acct:user@example.com",
    "aliases": [
        "https://example.com/ap/users/user"
    ],
    "links": [
        {
            "rel": "self",
            "type": "application/activity+json",
            "href": "https://example.com/ap/users/user"
        }
    ]
}
</code></pre>
            <p>Now we know how to interact with <code>@user@example.com</code>, using the <code>https://example.com/ap/users/user endpoint</code>.</p><p>Here’s our WebFinger <a href="https://github.com/cloudflare/wildebeest/blob/main/functions/.well-known/webfinger.ts">response</a>:</p>
            <pre><code>export async function handleRequest(request, db): Promise&lt;Response&gt; {
	…
	const jsonLink = /* … link to actor */

	const res: WebFingerResponse = {
		subject: `acct:...`,
		aliases: [jsonLink],
		links: [
			{
				rel: 'self',
				type: 'application/activity+json',
				href: jsonLink,
			},
		],
	}
	return new Response(JSON.stringify(res), { headers })
}</code></pre>
            
    <div>
      <h4>Mastodon API</h4>
      <a href="#mastodon-api">
        
      </a>
    </div>
    <p>Finally, things like setting your server information, profile information, generating timelines, notifications, and searches, are all Mastodon-specific APIs. The Mastodon open-source project defines a catalog of REST APIs, and you can find all the documentation for them on <a href="https://docs.joinmastodon.org/api/">their website</a>.</p><p>Our Mastodon API implementation can be found <a href="https://github.com/cloudflare/wildebeest/tree/main/functions/api">here</a> (REST endpoints) and <a href="https://github.com/cloudflare/wildebeest/tree/main/backend/src/mastodon">here</a> (backend primitives). Here’s an example of Mastodon’s server information <a href="https://docs.joinmastodon.org/methods/instance/#v2">/api/v2/instance</a> implemented by <a href="https://github.com/cloudflare/wildebeest/blob/main/functions/api/v2/instance.ts">Wildebeest</a>:</p>
            <pre><code>export async function handleRequest(domain, db, env) {

	const res: InstanceConfigV2 = {
		domain,
		title: env.INSTANCE_TITLE,
		version: getVersion(),
		source_url: 'https://github.com/cloudflare/wildebeest',
		description: env.INSTANCE_DESCR,
		thumbnail: {
			url: DEFAULT_THUMBNAIL,
		},
		languages: ['en'],
		registrations: {
			enabled: false,
		},
		contact: {
			email: env.ADMIN_EMAIL,
		},
		rules: [],
	}

	return new Response(JSON.stringify(res), { headers })
}</code></pre>
            <p>Wildebeest also implements <a href="https://github.com/cloudflare/wildebeest/tree/main/backend/src/webpush">WebPush</a> for client notifications and <a href="https://github.com/cloudflare/wildebeest/tree/main/functions/nodeinfo">NodeInfo</a> for server information.</p><p>Other Mastodon-compatible servers had to implement all these protocols <a href="https://pleroma.social/">too</a>; Wildebeest is one of them. The community is very active in discussing future enhancements; we will keep improving our compatibility and adding support to more features over time, ensuring that Wildebeest plays well with the Fediverse ecosystem of servers and clients emerging.</p>
    <div>
      <h3>Get started now</h3>
      <a href="#get-started-now">
        
      </a>
    </div>
    <p>Enough about technology; let's get you into the Fediverse. We tried to detail all the steps to deploy your server. To start using Wildebeest, head to the public GitHub repository and check our <a href="https://github.com/cloudflare/wildebeest/blob/main/README.md">Get Started tutorial</a>.</p><p>Most of Wildebeest's dependencies offer a generous free plan that allows you to try them for personal or hobby projects that aren't business-critical, however you will need to subscribe an <a href="https://www.cloudflare.com/products/cloudflare-images/">Images</a> plan (the lowest tier should be enough for most needs) and, depending on your server load, <a href="https://developers.cloudflare.com/workers/platform/limits/#unbound-usage-model">Workers Unbound</a> (again, the minimum cost should be plenty for most use cases).</p><p>Following our dogfooding mantra, Cloudflare is also officially joining the Fediverse today. You can start following our Mastodon accounts and get the same experience of having regular updates from Cloudflare as you get from us on other social platforms, using your favorite Mastodon apps. These accounts are entirely running on top of a Wildebeest server:</p><ul><li><p><a href="https://cloudflare.social/@cloudflare">@cloudflare@cloudflare.social</a> - Our main account</p></li><li><p><a href="https://cloudflare.social/@radar">@radar@cloudflare.social</a> - Cloudflare Radar</p></li></ul>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/2QJsY2PkGqLfVQCDJc1AlH/d52823cfd8d2d9e7de9845686790a3bf/14b.png" />
            
            </figure><p>Wildebeest is compatible with most client apps; we are confirmed to work with the official Mastodon <a href="https://play.google.com/store/apps/details?id=org.joinmastodon.android">Android</a> and <a href="https://apps.apple.com/us/app/mastodon-for-iphone/id1571998974">iOS</a> apps, <a href="https://pinafore.social/">Pinafore</a>, <a href="https://mastodon.social/@JPEGuin/109315609418460036">Mammoth</a>, and <a href="https://tooot.app/">tooot</a>, and looking into others like <a href="https://tapbots.com/ivory/">Ivory</a>. If your favorite isn’t working, please submit an <a href="https://github.com/cloudflare/wildebeest/issues">issue here</a>, we’ll do our best to help support it.</p>
    <div>
      <h3>Final words</h3>
      <a href="#final-words">
        
      </a>
    </div>
    <p>Wildebeest was built entirely on top of our <a href="/welcome-to-the-supercloud-and-developer-week-2022/">Supercloud</a> stack. It was one of the most complete and complex projects we have created that uses various Cloudflare products and features.</p><p>We hope this write-up inspires you to not only try deploying Wildebeest and joining the Fediverse, but also building your next application, however demanding it is, on top of Cloudflare.</p><p>Wildebeest is a minimally viable Mastodon-compatible server right now, but we will keep improving it with more features and supporting it over time; after all, we're using it for our official accounts. It is also open-sourced, meaning you are more than welcome to contribute with pull requests or feedback.</p><p>In the meantime, we opened a <a href="https://discord.com/channels/595317990191398933/1064925651464896552">Wildebeest room</a> on our <a href="https://discord.gg/cloudflaredev">Developers Discord Server</a> and are keeping an eye open on the GitHub repo <a href="https://github.com/cloudflare/wildebeest/issues">issues</a> tab. Feel free to engage with us; the team is eager to know how you use Wildebeest and answer your questions.</p><p><i>PS: The code snippets in this blog were simplified to benefit readability and space (the TypeScript types and error handling code were removed, for example). Please refer to the GitHub repo links for the complete versions.</i></p> ]]></content:encoded>
            <category><![CDATA[Wildebeest]]></category>
            <category><![CDATA[Cloudflare Pages]]></category>
            <category><![CDATA[D1]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[Cloudflare Zero Trust]]></category>
            <category><![CDATA[Developers]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <category><![CDATA[SASE]]></category>
            <guid isPermaLink="false">5dmHcGVas7xv8tKbRbWLWN</guid>
            <dc:creator>Celso Martinho</dc:creator>
            <dc:creator>Sven Sauleau</dc:creator>
        </item>
        <item>
            <title><![CDATA[How we detect route leaks and our new Cloudflare Radar route leak service]]></title>
            <link>https://blog.cloudflare.com/route-leak-detection-with-cloudflare-radar/</link>
            <pubDate>Wed, 23 Nov 2022 16:00:00 GMT</pubDate>
            <description><![CDATA[ In this blog post, we will introduce our new system designed to detect route leaks and its integration on Cloudflare Radar and its public API. ]]></description>
            <content:encoded><![CDATA[ <p></p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5tjdb8oktiBnsjsr6Fj109/dca51bae6a054cc120d91c11a35c54fe/image5-19.png" />
            
            </figure><p>Today we’re introducing Cloudflare Radar’s route leak data and API so that anyone can get information about route leaks across the Internet. We’ve built a comprehensive system that takes in data from public sources and Cloudflare’s view of the Internet drawn from our massive global network. The system is now feeding route leak data on Cloudflare Radar’s ASN pages and via the API.</p><p>This blog post is in two parts. There’s a discussion of BGP and route leaks followed by details of our route leak detection system and how it feeds Cloudflare Radar.</p>
    <div>
      <h2>About BGP and route leaks</h2>
      <a href="#about-bgp-and-route-leaks">
        
      </a>
    </div>
    <p>Inter-domain routing, i.e., exchanging reachability information among networks, is critical to the wellness and performance of the Internet. The <a href="https://www.cloudflare.com/learning/security/glossary/what-is-bgp/">Border Gateway Protocol</a> (BGP) is the de facto routing protocol that exchanges routing information among organizations and networks. At its core, BGP assumes the information being exchanged is genuine and trust-worthy, which unfortunately is <a href="/rpki/">no longer a valid assumption</a> on the current Internet. In many cases, networks can make mistakes or intentionally lie about the reachability information and propagate that to the rest of the Internet. Such incidents can cause significant disruptions of the normal operations of the Internet. One type of such disruptive incident is <b>route leaks</b>.</p><p>We consider route leaks as the propagation of routing announcements beyond their intended scope (<a href="https://www.rfc-editor.org/rfc/rfc7908.html">RFC7908</a>). Route leaks can cause significant disruption affecting millions of Internet users, as we have seen in many past notable incidents. For example, <a href="/how-verizon-and-a-bgp-optimizer-knocked-large-parts-of-the-internet-offline-today/">in June 2019 a misconfiguration</a> in a small network in Pennsylvania, US (<a href="https://radar.cloudflare.com/traffic/as396531">AS396531</a> - Allegheny Technologies Inc) accidentally leaked a Cloudflare prefix to Verizon, which proceeded to propagate the misconfigured route to the rest of its peers and customers. As a result, the traffic of a large portion of the Internet was squeezed through the limited-capacity links of a small network. The resulting congestion caused most of Cloudflare traffic to and from the affected IP range to be dropped.</p><p>A similar incident in November 2018 caused widespread unavailability of Google services when a Nigerian ISP (<a href="https://radar.cloudflare.com/traffic/as37282">AS37282</a> - Mainone) <a href="/how-a-nigerian-isp-knocked-google-offline/">accidentally leaked</a> a large number of Google IP prefixes to its peers and providers violating the <a href="https://ieeexplore.ieee.org/document/974527">valley-free principle</a>.</p><p>These incidents illustrate not only that route leaks can be very impactful, but also the snowball effects that misconfigurations in small regional networks can have on the global Internet.</p><p>Despite the criticality of detecting and rectifying route leaks promptly, they are often detected only when users start reporting the noticeable effects of the leaks. The challenge with detecting and preventing route leaks stems from the fact that AS business relationships and BGP routing policies are generally <a href="https://ieeexplore.ieee.org/document/974523">undisclosed</a>, and the affected network is often remote to the root of the route leak.</p><p>In the past few years, solutions have been proposed to prevent the propagation of leaked routes. Such proposals include <a href="https://datatracker.ietf.org/doc/rfc9234/">RFC9234</a> and <a href="https://datatracker.ietf.org/doc/html/draft-ietf-sidrops-aspa-verification">ASPA</a>, which extends the BGP to annotate sessions with the relationship type between the two connected AS networks to enable the detention and prevention of route leaks.</p><p>An alternative proposal to implement similar signaling of BGP roles is through the use of <a href="https://en.wikipedia.org/wiki/Border_Gateway_Protocol#Communities">BGP Communities</a>; a transitive attribute used to encode metadata in BGP announcements. While these directions are promising in the long term, they are still in very preliminary stages and are not expected to be adopted at scale soon.</p><p>At Cloudflare, we have developed a system to detect route leak events automatically and send notifications to multiple channels for visibility. As we continue our efforts to bring more relevant <a href="https://developers.cloudflare.com/radar/">data to the public</a>, we are happy to announce that we are starting an <a href="https://developers.cloudflare.com/api/operations/radar_get_BGPRouteLeakEvents">open data API</a> for our route leak detection results today and integrate results to <a href="https://radar.cloudflare.com/">Cloudflare Radar</a> pages.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4bWYvHZtatR3ooYMKq2WCb/79ef433f8c46c40aefa2fefe35905aa7/image4-32.png" />
            
            </figure>
    <div>
      <h2>Route leak definition and types</h2>
      <a href="#route-leak-definition-and-types">
        
      </a>
    </div>
    <p>Before we jump into how we design our systems, we will first do a quick primer on what a route leak is, and why it is important to detect it.</p><p>We refer to the published IETF RFC7908 document <a href="https://www.rfc-editor.org/rfc/rfc7908.html"><i>"Problem Definition and Classification of BGP Route Leaks"</i></a> to define route leaks.</p><p>&gt; A route leak is the propagation of routing announcement(s) beyond their intended scope.</p><p>The <i>intended scope</i> is often concretely defined as inter-domain routing policies based on business relationships between Autonomous Systems (ASes). These business relationships <a href="https://ieeexplore.ieee.org/document/974527">are broadly classified into four categories</a>: customers, transit providers, peers and siblings, although more complex arrangements are possible.</p><p>In a customer-provider relationship the customer AS has an agreement with another network to transit its traffic to the global routing table. In a peer-to-peer relationship two ASes agree to free bilateral traffic exchange, but only between their own IPs and the IPs of their customers. Finally, ASes that belong under the same administrative entity are considered siblings, and their traffic exchange is often unrestricted.  The image below illustrates how the three main relationship types translate to export policies.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3XAOHLT8UtzkLQigSbcrUd/34f8de710fcdda2c4feb7fd1eaaec576/image7-7.png" />
            
            </figure><p>By categorizing the types of AS-level relationships and their implications on the propagation of BGP routes, we can define multiple phases of a prefix origination announcements during propagation:</p><ul><li><p>upward: all path segments during this phase are <b>customer to provider</b></p></li><li><p>peering: one peer-peer path segment</p></li><li><p>downward: all path segments during this phase are <b>provider to customer</b></p></li></ul><p>An AS path that follows <a href="https://ieeexplore.ieee.org/document/6363987"><b>valley-free routing principle</b></a> will have <b>upward, peering, downward</b> phases, <b>all optional</b> but have to be <b>in that order</b>. Here is an example of an AS path that conforms with valley-free routing.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7mm4sD88Ai3cFOS7ugzhoH/d0ea889e5d22d60d4648b7f13a69b08b/image11-4.png" />
            
            </figure><p>In RFC7908, <a href="https://www.rfc-editor.org/rfc/rfc7908.html"><i>"Problem Definition and Classification of BGP Route Leaks"</i></a>, the authors define six types of route leaks, and we refer to these definitions in our system design. Here are illustrations of each of the route leak types.</p>
    <div>
      <h3>Type 1: Hairpin Turn with Full Prefix</h3>
      <a href="#type-1-hairpin-turn-with-full-prefix">
        
      </a>
    </div>
    <p>&gt; A multihomed AS learns a route from one upstream ISP and simply propagates it to another upstream ISP (the turn essentially resembling a hairpin).  Neither the prefix nor the AS path in the update is altered.</p><p>An AS path that contains a provider-customer and customer-provider segment is considered a type 1 leak. The following example: AS4 → AS5 → AS6 forms a type 1 leak.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6Njxtz1neeF3ejVaLH6RPi/4836161f45d7f547608839f3a11467dd/image9-5.png" />
            
            </figure><p>Type 1 is the most recognized type of route leaks and is very impactful. In many cases, a customer route is preferable to a peer or a provider route. In this example, AS6 will likely prefer sending traffic via AS5 instead of its other peer or provider routes, causing AS5 to unintentionally become a transit provider. This can significantly affect the performance of the traffic related to the leaked prefix or cause outages if the leaking AS is not provisioned to handle a large influx of traffic.</p><p>In June 2015, Telekom Malaysia (<a href="https://radar.cloudflare.com/traffic/as4788">AS4788</a>), a regional ISP, <a href="https://www.bgpmon.net/massive-route-leak-cause-internet-slowdown/">leaked over 170,000 routes</a> learned from its providers and peers to its other provider Level3 (<a href="https://radar.cloudflare.com/traffic/as3549">AS3549</a>, now Lumen). Level3 accepted the routes and further propagated them to its downstream networks, which in turn caused significant network issues globally.</p>
    <div>
      <h3>Type 2: Lateral ISP-ISP-ISP Leak</h3>
      <a href="#type-2-lateral-isp-isp-isp-leak">
        
      </a>
    </div>
    <p>Type 2 leak is defined as propagating routes obtained from one peer to another peer, creating two or more consecutive peer-to-peer path segments.</p><p>Here is an example: AS3 → AS4 → AS5 forms a  type 2 leak.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/18VJCjED1cnQmJd0cXPnBU/f38d68dacc38637c0a9d72e5cdafa5ae/image1-70.png" />
            
            </figure><p>One example of such leaks is <a href="https://archive.nanog.org/meetings/nanog41/presentations/mauch-lightning.pdf">more than three very large networks appearing in sequence</a>. Very large networks (such as Verizon and Lumen) do not purchase transit from each other, and having <a href="https://puck.nether.net/bgp/leakinfo.cgi/">more than three such networks</a> on the path in sequence is often an indication of a route leak.</p><p>However, in the real world, it is not unusual to see multiple small peering networks exchanging routes and passing on to each other. Legit business reasons exist for having this type of network path. We are less concerned about this type of route leak as compared to type 1.</p>
    <div>
      <h3>Type 3 and 4: Provider routes to peer; peer routes to provider</h3>
      <a href="#type-3-and-4-provider-routes-to-peer-peer-routes-to-provider">
        
      </a>
    </div>
    <p>These two types involve propagating routes from a provider or a peer not to a customer, but to another peer or provider. Here are the illustrations of the two types of leaks:</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6T2roi9v4ATputaICUVuUv/50831a0a631774e101e5f04abfb25876/image10-3.png" />
            
            </figure>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/18D2LnLHkNkbzLsORnD95y/1c033cd2c3cea76e556013bc777889a9/image13-1.png" />
            
            </figure><p>As in the <a href="/how-a-nigerian-isp-knocked-google-offline/">previously mentioned example</a>, a Nigerian ISP who peers with Google accidentally leaked its route to its provider <a href="https://radar.cloudflare.com/traffic/as4809">AS4809</a>, and thus generated a type 4 route leak. Because routes via customers are usually preferred to others, the large provider (AS4809) rerouted its traffic to Google via its customer, i.e. the leaking ASN, overwhelmed the small ISP and took down Google for over one hour.</p>
    <div>
      <h2>Route leak summary</h2>
      <a href="#route-leak-summary">
        
      </a>
    </div>
    <p>So far, we have looked at the four types of route leaks defined in <a href="https://www.rfc-editor.org/rfc/rfc7908.html">RFC7908</a>. The common thread of the four types of route leaks is that they're all defined using AS-relationships, i.e., peers, customers, and providers. We summarize the types of leaks by categorizing the AS path propagation based on where the routes are learned from and propagate to. The results are shown in the following table.</p><table>
<thead>
  <tr>
    <th>Routes from / propagates to</th>
    <th>To provider</th>
    <th>To peer</th>
    <th>To customer</th>
  </tr>
</thead>
<tbody>
  <tr>
    <td>From provider</td>
    <td>Type 1</td>
    <td>Type 3</td>
    <td>Normal</td>
  </tr>
  <tr>
    <td>From peer</td>
    <td>Type 4</td>
    <td>Type 2</td>
    <td>Normal</td>
  </tr>
  <tr>
    <td>From customer</td>
    <td>Normal</td>
    <td>Normal</td>
    <td>Normal</td>
  </tr>
</tbody>
</table><p>We can summarize the whole table into one single rule: <b>routes obtained from a non-customer AS can only be propagated to customers</b>.</p><p><i>Note: Type 5 and type 6 route leaks are defined as prefix re-origination and announcing of private prefixes. Type 5 is more closely related to</i> <a href="https://www.cloudflare.com/learning/security/glossary/bgp-hijacking/"><i>prefix hijackings</i></a><i>, which we plan to expand our system to as the next steps, while type 6 leaks are outside the scope of this work. Interested readers can refer to sections 3.5 and 3.6 of</i> <a href="https://www.rfc-editor.org/rfc/rfc7908.html"><i>RFC7908</i></a> <i>for more information.</i></p>
    <div>
      <h2>The Cloudflare Radar route leak system</h2>
      <a href="#the-cloudflare-radar-route-leak-system">
        
      </a>
    </div>
    <p>Now that we know what a  route leak is, let’s talk about how we designed our route leak detection system.</p><p>From a very high level, we compartmentalize our system into three different components:</p><ol><li><p><b>Raw data collection module</b>: responsible for gathering BGP data from multiple sources and providing BGP message stream to downstream consumers.</p></li><li><p><b>Leak detection module</b>: responsible for determining whether a given AS-level path is a route leak, estimate the confidence level of the assessment, aggregating and providing all external evidence needed for further analysis of the event.</p></li><li><p><b>Storage and notification module</b>: responsible for providing access to detected route leak events and sending out notifications to relevant parties. This could also include building a dashboard for easy access and search of the historical events and providing the user interface for high-level analysis of the event.</p></li></ol>
    <div>
      <h3>Data collection module</h3>
      <a href="#data-collection-module">
        
      </a>
    </div>
    <p>There are three types of data input we take into consideration:</p><ol><li><p>Historical: BGP archive files for some time range in the pasta. <a href="https://www.routeviews.org/routeviews/">RouteViews</a> and <a href="https://ris.ripe.net/docs/20_raw_data_mrt.html#name-and-location">RIPE RIS</a> BGP archives</p></li><li><p>Semi-real-time: BGP archive files as soon as they become available, with a 10-30 minute delay.a. RouteViews and RIPE RIS archives with data broker that checks new files periodically (e.g. <a href="https://bgpkit.com/broker">BGPKIT Broker</a>)</p></li><li><p>Real-time: true real-time data sourcesa. <a href="https://ris-live.ripe.net/">RIPE RIS Live</a>b. Cloudflare internal BGP sources</p></li></ol>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6p32es0tPhqsESHMazR8Ni/96910fef1be1cccf2bd69aa750b063c8/image6-11.png" />
            
            </figure><p>For the current version, we use the semi-real-time data source for the detection system, i.e., the BGP updates files from RouteViews and RIPE RIS. For data completeness, we process data from all public collectors from these two projects (a total of 63 collectors and over 2,400 collector peers) and implement a pipeline that’s capable of handling the BGP data processing as the data files become available.</p><p>For data files indexing and processing, we deployed an on-premises <a href="https://github.com/bgpkit/bgpkit-broker-backend">BGPKIT Broker instance</a> with Kafka feature enabled for message passing, and a custom concurrent <a href="https://www.rfc-editor.org/rfc/rfc6396.html">MRT</a> data processing pipeline based on <a href="https://github.com/bgpkit/bgpkit-parser">BGPKIT Parser</a> Rust SDK. The data collection module processes MRT files and converts results into a BGP messages stream at over two billion BGP messages per day (roughly 30,000 messages per second).</p>
    <div>
      <h3>Route leak detection</h3>
      <a href="#route-leak-detection">
        
      </a>
    </div>
    <p>The route leak detection module works at the level of individual BGP announcements. The detection component investigates one BGP message at a time, and estimates how likely a given BGP message is a result of a route leak event.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3uLtzc0IV8IAuxze3lYUXY/8eb3553e71a84930fe6851e8732f849f/image8-5.png" />
            
            </figure><p>We base our detection algorithm mainly on the <a href="https://ieeexplore.ieee.org/document/6363987">valley-free model</a>, which we believe can capture most of the notable route leak incidents. As mentioned previously, the key to having low false positives for detecting route leaks with the valley-free model is to have accurate AS-level relationships. While those relationship types are not publicized by every AS, there have been over two <a href="https://ieeexplore.ieee.org/document/6027863">decades of research</a> on the inference of the relationship types using publicly observed BGP data.</p><p>While state-of-the-art relationship inference algorithms have been shown to be <a href="https://dl.acm.org/doi/10.1145/2504730.2504735">highly accurate</a>, even a small margin of errors can still incur inaccuracies in the detection of route leaks. To alleviate such artifacts, we synthesize multiple data sources for inferring AS-level relationships, including <a href="https://www.caida.org/">CAIDA/UCSD</a>’s <a href="https://www.caida.org/catalog/datasets/as-relationships/">AS relationship</a> data and our in-house built AS relationship dataset. Building on top of the two AS-level relationships, we create a much more granular dataset at the per-prefix and per-peer levels. The improved dataset allows us to answer the question like what is the relationship between AS1 and AS2 with respect to prefix P observed by collector peer X. This eliminates much of the ambiguity for cases where networks have multiple different relationships based on prefixes and geo-locations, and thus helps us reduce the number of false positives in the system. Besides the AS-relationships datasets, we also apply the <a href="https://ihr.iijlab.net/ihr/en-us/documentation#AS_dependency">AS Hegemony dataset</a> from <a href="https://ihr.iijlab.net/ihr/en-us/">IHR IIJ</a> to further reduce false positives.</p>
    <div>
      <h3>Route leak storage and presentation</h3>
      <a href="#route-leak-storage-and-presentation">
        
      </a>
    </div>
    <p>After processing each BGP message, we store the generated route leak entries in a database for long-term storage and exploration. We also aggregate individual route leak BGP announcements and group relevant leaks from the same leak ASN within a short period together into <b>route-leak events</b>. The route leak events will then be available for consumption by different downstream applications like web UIs, an <a href="https://developers.cloudflare.com/api/operations/radar_get_BGPRouteLeakEvents">API</a>, or alerts.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7f4kKk1DFYPIltkyzArwDy/4347df7a5bee4ca6686455d6205324f7/image12-2.png" />
            
            </figure>
    <div>
      <h2>Route leaks on Cloudflare Radar</h2>
      <a href="#route-leaks-on-cloudflare-radar">
        
      </a>
    </div>
    <p>At Cloudflare, we aim to help build a better Internet, and that includes sharing our efforts on monitoring and securing Internet routing. Today, we are releasing our route leak detection system as public beta.</p><p>Starting today, users going to the Cloudflare Radar ASN pages will now find the list of route leaks that affect that AS. We consider that an AS is being affected when the leaker AS is one hop away from it in any direction, before or after.</p><p>The Cloudflare Radar ASN page is directly accessible via <a href="https://radar.cloudflare.com/as{ASN}"><b>https://radar.cloudflare.com/as{ASN}</b></a>. For example, one can navigate to <a href="https://radar.cloudflare.com/as174">https://radar.cloudflare.com/as174</a> to view the overview page for Cogent AS174. ASN pages now show a dedicated card for route leaks detected relevant to the current ASN within the selected time range.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5CoRAfRuDBdZr5zQxCtoGn/1a89ba21ea72c44cbd45da3661705f65/image2-54.png" />
            
            </figure><p>Users can also start using our <a href="https://developers.cloudflare.com/api/operations/radar_get_BGPRouteLeakEvents">public data API</a> to lookup route leak events with regards to any given ASN.  Our API supports filtering route leak results by time ranges, and ASes involved. Here is a screenshot of the <a href="https://developers.cloudflare.com/api/operations/radar_get_BGPRouteLeakEvents">route leak events API documentation page</a> on the <a href="/building-a-better-developer-experience-through-api-documentation/">newly updated API docs site</a>.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5IgJJ1GO4uHepxxQc5vwV7/9e809b7ef9264f9c03d70c70c27d4bb5/image3-44.png" />
            
            </figure>
    <div>
      <h2>More to come on routing security</h2>
      <a href="#more-to-come-on-routing-security">
        
      </a>
    </div>
    <p>There is a lot more we are planning to do with route-leak detection. More features like a global view page, route leak notifications, more advanced APIs, custom automations scripts, and historical archive datasets will begin to ship on Cloudflare Radar over time. Your feedback and suggestions are also very important for us to continue improving on our detection results and serve better data to the public.</p><p>Furthermore, we will continue to expand our work on other important topics of Internet routing security, including global BGP hijack detection (not limited to our customer networks), RPKI validation monitoring, open-sourcing tools and architecture designs, and centralized routing security web gateway. Our goal is to provide the best data and tools for routing security to the communities so that we can build a better and more secure Internet together.</p><p>In the meantime, we opened a <a href="https://discord.com/channels/595317990191398933/1035553707116478495">Radar room</a> on our Developers Discord Server. Feel free to <a href="https://discord.com/channels/595317990191398933/1035553707116478495">join</a> and talk to us; the team is eager to receive feedback and answer questions.</p><p>Visit <a href="https://radar.cloudflare.com/">Cloudflare Radar</a> for more Internet insights. You can also follow us <a href="https://twitter.com/cloudflareradar">on Twitter</a> for more Radar updates.</p> ]]></content:encoded>
            <category><![CDATA[Radar]]></category>
            <category><![CDATA[BGP]]></category>
            <category><![CDATA[Routing Security]]></category>
            <guid isPermaLink="false">72oaP8g7ZckKtIVQxA8EX4</guid>
            <dc:creator>Mingwei Zhang</dc:creator>
            <dc:creator>Vasilis Giotsas</dc:creator>
            <dc:creator>Celso Martinho</dc:creator>
        </item>
    </channel>
</rss>