
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/">
    <channel>
        <title><![CDATA[ The Cloudflare Blog ]]></title>
        <description><![CDATA[ Get the latest news on how products at Cloudflare are built, technologies used, and join the teams helping to build a better Internet. ]]></description>
        <link>https://blog.cloudflare.com</link>
        <atom:link href="https://blog.cloudflare.com/" rel="self" type="application/rss+xml"/>
        <language>en-us</language>
        <image>
            <url>https://blog.cloudflare.com/favicon.png</url>
            <title>The Cloudflare Blog</title>
            <link>https://blog.cloudflare.com</link>
        </image>
        <lastBuildDate>Fri, 10 Apr 2026 18:57:52 GMT</lastBuildDate>
        <item>
            <title><![CDATA[Launching Cloudflare’s Gen 13 servers: trading cache for cores for 2x edge compute performance]]></title>
            <link>https://blog.cloudflare.com/gen13-launch/</link>
            <pubDate>Mon, 23 Mar 2026 13:00:00 GMT</pubDate>
            <description><![CDATA[ Cloudflare’s Gen 13 servers double our compute throughput by rethinking the balance between cache and cores. Moving to high-core-count AMD EPYC ™ Turin CPUs, we traded large L3 cache for raw compute density. By running our new Rust-based FL2 stack, we completely mitigated the latency penalty to unlock twice the performance. ]]></description>
            <content:encoded><![CDATA[ <p>Two years ago, Cloudflare deployed our <a href="https://blog.cloudflare.com/cloudflare-gen-12-server-bigger-better-cooler-in-a-2u1n-form-factor/"><u>12th Generation server fleet</u></a>, based on AMD EPYC™ Genoa-X processors with their massive 3D V-Cache. That cache-heavy architecture was a perfect match for our request handling layer, FL1 at the time. But as we evaluated next-generation hardware, we faced a dilemma — the CPUs offering the biggest throughput gains came with a significant cache reduction. Our legacy software stack wasn't optimized for this, and the potential throughput benefits were being capped by increasing latency.</p><p>This blog describes how the <a href="https://blog.cloudflare.com/20-percent-internet-upgrade/"><u>FL2 transition</u></a>, our Rust-based rewrite of Cloudflare's core request handling layer, allowed us to prove Gen 13's full potential and unlock performance gains that would have been impossible on our previous stack. FL2 removes the dependency on the larger cache, allowing for performance to scale with cores while maintaining our SLAs. Today, we are proud to announce the launch of Cloudflare's Gen 13 based on AMD EPYC™ 5th Gen Turin-based servers running FL2, effectively capturing and scaling performance at the edge. </p>
    <div>
      <h2>What AMD EPYCTurin brings to the table</h2>
      <a href="#what-amd-epycturin-brings-to-the-table">
        
      </a>
    </div>
    <p><a href="https://www.amd.com/en/products/processors/server/epyc/9005-series.html"><u>AMD's EPYC™ 5th Generation Turin-based processors</u></a> deliver more than just a core count increase. The architecture delivers improvements across multiple dimensions of what Cloudflare servers require.</p><ul><li><p><b>2x core count:</b> up to 192 cores versus Gen 12's 96 cores, with SMT providing 384 threads</p></li><li><p><b>Improved IPC:</b> Zen 5's architectural improvements deliver better instructions-per-cycle compared to Zen 4</p></li><li><p><b>Better power efficiency</b>: Despite the higher core count, Turin consumes up to 32% fewer watts per core compared to Genoa-X</p></li><li><p><b>DDR5-6400 support</b>: Higher memory bandwidth to feed all those cores</p></li></ul><p>However, Turin's high density OPNs make a deliberate tradeoff: prioritizing throughput over per core cache. Our analysis across the Turin stack highlighted this shift. For example, comparing the highest density Turin OPN to our Gen 12 Genoa-X processors reveals that Turin's 192 cores share 384MB of L3 cache. This leaves each core with access to just 2MB, one-sixth of Gen 12's allocation. For any workload that relies heavily on cache locality, which ours did, this reduction posed a serious challenge.</p><table><tr><td><p>Generation</p></td><td><p>Processor</p></td><td><p>Cores/Threads</p></td><td><p>L3 Cache/Core</p></td></tr><tr><td><p>Gen 12</p></td><td><p>AMD Genoa-X 9684X</p></td><td><p>96C/192T</p></td><td><p>12MB (3D V-Cache)</p></td></tr><tr><td><p>Gen 13 Option 1</p></td><td><p>AMD Turin 9755</p></td><td><p>128C/256T</p></td><td><p>4MB</p></td></tr><tr><td><p>Gen 13 Option 2</p></td><td><p>AMD Turin 9845</p></td><td><p>160C/320T</p></td><td><p>2MB</p></td></tr><tr><td><p>Gen 13 Option 3</p></td><td><p>AMD Turin 9965</p></td><td><p>192C/384T</p></td><td><p>2MB</p></td></tr></table>
    <div>
      <h2>Diagnosing the problem with performance counters</h2>
      <a href="#diagnosing-the-problem-with-performance-counters">
        
      </a>
    </div>
    <p>For our FL1 request handling layer, NGINX- and LuaJIT-based code, this cache reduction presented a significant challenge. But we didn't just assume it would be a problem; we measured it.</p><p>During the CPU evaluation phase for Gen 13, we collected CPU performance counters and profiling data to identify exactly what was happening under the hood using <a href="https://docs.amd.com/r/en-US/68658-uProf-getting-started-guide/Identifying-Issues-Using-uProfPcm"><u>AMD uProf tool</u></a>. The data showed:</p><ul><li><p>L3 cache miss rates increased dramatically compared to Gen 12's server equipped with 3D V-cache processors</p></li><li><p>Memory fetch latency dominated request processing time as data that previously stayed in L3 now required trips to DRAM</p></li><li><p>The latency penalty scaled with utilization as we pushed CPU usage higher, and cache contention worsened</p></li></ul><p>L3 cache hits complete in roughly 50 cycles; L3 cache misses requiring DRAM access take 350+ cycles, an order of magnitude difference. With 6x less cache per core, FL1 on Gen 13 was hitting memory far more often, incurring latency penalties.</p>
    <div>
      <h2>The tradeoff: latency vs. throughput </h2>
      <a href="#the-tradeoff-latency-vs-throughput">
        
      </a>
    </div>
    <p>Our initial tests running FL1 on Gen 13 confirmed what the performance counters had already suggested. While the Turin processor could achieve higher throughput, it came at a steep latency cost.</p><table><tr><td><p>Metric</p></td><td><p>Gen 12 (FL1)</p></td><td><p>Gen 13 - AMD Turin 9755 (FL1)</p></td><td><p>Gen 13 - AMD Turin 9845 (FL1)</p></td><td><p>Gen 13 - AMD Turin 9965 (FL1)</p></td><td><p>Delta</p></td></tr><tr><td><p>Core count</p></td><td><p>baseline</p></td><td><p>+33%</p></td><td><p>+67%</p></td><td><p>+100%</p></td><td><p></p></td></tr><tr><td><p>FL throughput</p></td><td><p>baseline</p></td><td><p>+10%</p></td><td><p>+31%</p></td><td><p>+62%</p></td><td><p>Improvement</p></td></tr><tr><td><p>Latency at low to moderate CPU utilization</p></td><td><p>baseline</p></td><td><p>+10%</p></td><td><p>+30%</p></td><td><p>+30%</p></td><td><p>Regression</p></td></tr><tr><td><p>Latency at high CPU utilization</p></td><td><p>baseline</p></td><td><p>&gt; 20% </p></td><td><p>&gt; 50% </p></td><td><p>&gt; 50% </p></td><td><p>Unacceptable</p></td></tr></table><p>The Gen 13 evaluation server with AMD Turin 9965 that generated 60% throughput gain was compelling, and the performance uplift provided the most improvement to Cloudflare’s total cost of ownership (TCO). </p><p>But a more than 50% latency penalty is not acceptable. The increase in request processing latency would directly impact customer experience. We faced a familiar infrastructure question: do we accept a solution with no TCO benefit, accept the increased latency tradeoff, or find a way to boost efficiency without adding latency?</p>
    <div>
      <h2>Incremental gains with performance tuning</h2>
      <a href="#incremental-gains-with-performance-tuning">
        
      </a>
    </div>
    <p>To find a path to an optimal outcome, we collaborated with AMD to analyze the Turin 9965 data and run targeted optimization experiments. We systematically tested multiple configurations:</p><ul><li><p><b>Hardware Tuning:</b> Adjusting hardware prefetchers and Data Fabric (DF) Probe Filters, which showed only marginal gains</p></li><li><p><b>Scaling Workers</b>: Launching more FL1 workers, which improved throughput but cannibalized resources from other production services</p></li><li><p><b>CPU Pinning &amp; Isolation:</b> Adjusting workload isolation configurations to find optimal mix, with limited success </p></li></ul><p>The configuration that ultimately provided the most value was <b>AMD’s Platform Quality of Service (PQOS). PQOS </b>extensions enable fine-grained regulation of shared resources like cache and memory bandwidth. Since Turin processors consist of one I/O Die and up to 12 Core Complex Dies (CCDs), each sharing an L3 cache across up to 16 cores, we put this to the test. Here is how the different experimental configurations performed. </p><p>First, we used PQOS to allocate a dedicated L3 cache share within a single CCD for FL1, the gains were minimal. However, when we scaled the concept to the socket level, dedicating an <i>entire</i> CCD strictly to FL1, we saw meaningful throughput gains while keeping latency acceptable.</p><div>
<figure>
<table><colgroup><col></col><col></col><col></col><col></col></colgroup>
<tbody>
<tr>
<td>
<p><span><span>Configuration</span></span></p>
</td>
<td>
<p><span><span>Description</span></span></p>
</td>
<td>
<p><span><span>Illustration</span></span></p>
</td>
<td>
<p><span><span>Performance gain</span></span></p>
</td>
</tr>
<tr>
<td>
<p><span><span>NUMA-aware core affinity </span></span><br /><span><span>(equivalent to PQOS at socket level)</span></span></p>
</td>
<td>
<p><span><span>6 out of 12 CCD (aligned with NUMA domain) run FL.</span></span></p>
<p> </p>
<p><span><span>32MB L3 cache in each CCD shared among all cores. </span></span></p>
</td>
<td>
<p><span><span><img src="https://images.ctfassets.net/zkvhlag99gkb/4CBSHY02oIZOiENgFrzLSz/0c6c2ac8ef0096894ff4827e30d25851/image3.png" /></span></span></p>
</td>
<td>
<p><span><span>&gt;15% incremental </span></span></p>
<p><span><span>throughput gain</span></span></p>
</td>
</tr>
<tr>
<td>
<p><span><span>PQOS config 1</span></span></p>
</td>
<td>
<p><span><span>1 of 2 vCPU on each physical core in each CCD runs FL. </span></span></p>
<p> </p>
<p><span><span>FL gets 75% of the 32MB L3 cache of each CCD.</span></span></p>
</td>
<td>
<p><span><span><img src="https://images.ctfassets.net/zkvhlag99gkb/3iJo1BBRueQRy92R3aXbGx/596c3231fa0e66f20de70ea02615f9a7/image2.png" /></span></span></p>
</td>
<td>
<p><span><span>&lt; 5% incremental throughput gain</span></span></p>
<p> </p>
<p><span><span>Other services show minor signs of degradation</span></span></p>
</td>
</tr>
<tr>
<td>
<p><span><span>PQOS config 2</span></span></p>
</td>
<td>
<p><span><span>1 of 2 vCPU in each physical core in each CCD runs FL.</span></span></p>
<p> </p>
<p><span><span>FL gets 50% of the 32MB L3 cache of each CCD.</span></span></p>
</td>
<td>
<p><span><span><img src="https://images.ctfassets.net/zkvhlag99gkb/3iJo1BBRueQRy92R3aXbGx/596c3231fa0e66f20de70ea02615f9a7/image2.png" /></span></span></p>
</td>
<td>
<p><span><span>&lt; 5% incremental throughput gain</span></span></p>
</td>
</tr>
<tr>
<td>
<p><span><span>PQOS config 3</span></span></p>
</td>
<td>
<p><span><span>2 vCPU on 50% of the physical core in each CCD runs FL. </span></span></p>
<p> </p>
<p><span><span>FL gets 50% of  the 32MB L3 cache of each CCD.</span></span></p>
</td>
<td>
<p><span><span><img src="https://images.ctfassets.net/zkvhlag99gkb/7FKLfSxnSNUlXJCw8CJGzU/69c7b81b6cee5a2c7040ecc96748084b/image5.png" /></span></span></p>
</td>
<td>
<p><span><span>&lt; 5% incremental throughput gain</span></span></p>
</td>
</tr>
</tbody>
</table>
</figure>
</div>
    <div>
      <h2>The opportunity: FL2 was already in progress</h2>
      <a href="#the-opportunity-fl2-was-already-in-progress">
        
      </a>
    </div>
    <p>Hardware tuning and resource configuration provided modest gains, but to truly unlock the performance potential of the Gen 13 architecture, we knew we would have to rewrite our software stack to fundamentally change how it utilized system resources.</p><p>Fortunately, we weren't starting from scratch. As we <a href="https://blog.cloudflare.com/20-percent-internet-upgrade/"><u>announced during Birthday Week 2025</u></a>, we had already been rebuilding FL1 from the ground up. FL2 is a complete rewrite of our request handling layer in Rust, built on our <a href="https://blog.cloudflare.com/pingora-open-source/"><u>Pingora</u></a> and <a href="https://blog.cloudflare.com/introducing-oxy/"><u>Oxy</u></a> frameworks, replacing 15 years of NGINX and LuaJIT code.</p><p>The FL2 project wasn't initiated to solve the Gen 13 cache problem — it was driven by the need for better security (Rust's memory safety), faster development velocity (strict module system), and improved performance across the board (less CPU, less memory, modular execution).</p><p>FL2's cleaner architecture, with better memory access patterns and less dynamic allocation, might not depend on massive L3 caches the way FL1 did. This gave us an opportunity to use the FL2 transition to prove whether Gen 13's throughput gains could be realized without the latency penalty.</p>
    <div>
      <h2>Proving it out: FL2 on Gen 13</h2>
      <a href="#proving-it-out-fl2-on-gen-13">
        
      </a>
    </div>
    <p>As the FL2 rollout progressed, production metrics from our Gen 13 servers validated what we had hypothesized.</p><table><tr><td><p>Metric</p></td><td><p>Gen 13 AMD Turin 9965 (FL1)</p></td><td><p>Gen 13 AMD Turin 9965 (FL2)</p></td></tr><tr><td><p>FL requests per CPU%</p></td><td><p>baseline</p></td><td><p>50% higher</p></td></tr><tr><td><p>Latency vs Gen 12</p></td><td><p>baseline</p></td><td><p>70% lower</p></td></tr><tr><td><p>Throughput vs Gen 12</p></td><td><p>62% higher</p></td><td><p>100% higher</p></td></tr></table><p>The out-of-the-box efficiency gains on our new FL2 stack were substantial, even before any system optimizations. FL2 slashed the latency penalty by 70%, allowing us to push Gen 13 to higher CPU utilization while strictly meeting our latency SLAs. Under FL1, this would have been impossible.</p><p>By effectively eliminating the cache bottleneck, FL2 enables our throughput to scale linearly with core count. The impact is undeniable on the high-density AMD Turin 9965: we achieved a 2x performance gain, unlocking the true potential of the hardware. With further system tuning, we expect to squeeze even more power out of our Gen 13 fleet.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1jV1q0n9PgmbbNzDl8E1J1/2ead24a20cc10836ba041f73a16f3883/image6.png" />
          </figure>
    <div>
      <h2>Generational improvement with Gen 13</h2>
      <a href="#generational-improvement-with-gen-13">
        
      </a>
    </div>
    <p>With FL2 unlocking the immense throughput of the high-core-count AMD Turin 9965, we have officially selected these processors for our Gen 13 deployment. Hardware qualification is complete, and Gen 13 servers are now shipping at scale to support our global rollout.</p>
    <div>
      <h3>Performance improvements</h3>
      <a href="#performance-improvements">
        
      </a>
    </div>
    <table><tr><td><p>
</p></td><td><p>Gen 12 </p></td><td><p>Gen 13 </p></td></tr><tr><td><p>Processor</p></td><td><p>AMD EPYC™ 4th Gen Genoa-X 9684X</p></td><td><p>AMD EPYC™ 5th Gen Turin 9965</p></td></tr><tr><td><p>Core count</p></td><td><p>96C/192T</p></td><td><p>192C/384T</p></td></tr><tr><td><p>FL throughput</p></td><td><p>baseline</p></td><td><p>Up to +100%</p></td></tr><tr><td><p>Performance per watt</p></td><td><p>baseline</p></td><td><p>Up to +50%</p></td></tr></table>
    <div>
      <h3>Gen 13 business impact</h3>
      <a href="#gen-13-business-impact">
        
      </a>
    </div>
    <p><b>Up to 2x throughput vs Gen 12 </b>for uncompromising customer experience: By doubling our throughput capacity while staying within our latency SLAs, we guarantee our applications remain fast and responsive, and able to absorb massive traffic spikes.</p><p><b>50% better performance/watt vs Gen 12 </b>for sustainable scaling: This gain in power efficiency not only reduces data center expansion costs, but allows us to process growing traffic with a vastly lower carbon footprint per request.</p><p><b>60% higher rack throughput vs Gen 12 </b>for global edge upgrades: Because we achieved this throughput density while keeping the rack power budget constant, we can seamlessly deploy this next generation compute anywhere in the world across our global edge network, delivering top tier performance exactly where our customers want it.</p>
    <div>
      <h2>Gen 13 + FL2: ready for the edge </h2>
      <a href="#gen-13-fl2-ready-for-the-edge">
        
      </a>
    </div>
    <p>Our legacy request serving layer FL1 hit a cache contention wall on Gen 13, forcing an unacceptable tradeoff between throughput and latency. Instead of compromising, we built FL2. </p><p>Designed with a vastly leaner memory access pattern, FL2 removes our dependency on massive L3 caches and allows linear scaling with core count. Running on the Gen 13 AMD Turin platform, FL2 unlocks 2x the throughput and a 50% boost in power efficiency all while keeping latency within our SLAs. This leap forward is a great reminder of the importance of hardware-software co-design. Unconstrained by cache limits, Gen 13 servers are now ready to be deployed to serve millions of requests across Cloudflare’s global network.</p><p>If you're excited about working on infrastructure at global scale, <a href="https://www.cloudflare.com/careers/jobs"><u>we're hiring</u></a>.</p> ]]></content:encoded>
            <category><![CDATA[Hardware]]></category>
            <category><![CDATA[Performance]]></category>
            <category><![CDATA[Infrastructure]]></category>
            <category><![CDATA[Rust]]></category>
            <category><![CDATA[AMD]]></category>
            <category><![CDATA[Engineering]]></category>
            <guid isPermaLink="false">4shbA7eyT2KredK7RJyizK</guid>
            <dc:creator>Syona Sarma</dc:creator>
            <dc:creator>JQ Lau</dc:creator>
            <dc:creator>Jesse Brandeburg</dc:creator>
        </item>
        <item>
            <title><![CDATA[We deserve a better streams API for JavaScript]]></title>
            <link>https://blog.cloudflare.com/a-better-web-streams-api/</link>
            <pubDate>Fri, 27 Feb 2026 06:00:00 GMT</pubDate>
            <description><![CDATA[ The Web streams API has become ubiquitous in JavaScript runtimes but was designed for a different era. Here's what a modern streaming API could (should?) look like. ]]></description>
            <content:encoded><![CDATA[ <p>Handling data in streams is fundamental to how we build applications. To make streaming work everywhere, the <a href="https://streams.spec.whatwg.org/"><u>WHATWG Streams Standard</u></a> (informally known as "Web streams") was designed to establish a common API to work across browsers and servers. It shipped in browsers, was adopted by Cloudflare Workers, Node.js, Deno, and Bun, and became the foundation for APIs like <a href="https://developer.mozilla.org/en-US/docs/Web/API/Fetch_API"><u>fetch()</u></a>. It's a significant undertaking, and the people who designed it were solving hard problems with the constraints and tools they had at the time.</p><p>But after years of building on Web streams – implementing them in both Node.js and Cloudflare Workers, debugging production issues for customers and runtimes, and helping developers work through far too many common pitfalls – I've come to believe that the standard API has fundamental usability and performance issues that cannot be fixed easily with incremental improvements alone. The problems aren't bugs; they're consequences of design decisions that may have made sense a decade ago, but don't align with how JavaScript developers write code today.</p><p>This post explores some of the fundamental issues I see with Web streams and presents an alternative approach built around JavaScript language primitives that demonstrate something better is possible. </p><p>In benchmarks, this alternative can run anywhere between 2x to <i>120x</i> faster than Web streams in every runtime I've tested it on (including Cloudflare Workers, Node.js, Deno, Bun, and every major browser). The improvements are not due to clever optimizations, but fundamentally different design choices that more effectively leverage modern JavaScript language features. I'm not here to disparage the work that came before; I'm here to start a conversation about what can potentially come next.</p>
    <div>
      <h2>Where we're coming from</h2>
      <a href="#where-were-coming-from">
        
      </a>
    </div>
    <p>The Streams Standard was developed between 2014 and 2016 with an ambitious goal to provide "APIs for creating, composing, and consuming streams of data that map efficiently to low-level I/O primitives." Before Web streams, the web platform had no standard way to work with streaming data.</p><p>Node.js already had its own <a href="https://nodejs.org/api/stream.html"><u>streaming API</u></a> at the time that was ported to also work in browsers, but WHATWG chose not to use it as a starting point given that it is chartered to only consider the needs of Web browsers. Server-side runtimes only adopted Web streams later, after Cloudflare Workers and Deno each emerged with first-class Web streams support and cross-runtime compatibility became a priority.</p><p>The design of Web streams predates async iteration in JavaScript. The <a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/for-await...of"><code><u>for await...of</u></code></a> syntax didn't land until <a href="https://262.ecma-international.org/9.0/"><u>ES2018</u></a>, two years after the Streams Standard was initially finalized. This timing meant the API couldn't initially leverage what would eventually become the idiomatic way to consume asynchronous sequences in JavaScript. Instead, the spec introduced its own reader/writer acquisition model, and that decision rippled through every aspect of the API.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3X0niHShBlgF4LlpWYB7eC/f0bbf35f12ecc98a3888e6e3835acf3a/1.png" />
          </figure>
    <div>
      <h4>Excessive ceremony for common operations</h4>
      <a href="#excessive-ceremony-for-common-operations">
        
      </a>
    </div>
    <p>The most common task with streams is reading them to completion. Here's what that looks like with Web streams:</p>
            <pre><code>// First, we acquire a reader that gives an exclusive lock
// on the stream...
const reader = stream.getReader();
const chunks = [];
try {
  // Second, we repeatedly call read and await on the returned
  // promise to either yield a chunk of data or indicate we're
  // done.
  while (true) {
    const { value, done } = await reader.read();
    if (done) break;
    chunks.push(value);
  }
} finally {
  // Finally, we release the lock on the stream
  reader.releaseLock();
}</code></pre>
            <p>You might assume this pattern is inherent to streaming. It isn't. The reader acquisition, the lock management, and the <code>{ value, done }</code> protocol are all just design choices, not requirements. They are artifacts of how and when the Web streams spec was written. Async iteration exists precisely to handle sequences that arrive over time, but async iteration did not yet exist when the streams specification was written. The complexity here is pure API overhead, not fundamental necessity.</p><p>Consider the alternative approach now that Web streams do support <code>for await...of</code>:</p>
            <pre><code>const chunks = [];
for await (const chunk of stream) {
  chunks.push(chunk);
}</code></pre>
            <p>This is better in that there is far less boilerplate, but it doesn't solve everything. Async iteration was retrofitted onto an API that wasn't designed for it, and it shows. Features like <a href="https://developer.mozilla.org/en-US/docs/Web/API/ReadableStreamBYOBReader"><u>BYOB (bring your own buffer)</u></a> reads aren't accessible through iteration. The underlying complexity of readers, locks, and controllers are still there, just hidden. When something does go wrong, or when additional features of the API are needed, developers find themselves back in the weeds of the original API, trying to understand why their stream is "locked" or why <code>releaseLock()</code> didn't do what they expected or hunting down bottlenecks in code they don't control.</p>
    <div>
      <h4>The locking problem</h4>
      <a href="#the-locking-problem">
        
      </a>
    </div>
    <p>Web streams use a locking model to prevent multiple consumers from interleaving reads. When you call <a href="https://developer.mozilla.org/en-US/docs/Web/API/ReadableStream/getReader"><code><u>getReader()</u></code></a>, the stream becomes locked. While locked, nothing else can read from the stream directly, pipe it, or even cancel it – only the code that is actually holding the reader can.</p><p>This sounds reasonable until you see how easily it goes wrong:</p>
            <pre><code>async function peekFirstChunk(stream) {
  const reader = stream.getReader();
  const { value } = await reader.read();
  // Oops — forgot to call reader.releaseLock()
  // And the reader is no longer available when we return
  return value;
}

const first = await peekFirstChunk(stream);
// TypeError: Cannot obtain lock — stream is permanently locked
for await (const chunk of stream) { /* never runs */ }</code></pre>
            <p>Forgetting <a href="https://developer.mozilla.org/en-US/docs/Web/API/ReadableStreamDefaultReader/releaseLock"><code><u>releaseLock()</u></code></a> permanently breaks the stream. The <a href="https://developer.mozilla.org/en-US/docs/Web/API/ReadableStream/locked"><code><u>locked</u></code></a><code> </code>property tells you that a stream is locked, but not why, by whom, or whether the lock is even still usable. <a href="https://developer.mozilla.org/en-US/docs/Web/API/ReadableStream/pipeTo"><u>Piping</u></a> internally acquires locks, making streams unusable during pipe operations in ways that aren't obvious.</p><p>The semantics around releasing locks with pending reads were also unclear for years. If you called read() but didn't await it, then called releaseLock(), what happened? The spec was recently clarified to cancel pending reads on lock release – but implementations varied, and code that relied on the previous unspecified behavior can break.</p><p>That said, it's important to recognize that locking in itself is not bad. It does, in fact, serve an important purpose to ensure that applications properly and orderly consume or produce data. The key challenge is with the original manual implementation of it using APIs like <code>getReader() </code>and <code>releaseLock()</code>. With the arrival of automatic lock and reader management with async iterables, dealing with locks from the users point of view became a lot easier.</p><p>For implementers, the locking model adds a fair amount of non-trivial internal bookkeeping. Every operation must check lock state, readers must be tracked, and the interplay between locks, cancellation, and error states creates a matrix of edge cases that must all be handled correctly.</p>
    <div>
      <h4>BYOB: complexity without payoff</h4>
      <a href="#byob-complexity-without-payoff">
        
      </a>
    </div>
    <p><a href="https://developer.mozilla.org/en-US/docs/Web/API/ReadableStreamBYOBReader"><u>BYOB (bring your own buffer)</u></a> reads were designed to let developers reuse memory buffers when reading from streams, an important optimization intended for high-throughput scenarios. The idea is sound: instead of allocating new buffers for each chunk, you provide your own buffer and the stream fills it.</p><p>In practice, (and yes, there are always exceptions to be found) BYOB is rarely used to any measurable benefit. The API is substantially more complex than default reads, requiring a separate reader type (<code>ReadableStreamBYOBReader</code>) and other specialized classes (e.g. <code>ReadableStreamBYOBRequest</code>), careful buffer lifecycle management, and understanding of <a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/ArrayBuffer#transferring_arraybuffers"><code><u>ArrayBuffer</u></code><u> detachment</u></a> semantics. When you pass a buffer to a BYOB read, the buffer becomes detached – transferred to the stream – and you get back a different view over potentially different memory. This transfer-based model is error-prone and confusing:</p>
            <pre><code>const reader = stream.getReader({ mode: 'byob' });
const buffer = new ArrayBuffer(1024);
let view = new Uint8Array(buffer);

const result = await reader.read(view);
// 'view' should now be detached and unusable
// (it isn't always in every impl)
// result.value is a NEW view, possibly over different memory
view = result.value; // Must reassign</code></pre>
            <p>BYOB also can't be used with async iteration or TransformStreams, so developers who want zero-copy reads are forced back into the manual reader loop.</p><p>For implementers, BYOB adds significant complexity. The stream must track pending BYOB requests, handle partial fills, manage buffer detachment correctly, and coordinate between the BYOB reader and the underlying source. The <a href="https://github.com/web-platform-tests/wpt/tree/master/streams/readable-byte-streams"><u>Web Platform Tests for readable byte streams</u></a> include dedicated test files just for BYOB edge cases: detached buffers, bad views, response-after-enqueue ordering, and more.</p><p>BYOB ends up being complex for both users and implementers, yet sees little adoption in practice. Most developers stick with default reads and accept the allocation overhead.</p><p>Most userland implementations of custom ReadableStream instances do not typically bother with all the ceremony required to correctly implement both default and BYOB read support in a single stream – and for good reason. It's difficult to get right and most of the time consuming code is typically going to fallback on the default read path. The example below shows what a "correct" implementation would need to do. It's big, complex, and error prone, and not a level of complexity that the typical developer really wants to have to deal with:</p>
            <pre><code>new ReadableStream({
    type: 'bytes',
    
    async pull(controller: ReadableByteStreamController) {      
      if (offset &gt;= totalBytes) {
        controller.close();
        return;
      }
      
      // Check for BYOB request FIRST
      const byobRequest = controller.byobRequest;
      
      if (byobRequest) {
        // === BYOB PATH ===
        // Consumer provided a buffer - we MUST fill it (or part of it)
        const view = byobRequest.view!;
        const bytesAvailable = totalBytes - offset;
        const bytesToWrite = Math.min(view.byteLength, bytesAvailable);
        
        // Create a view into the consumer's buffer and fill it
        // not critical but safer when bytesToWrite != view.byteLength
        const dest = new Uint8Array(
          view.buffer,
          view.byteOffset,
          bytesToWrite
        );
        
        // Fill with sequential bytes (our "data source")
        // Can be any thing here that writes into the view
        for (let i = 0; i &lt; bytesToWrite; i++) {
          dest[i] = (offset + i) &amp; 0xFF;
        }
        
        offset += bytesToWrite;
        
        // Signal how many bytes we wrote
        byobRequest.respond(bytesToWrite);
        
      } else {
        // === DEFAULT READER PATH ===
        // No BYOB request - allocate and enqueue a chunk
        const bytesAvailable = totalBytes - offset;
        const chunkSize = Math.min(1024, bytesAvailable);
        
        const chunk = new Uint8Array(chunkSize);
        for (let i = 0; i &lt; chunkSize; i++) {
          chunk[i] = (offset + i) &amp; 0xFF;
        }
        
        offset += chunkSize;
        controller.enqueue(chunk);
      }
    },
    
    cancel(reason) {
      console.log('Stream canceled:', reason);
    }
  });</code></pre>
            <p>When a host runtime provides a byte-oriented ReadableStream from the runtime itself, for instance, as the <code>body </code>of a fetch <code>Response</code>, it is often far easier for the runtime itself to provide an optimized implementation of BYOB reads, but those still need to be capable of handling both default and BYOB reading patterns and that requirement brings with it a fair amount of complexity.</p>
    <div>
      <h4>Backpressure: good in theory, broken in practice</h4>
      <a href="#backpressure-good-in-theory-broken-in-practice">
        
      </a>
    </div>
    <p>Backpressure – the ability for a slow consumer to signal a fast producer to slow down – is a first-class concept in Web streams. In theory. In practice, the model has some serious flaws.</p><p>The primary signal is <a href="https://developer.mozilla.org/en-US/docs/Web/API/ReadableStreamDefaultController/desiredSize"><code><u>desiredSize</u></code></a> on the controller. It can be positive (wants data), zero (at capacity), negative (over capacity), or null (closed). Producers are supposed to check this value and stop enqueueing when it's not positive. But there's nothing enforcing this: <a href="https://developer.mozilla.org/en-US/docs/Web/API/ReadableStreamDefaultController/enqueue"><code><u>controller.enqueue()</u></code></a> always succeeds, even when desiredSize is deeply negative.</p>
            <pre><code>new ReadableStream({
  start(controller) {
    // Nothing stops you from doing this
    while (true) {
      controller.enqueue(generateData()); // desiredSize: -999999
    }
  }
});</code></pre>
            <p>Stream implementations can and do ignore backpressure; and some spec-defined features explicitly break backpressure. <a href="https://developer.mozilla.org/en-US/docs/Web/API/ReadableStream/tee"><code><u>tee()</u></code></a>, for instance, creates two branches from a single stream. If one branch reads faster than the other, data accumulates in an internal buffer with no limit. A fast consumer can cause unbounded memory growth while the slow consumer catches up, and there's no way to configure this or opt out beyond canceling the slower branch.</p><p>Web streams do provide clear mechanisms for tuning backpressure behavior in the form of the <code>highWaterMark</code> option and customizable size calculations, but these are just as easy to ignore as <code>desiredSize</code>, and many applications simply fail to pay attention to them.</p><p>The same issues exist on the <code>WritableStream</code> side. A <code>WritableStream</code> has a <code>highWaterMark</code> and <code>desiredSize</code>. There is a <code>writer.ready</code> promise that producers of data are supposed to pay attention but often don't.</p>
            <pre><code>const writable = getWritableStreamSomehow();
const writer = writable.getWriter();

// Producers are supposed to wait for the writer.ready
// It is a promise that, when resolves, indicates that
// the writables internal backpressure is cleared and
// it is ok to write more data
await writer.ready;
await writer.write(...);</code></pre>
            <p>For implementers, backpressure adds complexity without providing guarantees. The machinery to track queue sizes, compute <code>desiredSize</code>, and invoke <code>pull()</code> at the right times must all be implemented correctly. However, since these signals are advisory, all that work doesn't actually prevent the problems backpressure is supposed to solve.</p>
    <div>
      <h4>The hidden cost of promises</h4>
      <a href="#the-hidden-cost-of-promises">
        
      </a>
    </div>
    <p>The Web streams spec requires promise creation at numerous points, often in hot paths and often invisible to users. Each <code>read()</code> call doesn't just return a promise; internally, the implementation creates additional promises for queue management, <code>pull()</code> coordination, and backpressure signaling.</p><p>This overhead is mandated by the spec's reliance on promises for buffer management, completion, and backpressure signals. While some of it is implementation-specific, much of it is unavoidable if you're following the spec as written. For high-frequency streaming – video frames, network packets, real-time data – this overhead is significant.</p><p>The problem compounds in pipelines. Each <code>TransformStream</code> adds another layer of promise machinery between source and sink. The spec doesn't define synchronous fast paths, so even when data is available immediately, the promise machinery still runs.</p><p>For implementers, this promise-heavy design constrains optimization opportunities. The spec mandates specific promise resolution ordering, making it difficult to batch operations or skip unnecessary async boundaries without risking subtle compliance failures. There are many hidden internal optimizations that implementers do make but these can be complicated and difficult to get right.</p><p>While I was writing this blog post, Vercel's Malte Ubl published their own <a href="https://vercel.com/blog/we-ralph-wiggumed-webstreams-to-make-them-10x-faster"><u>blog post</u></a> describing some research work Vercel has been doing around improving the performance of Node.js' Web streams implementation. In that post they discuss the same fundamental performance optimization problem that every implementation of Web streams face:</p><blockquote><p>"Or consider pipeTo(). Each chunk passes through a full Promise chain: read, write, check backpressure, repeat. An {value, done} result object is allocated per read. Error propagation creates additional Promise branches.</p><p>None of this is wrong. These guarantees matter in the browser where streams cross security boundaries, where cancellation semantics need to be airtight, where you do not control both ends of a pipe. But on the server, when you are piping React Server Components through three transforms at 1KB chunks, the cost adds up.</p><p>We benchmarked native WebStream pipeThrough at 630 MB/s for 1KB chunks. Node.js pipeline() with the same passthrough transform: ~7,900 MB/s. That is a 12x gap, and the difference is almost entirely Promise and object allocation overhead." 
- Malte Ubl, <a href="https://vercel.com/blog/we-ralph-wiggumed-webstreams-to-make-them-10x-faster"><u>https://vercel.com/blog/we-ralph-wiggumed-webstreams-to-make-them-10x-faster</u></a></p></blockquote><p>As part of their research, they have put together a set of proposed improvements for Node.js' Web streams implementation that will eliminate promises in certain code paths which can yield a significant performance boost up to 10x faster, which only goes to prove the point: promises, while useful, add significant overhead. As one of the core maintainers of Node.js, I am looking forward to helping Malte and the folks at Vercel get their proposed improvements landed!</p><p>In a recent update made to Cloudflare Workers, I made similar kinds of modifications to an internal data pipeline that reduced the number of JavaScript promises created in certain application scenarios by up to 200x. The result is several orders of magnitude improvement in performance in those applications.</p>
    <div>
      <h3>Real-world failures</h3>
      <a href="#real-world-failures">
        
      </a>
    </div>
    
    <div>
      <h4>Exhausting resources with unconsumed bodies</h4>
      <a href="#exhausting-resources-with-unconsumed-bodies">
        
      </a>
    </div>
    <p>When <code>fetch()</code> returns a response, the body is a <a href="https://developer.mozilla.org/en-US/docs/Web/API/Response/body"><code><u>ReadableStream</u></code></a>. If you only check the status and don't consume or cancel the body, what happens? The answer varies by implementation, but a common outcome is resource leakage.</p>
            <pre><code>async function checkEndpoint(url) {
  const response = await fetch(url);
  return response.ok; // Body is never consumed or cancelled
}

// In a loop, this can exhaust connection pools
for (const url of urls) {
  await checkEndpoint(url);
}</code></pre>
            <p>This pattern has caused connection pool exhaustion in Node.js applications using <a href="https://nodejs.org/api/globals.html#fetch"><u>undici</u></a> (the <code>fetch() </code>implementation built into Node.js), and similar issues have appeared in other runtimes. The stream holds a reference to the underlying connection, and without explicit consumption or cancellation, the connection may linger until garbage collection – which may not happen soon enough under load.</p><p>The problem is compounded by APIs that implicitly create stream branches. <a href="https://developer.mozilla.org/en-US/docs/Web/API/Request/clone"><code><u>Request.clone()</u></code></a> and <a href="https://developer.mozilla.org/en-US/docs/Web/API/Response/clone"><code><u>Response.clone()</u></code></a> perform implicit <code>tee()</code> operations on the body stream – a detail that's easy to miss. Code that clones a request for logging or retry logic may unknowingly create branched streams that need independent consumption, multiplying the resource management burden.</p><p>Now, to be certain, these types of issues <i>are</i> implementation bugs. The connection leak was definitely something that undici needed to fix in its own implementation, but the complexity of the specification does not make dealing with these types of issues easy.</p><blockquote><p>"Cloning streams in Node.js's fetch() implementation is harder than it looks. When you clone a request or response body, you're calling tee() - which splits a single stream into two branches that both need to be consumed. If one consumer reads faster than the other, data buffers unbounded in memory waiting for the slow branch. If you don't properly consume both branches, the underlying connection leaks. The coordination required between two readers sharing one source makes it easy to accidentally break the original request or exhaust connection pools. It's a simple API call with complex underlying mechanics that are difficult to get right." - Matteo Collina, Ph.D. - Platformatic Co-Founder &amp; CTO, Node.js Technical Steering Committee Chair</p></blockquote>
    <div>
      <h4>Falling headlong off the tee() memory cliff</h4>
      <a href="#falling-headlong-off-the-tee-memory-cliff">
        
      </a>
    </div>
    <p><a href="https://developer.mozilla.org/en-US/docs/Web/API/ReadableStream/tee"><code><u>tee()</u></code></a> splits a stream into two branches. It seems straightforward, but the implementation requires buffering: if one branch is read faster than the other, the data must be held somewhere until the slower branch catches up.</p>
            <pre><code>const [forHash, forStorage] = response.body.tee();

// Hash computation is fast
const hash = await computeHash(forHash);

// Storage write is slow — meanwhile, the entire stream
// may be buffered in memory waiting for this branch
await writeToStorage(forStorage);</code></pre>
            <p>The spec does not mandate buffer limits for <code>tee()</code>. And to be fair, the spec allows implementations to implement the actual internal mechanisms for <code>tee()</code>and other APIs in any way they see fit so long as the observable normative requirements of the specification are met. But if an implementation chooses to implement <code>tee()</code> in the specific way described by the streams specification, then <code>tee()</code> will come with a built-in memory management issue that is difficult to work around.</p><p>Implementations have had to develop their own strategies for dealing with this. Firefox initially used a linked-list approach that led to O<code>(n)</code> memory growth proportional to the consumption rate difference. In Cloudflare Workers, we opted to implement a shared buffer model where backpressure is signaled by the slowest consumer rather than the fastest.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5cl4vqYfaHaVXiHjLSXv0a/03a0b9fe4c9c0594e181ffee43b63998/2.png" />
          </figure>
    <div>
      <h4>Transform backpressure gaps</h4>
      <a href="#transform-backpressure-gaps">
        
      </a>
    </div>
    <p><code>TransformStream</code> creates a <code>readable/writable</code> pair with processing logic in between. The <code>transform()</code> function executes on <i>write</i>, not on read. Processing of the transform happens eagerly as data arrives, regardless of whether any consumer is ready. This causes unnecessary work when consumers are slow, and the backpressure signaling between the two sides has gaps that can cause unbounded buffering under load. The expectation in the spec is that the producer of the data being transformed is paying attention to the <code>writer.ready</code> signal on the writable side of the transform but quite often producers just simply ignore it.</p><p>If the transform's <code>transform() </code>operation is synchronous and always enqueues output immediately, it never signals backpressure back to the writable side even when the downstream consumer is slow. This is a consequence of the spec design that many developers completely overlook. In browsers, where there's only a single user and typically only a small number of stream pipelines active at any given time, this type of foot gun is often of no consequence, but it has a major impact on server-side or edge performance in runtimes that serve thousands of concurrent requests.</p>
            <pre><code>const fastTransform = new TransformStream({
  transform(chunk, controller) {
    // Synchronously enqueue — this never applies backpressure
    // Even if the readable side's buffer is full, this succeeds
    controller.enqueue(processChunk(chunk));
  }
});

// Pipe a fast source through the transform to a slow sink
fastSource
  .pipeThrough(fastTransform)
  .pipeTo(slowSink);  // Buffer grows without bound</code></pre>
            <p>What TransformStreams are supposed to do is check for backpressure on the controller and use promises to communicate that back to the writer:</p>
            <pre><code>const fastTransform = new TransformStream({
  async transform(chunk, controller) {
    if (controller.desiredSize &lt;= 0) {
      // Wait on the backpressure to clear somehow
    }

    controller.enqueue(processChunk(chunk));
  }
});</code></pre>
            <p>A difficulty here, however, is that the <code>TransformStreamDefaultController</code> does not have a ready promise mechanism like Writers do; so the <code>TransformStream</code> implementation would need to implement a polling mechanism to periodically check when <code>controller.desiredSize</code> becomes positive again.</p><p>The problem gets worse in pipelines. When you chain multiple transforms – say, parse, transform, then serialize – each <code>TransformStream</code> has its own internal readable and writable buffers. If implementers follow the spec strictly, data cascades through these buffers in a push-oriented fashion: the source pushes to transform A, which pushes to transform B, which pushes to transform C, each accumulating data in intermediate buffers before the final consumer has even started pulling. With three transforms, you can have six internal buffers filling up simultaneously.</p><p>Developers using the streams API are expected to remember to use options like <code>highWaterMark</code> when creating their sources, transforms, and writable destinations but often they either forget or simply choose to ignore it.</p>
            <pre><code>source
  .pipeThrough(parse)      // buffers filling...
  .pipeThrough(transform)  // more buffers filling...
  .pipeThrough(serialize)  // even more buffers...
  .pipeTo(destination);    // consumer hasn't started yet</code></pre>
            <p>Implementations have found ways to optimize transform pipelines by collapsing identity transforms, short-circuiting non-observable paths, deferring buffer allocation, or falling back to native code that does not run JavaScript at all. Deno, Bun, and Cloudflare Workers have all successfully implemented "native path" optimizations that can help eliminate much of the overhead, and Vercel's recent <a href="https://vercel.com/blog/we-ralph-wiggumed-webstreams-to-make-them-10x-faster"><u>fast-webstreams</u></a> research is working on similar optimizations for Node.js. But the optimizations themselves add significant complexity and still can't fully escape the inherently push-oriented model that TransformStream uses.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/64FcAUPYrTvOSYOPoT2FkR/cc91e0d32dd47320e8ac9d6f431a2fda/3.png" />
          </figure>
    <div>
      <h4>GC thrashing in server-side rendering</h4>
      <a href="#gc-thrashing-in-server-side-rendering">
        
      </a>
    </div>
    <p>Streaming server-side rendering (SSR) is a particularly painful case. A typical SSR stream might render thousands of small HTML fragments, each passing through the streams machinery:</p>
            <pre><code>// Each component enqueues a small chunk
function renderComponent(controller) {
  controller.enqueue(encoder.encode(`&lt;div&gt;${content}&lt;/div&gt;`));
}

// Hundreds of components = hundreds of enqueue calls
// Each one triggers promise machinery internally
for (const component of components) {
  renderComponent(controller);  // Promises created, objects allocated
}</code></pre>
            <p>Every fragment means promises created for <code>read()</code> calls, promises for backpressure coordination, intermediate buffer allocations, and <code>{ value, done } </code>result objects – most of which become garbage almost immediately.</p><p>Under load, this creates GC pressure that can devastate throughput. The JavaScript engine spends significant time collecting short-lived objects instead of doing useful work. Latency becomes unpredictable as GC pauses interrupt request handling. I've seen SSR workloads where garbage collection accounts for a substantial portion (up to and beyond 50%) of total CPU time per request. That's time that could be spent actually rendering content.</p><p>The irony is that streaming SSR is supposed to improve performance by sending content incrementally. But the overhead of the streams machinery can negate those gains, especially for pages with many small components. Developers sometimes find that buffering the entire response is actually faster than streaming through Web streams, defeating the purpose entirely.</p>
    <div>
      <h3>The optimization treadmill</h3>
      <a href="#the-optimization-treadmill">
        
      </a>
    </div>
    <p>To achieve usable performance, every major runtime has resorted to non-standard internal optimizations for Web streams. Node.js, Deno, Bun, and Cloudflare Workers have all developed their own workarounds. This is particularly true for streams wired up to system-level I/O, where much of the machinery is non-observable and can be short-circuited.</p><p>Finding these optimization opportunities can itself be a significant undertaking. It requires end-to-end understanding of the spec to identify which behaviors are observable and which can safely be elided. Even then, whether a given optimization is actually spec-compliant is often unclear. Implementers must make judgment calls about which semantics they can relax without breaking compatibility. This puts enormous pressure on runtime teams to become spec experts just to achieve acceptable performance.</p><p>These optimizations are difficult to implement, frequently error-prone, and lead to inconsistent behavior across runtimes. Bun's "<a href="https://bun.sh/docs/api/streams#direct-readablestream"><u>Direct Streams</u></a>" optimization takes a deliberately and observably non-standard approach, bypassing much of the spec's machinery entirely. Cloudflare Workers' <a href="https://developers.cloudflare.com/workers/runtime-apis/streams/transformstream/"><code><u>IdentityTransformStream</u></code></a> provides a fast-path for pass-through transforms but is Workers-specific and implements behaviors that are not standard for a <code>TransformStream</code>. Each runtime has its own set of tricks and the natural tendency is toward non-standard solutions, because that's often the only way to make things fast.</p><p>This fragmentation hurts portability. Code that performs well on one runtime may behave differently (or poorly) on another, even though it's using "standard" APIs. The complexity burden on runtime implementers is substantial, and the subtle behavioral differences create friction for developers trying to write cross-runtime code, particularly those maintaining frameworks that must be able to run efficiently across many runtime environments.</p><p>It is also necessary to emphasize that many optimizations are only possible in parts of the spec that are unobservable to user code. The alternative, like Bun "Direct Streams", is to intentionally diverge from the spec-defined observable behaviors. This means optimizations often feel "incomplete". They work in some scenarios but not in others, in some runtimes but not others, etc. Every such case adds to the overall unsustainable complexity of the Web streams approach which is why most runtime implementers rarely put significant effort into further improvements to their streams implementations once the conformance tests are passing.</p><p>Implementers shouldn't need to jump through these hoops. When you find yourself needing to relax or bypass spec semantics just to achieve reasonable performance, that's a sign something is wrong with the spec itself. A well-designed streaming API should be efficient by default, not require each runtime to invent its own escape hatches.</p>
    <div>
      <h3>The compliance burden</h3>
      <a href="#the-compliance-burden">
        
      </a>
    </div>
    <p>A complex spec creates complex edge cases. The <a href="https://github.com/web-platform-tests/wpt/tree/master/streams"><u>Web Platform Tests for streams</u></a> span over 70 test files, and while comprehensive testing is a good thing, what's telling is what needs to be tested.</p><p>Consider some of the more obscure tests that implementations must pass:</p><ul><li><p>Prototype pollution defense: One test patches <code>Object.prototype.</code>then to intercept promise resolutions, then verifies that <code>pipeTo()</code> and <code>tee()</code> operations don't leak internal values through the prototype chain. This tests a security property that only exists because the spec's promise-heavy internals create an attack surface.</p></li><li><p>WebAssembly memory rejection: BYOB reads must explicitly reject ArrayBuffers backed by WebAssembly memory, which look like regular buffers but can't be transferred. This edge case exists because of the spec's buffer detachment model – a simpler API wouldn't need to handle it.</p></li><li><p>Crash regression for state machine conflicts: A test specifically checks that calling <code>byobRequest.respond()</code> after <code>enqueue()</code> doesn't crash the runtime. This sequence creates a conflict in the internal state machine — the <code>enqueue()</code> fulfills the pending read and should invalidate the <code>byobRequest</code>, but implementations must gracefully handle the subsequent <code>respond()</code> rather than corrupting memory in order to cover the very likely possibility that developers are not using the complex API correctly.</p></li></ul><p>These aren't contrived scenarios invented by test authors in total vacuum. They're consequences of the spec's design and reflect real world bugs.</p><p>For runtime implementers, passing the WPT suite means handling intricate corner cases that most application code will never encounter. The tests encode not just the happy path but the full matrix of interactions between readers, writers, controllers, queues, strategies, and the promise machinery that connects them all.</p><p>A simpler API would mean fewer concepts, fewer interactions between concepts, and fewer edge cases to get right resulting in more confidence that implementations actually behave consistently.</p>
    <div>
      <h3>The takeaway</h3>
      <a href="#the-takeaway">
        
      </a>
    </div>
    <p>Web streams are complex for users and implementers alike. The problems with the spec aren't bugs. They emerge from using the API exactly as designed. They aren't issues that can be fixed solely through incremental improvements. They're consequences of fundamental design choices. To improve things we need different foundations.</p>
    <div>
      <h2>A better streams API is possible</h2>
      <a href="#a-better-streams-api-is-possible">
        
      </a>
    </div>
    <p>After implementing the Web streams spec multiple times across different runtimes and seeing the pain points firsthand, I decided it was time to explore what a better, alternative streaming API could look like if designed from first principles today.</p><p>What follows is a proof of concept: it's not a finished standard, not a production-ready library, not even necessarily a concrete proposal for something new, but a starting point for discussion that demonstrates the problems with Web streams aren't inherent to streaming itself; they're consequences of specific design choices that could be made differently. Whether this exact API is the right answer is less important than whether it sparks a productive conversation about what we actually need from a streaming primitive.</p>
    <div>
      <h3>What is a stream?</h3>
      <a href="#what-is-a-stream">
        
      </a>
    </div>
    <p>Before diving into API design, it's worth asking: what is a stream?</p><p>At its core, a stream is just a sequence of data that arrives over time. You don't have all of it at once. You process it incrementally as it becomes available.</p><p>Unix pipes are perhaps the purest expression of this idea:</p>
            <pre><code>cat access.log | grep "error" | sort | uniq -c</code></pre>
            <p>
Data flows left to right. Each stage reads input, does its work, writes output. There's no pipe reader to acquire, no controller lock to manage. If a downstream stage is slow, upstream stages naturally slow down as well. Backpressure is implicit in the model, not a separate mechanism to learn (or ignore).</p><p>In JavaScript, the natural primitive for "a sequence of things that arrive over time" is already in the language: the async iterable. You consume it with <code>for await...of</code>. You stop consuming by stopping iteration.</p><p>This is the intuition the new API tries to preserve: streams should feel like iteration, because that's what they are. The complexity of Web streams – readers, writers, controllers, locks, queuing strategies – obscures this fundamental simplicity. A better API should make the simple case simple and only add complexity where it's genuinely needed.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3AUAA4bitbTOVSQg7Pd7fv/0856b44d78899dcffc4493f4146fb64f/4.png" />
          </figure>
    <div>
      <h3>Design principles</h3>
      <a href="#design-principles">
        
      </a>
    </div>
    <p>I built the proof-of-concept alternative around a different set of principles.</p>
    <div>
      <h4>Streams are iterables.</h4>
      <a href="#streams-are-iterables">
        
      </a>
    </div>
    <p>No custom <code>ReadableStream</code> class with hidden internal state. A readable stream is just an <a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Iteration_protocols#the_async_iterator_and_async_iterable_protocols"><code><u>AsyncIterable&lt;Uint8Array[]&gt;</u></code></a>. You consume it with <code>for await...of</code>. No readers to acquire, no locks to manage.</p>
    <div>
      <h4>Pull-through transforms</h4>
      <a href="#pull-through-transforms">
        
      </a>
    </div>
    <p>Transforms don't execute until the consumer pulls. There's no eager evaluation, no hidden buffering. Data flows on-demand from source, through transforms, to the consumer. If you stop iterating, processing stops.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4bEXBTEOHBMnCRKGA7odt5/cf51074cce3bb8b2ec1b5158c7560b68/5.png" />
          </figure>
    <div>
      <h4>Explicit backpressure</h4>
      <a href="#explicit-backpressure">
        
      </a>
    </div>
    <p>Backpressure is strict by default. When a buffer is full, writes reject rather than silently accumulating. You can configure alternative policies – block until space is available, drop oldest, drop newest – but you have to choose explicitly. No more silent memory growth.</p>
    <div>
      <h4>Batched chunks</h4>
      <a href="#batched-chunks">
        
      </a>
    </div>
    <p>Instead of yielding one chunk per iteration, streams yield <code>Uint8Array[]:</code> arrays of chunks. This amortizes the async overhead across multiple chunks, reducing promise creation and microtask latency in hot paths.</p>
    <div>
      <h4>Bytes only</h4>
      <a href="#bytes-only">
        
      </a>
    </div>
    <p>The API deals exclusively with bytes (<a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Uint8Array"><code><u>Uint8Array</u></code></a>). Strings are UTF-8 encoded automatically. There's no "value stream" vs "byte stream" dichotomy. If you want to stream arbitrary JavaScript values, use async iterables directly. While the API uses <code>Uint8Array</code>, it treats chunks as opaque. There is no partial consumption, no BYOB patterns, no byte-level operations within the streaming machinery itself. Chunks go in, chunks come out, unchanged unless a transform explicitly modifies them.</p>
    <div>
      <h4>Synchronous fast paths matter</h4>
      <a href="#synchronous-fast-paths-matter">
        
      </a>
    </div>
    <p>The API recognizes that synchronous data sources are both necessary and common. The application should not be forced to always accept the performance cost of asynchronous scheduling simply because that's the only option provided. At the same time, mixing sync and async processing can be dangerous. Synchronous paths should always be an option and should always be explicit.</p>
    <div>
      <h3>The new API in action</h3>
      <a href="#the-new-api-in-action">
        
      </a>
    </div>
    
    <div>
      <h4>Creating and consuming streams</h4>
      <a href="#creating-and-consuming-streams">
        
      </a>
    </div>
    <p>In Web streams, creating a simple producer/consumer pair requires <code>TransformStream</code>, manual encoding, and careful lock management:</p>
            <pre><code>const { readable, writable } = new TransformStream();
const enc = new TextEncoder();
const writer = writable.getWriter();
await writer.write(enc.encode("Hello, World!"));
await writer.close();
writer.releaseLock();

const dec = new TextDecoder();
let text = '';
for await (const chunk of readable) {
  text += dec.decode(chunk, { stream: true });
}
text += dec.decode();</code></pre>
            <p>Even this relatively clean version requires: a <code>TransformStream</code>, manual <code>TextEncoder</code> and <code>TextDecoder</code>, and explicit lock release.</p><p>Here's the equivalent with the new API:</p>
            <pre><code>import { Stream } from 'new-streams';

// Create a push stream
const { writer, readable } = Stream.push();

// Write data — backpressure is enforced
await writer.write("Hello, World!");
await writer.end();

// Consume as text
const text = await Stream.text(readable);</code></pre>
            <p>The readable is just an async iterable. You can pass it to any function that expects one, including <code>Stream.text()</code> which collects and decodes the entire stream.</p><p>The writer has a simple interface: <code>write(), writev()</code> for batched writes, <code>end()</code> to signal completion, and <code>abort()</code> for errors. That's essentially it.</p><p>The Writer is not a concrete class. Any object that implements <code>write()</code>, <code>end()</code>, and <code>abort()</code> can be a writer making it easy to adapt existing APIs or create specialized implementations without subclassing. There's no complex <code>UnderlyingSink</code> protocol with <code>start()</code>, <code>write()</code>, <code>close()</code>, <code>and abort() </code>callbacks that must coordinate through a controller whose lifecycle and state are independent of the <code>WritableStream</code> it is bound to.</p><p>Here's a simple in-memory writer that collects all written data:</p>
            <pre><code>// A minimal writer implementation — just an object with methods
function createBufferWriter() {
  const chunks = [];
  let totalBytes = 0;
  let closed = false;

  const addChunk = (chunk) =&gt; {
    chunks.push(chunk);
    totalBytes += chunk.byteLength;
  };

  return {
    get desiredSize() { return closed ? null : 1; },

    // Async variants
    write(chunk) { addChunk(chunk); },
    writev(batch) { for (const c of batch) addChunk(c); },
    end() { closed = true; return totalBytes; },
    abort(reason) { closed = true; chunks.length = 0; },

    // Sync variants return boolean (true = accepted)
    writeSync(chunk) { addChunk(chunk); return true; },
    writevSync(batch) { for (const c of batch) addChunk(c); return true; },
    endSync() { closed = true; return totalBytes; },
    abortSync(reason) { closed = true; chunks.length = 0; return true; },

    getChunks() { return chunks; }
  };
}

// Use it
const writer = createBufferWriter();
await Stream.pipeTo(source, writer);
const allData = writer.getChunks();</code></pre>
            <p>No base class to extend, no abstract methods to implement, no controller to coordinate with. Just an object with the right shape.</p>
    <div>
      <h4>Pull-through transforms</h4>
      <a href="#pull-through-transforms">
        
      </a>
    </div>
    <p>Under the new API design, transforms should not perform any work until the data is being consumed. This is a fundamental principle.</p>
            <pre><code>// Nothing executes until iteration begins
const output = Stream.pull(source, compress, encrypt);

// Transforms execute as we iterate
for await (const chunks of output) {
  for (const chunk of chunks) {
    process(chunk);
  }
}</code></pre>
            <p><code>Stream.pull()</code> creates a lazy pipeline. The <code>compress</code> and <code>encrypt</code> transforms don't run until you start iterating output. Each iteration pulls data through the pipeline on demand.</p><p>This is fundamentally different from Web streams' <a href="https://developer.mozilla.org/en-US/docs/Web/API/ReadableStream/pipeThrough"><code><u>pipeThrough()</u></code></a>, which starts actively pumping data from the source to the transform as soon as you set up the pipe. Pull semantics mean you control when processing happens, and stopping iteration stops processing.</p><p>Transforms can be stateless or stateful. A stateless transform is just a function that takes chunks and returns transformed chunks:</p>
            <pre><code>// Stateless transform — a pure function
// Receives chunks or null (flush signal)
const toUpperCase = (chunks) =&gt; {
  if (chunks === null) return null; // End of stream
  return chunks.map(chunk =&gt; {
    const str = new TextDecoder().decode(chunk);
    return new TextEncoder().encode(str.toUpperCase());
  });
};

// Use it directly
const output = Stream.pull(source, toUpperCase);</code></pre>
            <p>Stateful transforms are simple objects with member functions that maintain state across calls:</p>
            <pre><code>// Stateful transform — a generator that wraps the source
function createLineParser() {
  // Helper to concatenate Uint8Arrays
  const concat = (...arrays) =&gt; {
    const result = new Uint8Array(arrays.reduce((n, a) =&gt; n + a.length, 0));
    let offset = 0;
    for (const arr of arrays) { result.set(arr, offset); offset += arr.length; }
    return result;
  };

  return {
    async *transform(source) {
      let pending = new Uint8Array(0);
      
      for await (const chunks of source) {
        if (chunks === null) {
          // Flush: yield any remaining data
          if (pending.length &gt; 0) yield [pending];
          continue;
        }
        
        // Concatenate pending data with new chunks
        const combined = concat(pending, ...chunks);
        const lines = [];
        let start = 0;

        for (let i = 0; i &lt; combined.length; i++) {
          if (combined[i] === 0x0a) { // newline
            lines.push(combined.slice(start, i));
            start = i + 1;
          }
        }

        pending = combined.slice(start);
        if (lines.length &gt; 0) yield lines;
      }
    }
  };
}

const output = Stream.pull(source, createLineParser());</code></pre>
            <p>For transforms that need cleanup on abort, add an abort handler:</p>
            <pre><code>// Stateful transform with resource cleanup
function createGzipCompressor() {
  // Hypothetical compression API...
  const deflate = new Deflater({ gzip: true });

  return {
    async *transform(source) {
      for await (const chunks of source) {
        if (chunks === null) {
          // Flush: finalize compression
          deflate.push(new Uint8Array(0), true);
          if (deflate.result) yield [deflate.result];
        } else {
          for (const chunk of chunks) {
            deflate.push(chunk, false);
            if (deflate.result) yield [deflate.result];
          }
        }
      }
    },
    abort(reason) {
      // Clean up compressor resources on error/cancellation
    }
  };
}</code></pre>
            <p>For implementers, there's no Transformer protocol with <code>start()</code>, <code>transform()</code>, <code>flush()</code> methods and controller coordination passed into a <code>TransformStream</code> class that has its own hidden state machine and buffering mechanisms. Transforms are just functions or simple objects: far simpler to implement and test.</p>
    <div>
      <h4>Explicit backpressure policies</h4>
      <a href="#explicit-backpressure-policies">
        
      </a>
    </div>
    <p>When a bounded buffer fills up and a producer wants to write more, there are only a few things you can do:</p><ol><li><p>Reject the write: refuse to accept more data</p></li><li><p>Wait: block until space becomes available</p></li><li><p>Discard old data: evict what's already buffered to make room</p></li><li><p>Discard new data: drop what's incoming</p></li></ol><p>That's it. Any other response is either a variation of these (like "resize the buffer," which is really just deferring the choice) or domain-specific logic that doesn't belong in a general streaming primitive. Web streams currently always choose Wait by default.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/68339c8QsvNmb7JcZ2lSDO/e52a86a9b8f52b52eb9328d5ee58f23a/6.png" />
          </figure><p>The new API makes you choose one of these four explicitly:</p><ul><li><p><code>strict</code> (default): Rejects writes when the buffer is full and too many writes are pending. Catches "fire-and-forget" patterns where producers ignore backpressure.</p></li><li><p><code>block</code>: Writes wait until buffer space is available. Use when you trust the producer to await writes properly.</p></li><li><p><code>drop-oldest</code>: Drops the oldest buffered data to make room. Useful for live feeds where stale data loses value.</p></li><li><p><code>drop-newest</code>: Discards incoming data when full. Useful when you want to process what you have without being overwhelmed.</p></li></ul>
            <pre><code>const { writer, readable } = Stream.push({
  highWaterMark: 10,
  backpressure: 'strict' // or 'block', 'drop-oldest', 'drop-newest'
});</code></pre>
            <p>No more hoping producers cooperate. The policy you choose determines what happens when the buffer fills.</p><p>Here's how each policy behaves when a producer writes faster than the consumer reads:</p>
            <pre><code>// strict: Catches fire-and-forget writes that ignore backpressure
const strict = Stream.push({ highWaterMark: 2, backpressure: 'strict' });
strict.writer.write(chunk1);  // ok (not awaited)
strict.writer.write(chunk2);  // ok (fills slots buffer)
strict.writer.write(chunk3);  // ok (queued in pending)
strict.writer.write(chunk4);  // ok (pending buffer fills)
strict.writer.write(chunk5);  // throws! too many pending writes

// block: Wait for space (unbounded pending queue)
const blocking = Stream.push({ highWaterMark: 2, backpressure: 'block' });
await blocking.writer.write(chunk1);  // ok
await blocking.writer.write(chunk2);  // ok
await blocking.writer.write(chunk3);  // waits until consumer reads
await blocking.writer.write(chunk4);  // waits until consumer reads
await blocking.writer.write(chunk5);  // waits until consumer reads

// drop-oldest: Discard old data to make room
const dropOld = Stream.push({ highWaterMark: 2, backpressure: 'drop-oldest' });
await dropOld.writer.write(chunk1);  // ok
await dropOld.writer.write(chunk2);  // ok
await dropOld.writer.write(chunk3);  // ok, chunk1 discarded

// drop-newest: Discard incoming data when full
const dropNew = Stream.push({ highWaterMark: 2, backpressure: 'drop-newest' });
await dropNew.writer.write(chunk1);  // ok
await dropNew.writer.write(chunk2);  // ok
await dropNew.writer.write(chunk3);  // silently dropped</code></pre>
            
    <div>
      <h4>Explicit Multi-consumer patterns</h4>
      <a href="#explicit-multi-consumer-patterns">
        
      </a>
    </div>
    
            <pre><code>// Share with explicit buffer management
const shared = Stream.share(source, {
  highWaterMark: 100,
  backpressure: 'strict'
});

const consumer1 = shared.pull();
const consumer2 = shared.pull(decompress);</code></pre>
            <p>Instead of <code>tee()</code> with its hidden unbounded buffer, you get explicit multi-consumer primitives. <code>Stream.share()</code> is pull-based: consumers pull from a shared source, and you configure the buffer limits and backpressure policy upfront.</p><p>There's also <code>Stream.broadcast()</code> for push-based multi-consumer scenarios. Both require you to think about what happens when consumers run at different speeds, because that's a real concern that shouldn't be hidden.</p>
    <div>
      <h4>Sync/async separation</h4>
      <a href="#sync-async-separation">
        
      </a>
    </div>
    <p>Not all streaming workloads involve I/O. When your source is in-memory and your transforms are pure functions, async machinery adds overhead without benefit. You're paying for coordination of "waiting" that adds no benefit.</p><p>The new API has complete parallel sync versions: <code>Stream.pullSync()</code>, <code>Stream.bytesSync()</code>, <code>Stream.textSync()</code>, and so on. If your source and transforms are all synchronous, you can process the entire pipeline without a single promise.</p>
            <pre><code>// Async — when source or transforms may be asynchronous
const textAsync = await Stream.text(source);

// Sync — when all components are synchronous
const textSync = Stream.textSync(source);</code></pre>
            <p>Here's a complete synchronous pipeline – compression, transformation, and consumption with zero async overhead:</p>
            <pre><code>// Synchronous source from in-memory data
const source = Stream.fromSync([inputBuffer]);

// Synchronous transforms
const compressed = Stream.pullSync(source, zlibCompressSync);
const encrypted = Stream.pullSync(compressed, aesEncryptSync);

// Synchronous consumption — no promises, no event loop trips
const result = Stream.bytesSync(encrypted);</code></pre>
            <p>The entire pipeline executes in a single call stack. No promises are created, no microtask queue scheduling occurs, and no GC pressure from short-lived async machinery. For CPU-bound workloads like parsing, compression, or transformation of in-memory data, this can be significantly faster than the equivalent Web streams code – which would force async boundaries even when every component is synchronous.</p><p>Web streams has no synchronous path. Even if your source has data ready and your transform is a pure function, you still pay for promise creation and microtask scheduling on every operation. Promises are fantastic for cases in which waiting is actually necessary, but they aren't always necessary. The new API lets you stay in sync-land when that's what you need.</p>
    <div>
      <h4>Bridging the gap between this and web streams</h4>
      <a href="#bridging-the-gap-between-this-and-web-streams">
        
      </a>
    </div>
    <p>The async iterator based approach provides a natural bridge between this alternative approach and Web streams. When coming from a ReadableStream to this new approach, simply passing the readable in as input works as expected when the ReadableStream is set up to yield bytes:</p>
            <pre><code>const readable = getWebReadableStreamSomehow();
const input = Stream.pull(readable, transform1, transform2);
for await (const chunks of input) {
  // process chunks
}</code></pre>
            <p>When adapting to a ReadableStream, a bit more work is required since the alternative approach yields batches of chunks, but the adaptation layer is as easily straightforward:</p>
            <pre><code>async function* adapt(input) {
  for await (const chunks of input) {
    for (const chunk of chunks) {
      yield chunk;
    }
  }
}

const input = Stream.pull(source, transform1, transform2);
const readable = ReadableStream.from(adapt(input));</code></pre>
            
    <div>
      <h4>How this addresses the real-world failures from earlier</h4>
      <a href="#how-this-addresses-the-real-world-failures-from-earlier">
        
      </a>
    </div>
    <ul><li><p>Unconsumed bodies: Pull semantics mean nothing happens until you iterate. No hidden resource retention. If you don't consume a stream, there's no background machinery holding connections open.</p></li><li><p>The <code>tee()</code> memory cliff: <code>Stream.share()</code> requires explicit buffer configuration. You choose the <code>highWaterMark</code> and backpressure policy upfront: no more silent unbounded growth when consumers run at different speeds.</p></li><li><p>Transform backpressure gaps: Pull-through transforms execute on-demand. Data doesn't cascade through intermediate buffers; it flows only when the consumer pulls. Stop iterating, stop processing.</p></li><li><p>GC thrashing in SSR: Batched chunks (<code>Uint8Array[]</code>) amortize async overhead. Sync pipelines via <code>Stream.pullSync()</code> eliminate promise allocation entirely for CPU-bound workloads.</p></li></ul>
    <div>
      <h3>Performance</h3>
      <a href="#performance">
        
      </a>
    </div>
    <p>The design choices have performance implications. Here are benchmarks from the reference implementation of this possible alternative compared to Web streams (Node.js v24.x, Apple M1 Pro, averaged over 10 runs):</p><table><tr><td><p><b>Scenario</b></p></td><td><p><b>Alternative</b></p></td><td><p><b>Web streams</b></p></td><td><p><b>Difference</b></p></td></tr><tr><td><p>Small chunks (1KB × 5000)</p></td><td><p>~13 GB/s</p></td><td><p>~4 GB/s</p></td><td><p>~3× faster</p></td></tr><tr><td><p>Tiny chunks (100B × 10000)</p></td><td><p>~4 GB/s</p></td><td><p>~450 MB/s</p></td><td><p>~8× faster</p></td></tr><tr><td><p>Async iteration (8KB × 1000)</p></td><td><p>~530 GB/s</p></td><td><p>~35 GB/s</p></td><td><p>~15× faster</p></td></tr><tr><td><p>Chained 3× transforms (8KB × 500)</p></td><td><p>~275 GB/s</p></td><td><p>~3 GB/s</p></td><td><p><b>~80–90× faster</b></p></td></tr><tr><td><p>High-frequency (64B × 20000)</p></td><td><p>~7.5 GB/s</p></td><td><p>~280 MB/s</p></td><td><p>~25× faster</p></td></tr></table><p>The chained transform result is particularly striking: pull-through semantics eliminate the intermediate buffering that plagues Web streams pipelines. Instead of each <code>TransformStream</code> eagerly filling its internal buffers, data flows on-demand from consumer to source.</p><p>Now, to be fair, Node.js really has not yet put significant effort into fully optimizing the performance of its Web streams implementation. There's likely significant room for improvement in Node.js' performance results through a bit of applied effort to optimize the hot paths there. That said, running these benchmarks in Deno and Bun also show a significant performance improvement with this alternative iterator based approach than in either of their Web streams implementations as well.</p><p>Browser benchmarks (Chrome/Blink, averaged over 3 runs) show consistent gains as well:</p><table><tr><td><p><b>Scenario</b></p></td><td><p><b>Alternative</b></p></td><td><p><b>Web streams</b></p></td><td><p><b>Difference</b></p></td></tr><tr><td><p>Push 3KB chunks</p></td><td><p>~135k ops/s</p></td><td><p>~24k ops/s</p></td><td><p>~5–6× faster</p></td></tr><tr><td><p>Push 100KB chunks</p></td><td><p>~24k ops/s</p></td><td><p>~3k ops/s</p></td><td><p>~7–8× faster</p></td></tr><tr><td><p>3 transform chain</p></td><td><p>~4.6k ops/s</p></td><td><p>~880 ops/s</p></td><td><p>~5× faster</p></td></tr><tr><td><p>5 transform chain</p></td><td><p>~2.4k ops/s</p></td><td><p>~550 ops/s</p></td><td><p>~4× faster</p></td></tr><tr><td><p>bytes() consumption</p></td><td><p>~73k ops/s</p></td><td><p>~11k ops/s</p></td><td><p>~6–7× faster</p></td></tr><tr><td><p>Async iteration</p></td><td><p>~1.1M ops/s</p></td><td><p>~10k ops/s</p></td><td><p><b>~40–100× faster</b></p></td></tr></table><p>These benchmarks measure throughput in controlled scenarios; real-world performance depends on your specific use case. The difference between Node.js and browser gains reflects the distinct optimization paths each environment takes for Web streams.</p><p>It's worth noting that these benchmarks compare a pure TypeScript/JavaScript implementation of the new API against the native (JavaScript/C++/Rust) implementations of Web streams in each runtime. The new API's reference implementation has had no performance optimization work; the gains come entirely from the design. A native implementation would likely show further improvement.</p><p>The gains illustrate how fundamental design choices compound: batching amortizes async overhead, pull semantics eliminate intermediate buffering, and the freedom for implementations to use synchronous fast paths when data is available immediately all contribute.</p><blockquote><p>"We’ve done a lot to improve performance and consistency in Node streams, but there’s something uniquely powerful about starting from scratch. New streams’ approach embraces modern runtime realities without legacy baggage, and that opens the door to a simpler, performant and more coherent streams model." 
- Robert Nagy, Node.js TSC member and Node.js streams contributor</p></blockquote>
    <div>
      <h2>What's next</h2>
      <a href="#whats-next">
        
      </a>
    </div>
    <p>I'm publishing this to start a conversation. What did I get right? What did I miss? Are there use cases that don't fit this model? What would a migration path for this approach look like? The goal is to gather feedback from developers who've felt the pain of Web streams and have opinions about what a better API should look like.</p>
    <div>
      <h3>Try it yourself</h3>
      <a href="#try-it-yourself">
        
      </a>
    </div>
    <p>A reference implementation for this alternative approach is available now and can be found at <a href="https://github.com/jasnell/new-streams"><u>https://github.com/jasnell/new-streams</u></a>.</p><ul><li><p>API Reference: See the <a href="https://github.com/jasnell/new-streams/blob/main/API.md"><u>API.md</u></a> for complete documentation</p></li><li><p>Examples: The <a href="https://github.com/jasnell/new-streams/tree/main/samples"><u>samples directory</u></a> has working code for common patterns</p></li></ul><p>I welcome issues, discussions, and pull requests. If you've run into Web streams problems I haven't covered, or if you see gaps in this approach, let me know. But again, the idea here is not to say "Let's all use this shiny new object!"; it is to kick off a discussion that looks beyond the current status quo of Web Streams and returns back to first principles.</p><p>Web streams was an ambitious project that brought streaming to the web platform when nothing else existed. The people who designed it made reasonable choices given the constraints of 2014 – before async iteration, before years of production experience revealed the edge cases.</p><p>But we've learned a lot since then. JavaScript has evolved. A streaming API designed today can be simpler, more aligned with the language, and more explicit about the things that matter, like backpressure and multi-consumer behavior.</p><p>We deserve a better stream API. So let's talk about what that could look like.</p> ]]></content:encoded>
            <category><![CDATA[Standards]]></category>
            <category><![CDATA[JavaScript]]></category>
            <category><![CDATA[TypeScript]]></category>
            <category><![CDATA[Open Source]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[Node.js]]></category>
            <category><![CDATA[Performance]]></category>
            <category><![CDATA[API]]></category>
            <guid isPermaLink="false">37h1uszA2vuOfmXb3oAnZr</guid>
            <dc:creator>James M Snell</dc:creator>
        </item>
        <item>
            <title><![CDATA[How we rebuilt Next.js with AI in one week]]></title>
            <link>https://blog.cloudflare.com/vinext/</link>
            <pubDate>Tue, 24 Feb 2026 20:00:00 GMT</pubDate>
            <description><![CDATA[ One engineer used AI to rebuild Next.js on Vite in a week. vinext builds up to 4x faster, produces 57% smaller bundles, and deploys to Cloudflare Workers with a single command. ]]></description>
            <content:encoded><![CDATA[ <p><sub><i>*This post was updated at 12:35 pm PT to fix a typo in the build time benchmarks.</i></sub></p><p>Last week, one engineer and an AI model rebuilt the most popular front-end framework from scratch. The result, <a href="https://github.com/cloudflare/vinext"><u>vinext</u></a> (pronounced "vee-next"), is a drop-in replacement for Next.js, built on <a href="https://vite.dev/"><u>Vite</u></a>, that deploys to Cloudflare Workers with a single command. In early benchmarks, it builds production apps up to 4x faster and produces client bundles up to 57% smaller. And we already have customers running it in production. </p><p>The whole thing cost about $1,100 in tokens.</p>
    <div>
      <h2>The Next.js deployment problem</h2>
      <a href="#the-next-js-deployment-problem">
        
      </a>
    </div>
    <p><a href="https://nextjs.org/"><u>Next.js</u></a> is the most popular React framework. Millions of developers use it. It powers a huge chunk of the production web, and for good reason. The developer experience is top-notch.</p><p>But Next.js has a deployment problem when used in the broader serverless ecosystem. The tooling is entirely bespoke: Next.js has invested heavily in Turbopack but if you want to deploy it to Cloudflare, Netlify, or AWS Lambda, you have to take that build output and reshape it into something the target platform can actually run.</p><p>If you’re thinking: “Isn’t that what OpenNext does?”, you are correct. </p><p>That is indeed the problem <a href="https://opennext.js.org/"><u>OpenNext</u></a> was built to solve. And a lot of engineering effort has gone into OpenNext from multiple providers, including us at Cloudflare. It works, but quickly runs into limitations and becomes a game of whack-a-mole. </p><p>Building on top of Next.js output as a foundation has proven to be a difficult and fragile approach. Because OpenNext has to reverse-engineer Next.js's build output, this results in unpredictable changes between versions that take a lot of work to correct. </p><p>Next.js has been working on a first-class adapters API, and we've been collaborating with them on it. It's still an early effort but even with adapters, you're still building on the bespoke Turbopack toolchain. And adapters only cover build and deploy. During development, next dev runs exclusively in Node.js with no way to plug in a different runtime. If your application uses platform-specific APIs like Durable Objects, KV, or AI bindings, you can't test that code in dev without workarounds.</p>
    <div>
      <h2>Introducing vinext </h2>
      <a href="#introducing-vinext">
        
      </a>
    </div>
    
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7BCYnb6nCnc9oRBPQnuES5/d217b3582f4fe30597a3b4bf000d9bd7/BLOG-3194_2.png" />
          </figure><p>What if instead of adapting Next.js output, we reimplemented the Next.js API surface on <a href="https://vite.dev/"><u>Vite</u></a> directly? Vite is the build tool used by most of the front-end ecosystem outside of Next.js, powering frameworks like Astro, SvelteKit, Nuxt, and Remix. A clean reimplementation, not merely a wrapper or adapter. We honestly didn't think it would work. But it’s 2026, and the cost of building software has completely changed.</p><p>We got a lot further than we expected.</p>
            <pre><code>npm install vinext</code></pre>
            <p>Replace <code>next</code> with <code>vinext</code> in your scripts and everything else stays the same. Your existing <code>app/</code>, <code>pages/</code>, and <code>next.config.js</code> work as-is.</p>
            <pre><code>vinext dev          # Development server with HMR
vinext build        # Production build
vinext deploy       # Build and deploy to Cloudflare Workers</code></pre>
            <p>This is not a wrapper around Next.js and Turbopack output. It's an alternative implementation of the API surface: routing, server rendering, React Server Components, server actions, caching, middleware. All of it built on top of Vite as a plugin. Most importantly Vite output runs on any platform thanks to the <a href="https://vite.dev/guide/api-environment"><u>Vite Environment API</u></a>.</p>
    <div>
      <h2>The numbers</h2>
      <a href="#the-numbers">
        
      </a>
    </div>
    <p>Early benchmarks are promising. We compared vinext against Next.js 16 using a shared 33-route App Router application.

Both frameworks are doing the same work: compiling, bundling, and preparing server-rendered routes. We disabled TypeScript type checking and ESLint in Next.js's build (Vite doesn't run these during builds), and used force-dynamic so Next.js doesn't spend extra time pre-rendering static routes, which would unfairly slow down its numbers. The goal was to measure only bundler and compilation speed, nothing else. Benchmarks run on GitHub CI on every merge to main. </p><p><b>Production build time:</b></p>
<div><table><colgroup>
<col></col>
<col></col>
<col></col>
</colgroup>
<thead>
  <tr>
    <th><span>Framework</span></th>
    <th><span>Mean</span></th>
    <th><span>vs Next.js</span></th>
  </tr></thead>
<tbody>
  <tr>
    <td><span>Next.js 16.1.6 (Turbopack)</span></td>
    <td><span>7.38s</span></td>
    <td><span>baseline</span></td>
  </tr>
  <tr>
    <td><span>vinext (Vite 7 / Rollup)</span></td>
    <td>4.64s</td>
    <td>1.6x faster</td>
  </tr>
  <tr>
    <td><span>vinext (Vite 8 / Rolldown)</span></td>
    <td>1.67s</td>
    <td>4.4x faster</td>
  </tr>
</tbody></table></div><p><b>Client bundle size (gzipped):</b></p>
<div><table><colgroup>
<col></col>
<col></col>
<col></col>
</colgroup>
<thead>
  <tr>
    <th><span>Framework</span></th>
    <th><span>Gzipped</span></th>
    <th><span>vs Next.js</span></th>
  </tr></thead>
<tbody>
  <tr>
    <td><span>Next.js 16.1.6</span></td>
    <td><span>168.9 KB</span></td>
    <td><span>baseline</span></td>
  </tr>
  <tr>
    <td><span>vinext (Rollup)</span></td>
    <td><span>74.0 KB</span></td>
    <td><span>56% smaller</span></td>
  </tr>
  <tr>
    <td><span>vinext (Rolldown)</span></td>
    <td><span>72.9 KB</span></td>
    <td><span>57% smaller</span></td>
  </tr>
</tbody></table></div><p>These benchmarks measure compilation and bundling speed, not production serving performance. The test fixture is a single 33-route app, not a representative sample of all production applications. We expect these numbers to evolve as three projects continue to develop. The <a href="https://benchmarks.vinext.workers.dev"><u>full methodology and historical results</u></a> are public. Take them as directional, not definitive.</p><p>The direction is encouraging, though. Vite's architecture, and especially <a href="https://rolldown.rs/"><u>Rolldown</u></a> (the Rust-based bundler coming in Vite 8), has structural advantages for build performance that show up clearly here.</p>
    <div>
      <h2>Deploying to Cloudflare Workers</h2>
      <a href="#deploying-to-cloudflare-workers">
        
      </a>
    </div>
    <p>vinext is built with Cloudflare Workers as the first deployment target. A single command takes you from source code to a running Worker:</p>
            <pre><code>vinext deploy</code></pre>
            <p>This handles everything: builds the application, auto-generates the Worker configuration, and deploys. Both the App Router and Pages Router work on Workers, with full client-side hydration, interactive components, client-side navigation, React state.</p><p>For production caching, vinext includes a Cloudflare KV cache handler that gives you ISR (Incremental Static Regeneration) out of the box:</p>
            <pre><code>import { KVCacheHandler } from "vinext/cloudflare";
import { setCacheHandler } from "next/cache";

setCacheHandler(new KVCacheHandler(env.MY_KV_NAMESPACE));</code></pre>
            <p><a href="https://developers.cloudflare.com/kv/"><u>KV</u></a> is a good default for most applications, but the caching layer is designed to be pluggable. That setCacheHandler call means you can swap in whatever backend makes sense. <a href="https://developers.cloudflare.com/r2/"><u>R2</u></a> might be a better fit for apps with large cached payloads or different access patterns. We're also working on improvements to our Cache API that should provide a strong caching layer with less configuration. The goal is flexibility: pick the caching strategy that fits your app.</p><p>Live examples running right now:</p><ul><li><p><a href="https://app-router-playground.vinext.workers.dev"><u>App Router Playground</u></a></p></li><li><p><a href="https://hackernews.vinext.workers.dev"><u>Hacker News clone</u></a></p></li><li><p><a href="https://app-router-cloudflare.vinext.workers.dev"><u>App Router minimal</u></a></p></li><li><p><a href="https://pages-router-cloudflare.vinext.workers.dev"><u>Pages Router minimal</u></a></p></li></ul><p>We also have <a href="https://next-agents.threepointone.workers.dev/"><u>a live example</u></a> of Cloudflare Agents running in a Next.js app, without the need for workarounds like <a href="https://developers.cloudflare.com/workers/wrangler/api/#getplatformproxy"><u>getPlatformProxy</u></a>, since the entire app now runs in workerd, during both dev and deploy phases. This means being able to use Durable Objects, AI bindings, and every other Cloudflare-specific service without compromise. <a href="https://github.com/cloudflare/vinext-agents-example"><u>Have a look here.</u></a>   </p>
    <div>
      <h2>Frameworks are a team sport</h2>
      <a href="#frameworks-are-a-team-sport">
        
      </a>
    </div>
    <p>The current deployment target is Cloudflare Workers, but that's a small part of the picture. Something like 95% of vinext is pure Vite. The routing, the module shims, the SSR pipeline, the RSC integration: none of it is Cloudflare-specific.</p><p>Cloudflare is looking to work with other hosting providers about adopting this toolchain for their customers (the lift is minimal — we got a proof-of-concept working on <a href="https://vinext-on-vercel.vercel.app/"><u>Vercel</u></a> in less than 30 minutes!). This is an open-source project, and for its long term success, we believe it’s important we work with partners across the ecosystem to ensure ongoing investment. PRs from other platforms are welcome. If you're interested in adding a deployment target, <a href="https://github.com/cloudflare/vinext/issues"><u>open an issue</u></a> or reach out.</p>
    <div>
      <h2>Status: Experimental</h2>
      <a href="#status-experimental">
        
      </a>
    </div>
    <p>We want to be clear: vinext is experimental. It's not even one week old, and it has not yet been battle-tested with any meaningful traffic at scale. If you're evaluating it for a production application, proceed with appropriate caution.</p><p>That said, the test suite is extensive: over 1,700 Vitest tests and 380 Playwright E2E tests, including tests ported directly from the Next.js test suite and OpenNext's Cloudflare conformance suite. We’ve verified it against the Next.js App Router Playground. Coverage sits at 94% of the Next.js 16 API surface.

Early results from real-world customers are encouraging. We've been working with <a href="https://ndstudio.gov/"><u>National Design Studio</u></a>, a team that's aiming to modernize every government interface, on one of their beta sites, <a href="https://www.cio.gov/"><u>CIO.gov</u></a>. They're already running vinext in production, with meaningful improvements in build times and bundle sizes.</p><p>The README is honest about <a href="https://github.com/cloudflare/vinext#whats-not-supported-and-wont-be"><u>what's not supported and won't be</u></a>, and about <a href="https://github.com/cloudflare/vinext#known-limitations"><u>known limitations</u></a>. We want to be upfront rather than overpromise.</p>
    <div>
      <h2>What about pre-rendering?</h2>
      <a href="#what-about-pre-rendering">
        
      </a>
    </div>
    <p>vinext already supports Incremental Static Regeneration (ISR) out of the box. After the first request to any page, it's cached and revalidated in the background, just like Next.js. That part works today.</p><p>vinext does not yet support static pre-rendering at build time. In Next.js, pages without dynamic data get rendered during <code>next build</code> and served as static HTML. If you have dynamic routes, you use <code>generateStaticParams()</code> to enumerate which pages to build ahead of time. vinext doesn't do that… yet.</p><p>This was an intentional design decision for launch. It's  <a href="https://github.com/cloudflare/vinext/issues/9">on the roadmap</a>, but if your site is 100% prebuilt HTML with static content, you probably won't see much benefit from vinext today. That said, if one engineer can spend <span>$</span>1,100 in tokens and rebuild Next.js, you can probably spend $10 and migrate to a Vite-based framework designed specifically for static content, like <a href="https://astro.build/">Astro</a> (which <a href="https://blog.cloudflare.com/astro-joins-cloudflare/">also deploys to Cloudflare Workers</a>).</p><p>For sites that aren't purely static, though, we think we can do something better than pre-rendering everything at build time.</p>
    <div>
      <h2>Introducing Traffic-aware Pre-Rendering</h2>
      <a href="#introducing-traffic-aware-pre-rendering">
        
      </a>
    </div>
    <p>Next.js pre-renders every page listed in <code>generateStaticParams()</code> during the build. A site with 10,000 product pages means 10,000 renders at build time, even though 99% of those pages may never receive a request. Builds scale linearly with page count. This is why large Next.js sites end up with 30-minute builds.</p><p>So we built <b>Traffic-aware Pre-Rendering</b> (TPR). It's experimental today, and we plan to make it the default once we have more real-world testing behind it.</p><p>The idea is simple. Cloudflare is already the reverse proxy for your site. We have your traffic data. We know which pages actually get visited. So instead of pre-rendering everything or pre-rendering nothing, vinext queries Cloudflare's zone analytics at deploy time and pre-renders only the pages that matter.</p>
            <pre><code>vinext deploy --experimental-tpr

  Building...
  Build complete (4.2s)

  TPR (experimental): Analyzing traffic for my-store.com (last 24h)
  TPR: 12,847 unique paths — 184 pages cover 90% of traffic
  TPR: Pre-rendering 184 pages...
  TPR: Pre-rendered 184 pages in 8.3s → KV cache

  Deploying to Cloudflare Workers...
</code></pre>
            <p>For a site with 100,000 product pages, the power law means 90% of traffic usually goes to 50 to 200 pages. Those get pre-rendered in seconds. Everything else falls back to on-demand SSR and gets cached via ISR after the first request. Every new deploy refreshes the set based on current traffic patterns. Pages that go viral get picked up automatically. All of this works without <code>generateStaticParams()</code> and without coupling your build to your production database.</p>
    <div>
      <h2>Taking on the Next.js challenge, but this time with AI</h2>
      <a href="#taking-on-the-next-js-challenge-but-this-time-with-ai">
        
      </a>
    </div>
    <p>A project like this would normally take a team of engineers months, if not years. Several teams at various companies have attempted it, and the scope is just enormous. We tried once at Cloudflare! Two routers, 33+ module shims, server rendering pipelines, RSC streaming, file-system routing, middleware, caching, static export. There's a reason nobody has pulled it off.</p><p>This time we did it in under a week. One engineer (technically engineering manager) directing AI.</p><p>The first commit landed on February 13. By the end of that same evening, both the Pages Router and App Router had basic SSR working, along with middleware, server actions, and streaming. By the next afternoon, <a href="https://app-router-playground.vinext.workers.dev"><u>App Router Playground</u></a> was rendering 10 of 11 routes. By day three, <code>vinext deploy</code> was shipping apps to Cloudflare Workers with full client hydration. The rest of the week was hardening: fixing edge cases, expanding the test suite, bringing API coverage to 94%.</p><p>What changed from those earlier attempts? AI got better. Way better.</p>
    <div>
      <h2>Why this problem is made for AI</h2>
      <a href="#why-this-problem-is-made-for-ai">
        
      </a>
    </div>
    <p>Not every project would go this way. This one did because a few things happened to line up at the right time.</p><p><b>Next.js is well-specified.</b> It has extensive documentation, a massive user base, and years of Stack Overflow answers and tutorials. The API surface is all over the training data. When you ask Claude to implement <code>getServerSideProps</code> or explain how <code>useRouter</code> works, it doesn't hallucinate. It knows how Next works.</p><p><b>Next.js has an elaborate test suite.</b> The <a href="https://github.com/vercel/next.js"><u>Next.js repo</u></a> contains thousands of E2E tests covering every feature and edge case. We ported tests directly from their suite (you can see the attribution in the code). This gave us a specification we could verify against mechanically.</p><p><b>Vite is an excellent foundation.</b> <a href="https://vite.dev/"><u>Vite</u></a> handles the hard parts of front-end tooling: fast HMR, native ESM, a clean plugin API, production bundling. We didn't have to build a bundler. We just had to teach it to speak Next.js. <a href="https://github.com/vitejs/vite-plugin-rsc"><code><u>@vitejs/plugin-rsc</u></code></a> is still early, but it gave us React Server Components support without having to build an RSC implementation from scratch.</p><p><b>The models caught up.</b> We don't think this would have been possible even a few months ago. Earlier models couldn't sustain coherence across a codebase this size. New models can hold the full architecture in context, reason about how modules interact, and produce correct code often enough to keep momentum going. At times, I saw it go into Next, Vite, and React internals to figure out a bug. The state-of-the-art models are impressive, and they seem to keep getting better.</p><p>All of those things had to be true at the same time. Well-documented target API, comprehensive test suite, solid build tool underneath, and a model that could actually handle the complexity. Take any one of them away and this doesn't work nearly as well.</p>
    <div>
      <h2>How we actually built it</h2>
      <a href="#how-we-actually-built-it">
        
      </a>
    </div>
    <p>Almost every line of code in vinext was written by AI. But here's the thing that matters more: every line passes the same quality gates you'd expect from human-written code. The project has 1,700+ Vitest tests, 380 Playwright E2E tests, full TypeScript type checking via tsgo, and linting via oxlint. Continuous integration runs all of it on every pull request. Establishing a set of good guardrails is critical to making AI productive in a codebase.</p><p>The process started with a plan. I spent a couple of hours going back and forth with Claude in <a href="https://opencode.ai"><u>OpenCode</u></a> to define the architecture: what to build, in what order, which abstractions to use. That plan became the north star. From there, the workflow was straightforward:</p><ol><li><p>Define a task ("implement the <code>next/navigation</code> shim with usePathname, <code>useSearchParams</code>, <code>useRouter</code>").</p></li><li><p>Let the AI write the implementation and tests.</p></li><li><p>Run the test suite.</p></li><li><p>If tests pass, merge. If not, give the AI the error output and let it iterate.</p></li><li><p>Repeat.</p></li></ol><p>We wired up AI agents for code review too. When a PR was opened, an agent reviewed it. When review comments came back, another agent addressed them. The feedback loop was mostly automated. </p><p>It didn't work perfectly every time. There were PRs that were just wrong. The AI would confidently implement something that seemed right but didn't match actual Next.js behavior. I had to course-correct regularly. Architecture decisions, prioritization, knowing when the AI was headed down a dead end: that was all me. When you give AI good direction, good context, and good guardrails, it can be very productive. But the human still has to steer.</p><p>For browser-level testing, I used <a href="https://github.com/vercel-labs/agent-browser"><u>agent-browser</u></a> to verify actual rendered output, client-side navigation, and hydration behavior. Unit tests miss a lot of subtle browser issues. This caught them.</p><p>Over the course of the project, we ran over 800 sessions in OpenCode. Total cost: roughly $1,100 in Claude API tokens.</p>
    <div>
      <h2>What this means for software</h2>
      <a href="#what-this-means-for-software">
        
      </a>
    </div>
    <p>Why do we have so many layers in the stack? This project forced me to think deeply about this question. And to consider how AI impacts the answer.</p><p>Most abstractions in software exist because humans need help. We couldn't hold the whole system in our heads, so we built layers to manage the complexity for us. Each layer made the next person's job easier. That's how you end up with frameworks on top of frameworks, wrapper libraries, thousands of lines of glue code.</p><p>AI doesn't have the same limitation. It can hold the whole system in context and just write the code. It doesn't need an intermediate framework to stay organized. It just needs a spec and a foundation to build on.</p><p>It's not clear yet which abstractions are truly foundational and which ones were just crutches for human cognition. That line is going to shift a lot over the next few years. But vinext is a data point. We took an API contract, a build tool, and an AI model, and the AI wrote everything in between. No intermediate framework needed. We think this pattern will repeat across a lot of software. The layers we've built up over the years aren't all going to make it.</p>
    <div>
      <h2>Acknowledgments</h2>
      <a href="#acknowledgments">
        
      </a>
    </div>
    <p>Thanks to the Vite team. <a href="https://vite.dev/"><u>Vite</u></a> is the foundation this whole thing stands on. <a href="https://github.com/vitejs/vite-plugin-rsc"><code><u>@vitejs/plugin-rsc</u></code></a> is still early days, but it gave me RSC support without having to build that from scratch, which would have been a dealbreaker. The Vite maintainers were responsive and helpful as I pushed the plugin into territory it hadn't been tested in before.</p><p>We also want to acknowledge the <a href="https://nextjs.org/"><u>Next.js</u></a> team. They've spent years building a framework that raised the bar for what React development could look like. The fact that their API surface is so well-documented and their test suite so comprehensive is a big part of what made this project possible. vinext wouldn't exist without the standard they set.</p>
    <div>
      <h2>Try it</h2>
      <a href="#try-it">
        
      </a>
    </div>
    <p>vinext includes an <a href="https://agentskills.io"><u>Agent Skill</u></a> that handles migration for you. It works with Claude Code, OpenCode, Cursor, Codex, and dozens of other AI coding tools. Install it, open your Next.js project, and tell the AI to migrate:</p>
            <pre><code>npx skills add cloudflare/vinext</code></pre>
            <p>Then open your Next.js project in any supported tool and say:</p>
            <pre><code>migrate this project to vinext</code></pre>
            <p>The skill handles compatibility checking, dependency installation, config generation, and dev server startup. It knows what vinext supports and will flag anything that needs manual attention.</p><p>Or if you prefer doing it by hand:</p>
            <pre><code>npx vinext init    # Migrate an existing Next.js project
npx vinext dev     # Start the dev server
npx vinext deploy  # Ship to Cloudflare Workers</code></pre>
            <p>The source is at <a href="https://github.com/cloudflare/vinext"><u>github.com/cloudflare/vinext</u></a>. Issues, PRs, and feedback are welcome.</p> ]]></content:encoded>
            <category><![CDATA[AI]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[Workers AI]]></category>
            <category><![CDATA[Developers]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <category><![CDATA[JavaScript]]></category>
            <category><![CDATA[Open Source]]></category>
            <category><![CDATA[Performance]]></category>
            <guid isPermaLink="false">2w61xT0J7H7ECzhiABytS</guid>
            <dc:creator>Steve Faulkner</dc:creator>
        </item>
        <item>
            <title><![CDATA[Improve global upload performance with R2 Local Uploads]]></title>
            <link>https://blog.cloudflare.com/r2-local-uploads/</link>
            <pubDate>Tue, 03 Feb 2026 14:00:00 GMT</pubDate>
            <description><![CDATA[ Local Uploads on R2 reduces request duration for uploads by up to 75%. It writes object data to a nearby location and asynchronously copies it to your bucket, all while data is available immediately.  ]]></description>
            <content:encoded><![CDATA[ <p>Today, we are launching<b> Local Uploads</b> for R2 in <b>open beta</b>. With <a href="https://developers.cloudflare.com/r2/buckets/local-uploads/"><u>Local Uploads</u></a> enabled, object data is automatically written to a storage location close to the client first, then asynchronously copied to where the bucket lives. The data is immediately accessible and stays <a href="https://developers.cloudflare.com/r2/reference/consistency/"><u>strongly consistent</u></a>. Uploads get faster, and data feels global.</p><p>For many applications, performance needs to be global. Users uploading media content from different regions, for example, or devices sending logs and telemetry from all around the world. But your data has to live somewhere, and that means uploads from far away have to travel the full distance to reach your bucket. </p><p><a href="https://www.cloudflare.com/developer-platform/products/r2/"><u>R2</u></a> is <a href="https://www.cloudflare.com/learning/cloud/what-is-object-storage/"><u>object storage</u></a> built on Cloudflare's global network. Out of the box, it automatically caches object data globally for fast reads anywhere — all while retaining strong consistency and zero <a href="https://www.cloudflare.com/learning/cloud/what-are-data-egress-fees/"><u>egress fees</u></a>. This happens behind the scenes whether you're using the <a href="https://www.cloudflare.com/developer-platform/solutions/s3-compatible-object-storage/">S3</a> API, Workers Bindings, or plain HTTP. And now with Local Uploads, both reads and writes can be fast from anywhere in the world.</p><p>Try it yourself <a href="https://local-uploads.r2-demo.workers.dev/"><u>in this demo</u></a> to see the benefits of Local Uploads.</p><p>Ready to try it? Enable Local Uploads in the <a href="https://dash.cloudflare.com/?to=/:account/r2/overview"><u>Cloudflare Dashboard</u></a> under your bucket's settings, or with a single Wrangler command on an existing bucket.</p>
            <pre><code>npx wrangler r2 bucket local-uploads enable [BUCKET]</code></pre>
            
    <div>
      <h2>75% lower total request duration for global uploads</h2>
      <a href="#75-lower-total-request-duration-for-global-uploads">
        
      </a>
    </div>
    <p><a href="https://developers.cloudflare.com/r2/buckets/local-uploads"><u>Local Uploads</u></a> makes upload requests (i.e. PutObject, UploadPart) faster. In both our private beta tests with customers and our synthetic benchmarks, we saw up to 75% reduction in Time to Last Byte (TTLB) when upload requests are made in a different region than the bucket. In these results, TTLB is measured from when R2 receives the upload request to when R2 returns a 200 response.</p><p>In our synthetic tests, we measured the impact of Local Uploads by using a synthetic workload to simulate a cross-region upload workflow. We deployed a test client in Western North America and configured an R2 bucket with a <a href="https://developers.cloudflare.com/r2/reference/data-location/"><u>location hint</u></a> for Asia-Pacific. The client performed around 20 PutObject requests per second over 30 minutes to upload objects of 5 MB size. </p><p>The following graph compares the p50 (or median) TTLB metrics for these requests, showing the difference in upload request duration — first without Local Uploads (TTLB around 2s), and then with Local Uploads enabled (TTLB around 500ms): </p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4uvSdPwflyjHohwLQvOKsu/4b82637a5ac29ceee0fc37e04ab0107f/image1.png" />
          </figure>
    <div>
      <h2>How it works: The distance problem</h2>
      <a href="#how-it-works-the-distance-problem">
        
      </a>
    </div>
    <p>To understand how Local Uploads can improve upload requests, let’s first take a look at <a href="https://developers.cloudflare.com/r2/how-r2-works/"><u>how R2 works</u></a>. R2's architecture is composed of multiple components including:</p><ul><li><p><b>R2 Gateway Worker: </b>The entry point for all API requests that handles authentication and routing logic. It is deployed across Cloudflare's global network via <a href="https://developers.cloudflare.com/workers/"><u>Cloudflare Workers</u></a>.</p></li><li><p><b>Durable Object Metadata Service: </b>A distributed layer built on <a href="https://developers.cloudflare.com/durable-objects/"><u>Durable Objects</u></a> used to store and manage object metadata (e.g. object key, checksum).</p></li><li><p><b>Distributed Storage Infrastructure: </b>The underlying infrastructure that persistently stores encrypted object data.</p></li></ul><p>Without Local Uploads, here’s what happens when you upload objects to your bucket: The request is first received by the R2 Gateway, close to the user, where it is authenticated. Then, as the client streams bytes of the object data, the data is encrypted and written into the storage infrastructure in the region where the bucket is placed. When this is completed, the Gateway reaches out to the Metadata Service to publish the object metadata, and it returns a success response back to the client after it is committed.</p><p>If the client and the bucket are in separate regions, more variability can be introduced in the process of uploading bytes of the object data, due to the longer distance that the request must travel. This could result in slower or less reliable uploads. </p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6toAZ6JSHPv2jgntdyCOvr/704f6837d2705f18a0e5b8554994cb7a/image9.png" />
          </figure><p><sup>A client uploading from Eastern North America to a bucket in Eastern Europe without Local Uploads enabled. </sup></p><p>Now, when you make an upload request to a bucket with Local Uploads enabled, there are two cases that are handled: </p><ol><li><p>The client and the bucket region are in the <b>same</b> region</p></li><li><p>The client and the bucket region are in <b>different</b> regions</p></li></ol><p>In the first case, R2 follows the regular flow, where object data is written to the storage infrastructure for your bucket. In the second case, R2 writes to the storage infrastructure located in the client region while still publishing to the object metadata to the region of the bucket.</p><p>Importantly, the object is immediately accessible after the initial write completes. It remains accessible throughout the entire replication process — there's <b>no</b> <b>waiting period</b> for background replication to finish before the object can be read.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/33oUAdlGF8cWOeQhha6Ocy/68537e503f1ec8d1dd080db363f97dc3/image3.png" />
          </figure><p><sup>A client uploading from Eastern North America to a bucket in Eastern Europe with Local Uploads enabled. </sup></p><p>Note that this is for non-jurisdiction restricted buckets, and Local Uploads are not available for buckets with jurisdiction restriction (e.g. EU, FedRAMP) enabled.</p>
    <div>
      <h2>When to use Local Uploads</h2>
      <a href="#when-to-use-local-uploads">
        
      </a>
    </div>
    <p>Local uploads are built for workloads that receive a lot of upload requests originating from different geographic regions than where your bucket is located. This feature is ideal when:</p><ul><li><p>Your users are globally distributed</p></li><li><p>Upload performance and reliability is critical to your application</p></li><li><p>You want to optimize write performance without changing your bucket's primary location</p></li></ul><p>To understand the geographic distribution of where your read and write requests are initiated, you can visit the <a href="https://dash.cloudflare.com/?to=/:account/r2/overview"><u>Cloudflare Dashboard</u></a>, and go to your R2 bucket’s Metrics page and view the Request Distribution by Region graph. </p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6SJ9UYY3RADryXmnT0J3Vq/9b26c948e925a705387a64c24a1dd7e3/image7.png" />
          </figure>
    <div>
      <h2>How we built Local Uploads</h2>
      <a href="#how-we-built-local-uploads">
        
      </a>
    </div>
    <p>With Local Uploads, object data is written close to the client and then copied to the bucket's region in the background. We call this copy job a replication task.</p><p>Given these replication tasks, we needed an asynchronous processing component for them, which tends to be a great use case for <a href="https://developers.cloudflare.com/queues/"><u>Cloudflare Queues</u></a>. Queues allow us to control the rate at which we process replication tasks, and it provides built-in failure handling capabilities like <a href="https://developers.cloudflare.com/queues/configuration/batching-retries/"><u>retries</u></a> and <a href="https://developers.cloudflare.com/queues/configuration/dead-letter-queues/"><u>dead letter queues</u></a>. In this case, R2 shards replication tasks across multiple queues per storage region.</p>
    <div>
      <h3>Publishing metadata and scheduling replication</h3>
      <a href="#publishing-metadata-and-scheduling-replication">
        
      </a>
    </div>
    <p>When publishing the metadata of an object with Local Uploads enabled, we perform three operations atomically:</p><ol><li><p>Store the object metadata</p></li><li><p>Create a pending replica key that tracks which replications still need to happen</p></li><li><p>Create a replication task marker keyed by timestamp, which controls when the task should be sent to the queue</p></li></ol><p>The pending replica key contains the full replication plan: the number of replication tasks, which source location to read from, which destination location to write to, the replication mode and priority, and whether the source should be deleted after successful replication.</p><p>This gives us flexibility in how we move an object's data. For example, moving data across long geographical distances is expensive. We could try to move all the replicas as fast as possible by processing them in parallel, but this would incur greater cost and pressure the network infrastructure. Instead, we minimize the number of cross-regional data movements by first creating one replica in the target bucket region, and then use this local copy to create additional replicas within the bucket region.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/2rCuA2zXR4ltZJsiNDBHd7/ae388f13ea27922158b27f429080c69c/image6.png" />
          </figure><p>A background process periodically scans the replication task markers and sends them to one of the queues associated with the destination storage region. The markers guarantee at-least-once delivery to the queue — if enqueueing fails or the process crashes, the marker persists and the task will be retried on the next scan. This also allows us to process replications at different times and enqueue only valid tasks. Once a replication task reaches a queue, it is ready to be processed.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5G4STZSp67TnKhehzCqFMv/445a0c74ba7f4bc5dd3de04eb7aa1257/image4.png" />
          </figure>
    <div>
      <h3>Asynchronous replication: Pull model</h3>
      <a href="#asynchronous-replication-pull-model">
        
      </a>
    </div>
    <p>For the queue consumer, we chose a pull model where a centralized polling service consumes tasks from the regional queues and dispatches them to the Gateway Worker for execution.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/2p6SHkqO1tT7wxdhPJCFCr/86f219af85e332813ede2eb95a3810d8/image2.png" />
          </figure><p>Here's how it works:</p><ol><li><p>Polling service pulls from a regional queue: The consumer service polls the regional queue for replication tasks. It then batches the tasks to create uniform batch sizes based on the amount of data to be moved.</p></li><li><p>Polling service dispatches to Gateway Worker: The consumer service sends the replication job to the Gateway Worker.</p></li><li><p>Gateway Worker executes replication: The worker reads object data from the source location, writes it to the destination, and updates metadata in the Durable Object, optionally marking the source location to be garbage collected.</p></li><li><p>Gateway Worker reports result: On completion, the worker returns the result to the poller, which acknowledges the task to the queue as completed or failed.</p></li></ol><p>By using this pull model approach, we ensure that the replication process remains stable and efficient. The service can dynamically adjust its pace based on real-time system health, guaranteeing that data is safely replicated across regions.</p>
    <div>
      <h2>Try it out</h2>
      <a href="#try-it-out">
        
      </a>
    </div>
    <p>Local Uploads is available now in open beta. There is <b>no additional cost</b> to enable Local Uploads. Upload requests made with this feature enabled incur the standard <a href="https://developers.cloudflare.com/r2/pricing/"><u>Class A operation costs</u></a>, same as upload requests made without Local Uploads.</p><p>To get started, visit the <a href="https://dash.cloudflare.com/?to=/:account/r2/overview"><u>Cloudflare Dashboard</u></a> under your bucket's settings and look for the Local Uploads card to enable, or simply run the following command using Wrangler to enable Local Uploads on a bucket.</p>
            <pre><code>npx wrangler r2 bucket local-uploads enable [BUCKET]</code></pre>
            <p>Enabling Local Uploads on a bucket is seamless: existing uploads will complete as expected and there’s no interruption to traffic.</p><p>For more information, refer to the <a href="https://developers.cloudflare.com/r2/buckets/local-uploads/"><u>Local Uploads documentation</u></a>. If you have questions or want to share feedback, join the discussion on our <a href="https://discord.gg/cloudflaredev"><u>Developer Discord</u></a>.</p> ]]></content:encoded>
            <category><![CDATA[R2]]></category>
            <category><![CDATA[Performance]]></category>
            <category><![CDATA[Storage]]></category>
            <guid isPermaLink="false">453lZMuYluqGqfRKADhf9K</guid>
            <dc:creator>Frank Chen</dc:creator>
            <dc:creator>Rahul Suresh</dc:creator>
            <dc:creator>Anni Wang</dc:creator>
        </item>
        <item>
            <title><![CDATA[A deep dive into BPF LPM trie performance and optimization]]></title>
            <link>https://blog.cloudflare.com/a-deep-dive-into-bpf-lpm-trie-performance-and-optimization/</link>
            <pubDate>Tue, 21 Oct 2025 13:00:00 GMT</pubDate>
            <description><![CDATA[ This post explores the performance of BPF LPM tries, a critical data structure used for IP matching.  ]]></description>
            <content:encoded><![CDATA[ <p>It started with a mysterious soft lockup message in production. A single, cryptic line that led us down a rabbit hole into the performance of one of the most fundamental data structures we use: the BPF LPM trie.</p><p>BPF trie maps (<a href="https://docs.ebpf.io/linux/map-type/BPF_MAP_TYPE_LPM_TRIE/">BPF_MAP_TYPE_LPM_TRIE</a>) are heavily used for things like IP and IP+Port matching when routing network packets, ensuring your request passes through the right services before returning a result. The performance of this data structure is critical for serving our customers, but the speed of the current implementation leaves a lot to be desired. We’ve run into several bottlenecks when storing millions of entries in BPF LPM trie maps, such as entry lookup times taking hundreds of milliseconds to complete and freeing maps locking up a CPU for over 10 seconds. For instance, BPF maps are used when evaluating Cloudflare’s <a href="https://www.cloudflare.com/network-services/products/magic-firewall/"><u>Magic Firewall</u></a> rules and these bottlenecks have even led to traffic packet loss for some customers.</p><p>This post gives a refresher of how tries and prefix matching work, benchmark results, and a list of the shortcomings of the current BPF LPM trie implementation.</p>
    <div>
      <h2>A brief recap of tries</h2>
      <a href="#a-brief-recap-of-tries">
        
      </a>
    </div>
    <p>If it’s been a while since you last looked at the trie data structure (or if you’ve never seen it before), a trie is a tree data structure (similar to a binary tree) that allows you to store and search for data for a given key and where each node stores some number of key bits.</p><p>Searches are performed by traversing a path, which essentially reconstructs the key from the traversal path, meaning nodes do not need to store their full key. This differs from a traditional binary search tree (BST) where the primary invariant is that the left child node has a key that is less than the current node and the right child has a key that is greater. BSTs require that each node store the full key so that a comparison can be made at each search step.</p><p>Here’s an example that shows how a BST might store values for the keys:</p><ul><li><p>ABC</p></li><li><p>ABCD</p></li><li><p>ABCDEFGH</p></li><li><p>DEF</p></li></ul>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1uXt5qwpyq7VzrqxXlHFLj/99677afd73a98b9ce04d30209065499f/image4.png" />
          </figure><p>In comparison, a trie for storing the same set of keys might look like this.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3TfFZmwekNAF18yWlOIVWh/58396a19e053bd1c02734a6a54eea18e/image8.png" />
          </figure><p>This way of splitting out bits is really memory-efficient when you have redundancy in your data, e.g. prefixes are common in your keys, because that shared data only requires a single set of nodes. It’s for this reason that tries are often used to efficiently store strings, e.g. dictionaries of words – storing the strings “ABC” and “ABCD” doesn’t require 3 bytes + 4 bytes (assuming ASCII), it only requires 3 bytes + 1 byte because “ABC” is shared by both (the exact number of bits required in the trie is implementation dependent).</p><p>Tries also allow more efficient searching. For instance, if you wanted to know whether the key “CAR” existed in the BST you are required to go to the right child of the root (the node with key “DEF”) and check its left child because this is where it would live if it existed. A trie is more efficient because it searches in prefix order. In this particular example, a trie knows at the root whether that key is in the trie or not.</p><p>This design makes tries perfectly suited for performing longest prefix matches and for working with IP routing using CIDR. CIDR was introduced to make more efficient use of the IP address space (no longer requiring that classes fall into 4 buckets of 8 bits) but comes with added complexity because now the network portion of an IP address can fall anywhere. Handling the CIDR scheme in IP routing tables requires matching on the longest (most specific) prefix in the table rather than performing a search for an exact match.</p><p>If searching a trie does a single-bit comparison at each node, that’s a binary trie. If searching compares more bits we call that a <b><i>multibit trie</i></b>. You can store anything you like in a trie, including IP and subnet addresses – it’s all just ones and zeroes.</p><p>Nodes in multibit tries use more memory than in binary tries, but since computers operate on multibit words anyhow, it’s more efficient from a microarchitecture perspective to use multibit tries because you can traverse through the bits faster, reducing the number of comparisons you need to make to search for your data. It’s a classic space vs time tradeoff.</p><p>There are other optimisations we can use with tries. The distribution of data that you store in a trie might not be uniform and there could be sparsely populated areas. For example, if you store the strings “A” and “BCDEFGHI” in a multibit trie, how many nodes do you expect to use? If you’re using ASCII, you could construct the binary trie with a root node and branch left for “A” or right for “B”. With 8-bit nodes, you’d need another 7 nodes to store “C”, “D”, “E”, “F”, “G”, “H", “I”.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/LO6izFC5e06dRf9ra2roC/167ba5c4128fcebacc7b7a8eab199ea5/image5.png" />
          </figure><p>Since there are no other strings in the trie, that’s pretty suboptimal. Once you hit the first level after matching on “B” you know there’s only one string in the trie with that prefix, and you can avoid creating all the other nodes by using <b><i>path compression</i></b>. Path compression replaces nodes “C”, “D”, “E” etc. with a single one such as “I”.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1ADY3lNtF7NIgfUX7bX9vY/828a14e155d6530a4dc8cf3286ce8cc3/image13.png" />
          </figure><p>If you traverse the tree and hit “I”, you still need to compare the search key with the bits you skipped (“CDEFGH”) to make sure your search key matches the string. Exactly how and where you store the skipped bits is implementation dependent – BPF LPM tries simply store the entire key in the leaf node. As your data becomes denser, path compression is less effective.</p><p>What if your data distribution is dense and, say, all the first 3 levels in a trie are fully populated? In that case you can use <b><i>level compression</i></b><i> </i>and replace all the nodes in those levels with a single node that has 2**3 children. This is how Level-Compressed Tries work which are used for <a href="https://vincent.bernat.ch/en/blog/2017-ipv4-route-lookup-linux">IP route lookup</a> in the Linux kernel (see <a href="https://elixir.bootlin.com/linux/v6.12.43/source/net/ipv4/fib_trie.c"><u>net/ipv4/fib_trie.c</u></a>).</p><p>There are other optimisations too, but this brief detour is sufficient for this post because the BPF LPM trie implementation in the kernel doesn’t fully use the three we just discussed.</p>
    <div>
      <h2>How fast are BPF LPM trie maps?</h2>
      <a href="#how-fast-are-bpf-lpm-trie-maps">
        
      </a>
    </div>
    <p>Here are some numbers from running <a href="https://lore.kernel.org/bpf/20250827140149.1001557-1-matt@readmodwrite.com/"><u>BPF selftests benchmark</u></a> on AMD EPYC 9684X 96-Core machines. Here the trie has 10K entries, a 32-bit prefix length, and an entry for every key in the range [0, 10K).</p><table><tr><td><p>Operation</p></td><td><p>Throughput</p></td><td><p>Stddev</p></td><td><p>Latency</p></td></tr><tr><td><p>lookup</p></td><td><p>7.423M ops/s</p></td><td><p>0.023M ops/s</p></td><td><p>134.710 ns/op</p></td></tr><tr><td><p>update</p></td><td><p>2.643M ops/s</p></td><td><p>0.015M ops/s</p></td><td><p>378.310 ns/op</p></td></tr><tr><td><p>delete</p></td><td><p>0.712M ops/s</p></td><td><p>0.008M ops/s</p></td><td><p>1405.152 ns/op</p></td></tr><tr><td><p>free</p></td><td><p>0.573K ops/s</p></td><td><p>0.574K ops/s</p></td><td><p>1.743 ms/op</p></td></tr></table><p>The time to free a BPF LPM trie with 10K entries is noticeably large. We recently ran into an issue where this took so long that it caused <a href="https://lore.kernel.org/lkml/20250616095532.47020-1-matt@readmodwrite.com/"><u>soft lockup messages</u></a> to spew in production.</p><p>This benchmark gives some idea of worst case behaviour. Since the keys are so densely populated, path compression is completely ineffective. In the next section, we explore the lookup operation to understand the bottlenecks involved.</p>
    <div>
      <h2>Why are BPF LPM tries slow?</h2>
      <a href="#why-are-bpf-lpm-tries-slow">
        
      </a>
    </div>
    <p>The LPM trie implementation in <a href="https://elixir.bootlin.com/linux/v6.12.43/source/kernel/bpf/lpm_trie.c"><u>kernel/bpf/lpm_trie.c</u></a> has a couple of the optimisations we discussed in the introduction. It is capable of multibit comparisons at leaf nodes, but since there are only two child pointers in each internal node, if your tree is densely populated with a lot of data that only differs by one bit, these multibit comparisons degrade into single bit comparisons.</p><p>Here’s an example. Suppose you store the numbers 0, 1, and 3 in a BPF LPM trie. You might hope that since these values fit in a single 32 or 64-bit machine word, you could use a single comparison to decide which next node to visit in the trie. But that’s only possible if your trie implementation has 3 child pointers in the current node (which, to be fair, most trie implementations do). In other words, you want to make a 3-way branching decision but since BPF LPM tries only have two children, you’re limited to a 2-way branch.</p><p>A diagram for this 2-child trie is given below.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1ciL2t6aMyJHR2FfX41rNk/365abe47cf384729408cf9b98c65c0be/image9.png" />
          </figure><p>The leaf nodes are shown in green with the key, as a binary string, in the center. Even though a single 8-bit comparison is more than capable of figuring out which node has that key, the BPF LPM trie implementation resorts to inserting intermediate nodes (blue) to inject 2-way branching decisions into your path traversal because its parent (the orange root node in this case) only has 2 children. Once you reach a leaf node, BPF LPM tries can perform a multibit comparison to check the key. If a node supported pointers to more children, the above trie could instead look like this, allowing a 3-way branch and reducing the lookup time.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/17VoWl8OY6tzcARKDKuSjS/b9200dbeddf13f101b7085a549742f95/image3.png" />
          </figure><p>This 2-child design impacts the height of the trie. In the worst case, a completely full trie essentially becomes a binary search tree with height log2(nr_entries) and the height of the trie impacts how many comparisons are required to search for a key.</p><p>The above trie also shows how BPF LPM tries implement a form of path compression – you only need to insert an intermediate node where you have two nodes whose keys differ by a single bit. If instead of 3, you insert a key of 15 (0b1111), this won’t change the layout of the trie; you still only need a single node at the right child of the root.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/2ecfKSeoqN3bfBXmC9KHw5/3be952edea34d6b2cc867ba31ce14805/image12.png" />
          </figure><p>And finally, BPF LPM tries do not implement level compression. Again, this stems from the fact that nodes in the trie can only have 2 children. IP route tables tend to have many prefixes in common and you typically see densely packed tries at the upper levels which makes level compression very effective for tries containing IP routes.</p><p>Here’s a graph showing how the lookup throughput for LPM tries (measured in million ops/sec) degrades as the number of entries increases, from 1 entry up to 100K entries.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/33I92exrEZTcUWOjxaBOqY/fb1de551b06e3272c8670d0117d738fa/image2.png" />
          </figure><p>Once you reach 1 million entries, throughput is around 1.5 million ops/sec, and continues to fall as the number of entries increases.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4OhaAaI5Y2XJCofI9V39z/567a01b3335f29ef3b46ccdd74dc27e5/image1.png" />
          </figure><p>Why is this? Initially, this is because of the L1 dcache miss rate. All of those nodes that need to be traversed in the trie are potential cache miss opportunities.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5Gx4fOLKmhUKHegybQU7sl/4936239213f0061d5cbc2f5d6b63fde6/image11.png" />
          </figure><p>As you can see from the graph, L1 dcache miss rate remains relatively steady and yet the throughput continues to decline. At around 80K entries, dTLB miss rate becomes the bottleneck.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4Jy7aTN3Nyo2EsbSzw313n/d26871fa417ffe293adb47fe7f7dc56b/image7.png" />
          </figure><p>Because BPF LPM tries to dynamically allocate individual nodes from a freelist of kernel memory, these nodes can live at arbitrary addresses. Which means traversing a path through a trie almost certainly will incur cache misses and potentially dTLB misses. This gets worse as the number of entries, and height of the trie, increases.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6CB3MvSvSgH1T2eY7Xlei8/81ebe572592ca71529d79564a88993f0/image10.png" />
          </figure>
    <div>
      <h2>Where do we go from here?</h2>
      <a href="#where-do-we-go-from-here">
        
      </a>
    </div>
    <p>By understanding the current limitations of the BPF LPM trie, we can now work towards building a more performant and efficient solution for the future of the Internet.</p><p>We’ve already contributed these benchmarks to the upstream Linux kernel — but that’s only the start. We have plans to improve the performance of BPM LPM tries, particularly the lookup function which is heavily used for our workloads. This post covered a number of optimisations that are already used by the <a href="https://elixir.bootlin.com/linux/v6.12.43/source/net/ipv4/fib_trie.c"><u>net/ipv4/fib_trie.c</u></a> code, so a natural first step is to refactor that code so that a common Level Compressed trie implementation can be used. Expect future blog posts to explore this work in depth.</p><p>If you’re interested in looking at more performance numbers, <a href="https://wiki.cfdata.org/display/~jesper">Jesper Brouer</a> has recorded some here: <a href="https://github.com/xdp-project/xdp-project/blob/main/areas/bench/bench02_lpm-trie-lookup.org">https://github.com/xdp-project/xdp-project/blob/main/areas/bench/bench02_lpm-trie-lookup.org</a>.</p><h6><i>If the Linux kernel, performance, or optimising data structures excites you, </i><a href="https://www.cloudflare.com/en-gb/careers/jobs/?department=Engineering&amp;location=default"><i>our engineering teams are hiring</i></a><i>.</i></h6><p></p> ]]></content:encoded>
            <category><![CDATA[Deep Dive]]></category>
            <category><![CDATA[eBPF]]></category>
            <category><![CDATA[IPv4]]></category>
            <category><![CDATA[IPv6]]></category>
            <category><![CDATA[Linux]]></category>
            <category><![CDATA[Performance]]></category>
            <guid isPermaLink="false">2A4WHjTqyxprwUMPaZ6tfj</guid>
            <dc:creator>Matt Fleming</dc:creator>
            <dc:creator>Jesper Brouer</dc:creator>
        </item>
        <item>
            <title><![CDATA[15 years of helping build a better Internet: a look back at Birthday Week 2025]]></title>
            <link>https://blog.cloudflare.com/birthday-week-2025-wrap-up/</link>
            <pubDate>Mon, 29 Sep 2025 14:00:00 GMT</pubDate>
            <description><![CDATA[ Rust-powered core systems, post-quantum upgrades, developer access for students, PlanetScale integration, open-source partnerships, and our biggest internship program ever — 1,111 interns in 2026. ]]></description>
            <content:encoded><![CDATA[ <p>Cloudflare launched fifteen years ago with a mission to help build a better Internet. Over that time the Internet has changed and so has what it needs from teams like ours.  In this year’s <a href="https://blog.cloudflare.com/cloudflare-2025-annual-founders-letter/"><u>Founder’s Letter</u></a>, Matthew and Michelle discussed the role we have played in the evolution of the Internet, from helping encryption grow from 10% to 95% of Internet traffic to more recent challenges like how people consume content. </p><p>We spend Birthday Week every year releasing the products and capabilities we believe the Internet needs at this moment and around the corner. Previous <a href="https://blog.cloudflare.com/tag/birthday-week/"><u>Birthday Weeks</u></a> saw the launch of <a href="https://blog.cloudflare.com/introducing-cloudflares-automatic-ipv6-gatewa/"><u>IPv6 gateway</u></a> in 2011,  <a href="https://blog.cloudflare.com/introducing-universal-ssl/"><u>Universal SSL</u></a> in 2014, <a href="https://blog.cloudflare.com/introducing-cloudflare-workers/"><u>Cloudflare Workers</u></a> and <a href="https://blog.cloudflare.com/unmetered-mitigation/"><u>unmetered DDoS protection</u></a> in 2017, <a href="https://blog.cloudflare.com/introducing-cloudflare-radar/"><u>Cloudflare Radar</u></a> in 2020, <a href="https://www.cloudflare.com/developer-platform/products/r2/"><u>R2 Object Storage</u></a> with zero egress fees in 2021,  <a href="https://blog.cloudflare.com/post-quantum-tunnel/"><u>post-quantum upgrades for Cloudflare Tunnel</u></a> in 2022, <a href="https://blog.cloudflare.com/best-place-region-earth-inference/"><u>Workers AI</u></a> and <a href="https://blog.cloudflare.com/announcing-encrypted-client-hello/"><u>Encrypted Client Hello</u></a> in 2023. And those are just a sample of the launches.</p><p>This year’s themes focused on helping prepare the Internet for a new model of monetization that encourages great content to be published, fostering more opportunities to build community both inside and outside of Cloudflare, and evergreen missions like making more features available to everyone and constantly improving the speed and security of what we offer.</p><p>We shipped a lot of new things this year. In case you missed the dozens of blog posts, here is a breakdown of everything we announced during Birthday Week 2025. </p><p><b>Monday, September 22</b></p>
<div><table><thead>
  <tr>
    <th><span>What</span></th>
    <th><span>In a sentence …</span></th>
  </tr></thead>
<tbody>
  <tr>
    <td><a href="https://blog.cloudflare.com/cloudflare-1111-intern-program/?_gl=1*rxpw9t*_gcl_aw*R0NMLjE3NTg5MTQ0ODEuQ2p3S0NBanc4OWpHQmhCMEVpd0EybzFPbnp1VkVIN2UybUZJcERvWWtJMV9Rc2FlbTFEV19FU19qVjR1QnVmcEE3QVdkeU9zaVRIZGl4b0N4dHNRQXZEX0J3RQ..*_gcl_dc*R0NMLjE3NTgyMDc1NDEuQ2owS0NRancyNjdHQmhDU0FSSXNBT2pWSjRIWTFOVTZVWDFyVEJVNGNyd243d3RwX3lheFBuNnZJdXJlOUVmWmRzWkJJa1ZyejF4cDFDSWFBa2pBRUFMd193Y0I.*_gcl_au*MTI5NDk3ODE3OC4xNzUzMTQwMzIw*_ga*ZTI0NWUyMDQtZDM1YS00NTFkLWIwM2UtYjhhNzliZWQxY2Nj*_ga_SQCRB0TXZW*czE3NTg5MTY5NDEkbzYkZzEkdDE3NTg5MTgwNzEkajI4JGwwJGgw"><span>Help build the future: announcing Cloudflare’s goal to hire 1,111 interns in 2026</span></a></td>
    <td><span>To invest in the next generation of builders, we announced our most ambitious intern program yet with a goal to hire 1,111 interns in 2026.</span></td>
  </tr>
  <tr>
    <td><a href="https://blog.cloudflare.com/supporting-the-future-of-the-open-web/?_gl=1*1l701kl*_gcl_aw*R0NMLjE3NTg5MTQ0ODEuQ2p3S0NBanc4OWpHQmhCMEVpd0EybzFPbnp1VkVIN2UybUZJcERvWWtJMV9Rc2FlbTFEV19FU19qVjR1QnVmcEE3QVdkeU9zaVRIZGl4b0N4dHNRQXZEX0J3RQ..*_gcl_dc*R0NMLjE3NTgyMDc1NDEuQ2owS0NRancyNjdHQmhDU0FSSXNBT2pWSjRIWTFOVTZVWDFyVEJVNGNyd243d3RwX3lheFBuNnZJdXJlOUVmWmRzWkJJa1ZyejF4cDFDSWFBa2pBRUFMd193Y0I.*_gcl_au*MTI5NDk3ODE3OC4xNzUzMTQwMzIw*_ga*ZTI0NWUyMDQtZDM1YS00NTFkLWIwM2UtYjhhNzliZWQxY2Nj*_ga_SQCRB0TXZW*czE3NTg5MTY5NDEkbzYkZzEkdDE3NTg5MTg0MDMkajYwJGwwJGgw"><span>Supporting the future of the open web: Cloudflare is sponsoring Ladybird and Omarchy</span></a></td>
    <td><span>To support a diverse and open Internet, we are now sponsoring Ladybird (an independent browser) and Omarchy (an open-source Linux distribution and developer environment).</span></td>
  </tr>
  <tr>
    <td><a href="https://blog.cloudflare.com/new-hubs-for-startups/?_gl=1*s35rml*_gcl_aw*R0NMLjE3NTg5MTQ0ODEuQ2p3S0NBanc4OWpHQmhCMEVpd0EybzFPbnp1VkVIN2UybUZJcERvWWtJMV9Rc2FlbTFEV19FU19qVjR1QnVmcEE3QVdkeU9zaVRIZGl4b0N4dHNRQXZEX0J3RQ..*_gcl_dc*R0NMLjE3NTgyMDc1NDEuQ2owS0NRancyNjdHQmhDU0FSSXNBT2pWSjRIWTFOVTZVWDFyVEJVNGNyd243d3RwX3lheFBuNnZJdXJlOUVmWmRzWkJJa1ZyejF4cDFDSWFBa2pBRUFMd193Y0I.*_gcl_au*MTI5NDk3ODE3OC4xNzUzMTQwMzIw*_ga*ZTI0NWUyMDQtZDM1YS00NTFkLWIwM2UtYjhhNzliZWQxY2Nj*_ga_SQCRB0TXZW*czE3NTg5MTY5NDEkbzYkZzEkdDE3NTg5MTg2NjEkajYwJGwwJGgw/"><span>Come build with us: Cloudflare’s new hubs for startups</span></a></td>
    <td><span>We are opening our office doors in four major cities (San Francisco, Austin, London, and Lisbon) as free hubs for startups to collaborate and connect with the builder community.</span></td>
  </tr>
  <tr>
    <td><a href="https://blog.cloudflare.com/ai-crawl-control-for-project-galileo/?_gl=1*n9jmji*_gcl_aw*R0NMLjE3NTg5MTQ0ODEuQ2p3S0NBanc4OWpHQmhCMEVpd0EybzFPbnp1VkVIN2UybUZJcERvWWtJMV9Rc2FlbTFEV19FU19qVjR1QnVmcEE3QVdkeU9zaVRIZGl4b0N4dHNRQXZEX0J3RQ..*_gcl_dc*R0NMLjE3NTgyMDc1NDEuQ2owS0NRancyNjdHQmhDU0FSSXNBT2pWSjRIWTFOVTZVWDFyVEJVNGNyd243d3RwX3lheFBuNnZJdXJlOUVmWmRzWkJJa1ZyejF4cDFDSWFBa2pBRUFMd193Y0I.*_gcl_au*MTI5NDk3ODE3OC4xNzUzMTQwMzIw*_ga*ZTI0NWUyMDQtZDM1YS00NTFkLWIwM2UtYjhhNzliZWQxY2Nj*_ga_SQCRB0TXZW*czE3NTg5MTY5NDEkbzYkZzEkdDE3NTg5MTg2ODUkajM2JGwwJGgw"><span>Free access to Cloudflare developer services for non-profit and civil society organizations</span></a></td>
    <td><span>We extended our Cloudflare for Startups program to non-profits and public-interest organizations, offering free credits for our developer tools.</span></td>
  </tr>
  <tr>
    <td><a href="https://blog.cloudflare.com/workers-for-students/?_gl=1*lq39wt*_gcl_aw*R0NMLjE3NTg5MTQ0ODEuQ2p3S0NBanc4OWpHQmhCMEVpd0EybzFPbnp1VkVIN2UybUZJcERvWWtJMV9Rc2FlbTFEV19FU19qVjR1QnVmcEE3QVdkeU9zaVRIZGl4b0N4dHNRQXZEX0J3RQ..*_gcl_dc*R0NMLjE3NTgyMDc1NDEuQ2owS0NRancyNjdHQmhDU0FSSXNBT2pWSjRIWTFOVTZVWDFyVEJVNGNyd243d3RwX3lheFBuNnZJdXJlOUVmWmRzWkJJa1ZyejF4cDFDSWFBa2pBRUFMd193Y0I.*_gcl_au*MTI5NDk3ODE3OC4xNzUzMTQwMzIw*_ga*ZTI0NWUyMDQtZDM1YS00NTFkLWIwM2UtYjhhNzliZWQxY2Nj*_ga_SQCRB0TXZW*czE3NTg5MTY5NDEkbzYkZzEkdDE3NTg5MTg3NDgkajYwJGwwJGgw"><span>Introducing free access to Cloudflare developer features for students</span></a></td>
    <td><span>We are removing cost as a barrier for the next generation by giving students with .edu emails 12 months of free access to our paid developer platform features.</span></td>
  </tr>
  <tr>
    <td><a href="https://blog.cloudflare.com/capnweb-javascript-rpc-library/?_gl=1*19mcm4k*_gcl_aw*R0NMLjE3NTg5MTQ0ODEuQ2p3S0NBanc4OWpHQmhCMEVpd0EybzFPbnp1VkVIN2UybUZJcERvWWtJMV9Rc2FlbTFEV19FU19qVjR1QnVmcEE3QVdkeU9zaVRIZGl4b0N4dHNRQXZEX0J3RQ..*_gcl_dc*R0NMLjE3NTgyMDc1NDEuQ2owS0NRancyNjdHQmhDU0FSSXNBT2pWSjRIWTFOVTZVWDFyVEJVNGNyd243d3RwX3lheFBuNnZJdXJlOUVmWmRzWkJJa1ZyejF4cDFDSWFBa2pBRUFMd193Y0I.*_gcl_au*MTI5NDk3ODE3OC4xNzUzMTQwMzIw*_ga*ZTI0NWUyMDQtZDM1YS00NTFkLWIwM2UtYjhhNzliZWQxY2Nj*_ga_SQCRB0TXZW*czE3NTg5MTY5NDEkbzYkZzEkdDE3NTg5MjA2MTgkajYwJGwwJGgw"><span>Cap’n Web: a new RPC system for browsers and web servers</span></a></td>
    <td><span>We open-sourced Cap'n Web, a new JavaScript-native RPC protocol that simplifies powerful, schema-free communication for web applications.</span></td>
  </tr>
  <tr>
    <td><a href="https://blog.cloudflare.com/workers-launchpad-006/?_gl=1*8z9nf6*_gcl_aw*R0NMLjE3NTg5MTQ0ODEuQ2p3S0NBanc4OWpHQmhCMEVpd0EybzFPbnp1VkVIN2UybUZJcERvWWtJMV9Rc2FlbTFEV19FU19qVjR1QnVmcEE3QVdkeU9zaVRIZGl4b0N4dHNRQXZEX0J3RQ..*_gcl_dc*R0NMLjE3NTgyMDc1NDEuQ2owS0NRancyNjdHQmhDU0FSSXNBT2pWSjRIWTFOVTZVWDFyVEJVNGNyd243d3RwX3lheFBuNnZJdXJlOUVmWmRzWkJJa1ZyejF4cDFDSWFBa2pBRUFMd193Y0I.*_gcl_au*MTI5NDk3ODE3OC4xNzUzMTQwMzIw*_ga*ZTI0NWUyMDQtZDM1YS00NTFkLWIwM2UtYjhhNzliZWQxY2Nj*_ga_SQCRB0TXZW*czE3NTg5MTY5NDEkbzYkZzEkdDE3NTg5MjA3MTckajUwJGwwJGgw"><span>A lookback at Workers Launchpad and a warm welcome to Cohort #6</span></a></td>
    <td><span>We announced Cohort #6 of the Workers Launchpad, our accelerator program for startups building on Cloudflare.</span></td>
  </tr>
</tbody></table></div><p><b>Tuesday, September 23</b></p>
<div><table><thead>
  <tr>
    <th><span>What</span></th>
    <th><span>In a sentence …</span></th>
  </tr></thead>
<tbody>
  <tr>
    <td><a href="https://blog.cloudflare.com/per-customer-bot-defenses/?_gl=1*1i1oipn*_gcl_aw*R0NMLjE3NTg5MTQ0ODEuQ2p3S0NBanc4OWpHQmhCMEVpd0EybzFPbnp1VkVIN2UybUZJcERvWWtJMV9Rc2FlbTFEV19FU19qVjR1QnVmcEE3QVdkeU9zaVRIZGl4b0N4dHNRQXZEX0J3RQ..*_gcl_dc*R0NMLjE3NTgyMDc1NDEuQ2owS0NRancyNjdHQmhDU0FSSXNBT2pWSjRIWTFOVTZVWDFyVEJVNGNyd243d3RwX3lheFBuNnZJdXJlOUVmWmRzWkJJa1ZyejF4cDFDSWFBa2pBRUFMd193Y0I.*_gcl_au*MTI5NDk3ODE3OC4xNzUzMTQwMzIw*_ga*ZTI0NWUyMDQtZDM1YS00NTFkLWIwM2UtYjhhNzliZWQxY2Nj*_ga_SQCRB0TXZW*czE3NTg5MTY5NDEkbzYkZzEkdDE3NTg5MjA3NjAkajckbDAkaDA./"><span>Building unique, per-customer defenses against advanced bot threats in the AI era</span></a></td>
    <td><span>New anomaly detection system that uses machine learning trained on each zone to build defenses against AI-driven bot attacks. </span></td>
  </tr>
  <tr>
    <td><a href="https://blog.cloudflare.com/cloudflare-astro-tanstack/?_gl=1*v1uhzx*_gcl_aw*R0NMLjE3NTg5MTQ0ODEuQ2p3S0NBanc4OWpHQmhCMEVpd0EybzFPbnp1VkVIN2UybUZJcERvWWtJMV9Rc2FlbTFEV19FU19qVjR1QnVmcEE3QVdkeU9zaVRIZGl4b0N4dHNRQXZEX0J3RQ..*_gcl_dc*R0NMLjE3NTgyMDc1NDEuQ2owS0NRancyNjdHQmhDU0FSSXNBT2pWSjRIWTFOVTZVWDFyVEJVNGNyd243d3RwX3lheFBuNnZJdXJlOUVmWmRzWkJJa1ZyejF4cDFDSWFBa2pBRUFMd193Y0I.*_gcl_au*MTI5NDk3ODE3OC4xNzUzMTQwMzIw*_ga*ZTI0NWUyMDQtZDM1YS00NTFkLWIwM2UtYjhhNzliZWQxY2Nj*_ga_SQCRB0TXZW*czE3NTg5MTY5NDEkbzYkZzEkdDE3NTg5MjE2MzckajYwJGwwJGgw"><span>Why Cloudflare, Netlify, and Webflow are collaborating to support Open Source tools</span></a></td>
    <td><span>To support the open web, we joined forces with Webflow to sponsor Astro, and with Netlify to sponsor TanStack.</span></td>
  </tr>
  <tr>
    <td><a href="https://blog.cloudflare.com/x402/?_gl=1*kizcyy*_gcl_aw*R0NMLjE3NTg5MTQ0ODEuQ2p3S0NBanc4OWpHQmhCMEVpd0EybzFPbnp1VkVIN2UybUZJcERvWWtJMV9Rc2FlbTFEV19FU19qVjR1QnVmcEE3QVdkeU9zaVRIZGl4b0N4dHNRQXZEX0J3RQ..*_gcl_dc*R0NMLjE3NTgyMDc1NDEuQ2owS0NRancyNjdHQmhDU0FSSXNBT2pWSjRIWTFOVTZVWDFyVEJVNGNyd243d3RwX3lheFBuNnZJdXJlOUVmWmRzWkJJa1ZyejF4cDFDSWFBa2pBRUFMd193Y0I.*_gcl_au*MTI5NDk3ODE3OC4xNzUzMTQwMzIw*_ga*ZTI0NWUyMDQtZDM1YS00NTFkLWIwM2UtYjhhNzliZWQxY2Nj*_ga_SQCRB0TXZW*czE3NTg5MTY5NDEkbzYkZzEkdDE3NTg5MjA5OTUkajYkbDAkaDA./"><span>Launching the x402 Foundation with Coinbase, and support for x402 transactions</span></a></td>
    <td><span>We are partnering with Coinbase to create the x402 Foundation, encouraging the adoption of the </span><a href="https://github.com/coinbase/x402?cf_target_id=4D4A124640BFF471F5B56706F9A86B34"><span>x402 protocol</span></a><span> to allow clients and services to exchange value on the web using a common language</span></td>
  </tr>
  <tr>
    <td><a href="https://blog.cloudflare.com/ai-crawl-control-for-project-galileo/?_gl=1*1r1zsjt*_gcl_aw*R0NMLjE3NTg5MTQ0ODEuQ2p3S0NBanc4OWpHQmhCMEVpd0EybzFPbnp1VkVIN2UybUZJcERvWWtJMV9Rc2FlbTFEV19FU19qVjR1QnVmcEE3QVdkeU9zaVRIZGl4b0N4dHNRQXZEX0J3RQ..*_gcl_dc*R0NMLjE3NTgyMDc1NDEuQ2owS0NRancyNjdHQmhDU0FSSXNBT2pWSjRIWTFOVTZVWDFyVEJVNGNyd243d3RwX3lheFBuNnZJdXJlOUVmWmRzWkJJa1ZyejF4cDFDSWFBa2pBRUFMd193Y0I.*_gcl_au*MTI5NDk3ODE3OC4xNzUzMTQwMzIw*_ga*ZTI0NWUyMDQtZDM1YS00NTFkLWIwM2UtYjhhNzliZWQxY2Nj*_ga_SQCRB0TXZW*czE3NTg5MTY5NDEkbzYkZzEkdDE3NTg5MjE3NjYkajYwJGwwJGgw"><span>Helping protect journalists and local news from AI crawlers with Project Galileo</span></a></td>
    <td><span>We are extending our free Bot Management and AI Crawl Control services to journalists and news organizations through Project Galileo.</span></td>
  </tr>
  <tr>
    <td><a href="https://blog.cloudflare.com/confidence-score-rubric/"><span>Cloudflare Confidence Scorecards - making AI safer for the Internet</span></a></td>
    <td><span>Automated evaluation of AI and SaaS tools, helping organizations to embrace AI without compromising security.</span></td>
  </tr>
</tbody></table></div><p><b>Wednesday, September 24</b></p>
<div><table><thead>
  <tr>
    <th><span>What</span></th>
    <th><span>In a sentence …</span></th>
  </tr></thead>
<tbody>
  <tr>
    <td><a href="https://blog.cloudflare.com/automatically-secure/?_gl=1*8mjfiy*_gcl_aw*R0NMLjE3NTg5MTQ0ODEuQ2p3S0NBanc4OWpHQmhCMEVpd0EybzFPbnp1VkVIN2UybUZJcERvWWtJMV9Rc2FlbTFEV19FU19qVjR1QnVmcEE3QVdkeU9zaVRIZGl4b0N4dHNRQXZEX0J3RQ..*_gcl_dc*R0NMLjE3NTgyMDc1NDEuQ2owS0NRancyNjdHQmhDU0FSSXNBT2pWSjRIWTFOVTZVWDFyVEJVNGNyd243d3RwX3lheFBuNnZJdXJlOUVmWmRzWkJJa1ZyejF4cDFDSWFBa2pBRUFMd193Y0I.*_gcl_au*MTI5NDk3ODE3OC4xNzUzMTQwMzIw*_ga*ZTI0NWUyMDQtZDM1YS00NTFkLWIwM2UtYjhhNzliZWQxY2Nj*_ga_SQCRB0TXZW*czE3NTg5MTY5NDEkbzYkZzEkdDE3NTg5MjE4MTckajkkbDAkaDA."><span>Automatically Secure: how we upgraded 6,000,000 domains by default</span></a></td>
    <td><span>Our Automatic SSL/TLS system has upgraded over 6 million domains to more secure encryption modes by default and will soon automatically enable post-quantum connections.</span></td>
  </tr>
  <tr>
    <td><a href="https://blog.cloudflare.com/content-signals-policy/?_gl=1*lfy031*_gcl_aw*R0NMLjE3NTg5MTQ0ODEuQ2p3S0NBanc4OWpHQmhCMEVpd0EybzFPbnp1VkVIN2UybUZJcERvWWtJMV9Rc2FlbTFEV19FU19qVjR1QnVmcEE3QVdkeU9zaVRIZGl4b0N4dHNRQXZEX0J3RQ..*_gcl_dc*R0NMLjE3NTgyMDc1NDEuQ2owS0NRancyNjdHQmhDU0FSSXNBT2pWSjRIWTFOVTZVWDFyVEJVNGNyd243d3RwX3lheFBuNnZJdXJlOUVmWmRzWkJJa1ZyejF4cDFDSWFBa2pBRUFMd193Y0I.*_gcl_au*MTI5NDk3ODE3OC4xNzUzMTQwMzIw*_ga*ZTI0NWUyMDQtZDM1YS00NTFkLWIwM2UtYjhhNzliZWQxY2Nj*_ga_SQCRB0TXZW*czE3NTg5MTY5NDEkbzYkZzEkdDE3NTg5MjE5NTkkajYwJGwwJGgw/"><span>Giving users choice with Cloudflare’s new Content Signals Policy</span></a></td>
    <td><span>The Content Signals Policy is a new standard for robots.txt that lets creators express clear preferences for how AI can use their content.</span></td>
  </tr>
  <tr>
    <td><a href="https://blog.cloudflare.com/building-a-better-internet-with-responsible-ai-bot-principles/?_gl=1*hjo4nx*_gcl_aw*R0NMLjE3NTg5MTQ0ODEuQ2p3S0NBanc4OWpHQmhCMEVpd0EybzFPbnp1VkVIN2UybUZJcERvWWtJMV9Rc2FlbTFEV19FU19qVjR1QnVmcEE3QVdkeU9zaVRIZGl4b0N4dHNRQXZEX0J3RQ..*_gcl_dc*R0NMLjE3NTgyMDc1NDEuQ2owS0NRancyNjdHQmhDU0FSSXNBT2pWSjRIWTFOVTZVWDFyVEJVNGNyd243d3RwX3lheFBuNnZJdXJlOUVmWmRzWkJJa1ZyejF4cDFDSWFBa2pBRUFMd193Y0I.*_gcl_au*MTI5NDk3ODE3OC4xNzUzMTQwMzIw*_ga*ZTI0NWUyMDQtZDM1YS00NTFkLWIwM2UtYjhhNzliZWQxY2Nj*_ga_SQCRB0TXZW*czE3NTg5MTY5NDEkbzYkZzEkdDE3NTg5MjIwMTIkajckbDAkaDA."><span>To build a better Internet in the age of AI, we need responsible AI bot principles</span></a></td>
    <td><span>A proposed set of responsible AI bot principles to start a conversation around transparency and respect for content creators' preferences.</span></td>
  </tr>
  <tr>
    <td><a href="https://blog.cloudflare.com/saas-to-saas-security/?_gl=1*tigi23*_gcl_aw*R0NMLjE3NTg5MTQ0ODEuQ2p3S0NBanc4OWpHQmhCMEVpd0EybzFPbnp1VkVIN2UybUZJcERvWWtJMV9Rc2FlbTFEV19FU19qVjR1QnVmcEE3QVdkeU9zaVRIZGl4b0N4dHNRQXZEX0J3RQ..*_gcl_dc*R0NMLjE3NTgyMDc1NDEuQ2owS0NRancyNjdHQmhDU0FSSXNBT2pWSjRIWTFOVTZVWDFyVEJVNGNyd243d3RwX3lheFBuNnZJdXJlOUVmWmRzWkJJa1ZyejF4cDFDSWFBa2pBRUFMd193Y0I.*_gcl_au*MTI5NDk3ODE3OC4xNzUzMTQwMzIw*_ga*ZTI0NWUyMDQtZDM1YS00NTFkLWIwM2UtYjhhNzliZWQxY2Nj*_ga_SQCRB0TXZW*czE3NTg5MTY5NDEkbzYkZzEkdDE3NTg5MjIwNjgkajYwJGwwJGgw"><span>Securing data in SaaS to SaaS applications</span></a></td>
    <td><span>New security tools to give companies visibility and control over data flowing between SaaS applications.</span></td>
  </tr>
  <tr>
    <td><a href="https://blog.cloudflare.com/post-quantum-warp/?_gl=1*1vy23vv*_gcl_aw*R0NMLjE3NTg5MTQ0ODEuQ2p3S0NBanc4OWpHQmhCMEVpd0EybzFPbnp1VkVIN2UybUZJcERvWWtJMV9Rc2FlbTFEV19FU19qVjR1QnVmcEE3QVdkeU9zaVRIZGl4b0N4dHNRQXZEX0J3RQ..*_gcl_dc*R0NMLjE3NTgyMDc1NDEuQ2owS0NRancyNjdHQmhDU0FSSXNBT2pWSjRIWTFOVTZVWDFyVEJVNGNyd243d3RwX3lheFBuNnZJdXJlOUVmWmRzWkJJa1ZyejF4cDFDSWFBa2pBRUFMd193Y0I.*_gcl_au*MTI5NDk3ODE3OC4xNzUzMTQwMzIw*_ga*ZTI0NWUyMDQtZDM1YS00NTFkLWIwM2UtYjhhNzliZWQxY2Nj*_ga_SQCRB0TXZW*czE3NTg5MTY5NDEkbzYkZzEkdDE3NTg5MjIyMDIkajYwJGwwJGgw"><span>Securing today for the quantum future: WARP client now supports post-quantum cryptography (PQC)</span></a></td>
    <td><span>Cloudflare’s WARP client now supports post-quantum cryptography, providing quantum-resistant encryption for traffic. </span></td>
  </tr>
  <tr>
    <td><a href="https://blog.cloudflare.com/a-simpler-path-to-a-safer-internet-an-update-to-our-csam-scanning-tool/?_gl=1*1avvoeq*_gcl_aw*R0NMLjE3NTg5MTQ0ODEuQ2p3S0NBanc4OWpHQmhCMEVpd0EybzFPbnp1VkVIN2UybUZJcERvWWtJMV9Rc2FlbTFEV19FU19qVjR1QnVmcEE3QVdkeU9zaVRIZGl4b0N4dHNRQXZEX0J3RQ..*_gcl_dc*R0NMLjE3NTgyMDc1NDEuQ2owS0NRancyNjdHQmhDU0FSSXNBT2pWSjRIWTFOVTZVWDFyVEJVNGNyd243d3RwX3lheFBuNnZJdXJlOUVmWmRzWkJJa1ZyejF4cDFDSWFBa2pBRUFMd193Y0I.*_gcl_au*MTI5NDk3ODE3OC4xNzUzMTQwMzIw*_ga*ZTI0NWUyMDQtZDM1YS00NTFkLWIwM2UtYjhhNzliZWQxY2Nj*_ga_SQCRB0TXZW*czE3NTg5MTY5NDEkbzYkZzEkdDE3NTg5MjIxMTUkajEzJGwwJGgw"><span>A simpler path to a safer Internet: an update to our CSAM scanning tool</span></a></td>
    <td><span>We made our CSAM Scanning Tool easier to adopt by removing the need to create and provide unique credentials, helping more site owners protect their platforms.</span></td>
  </tr>
</tbody></table></div><p>
<b>Thursday, September 25</b></p>
<div><table><thead>
  <tr>
    <th><span>What</span></th>
    <th><span>In a sentence …</span></th>
  </tr></thead>
<tbody>
  <tr>
    <td><a href="https://blog.cloudflare.com/enterprise-grade-features-for-all/?_gl=1*ll2laa*_gcl_aw*R0NMLjE3NTg5MTQ0ODEuQ2p3S0NBanc4OWpHQmhCMEVpd0EybzFPbnp1VkVIN2UybUZJcERvWWtJMV9Rc2FlbTFEV19FU19qVjR1QnVmcEE3QVdkeU9zaVRIZGl4b0N4dHNRQXZEX0J3RQ..*_gcl_dc*R0NMLjE3NTgyMDc1NDEuQ2owS0NRancyNjdHQmhDU0FSSXNBT2pWSjRIWTFOVTZVWDFyVEJVNGNyd243d3RwX3lheFBuNnZJdXJlOUVmWmRzWkJJa1ZyejF4cDFDSWFBa2pBRUFMd193Y0I.*_gcl_au*MTI5NDk3ODE3OC4xNzUzMTQwMzIw*_ga*ZTI0NWUyMDQtZDM1YS00NTFkLWIwM2UtYjhhNzliZWQxY2Nj*_ga_SQCRB0TXZW*czE3NTg5MTY5NDEkbzYkZzEkdDE3NTg5MjIyODIkajYwJGwwJGgw/"><span>Every Cloudflare feature, available to everyone</span></a></td>
    <td><span>We are making every Cloudflare feature, starting with Single Sign On (SSO), available for anyone to purchase on any plan. </span></td>
  </tr>
  <tr>
    <td><a href="https://blog.cloudflare.com/cloudflare-developer-platform-keeps-getting-better-faster-and-more-powerful/?_gl=1*1dwrmxx*_gcl_aw*R0NMLjE3NTg5MTQ0ODEuQ2p3S0NBanc4OWpHQmhCMEVpd0EybzFPbnp1VkVIN2UybUZJcERvWWtJMV9Rc2FlbTFEV19FU19qVjR1QnVmcEE3QVdkeU9zaVRIZGl4b0N4dHNRQXZEX0J3RQ..*_gcl_dc*R0NMLjE3NTgyMDc1NDEuQ2owS0NRancyNjdHQmhDU0FSSXNBT2pWSjRIWTFOVTZVWDFyVEJVNGNyd243d3RwX3lheFBuNnZJdXJlOUVmWmRzWkJJa1ZyejF4cDFDSWFBa2pBRUFMd193Y0I.*_gcl_au*MTI5NDk3ODE3OC4xNzUzMTQwMzIw*_ga*ZTI0NWUyMDQtZDM1YS00NTFkLWIwM2UtYjhhNzliZWQxY2Nj*_ga_SQCRB0TXZW*czE3NTg5MTY5NDEkbzYkZzEkdDE3NTg5MjI0MzgkajYwJGwwJGgw/"><span>Cloudflare's developer platform keeps getting better, faster, and more powerful</span></a></td>
    <td><span>Updates across Workers and beyond for a more powerful developer platform – such as support for larger and more concurrent Container images, support for external models from OpenAI and Anthropic in AI Search (previously AutoRAG), and more. </span></td>
  </tr>
  <tr>
    <td><a href="https://blog.cloudflare.com/planetscale-postgres-workers/?_gl=1*1e87q21*_gcl_aw*R0NMLjE3NTg5MTQ0ODEuQ2p3S0NBanc4OWpHQmhCMEVpd0EybzFPbnp1VkVIN2UybUZJcERvWWtJMV9Rc2FlbTFEV19FU19qVjR1QnVmcEE3QVdkeU9zaVRIZGl4b0N4dHNRQXZEX0J3RQ..*_gcl_dc*R0NMLjE3NTgyMDc1NDEuQ2owS0NRancyNjdHQmhDU0FSSXNBT2pWSjRIWTFOVTZVWDFyVEJVNGNyd243d3RwX3lheFBuNnZJdXJlOUVmWmRzWkJJa1ZyejF4cDFDSWFBa2pBRUFMd193Y0I.*_gcl_au*MTI5NDk3ODE3OC4xNzUzMTQwMzIw*_ga*ZTI0NWUyMDQtZDM1YS00NTFkLWIwM2UtYjhhNzliZWQxY2Nj*_ga_SQCRB0TXZW*czE3NTg5MTY5NDEkbzYkZzEkdDE3NTg5MjI2MDUkajYwJGwwJGgw"><span>Partnering to make full-stack fast: deploy PlanetScale databases directly from Workers</span></a></td>
    <td><span>You can now connect Cloudflare Workers to PlanetScale databases directly, with connections automatically optimized by Hyperdrive.</span></td>
  </tr>
  <tr>
    <td><a href="https://blog.cloudflare.com/cloudflare-data-platform/?_gl=1*1gj7lyv*_gcl_aw*R0NMLjE3NTg5MTQ0ODEuQ2p3S0NBanc4OWpHQmhCMEVpd0EybzFPbnp1VkVIN2UybUZJcERvWWtJMV9Rc2FlbTFEV19FU19qVjR1QnVmcEE3QVdkeU9zaVRIZGl4b0N4dHNRQXZEX0J3RQ..*_gcl_dc*R0NMLjE3NTgyMDc1NDEuQ2owS0NRancyNjdHQmhDU0FSSXNBT2pWSjRIWTFOVTZVWDFyVEJVNGNyd243d3RwX3lheFBuNnZJdXJlOUVmWmRzWkJJa1ZyejF4cDFDSWFBa2pBRUFMd193Y0I.*_gcl_au*MTI5NDk3ODE3OC4xNzUzMTQwMzIw*_ga*ZTI0NWUyMDQtZDM1YS00NTFkLWIwM2UtYjhhNzliZWQxY2Nj*_ga_SQCRB0TXZW*czE3NTg5MTY5NDEkbzYkZzEkdDE3NTg5MjI5MDckajYwJGwwJGgw"><span>Announcing the Cloudflare Data Platform</span></a></td>
    <td><span>A complete solution for ingesting, storing, and querying analytical data tables using open standards like Apache Iceberg. </span></td>
  </tr>
  <tr>
    <td><a href="https://blog.cloudflare.com/r2-sql-deep-dive/?_gl=1*88kngf*_gcl_aw*R0NMLjE3NTg5MTQ0ODEuQ2p3S0NBanc4OWpHQmhCMEVpd0EybzFPbnp1VkVIN2UybUZJcERvWWtJMV9Rc2FlbTFEV19FU19qVjR1QnVmcEE3QVdkeU9zaVRIZGl4b0N4dHNRQXZEX0J3RQ..*_gcl_dc*R0NMLjE3NTgyMDc1NDEuQ2owS0NRancyNjdHQmhDU0FSSXNBT2pWSjRIWTFOVTZVWDFyVEJVNGNyd243d3RwX3lheFBuNnZJdXJlOUVmWmRzWkJJa1ZyejF4cDFDSWFBa2pBRUFMd193Y0I.*_gcl_au*MTI5NDk3ODE3OC4xNzUzMTQwMzIw*_ga*ZTI0NWUyMDQtZDM1YS00NTFkLWIwM2UtYjhhNzliZWQxY2Nj*_ga_SQCRB0TXZW*czE3NTg5MTY5NDEkbzYkZzEkdDE3NTg5MjI5MzAkajM3JGwwJGgw"><span>R2 SQL: a deep dive into our new distributed query engine</span></a></td>
    <td><span>A technical deep dive on R2 SQL, a serverless query engine for petabyte-scale datasets in R2.</span></td>
  </tr>
  <tr>
    <td><a href="https://blog.cloudflare.com/safe-in-the-sandbox-security-hardening-for-cloudflare-workers/?_gl=1*y25my1*_gcl_aw*R0NMLjE3NTg5MTQ0ODEuQ2p3S0NBanc4OWpHQmhCMEVpd0EybzFPbnp1VkVIN2UybUZJcERvWWtJMV9Rc2FlbTFEV19FU19qVjR1QnVmcEE3QVdkeU9zaVRIZGl4b0N4dHNRQXZEX0J3RQ..*_gcl_dc*R0NMLjE3NTgyMDc1NDEuQ2owS0NRancyNjdHQmhDU0FSSXNBT2pWSjRIWTFOVTZVWDFyVEJVNGNyd243d3RwX3lheFBuNnZJdXJlOUVmWmRzWkJJa1ZyejF4cDFDSWFBa2pBRUFMd193Y0I.*_gcl_au*MTI5NDk3ODE3OC4xNzUzMTQwMzIw*_ga*ZTI0NWUyMDQtZDM1YS00NTFkLWIwM2UtYjhhNzliZWQxY2Nj*_ga_SQCRB0TXZW*czE3NTg5MTY5NDEkbzYkZzEkdDE3NTg5MjI4ODQkajMkbDAkaDA./"><span>Safe in the sandbox: security hardening for Cloudflare Workers</span></a></td>
    <td><span>A deep-dive into how we’ve hardened the Workers runtime with new defense-in-depth security measures, including V8 sandboxes and hardware-assisted memory protection keys.</span></td>
  </tr>
  <tr>
    <td><a href="https://blog.cloudflare.com/sovereign-ai-and-choice/?_gl=1*1gvqucw*_gcl_aw*R0NMLjE3NTg5MTQ0ODEuQ2p3S0NBanc4OWpHQmhCMEVpd0EybzFPbnp1VkVIN2UybUZJcERvWWtJMV9Rc2FlbTFEV19FU19qVjR1QnVmcEE3QVdkeU9zaVRIZGl4b0N4dHNRQXZEX0J3RQ..*_gcl_dc*R0NMLjE3NTgyMDc1NDEuQ2owS0NRancyNjdHQmhDU0FSSXNBT2pWSjRIWTFOVTZVWDFyVEJVNGNyd243d3RwX3lheFBuNnZJdXJlOUVmWmRzWkJJa1ZyejF4cDFDSWFBa2pBRUFMd193Y0I.*_gcl_au*MTI5NDk3ODE3OC4xNzUzMTQwMzIw*_ga*ZTI0NWUyMDQtZDM1YS00NTFkLWIwM2UtYjhhNzliZWQxY2Nj*_ga_SQCRB0TXZW*czE3NTg5MTY5NDEkbzYkZzEkdDE3NTg5MjI4NjkkajE4JGwwJGgw/"><span>Choice: the path to AI sovereignty</span></a></td>
    <td><span>To champion AI sovereignty, we've added locally-developed open-source models from India, Japan, and Southeast Asia to our Workers AI platform.</span></td>
  </tr>
  <tr>
    <td><a href="https://blog.cloudflare.com/email-service/?_gl=1*z3yus0*_gcl_aw*R0NMLjE3NTg5MTQ0ODEuQ2p3S0NBanc4OWpHQmhCMEVpd0EybzFPbnp1VkVIN2UybUZJcERvWWtJMV9Rc2FlbTFEV19FU19qVjR1QnVmcEE3QVdkeU9zaVRIZGl4b0N4dHNRQXZEX0J3RQ..*_gcl_dc*R0NMLjE3NTgyMDc1NDEuQ2owS0NRancyNjdHQmhDU0FSSXNBT2pWSjRIWTFOVTZVWDFyVEJVNGNyd243d3RwX3lheFBuNnZJdXJlOUVmWmRzWkJJa1ZyejF4cDFDSWFBa2pBRUFMd193Y0I.*_gcl_au*MTI5NDk3ODE3OC4xNzUzMTQwMzIw*_ga*ZTI0NWUyMDQtZDM1YS00NTFkLWIwM2UtYjhhNzliZWQxY2Nj*_ga_SQCRB0TXZW*czE3NTg5MTY5NDEkbzYkZzEkdDE3NTg5MjI4MjckajYwJGwwJGgw"><span>Announcing Cloudflare Email Service’s private beta</span></a></td>
    <td><span>We announced the Cloudflare Email Service private beta, allowing developers to reliably send and receive transactional emails directly from Cloudflare Workers.</span></td>
  </tr>
  <tr>
    <td><a href="https://blog.cloudflare.com/nodejs-workers-2025/?_gl=1*gzumry*_gcl_aw*R0NMLjE3NTg5MTQ0ODEuQ2p3S0NBanc4OWpHQmhCMEVpd0EybzFPbnp1VkVIN2UybUZJcERvWWtJMV9Rc2FlbTFEV19FU19qVjR1QnVmcEE3QVdkeU9zaVRIZGl4b0N4dHNRQXZEX0J3RQ..*_gcl_dc*R0NMLjE3NTgyMDc1NDEuQ2owS0NRancyNjdHQmhDU0FSSXNBT2pWSjRIWTFOVTZVWDFyVEJVNGNyd243d3RwX3lheFBuNnZJdXJlOUVmWmRzWkJJa1ZyejF4cDFDSWFBa2pBRUFMd193Y0I.*_gcl_au*MTI5NDk3ODE3OC4xNzUzMTQwMzIw*_ga*ZTI0NWUyMDQtZDM1YS00NTFkLWIwM2UtYjhhNzliZWQxY2Nj*_ga_SQCRB0TXZW*czE3NTg5MTY5NDEkbzYkZzEkdDE3NTg5MjI2ODgkajYwJGwwJGgw/"><span>A year of improving Node.js compatibility in Cloudflare Workers</span></a></td>
    <td><span>There are hundreds of new Node.js APIs now available that make it easier to run existing Node.js code on our platform. </span></td>
  </tr>
</tbody></table></div><p><b>Friday, September 26</b></p>
<table><thead>
  <tr>
    <th><span>What</span></th>
    <th><span>In a sentence …</span></th>
  </tr></thead>
<tbody>
  <tr>
    <td><a href="https://blog.cloudflare.com/20-percent-internet-upgrade"><span>Cloudflare just got faster and more secure, powered by Rust</span></a></td>
    <td><span>We have re-engineered our core proxy with a new modular, Rust-based architecture, cutting median response time by 10ms for millions. </span></td>
  </tr>
  <tr>
    <td><a href="https://blog.cloudflare.com//introducing-observatory-and-smart-shield/"><span>Introducing Observatory and Smart Shield</span></a></td>
    <td><span>New monitoring tools in the Cloudflare dashboard that provide actionable recommendations and one-click fixes for performance issues.</span></td>
  </tr>
  <tr>
    <td><a href="https://blog.cloudflare.com/monitoring-as-sets-and-why-they-matter/"><span>Monitoring AS-SETs and why they matter</span></a></td>
    <td><span>Cloudflare Radar now includes Internet Routing Registry (IRR) data, allowing network operators to monitor AS-SETs to help prevent route leaks.</span></td>
  </tr>
  <tr>
    <td><a href="https://blog.cloudflare.com/an-ai-index-for-all-our-customers"><span>An AI Index for all our customers</span></a></td>
    <td><span>We announced the private beta of AI Index, a new service that creates an AI-optimized search index for your domain that you control and can monetize.</span></td>
  </tr>
  <tr>
    <td><a href="https://blog.cloudflare.com/new-regional-internet-traffic-and-certificate-transparency-insights-on-radar/"><span>Introducing new regional Internet traffic and Certificate Transparency insights on Cloudflare Radar</span></a></td>
    <td><span>Sub-national traffic insights and Certificate Transparency dashboards for TLS monitoring.</span></td>
  </tr>
  <tr>
    <td><a href="https://blog.cloudflare.com/eliminating-cold-starts-2-shard-and-conquer/"><span>Eliminating Cold Starts 2: shard and conquer</span></a></td>
    <td><span>We have reduced Workers cold starts by 10x by implementing a new "worker sharding" system that routes requests to already-loaded Workers.</span></td>
  </tr>
  <tr>
    <td><a href="https://blog.cloudflare.com/network-performance-update-birthday-week-2025/"><span>Network performance update: Birthday Week 2025</span></a></td>
    <td><span>The TCP Connection Time (Trimean) graph shows that we are the fastest TCP connection time in 40% of measured ISPs – and the fastest across the top networks.</span></td>
  </tr>
  <tr>
    <td><a href="https://blog.cloudflare.com/how-cloudflare-uses-the-worlds-greatest-collection-of-performance-data/"><span>How Cloudflare uses performance data to make the world’s fastest global network even faster</span></a></td>
    <td><span>We are using our network's vast performance data to tune congestion control algorithms, improving speeds by an average of 10% for QUIC traffic.</span></td>
  </tr>
  <tr>
    <td><a href="https://blog.cloudflare.com/code-mode/"><span>Code Mode: the better way to use MCP</span></a></td>
    <td><span>It turns out we've all been using MCP wrong. Most agents today use MCP by exposing the "tools" directly to the LLM. We tried something different: Convert the MCP tools into a TypeScript API, and then ask an LLM to write code that calls that API. The results are striking.</span></td>
  </tr>
</tbody></table>
    <div>
      <h3>Come build with us!</h3>
      <a href="#come-build-with-us">
        
      </a>
    </div>
    <p>Helping build a better Internet has always been about more than just technology. Like the announcements about interns or working together in our offices, the community of people behind helping build a better Internet matters to its future. This week, we rolled out our most ambitious set of initiatives ever to support the builders, founders, and students who are creating the future.</p><p>For founders and startups, we are thrilled to welcome <b>Cohort #6</b> to the <b>Workers Launchpad</b>, our accelerator program that gives early-stage companies the resources they need to scale. But we’re not stopping there. We’re opening our doors, literally, by launching <b>new physical hubs for startups</b> in our San Francisco, Austin, London, and Lisbon offices. These spaces will provide access to mentorship, resources, and a community of fellow builders.</p><p>We’re also investing in the next generation of talent. We announced <b>free access to the Cloudflare developer platform for all students</b>, giving them the tools to learn and experiment without limits. To provide a path from the classroom to the industry, we also announced our goal to hire <b>1,111 interns in 2026</b> — our biggest commitment yet to fostering future tech leaders.</p><p>And because a better Internet is for everyone, we’re extending our support to <b>non-profits and public-interest organizations</b>, offering them free access to our production-grade developer tools, so they can focus on their missions.</p><p>Whether you're a founder with a big idea, a student just getting started, or a team working for a cause you believe in, we want to help you succeed.</p>
    <div>
      <h3>Until next year</h3>
      <a href="#until-next-year">
        
      </a>
    </div>
    <p>Thank you to our customers, our community, and the millions of developers who trust us to help them build, secure, and accelerate the Internet. Your curiosity and feedback drive our innovation.</p><p>It’s been an incredible 15 years. And as always, we’re just getting started!</p><p><i>(Watch the full conversation on our show </i><a href="ThisWeekinNET.com"><i>ThisWeekinNET.com</i></a><i> about what we launched during Birthday Week 2025 </i><a href="https://youtu.be/Z2uHFc9ua9s?feature=shared"><i><b><u>here</u></b></i></a><i>.) </i></p> ]]></content:encoded>
            <category><![CDATA[Birthday Week]]></category>
            <category><![CDATA[Partners]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <category><![CDATA[Workers Launchpad]]></category>
            <category><![CDATA[Performance]]></category>
            <category><![CDATA[Security]]></category>
            <category><![CDATA[Cache]]></category>
            <category><![CDATA[Speed]]></category>
            <category><![CDATA[Developers]]></category>
            <category><![CDATA[AI]]></category>
            <category><![CDATA[1.1.1.1]]></category>
            <category><![CDATA[Application Security]]></category>
            <category><![CDATA[Application Services]]></category>
            <category><![CDATA[Bots]]></category>
            <category><![CDATA[CDN]]></category>
            <category><![CDATA[Cloudflare for Startups]]></category>
            <category><![CDATA[Cloudflare One]]></category>
            <category><![CDATA[Cloudflare Zero Trust]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <guid isPermaLink="false">4k1NhJtljIsH7GOkpHg1Ei</guid>
            <dc:creator>Nikita Cano</dc:creator>
            <dc:creator>Korinne Alpers</dc:creator>
        </item>
        <item>
            <title><![CDATA[Introducing Observatory and Smart Shield — see how the world sees your website, and make it faster in one click]]></title>
            <link>https://blog.cloudflare.com/introducing-observatory-and-smart-shield/</link>
            <pubDate>Fri, 26 Sep 2025 14:00:00 GMT</pubDate>
            <description><![CDATA[ We're announcing two enhancements to our Application Performance suite that'll show how the world sees your website, and make it faster with one click - available Cloudflare Dashboard! ]]></description>
            <content:encoded><![CDATA[ <p>Modern users expect instant, reliable web experiences. When your application is slow, they don’t just complain — they leave. Even delays as small as 100 ms have been <a href="https://wpostats.com/"><u>shown to have a measurable impact on revenue, conversions, bounce rate, engagement and more</u></a>. </p><p>If you’re responsible for delivering on these expectations to the users of your product, you know there are many monitoring tools that show you how visitors experience your website, and can let you know when things are slow or causing issues. This is essential, but we believe understanding the condition is only half the story. The real value comes from integrating monitoring and remedies in the same view, giving customers the ability to quickly identify and resolve issues.</p><p>That's why today, we're excited to launch the new and improved <b>Observatory</b>, now in open beta. This monitoring and <a href="https://www.cloudflare.com/learning/performance/what-is-observability/">observability</a> tool goes beyond charts and graphs, by also telling you exactly how to improve your application's performance and resilience, and immediately showing you the impact of those changes. And we’re releasing it to all subscription tiers (including Free!), available today. </p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/T6HhZL51aLEhzD3lQPxCq/f9b03f05cf4db0b2e61c8e861df4ecdf/1.png" />
          </figure><p>But wait, there’s more! To make your users’ experience in Cloudflare even faster, we’re launching Smart Shield, available today for all subscription tiers. Using Observatory, you can pinpoint performance bottlenecks, and for many of the most common issues, you can now apply the fix in just a few clicks with our <b>Smart Shield</b> product. Double the fun!</p>
    <div>
      <h2>Our unique perspective: leveraging data from 20% of the web</h2>
      <a href="#our-unique-perspective-leveraging-data-from-20-of-the-web">
        
      </a>
    </div>
    <p>Every day, Cloudflare handles traffic for over 20% of the web, giving us a unique vantage point into what makes websites faster and more resilient. We built Observatory to take advantage of this position, uniting data that is normally scattered across different tools — including real-user data, synthetic testing, error rates, and backend telemetry — into a single platform. This gives you a complete, cohesive picture of your application's health end-to-end, in one spot, and enables you to easily identify and resolve performance issues.</p><p>For this launch, we're bringing together:</p><ul><li><p><b>Real-user data:</b> See how your application performs for real people, in the real world.</p></li><li><p><b>Back-end telemetry:</b> Break down the lifecycle of a request to pinpoint areas for improvement.</p></li><li><p><b>Error rates:</b> Understand the stability of your application at both the edge and origin.</p></li><li><p><b>Cache hit ratios:</b> Ensure you're maximizing the performance of your configuration.</p></li><li><p><b>Synthetic testing:</b> Proactively test and monitor key endpoints with powerful, accurate simulations.</p></li></ul><p>Let's take a quick look at each data set to see how we use them in Observatory.</p>
    <div>
      <h2>Real-user data</h2>
      <a href="#real-user-data">
        
      </a>
    </div>
    <p>There are two primary forms of data collection: real-user data and synthetic data. Real-user data are performance metrics collected from real traffic, from real visitors, to your application. It’s how users are <i>actually</i> seeing your application perform in the real world. It’s unpredictable, and covers every scenario.</p><p>Synthetic data is data collected using some sort of simulated test (loading a site in a headless browser, making network requests from a testing system to an endpoint, etc.). Tests are run under a predefined set of characteristics — location, network speed, etc. — to provide a consistent baseline.</p><p>Both forms of data have their uses, and companies with a strongly established culture of operational excellence tend to use both.</p><p>The first data you’ll see when you visit Observatory is real-user data collected with <a href="https://www.cloudflare.com/web-analytics/"><u>Real User Monitoring (RUM)</u></a>, with a particular focus on the <a href="https://www.cloudflare.com/learning/performance/what-are-core-web-vitals/"><u>Core Web Vital</u></a> metrics.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/400NHp7OBcSXNmLi5AxXb8/641f6436574e040bfbc14b56c7bfcd70/1.5.png" />
          </figure><p>This is very intentional.</p><p>Real-user data should be the source of truth when it comes to measuring performance and resiliency of your application. Even the best of synthetic data sources are always going to be an approximation. They cannot cover every possible scenario, and because they are being run from a lab environment, they will not always reveal issues that may be more sporadic and unpredictable.</p><p>They’re also the best representation of what your users are experiencing when they access your site and, at the end of the day, that’s why we focus on improving performance, resiliency,  and security for our users.</p><p>We believe so strongly in the importance of every company having access to accurate, detailed RUM data that we are providing it for free, to all accounts. In fact, we’re about to make our <a href="https://www.cloudflare.com/web-analytics/#:~:text=Privacy%20First"><u>privacy-first analytics</u></a> — which doesn’t track individual users for analytics — <a href="https://blog.cloudflare.com/the-rum-diaries-enabling-web-analytics-by-default/"><u>available by default for all free zones</u></a> (<b>excluding data from EU or UK visitors</b>), no setup necessary. We believe the right thing is arming everyone with detailed, actionable, real-user data, and we want to make it easy.</p>
    <div>
      <h2>Backend telemetry</h2>
      <a href="#backend-telemetry">
        
      </a>
    </div>
    <p>Front-end performance metrics are our best proxy for understanding the actual user experience of an application and as a result, they work great as key performance indicators (KPI’s).</p><p>But they’re not enough. Every primary metric should have some level of supporting diagnostic metrics that help us understand <i>why</i> our user metrics are performing like they are — so that we can quickly identify issues, bottlenecks, and areas of improvement.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/2Un8yQdUf9DZw05gfS5WVs/187901b7e636cec35655ff954b1f38c4/2.png" />
          </figure><p>While the industry has largely, and rightfully, moved on from Time to First Byte (TTFB) as a primary metric of focus, it still has value as a diagnostic metric. In fact, we analyzed our RUM data and found a very strong connection between <a href="https://developers.cloudflare.com/speed/observatory/test-results/#synthetic-tests-and-real-user-monitoring-metrics"><u>Time to First Byte and Largest Contentful Paint</u></a>.</p><p>Google’s recommended thresholds for Time to First Byte are:</p><ul><li><p>Good: &lt;= 800ms</p></li><li><p>Needs Improvement: &gt; 800ms and &lt;= 1800ms</p></li><li><p>Poor: &gt; 1800ms</p></li></ul><p>Similarly, their official thresholds for Largest Contentful Paint are:</p><ul><li><p>Good: &lt;= 2500ms</p></li><li><p>Needs Improvement &gt; 2500ms and &lt;= 4000ms</p></li><li><p>Poor: &gt; 4000ms</p></li></ul><p>Looking across over 9 billion events, we found that when compared to the average site, sites with a “poor” (&gt;1800ms) TTFB are:</p><ul><li><p>70.1 percentage points less likely to have a “good” LCP</p></li><li><p>21.9 percentage points more likely to have a “needs improvement” LCP</p></li><li><p>48.2 percentage points more likely to have a “poor” LCP</p></li></ul><p>TTFB is an ill-defined blackbox, so we’re making a point to break that down into its various subparts so you can quickly pinpoint if the issue is with the connection establishment, the server response time, the network itself, and more. We’ll be working to break this down even further in the coming months as we expose the complete lifecycle of a request so you’re able to pinpoint <i>exactly</i> where the bottlenecks lie.</p>
    <div>
      <h2>Errors &amp; cache ratios</h2>
      <a href="#errors-cache-ratios">
        
      </a>
    </div>
    <p>Degradation in stability and performance are frequently directly connected to configuration changes or an increase in errors. Clear visibility into these characteristics can often cut right to the heart of the issue at hand, as well as point to opportunities for improvement of the overall efficiency and effectiveness of your application.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6j89m6ONeXh9v6XL35YJjn/1d65ac83476971fc42fccc2980bc79ff/3.png" />
          </figure><p>Observatory prominently surfaces cache hit ratio and error rates for <i>both</i> the edge and origin. This compliments the backend telemetry nicely, and helps to further breakdown the backend metrics you are seeing to help pinpoint areas of improvement.</p><p>Take cache hit ratio for example. Intuitively, we know that when content is served from cache on an edge server, it should be faster than when the request has to go all the way back to the origin server. Based on our data, again, that’s exactly what we see.</p><p>If we consider our Time To First Byte thresholds again (good is &lt;= 800ms; needs improvement is &gt; 800ms and less than 1800ms; poor is anything over 1800ms), when looking across 9 billion data points as collected by our RUM solution, we see that a whopping <b>91.7% of all pages served from Cloudflare’s cache have a “good” TTFB compared to 79.7% when the request has to be served from the origin server</b>.</p><p>In other words, optimizing origin performance (more on that in a bit) and moving more content to the edge are sure-fire ways to give you a much stronger performance baseline.</p>
    <div>
      <h2>Accurate and detailed synthetic testing</h2>
      <a href="#accurate-and-detailed-synthetic-testing">
        
      </a>
    </div>
    <p>While real-user data is our source of truth, synthetic testing and monitoring is important as well. Because tests are run in a more controlled environment (test from this location, at this time, with this criteria, etc.), the resulting data is a lot less noisy and variable. In addition, because there is not a user involved and we don’t have to worry about any observer effect, synthetic tests are able to grab a lot more information about the request and page lifecycle.</p><p>As a result, synthetic data tends to work very well for arming engineers with debugging information, as well as providing a cleaner set of data for comparing and contrasting results across different platforms, releases, and other situations.</p><p>Observatory provides two different types of synthetic tests.</p><p>The first synthetic test is a browser test. A browser test will load the requested page in a headless browser, run <a href="https://developer.chrome.com/docs/lighthouse"><u>Google’s Lighthouse</u></a> on it to report on key performance metrics, and provide some light suggestions for improvement. </p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3cvDSWqtBTMibgYysDgEoI/43cd0c684d3705fe021f588674a91cf6/4.png" />
          </figure><p>The second type of synthetic test Observatory provides is a network test. This is a brand new test type in Cloudflare, and is focused on giving you a better breakdown of the network and back-end performance of an endpoint.</p><p>Each network test will hit the provided endpoint for the test and record the wait time, server response time, connect time, SSL negotiation time, and total load time for the endpoint response. Because these tests are much more targeted, a single test in itself is not as valuable and can be prone to variation. That variation isn’t necessarily a bad thing—in fact, variability in these results can actually give you a better understanding of the breadth of results when real users hit that same endpoint.</p><p>For that reason, network tests trigger a series of individual runs against the provided endpoint spread out over a short period of time. The data for each response is recorded, and then presented as a histogram on the test results page, letting you see not just a single datapoint, but the long and short-tail of each metric. This gives you a much more accurate representation of reality than what a single test run can provide.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3gCWSp0HCTd4iJ0rTKpEpk/a610e47596eedd6b8cedf73dfcde09ca/5.png" />
          </figure><p>You are also able to compare network tests in Observatory, by selecting two network tests that have been completed. Again, all the data points for each test will be provided in a histogram, where you can easily compare the results of the two.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6mG2bRanAGzltvkucJImue/11f56a4d3c3af4cd2a65dab834a0f0af/6.png" />
          </figure><p>We are working on improving both synthetic test types in Q4 2025, focusing on making them more powerful and diagnostic.</p><p>As we mentioned before, even at its best, synthetic data is an approximation of what is actually happening. Accuracy is critical. Inaccurate data can distract teams with variability and faulty measurements.</p><p>It’s important that these tools are as accurate and true to the real world as possible. It’s also important to us that we give back to the community, both because it’s the right thing to do, and because we believe the best way to have the highest level of confidence in the measurement tools and frameworks we’re using is the rigor and scrutiny that open-source provides.</p><p>For those reasons, we’ll be working on open-sourcing many of the testing agents we’re using to power Observatory. We’ll share more on that soon, as well as more details about how we’ve built each different testing tool, and why.</p>
    <div>
      <h2>Doing something about it: Smart Suggestions</h2>
      <a href="#doing-something-about-it-smart-suggestions">
        
      </a>
    </div>
    <p>People don’t measure for the sake of having data and pretty charts. They measure because they want to be able to stay on top of the health of their application and find ways to improve it. Data is easy. Understanding what to do about the data you’re presented is both the hardest, and most important, part.</p><p>Monitoring without action is useless.</p><p>We’re building Observatory to have a <i>relentless</i> focus on actionability. Before any new metric is presented, we take some time to explore why that metric matters, when it’s something worth addressing, and what actions you should take if those metrics need improvement.</p><p>All of that leads us to our new Smart Suggestions. Wherever possible, we want to pair each metric with a set of opinionated, data-driven suggestions for how to make things better. We want to avoid vague hand-wavy advice and instead be prescriptive and specific and precise.</p><p>For example, let’s look at one particular recommendation we provide around improving Largest Contentful Paint.</p><p>Largest Contentful Paint is a core web vital metric that measures when the largest piece of content is displayed on the screen. That piece of content could be an image, video or text.</p><p>Much like TTFB, Largest Contentful Paint is a bit of a black box by itself. While it tells us how long it takes for that content to get on screen, there are a large number of potential bottlenecks that could be causing the delay. Perhaps the server response time was very slow. Or maybe there was something blocking the content from being displayed on the page. If the object was an image or video, perhaps the filesize was large and the resulting download was slow. LCP by itself doesn’t give us that level of granularity, so it’s hard to give more than hand wavy guidance on how to address it.</p><p>Thankfully, just like we can break TTFB into subparts, we can break LCP into its subparts as well. Specifically we can look at:</p><ul><li><p>Time to First Byte: how quickly the server responds to the request for HTML</p></li><li><p>Resource Load Delay: How long it takes after TTFB for the browser to discover the LCP resource</p></li><li><p>Resource Load Duration: How long it takes for the browser to download the LCP resource</p></li><li><p>Render Delay: How long it takes the browser to render the content, after it has the resource in hand.</p></li></ul><p>Breaking it down into these subparts, we can be much more diagnostic about what to do.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7qfKPLaTGTjJjhawTVoWAi/10ce739e376cabd7c468adfa280246dd/7.png" />
          </figure><p>In the example above, our recommendation engine analyzes the site's real-user data and notices that Resource Load Delay accounts for over 10% of total LCP time. As a result, there’s a high likelihood that the resource triggering LCP is large and could potentially be compressed to reduce file size. So we make a recommendation to enable compression using <a href="https://developers.cloudflare.com/images/polish/"><u>Polish</u></a>.</p><p>We’re very excited about the impact these suggestions will have on helping everyone quickly zero in on meaningful solutions for improving performance and resiliency, without having to wade through mountains of data to get there. As we analyze data, we’ll find more and more patterns of problems and the solutions they can map to. Expanding on our Smart Suggestions will be a constant and ongoing focus as we move forward, and we are working on adding much more content about those patterns and what we find in Q4.</p>
    <div>
      <h2>Fixing the biggest pain point: Smart Shield</h2>
      <a href="#fixing-the-biggest-pain-point-smart-shield">
        
      </a>
    </div>
    <p>Observatory gives you unprecedented insight into your application's health, but insights are only half the battle. The next challenge is acting on them, which brings us to another layer of complexity: protecting your origin. For many of our customers, proper management of origin routes and connections is one of the largest drivers of aggregate overall performance. As we mentioned before, we see a clear negative impact on user-facing performance metrics when we have to go back to the origin, and we want to make it as easy as possible for our customers to improve those experiences. Achieving this requires protecting against unnecessary load while ensuring only trusted traffic reaches your servers.</p><p>Today's customers have powerful tools to protect their origins, but achieving basic use cases remains frustratingly complex:</p><ul><li><p>Making applications faster</p></li><li><p>Reducing origin load</p></li><li><p>Understanding origin health issues</p></li><li><p>Restricting IP address access to origin servers</p></li></ul><p>These fundamental needs currently require navigating multiple APIs and dashboard settings. You shouldn't need to become an expert in each feature — we should analyze your traffic patterns and provide clear, actionable solutions.</p>
    <div>
      <h2>Smart Shield: the future of origin shielding</h2>
      <a href="#smart-shield-the-future-of-origin-shielding">
        
      </a>
    </div>
    <p>Smart Shield transforms origin protection from a complex, multi-tool challenge into a streamlined, intelligent solution that works on your behalf. Our unified API and UI combines all origin protection essentials — dynamic traffic acceleration, intelligent caching, health monitoring, and dedicated egress IPs — into one place that enables single-click configuration.</p><p>But we didn't stop at simplification. Smart Shield integrates with <b>Observatory</b> to provide both the <b>“what” </b>— identifying performance bottlenecks and health issues — and the <b>“how” </b>— delivering capabilities that increase performance, availability, and security.</p><p>This creates a continuous feedback loop: Observatory identifies problems, Smart Shield provides solutions, and real-time analytics verify the impact. </p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/2OI8AZzHo5kW4mesYsqM7Z/e08a5961deda6246a8d4fb906f2f5483/8.png" />
          </figure>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6blpvetS2fS0CNAvu1lnp2/c16e1a330c2c260df4920f85b1650917/9.png" />
          </figure><p>But what does this mean for you? </p><ul><li><p>Reduce total cost of ownership (TCO)</p></li><li><p>Reduce the time-to-value (TTV) for performance, availability, and security issues pertaining to customer origins</p></li><li><p>Enable new features without guesswork and validate effectiveness in the data</p></li></ul><p>Your time stays focused on building incredible user experiences, not becoming a configuration expert. We are excited to give you back time for your customers and your engineers, while paving the way for how you make sure your origin infrastructure is easily optimized to delight your customers. </p>
    <div>
      <h2>Protecting and accelerating origins with smart Connection Reuse</h2>
      <a href="#protecting-and-accelerating-origins-with-smart-connection-reuse">
        
      </a>
    </div>
    <p>Keeping your origins fast and stable is a big part of what we do at Cloudflare. When you experience a traffic surge, the last thing you want is for a flood of <a href="https://www.cloudflare.com/learning/ssl/what-happens-in-a-tls-handshake/"><u>TLS handshakes</u></a> to knock your origin down, or for those new connections to stall your requests, leaving your users to wait for slow pages to load.</p><p>This is why we’ve made significant changes to how Cloudflare’s network talks to your origins to dramatically improve the performance of our origin connections. </p><p>When Cloudflare makes a request to your origins, we make them from a subset of the available machines in every Cloudflare data center so that we can improve your connection reuse. Until now, this pool would be sized the same by default for every application within a data center, and changes to the sizing of the pool for a particular customer would need to be made manually. This often led to suboptimal connection reuse for our customers, as we might be making requests from way more machines than were actually needed, resulting in fewer warm connection pools than we otherwise could have had. This also caused issues at our data centers from time to time, as larger applications might have more traffic than the default pool size was capable of serving, resulting in production incidents where engineers are paged and had to manually increase the fanout factor for specific customers.</p><p>Now, these pool sizes are determined automatically and dynamically. By tracking domain-level traffic volume within a datacenter, we can automatically scale up and scale down the number of machines that serve traffic destined for customer origin servers for any particular customer, improving both the performance of customer websites and the reliability of our network. A massive, high-volume website with a considerable amount of API traffic will no longer be processed by the same number of machines as a smaller and more typical website. Our systems can respond to changes in customer traffic patterns within seconds, allowing us to quickly ramp up and respond to surges in origin traffic.</p><p>Thanks to these improvements, Cloudflare now uses over 30% fewer connections across the board to talk to origins. To put this into a more understandable perspective, this translates to saving approximately 402 years of handshake time every day across our global traffic, or 12,060 years of handshake time saved per month! This means just by proxying your traffic through Cloudflare, you’ll see a 30% on average reduction in the amount of connections to your origin, keeping it more available while serving the same traffic volume and in turn lowering your egress fees. But, in many cases, the results observed can be far greater than 30%. For example, in one data center which is particularly heavy in API traffic, we saw a reduction in origin connections of ~60%! </p><p>Many don’t realize that making more connections to an origin requires more compute and time for systems to create TCP and SSL handshakes. This takes time away from serving content requested by your end-users and can act as a hidden tax on your performance and overall to your application.<b> We are proud to reduce the Internet's hidden tax </b>by finding intelligent, innovative ways to reduce the amount of connections needed while supporting the same traffic volume.</p><p>Watch out for more updates to Smart Shield at the start of 2026 — we’re working on adding self-serve support for dedicated CDN egress IP addresses, along with significant performance, reliability, and resilience improvements!</p>
    <div>
      <h2>Charting the course: next steps for Observatory &amp; Smart Shield</h2>
      <a href="#charting-the-course-next-steps-for-observatory-smart-shield">
        
      </a>
    </div>
    <p>We’re really excited to share these two products with everyone today. Smart Shield and Observatory combine to provide a powerful one-two punch of insight and easy remediation.</p><p>As we navigate the beta launch of Observatory, we know this is just the start.</p><p>Our vision for Observatory is to be the single source of truth for your application’s health. We know that making the right decisions requires robust, accurate data, and we want to arm our customers with the most comprehensive picture available.</p><p>In the coming months, we plan to continue driving forward with our goal of providing comprehensive data, backed by a clear path to action.</p><ul><li><p><b>Deeper, more diagnostic data. </b>We’ll continue to break down data silos, bringing in more metrics to make sure you have a truly comprehensive view of your application’s health. We’ll be focused on going deeper and being more diagnostic, breaking down every aspect of both the request and page lifecycle to give you more granular data.</p></li><li><p><b>More paths to solutions. </b>People don’t measure for the sake of looking at data, they measure to solve problems. We’re going to continue to expand our suggestions, arming you with more precise, data-driven solutions to a wider range of issues, letting you fix problems with a single click through Smart Shield and bringing a tighter feedback loop to validate the impact of your configuration updates.</p></li><li><p><b>Benchmarking against other products.</b> Some of our customers split traffic between different CDNs due to regulatory or compliance requirements. Naturally, this brings up a whole series of questions about comparing the performance of the split traffic. In Observatory, you can compare these today, but we have a lot of things planned to make this even easier.</p></li></ul><p>Try out <a href="https://dash.cloudflare.com/?to=/:account/:zone/speed/overview"><u>Observatory</u></a> and <a href="https://www.cloudflare.com/application-services/products/smart-shield/"><u>Smart Shield</u></a> yourself today. And if you have ideas or suggestions for making Observatory and Smart Shield better, <a href="https://docs.google.com/forms/d/e/1FAIpQLScRMJVR7SmkiloMjPciaTdLzvHzKE9v3L0c418l02a1sMRj_g/viewform?usp=sharing&amp;ouid=115763007691250405767"><u>we’re all ears and would love to talk</u></a>!</p><div>
  
</div><p></p> ]]></content:encoded>
            <category><![CDATA[Speed]]></category>
            <category><![CDATA[Performance]]></category>
            <category><![CDATA[Birthday Week]]></category>
            <category><![CDATA[Aegis]]></category>
            <guid isPermaLink="false">tfg3NnmVPl0IoCJgQYuao</guid>
            <dc:creator>Tim Kadlec</dc:creator>
            <dc:creator>Brian Batraski</dc:creator>
            <dc:creator>Noah Maxwell Kennedy</dc:creator>
        </item>
        <item>
            <title><![CDATA[Network performance update: Birthday Week 2025]]></title>
            <link>https://blog.cloudflare.com/network-performance-update-birthday-week-2025/</link>
            <pubDate>Fri, 26 Sep 2025 13:00:00 GMT</pubDate>
            <description><![CDATA[ On the Internet, being fast is what matters and at Cloudflare, we are committed to being the fastest network in the world. ]]></description>
            <content:encoded><![CDATA[ <p>We are committed to being the fastest network in the world because improvements in our performance translate to improvements for the own end users of your application. We are excited to share that Cloudflare continues to be the fastest network for the most peered networks in the world.</p><p>We relentlessly measure our own performance and our performance against peers. We publish those results routinely, starting with our first update in <a href="https://blog.cloudflare.com/benchmarking-edge-network-performance/"><u>June 2021</u></a> and most recently with our last post in <a href="https://blog.cloudflare.com/tr-tr/network-performance-update-birthday-week-2024/"><u>September 2024</u></a>.</p><p>Today’s update breaks down where we have improved since our update last year and what our priorities are going into the next year. While we are excited to be the fastest in the greatest number of last-mile ISPs, we are never done improving and have more work to do.</p>
    <div>
      <h3>How do we measure this metric, and what are the results?</h3>
      <a href="#how-do-we-measure-this-metric-and-what-are-the-results">
        
      </a>
    </div>
    <p>We measure network performance by attempting to capture what the experience is like for Internet users across the globe. To do that we need to simulate what their connection is like from their last-mile ISP to our networks.</p><p>We start by taking the 1,000 largest networks in the world based on estimated population. We use that to give ourselves a representation of real users in nearly every geography.</p><p>We then measure performance itself with TCP connection time. TCP connection time is the time it takes for an end user to connect to the website or endpoint they are trying to reach. We chose this metric because we believe this most closely approximates what users perceive to be Internet speed, as opposed to other metrics which are either too scientific (ignoring real world challenges like congestion or distance) or too broad.</p><p>We take the trimean measurement of TCP connection times to calculate our metric. The trimean is a weighted average of three statistical values: the first quartile, the median, and the third quartile. This approach allows us to reduce some of the noise and outliers and get a comprehensive picture of quality.</p><p>For this year’s update, we examined the trimean of TCP connection times measured from August 6 to September 4, Cloudflare is the #1 provider in 40% of the top 1000 networks. In our September 2024 update, we shared that we were the #1 provider in 44% of the top 1000 networks.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6aMHnKc3pQMa8oHds3N1uZ/fce6e2eecf2e7e8c257d6a2409befcdc/image2.png" />
          </figure><p>The TCP Connection Time (Trimean) graph shows that we are the fastest TCP connection time in 383 networks, but that would make us the fastest in 38% of the top 1,000. We exclude networks that aren’t last-mile ISPs, such as transit networks, since they don’t reflect the end user experience, which brings the number of measured networks to 964 and makes Cloudflare the fastest in 40% of measured ISPs and the fastest across the top networks.</p>
    <div>
      <h3>How do we capture this data? </h3>
      <a href="#how-do-we-capture-this-data">
        
      </a>
    </div>
    <p>A Cloudflare-branded error page does more than just display an error; it kicks off a real-world speed test. Behind the scenes, on a selection of our error pages, we use Real User Measurements (RUM), which involves a browser retrieving a small file from multiple networks, including Cloudflare, Amazon CloudFront, Google, Fastly and Akamai.</p><p>Running these tests lets us gather performance data directly from the user's perspective, providing a genuine comparison of different network speeds. We do this to understand where our network is fastest and, more importantly, where we can make further improvements. For a deeper dive into the technical details, the <a href="https://blog.cloudflare.com/introducing-radar-internet-quality-page/"><u>Speed Week blog post</u></a> covers the full methodology.</p><p>By using RUM data, we track key metrics like TCP Connection Time, Time to First Byte (TTFB), and Time to Last Byte (TTLB). These are widely recognized, industry-standard metrics that allow us to objectively measure how quickly and efficiently a website loads for actual users. By monitoring these benchmarks, we can objectively compare our performance against other networks.</p><p>We specifically chose the top 1000 networks by estimated population from APNIC, excluding those that aren’t last-mile ISPs. Consistency is key: by analyzing the same group of networks in every cycle, we ensure our measurements and reporting remain reliable and directly comparable over time.</p>
    <div>
      <h3>How do the results compare across countries?</h3>
      <a href="#how-do-the-results-compare-across-countries">
        
      </a>
    </div>
    <p>The map below shows the fastest providers per country and Cloudflare is fastest in dozens of countries. </p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/b6jJn6IQTCWQDhjHdtb9P/9a324658130c08caf865a81f604b2000/image5.png" />
          </figure><p>The color coding is generated by grouping all the measurements we generate by which country the measurement originates from. Then we look at the trimean measurements for each provider to identify who is the fastest… Akamai was measured as well, but providers are only represented in the map if they ranked first in a country which Akamai does not anywhere in the world.</p><p>These slim margins mean that the fastest provider in a country is often determined by <a href="https://www.cloudflare.com/learning/performance/glossary/what-is-latency/"><u>latency</u></a> differences so small that the fastest provider is often only faster by less than 5%. As an example, let’s look at India, a country where we are currently the second-fastest provider.</p><table><tr><td><p><b>India (IN)</b></p></td><td><p></p></td><td><p></p></td><td><p></p></td></tr><tr><td><p><b>Rank</b></p></td><td><p><b>Entity </b></p></td><td><p><b>Connect Time (Trimean)</b></p></td><td><p><b>#1 Diff</b></p></td></tr><tr><td><p>#1</p></td><td><p>CloudFront</p></td><td><p>107 ms</p></td><td><p>-</p></td></tr><tr><td><p>#2</p></td><td><p>Cloudflare</p></td><td><p>113 ms</p></td><td><p>+4.81% (+5.16 ms)</p></td></tr><tr><td><p>#3</p></td><td><p>Google</p></td><td><p>117 ms</p></td><td><p>+8.74% (+9.39 ms)</p></td></tr><tr><td><p>#4 </p></td><td><p>Fastly</p></td><td><p>133 ms</p></td><td><p>+24% (+26 ms)</p></td></tr><tr><td><p>#5</p></td><td><p>Akamai</p></td><td><p>144 ms</p></td><td><p>+34% (+37 ms)</p></td></tr></table><p>In India, Cloudflare is 5ms behind Cloudfront, the #1 provider (To put milliseconds into perspective, the average human eye blink lasts between 100ms and 400ms). The competition for the number one spot in many countries is fierce and often shifts day by day. For example, in Mexico on Tuesday, August 5th, Cloudflare was the second-fastest provider by 0.73 ms but then on Tuesday, August 12th, Cloudflare was the fastest provider by 3.72 ms. </p><table><tr><td><p><b>Mexico (MX)</b></p></td><td><p></p></td><td><p></p></td><td><p></p></td><td><p></p></td></tr><tr><td><p><b>Date</b></p></td><td><p><b>Rank</b></p></td><td><p><b>Entity </b></p></td><td><p><b>Connect Time (Trimean)</b></p></td><td><p><b>#1 Diff</b></p></td></tr><tr><td><p>August 5, 2025</p></td><td><p>#1</p></td><td><p>CloudFront</p></td><td><p>116 ms</p></td><td><p>-</p></td></tr><tr><td><p></p></td><td><p>#2</p></td><td><p>Cloudflare</p></td><td><p>116 ms</p></td><td><p>+0.63% (+0.73 ms)</p></td></tr><tr><td><p>August 12, 2025</p></td><td><p>#1</p></td><td><p>Cloudflare</p></td><td><p>106 ms</p></td><td><p>-</p></td></tr><tr><td><p></p></td><td><p>#2</p></td><td><p>CloudFront</p></td><td><p>109 ms</p></td><td><p>+3.52% (+3.72 ms)</p></td></tr></table><p>Because ranking reorderings are common, we also review country and network level rankings to evaluate and benchmark our performance. </p>
    <div>
      <h3>Focusing on where we are not the fastest yet</h3>
      <a href="#focusing-on-where-we-are-not-the-fastest-yet">
        
      </a>
    </div>
    <p>As mentioned above, in September 2024, Cloudflare was fastest in 44% of measured ISPs. These values can shift as providers constantly make improvements to their networks. One way we focus in on how we are prioritizing improving is to not just observe where we are not the fastest but to measure how far we are from the leader.</p><p>In these locations we tend to pace extremely close to the fastest provider, giving us an opportunity to capture the spot as we <a href="https://blog.cloudflare.com/20-percent-internet-upgrade/">relentlessly improve</a>. In networks where Cloudflare is 2nd, over 50% of those networks have a less than 5% difference (10ms or less) with the top provider.</p><table><tr><td><p><b>Country</b></p></td><td><p><b>ASN</b></p></td><td><p><b>#1</b></p></td><td><p><b>Cloudflare Rank</b></p></td><td><p><b>#1 Diff (ms)</b></p></td><td><p><b>#1 Diff (%)</b></p></td></tr><tr><td><p><b>US</b></p></td><td><p><b>AS36352</b></p></td><td><p>Google</p></td><td><p><b>2</b></p></td><td><p>25 ms</p></td><td><p>32%</p></td></tr><tr><td><p><b>US</b></p></td><td><p><b>AS46475</b></p></td><td><p>Google</p></td><td><p><b>2</b></p></td><td><p>35 ms</p></td><td><p>29%</p></td></tr><tr><td><p><b>US</b></p></td><td><p><b>AS29802</b></p></td><td><p>Google</p></td><td><p><b>2</b></p></td><td><p>8.03 ms</p></td><td><p>21%</p></td></tr><tr><td><p><b>US</b></p></td><td><p><b>AS20473</b></p></td><td><p>Google</p></td><td><p><b>2</b></p></td><td><p>15 ms</p></td><td><p>13%</p></td></tr><tr><td><p><b>US</b></p></td><td><p><b>AS7018</b></p></td><td><p>CloudFront</p></td><td><p><b>2</b></p></td><td><p>23 ms</p></td><td><p>13%</p></td></tr><tr><td><p><b>US</b></p></td><td><p><b>AS4181</b></p></td><td><p>CloudFront</p></td><td><p><b>2</b></p></td><td><p>8.19 ms</p></td><td><p>11%</p></td></tr><tr><td><p><b>US</b></p></td><td><p><b>AS62240</b></p></td><td><p>Google</p></td><td><p><b>2</b></p></td><td><p>18 ms</p></td><td><p>9.77%</p></td></tr><tr><td><p><b>US</b></p></td><td><p><b>AS22773</b></p></td><td><p>CloudFront</p></td><td><p><b>2</b></p></td><td><p>12 ms</p></td><td><p>9.48%</p></td></tr><tr><td><p><b>US</b></p></td><td><p><b>AS6167</b></p></td><td><p>CloudFront</p></td><td><p><b>2</b></p></td><td><p>13 ms</p></td><td><p>7.55%</p></td></tr><tr><td><p><b>US</b></p></td><td><p><b>AS11427</b></p></td><td><p>Google</p></td><td><p><b>2</b></p></td><td><p>9.33 ms</p></td><td><p>5.27%</p></td></tr><tr><td><p><b>US</b></p></td><td><p><b>AS6614</b></p></td><td><p>CloudFront</p></td><td><p><b>2</b></p></td><td><p>6.68 ms</p></td><td><p>4.12%</p></td></tr><tr><td><p><b>US</b></p></td><td><p><b>AS4922</b></p></td><td><p>Google</p></td><td><p><b>2</b></p></td><td><p>3.38 ms</p></td><td><p>3.86%</p></td></tr><tr><td><p><b>US</b></p></td><td><p><b>AS11492</b></p></td><td><p>Fastly</p></td><td><p><b>2</b></p></td><td><p>3.73 ms</p></td><td><p>3.33%</p></td></tr><tr><td><p><b>US</b></p></td><td><p><b>AS11351</b></p></td><td><p>Google</p></td><td><p><b>2</b></p></td><td><p>5.14 ms</p></td><td><p>3.04%</p></td></tr><tr><td><p><b>US</b></p></td><td><p><b>AS396356</b></p></td><td><p>Google</p></td><td><p><b>2</b></p></td><td><p>4.12 ms</p></td><td><p>2.23%</p></td></tr><tr><td><p><b>US</b></p></td><td><p><b>AS212238</b></p></td><td><p>Google</p></td><td><p><b>2</b></p></td><td><p>3.42 ms</p></td><td><p>1.35%</p></td></tr><tr><td><p><b>US</b></p></td><td><p><b>AS20055</b></p></td><td><p>Fastly</p></td><td><p><b>2</b></p></td><td><p>1.22 ms</p></td><td><p>1.33%</p></td></tr><tr><td><p><b>US</b></p></td><td><p><b>AS40021</b></p></td><td><p>CloudFront</p></td><td><p><b>2</b></p></td><td><p>2.06 ms</p></td><td><p>0.91%</p></td></tr><tr><td><p><b>US</b></p></td><td><p><b>AS12271</b></p></td><td><p>Fastly</p></td><td><p><b>2</b></p></td><td><p>1.26 ms</p></td><td><p>0.89%</p></td></tr><tr><td><p><b>US</b></p></td><td><p><b>AS141039</b></p></td><td><p>CloudFront</p></td><td><p><b>2</b></p></td><td><p>1.26 ms</p></td><td><p>0.88%</p></td></tr></table><p>In networks where Cloudflare is 3rd, 50% of those networks are less than a 10% difference with the top provider (10ms or less). Margins are small and suggest that in instances where Cloudflare isn’t number one across networks, we’re extremely close to our competitors and the top networks change day over day. </p><table><tr><td><p><b>Country</b></p></td><td><p><b>ASN</b></p></td><td><p><b>#1</b></p></td><td><p><b>Cloudflare Rank</b></p></td><td><p><b>#1 Diff (ms)</b></p></td><td><p><b>#1 Diff (%)</b></p></td></tr><tr><td><p><b>US</b></p></td><td><p><b>AS6461</b></p></td><td><p>Google</p></td><td><p><b>3</b></p></td><td><p>33 ms</p></td><td><p>39%</p></td></tr><tr><td><p><b>US</b></p></td><td><p><b>AS81</b></p></td><td><p>Fastly</p></td><td><p><b>3</b></p></td><td><p>43 ms</p></td><td><p>35%</p></td></tr><tr><td><p><b>US</b></p></td><td><p><b>AS14615</b></p></td><td><p>Google</p></td><td><p><b>3</b></p></td><td><p>24 ms</p></td><td><p>24%</p></td></tr><tr><td><p><b>US</b></p></td><td><p><b>AS13977</b></p></td><td><p>CloudFront</p></td><td><p><b>3</b></p></td><td><p>21 ms</p></td><td><p>19%</p></td></tr><tr><td><p><b>US</b></p></td><td><p><b>AS33363</b></p></td><td><p>Google</p></td><td><p><b>3</b></p></td><td><p>29 ms</p></td><td><p>18%</p></td></tr><tr><td><p><b>US</b></p></td><td><p><b>AS63949</b></p></td><td><p>Google</p></td><td><p><b>3</b></p></td><td><p>9.56 ms</p></td><td><p>14%</p></td></tr><tr><td><p><b>US</b></p></td><td><p><b>AS14593</b></p></td><td><p>Fastly</p></td><td><p><b>3</b></p></td><td><p>17 ms</p></td><td><p>13%</p></td></tr><tr><td><p><b>US</b></p></td><td><p><b>AS23089</b></p></td><td><p>CloudFront</p></td><td><p><b>3</b></p></td><td><p>7.4 ms</p></td><td><p>11%</p></td></tr><tr><td><p><b>US</b></p></td><td><p><b>AS16509</b></p></td><td><p>Fastly</p></td><td><p><b>3</b></p></td><td><p>10 ms</p></td><td><p>9.48%</p></td></tr><tr><td><p><b>US</b></p></td><td><p><b>AS209</b></p></td><td><p>CloudFront</p></td><td><p><b>3</b></p></td><td><p>9.69 ms</p></td><td><p>6.87%</p></td></tr><tr><td><p><b>US</b></p></td><td><p><b>AS27364</b></p></td><td><p>CloudFront</p></td><td><p><b>3</b></p></td><td><p>8.76 ms</p></td><td><p>6.61%</p></td></tr><tr><td><p><b>US</b></p></td><td><p><b>AS11404</b></p></td><td><p>CloudFront</p></td><td><p><b>3</b></p></td><td><p>6.11 ms</p></td><td><p>6.16%</p></td></tr><tr><td><p><b>US</b></p></td><td><p><b>AS46690</b></p></td><td><p>CloudFront</p></td><td><p><b>3</b></p></td><td><p>5.91 ms</p></td><td><p>5.43%</p></td></tr><tr><td><p><b>US</b></p></td><td><p><b>AS136787</b></p></td><td><p>CloudFront</p></td><td><p><b>3</b></p></td><td><p>8.23 ms</p></td><td><p>5.18%</p></td></tr><tr><td><p><b>US</b></p></td><td><p><b>AS6079</b></p></td><td><p>Fastly</p></td><td><p><b>3</b></p></td><td><p>5.45 ms</p></td><td><p>4.49%</p></td></tr><tr><td><p><b>US</b></p></td><td><p><b>AS5650</b></p></td><td><p>Google</p></td><td><p><b>3</b></p></td><td><p>3.91 ms</p></td><td><p>3.35%</p></td></tr></table><p>Countries with an abundance of networks, like the United States, have a lot of noise we need to calibrate against. For example, the graph below represents the performance of all providers for a major ISP like AS701 (Verizon Business).</p><p><sub>AS701 (Verizon Business) Connect Time (P95) between 2025-08-09 and 2025-09-09</sub></p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7kiADi8Ld1teDWjMgnE4Qq/d6d3b1ca387ac12de1aac86b415129d1/image6.png" />
          </figure><p>In this chart, the “P95” value, or 95th percentile, refers to one point of a percentile distribution. The P95 shows the value below which 95% of the data points fall and is specifically good at helping identify the slowest or worst-case user experiences, such as those on poor networks or older devices. Additionally, we review the other numbers lower on the percentile chain in the table below, which tell us how performance varies across the full range of data. When we do so, the picture becomes more nuanced.</p><table><tr><td><p><b>AS701 (Verizon Business) Provider Rankings for Connect Time at P95, P75 and P50</b></p></td><td><p></p></td><td><p></p></td><td><p></p></td><td><p></p></td></tr><tr><td><p><b>Rank</b></p></td><td><p><b>Entity </b></p></td><td><p><b>Connect Time (P95)</b></p></td><td><p><b>Connect Time (P75)</b></p></td><td><p><b>Connect Time (P50)</b></p></td></tr><tr><td><p>#1</p></td><td><p>Fastly</p></td><td><p>128 ms</p></td><td><p>66 ms</p></td><td><p>48 ms</p></td></tr><tr><td><p>#2</p></td><td><p>Google</p></td><td><p>134 ms</p></td><td><p>72 ms</p></td><td><p>54 ms</p></td></tr><tr><td><p>#3</p></td><td><p>CloudFront</p></td><td><p>139 ms</p></td><td><p>67 ms</p></td><td><p>47 ms</p></td></tr><tr><td><p>#4 </p></td><td><p>Cloudflare</p></td><td><p>141 ms</p></td><td><p>68 ms</p></td><td><p>49 ms</p></td></tr><tr><td><p>#5</p></td><td><p>Akamai</p></td><td><p>160 ms</p></td><td><p>84 ms</p></td><td><p>61 ms</p></td></tr></table><p>At the 95th percentile for AS701, Cloudflare ranks 4th but at the 75th and 50th, Cloudflare is only 2 milliseconds slower than the fastest provider. In other words, when reviewing more than one point along the distribution at the network level, Cloudflare is keeping up with the top providers for the less extreme samples. To capture these details, it’s important to look at the range of outcomes, not just one percentile.</p><p>To better reflect the full spectrum of user experiences, we started using the trimean in July 2025 to rank providers. This metric combines values from across the distribution of data - specifically the 75th, 50th and 25th percentiles - which gives a more balanced representation of overall performance, rather than only focusing on the extremes. Summarizing user experience with a single number is always challenging, but the trimean helps us compare providers in a way that better reflects how users actually experience the Internet.</p><p>Cloudflare is the fastest provider in 40% of networks in the majority of real-world conditions, not just in worst-case scenarios. Still, the 95th percentile remains key to understanding how performance holds up in challenging conditions and where other providers might fall behind in performance. When we review the 95th percentile across the same date range for all the networks, not just AS701, Cloudflare is fastest across roughly the same amount of networks but by 103 more networks than the next fastest provider. Being faster in such a wide margin of networks tells us that Cloudflare is particularly strong in the challenging, long-tail cases that other providers struggle with.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3jOvjHJBG1fefaz25yi8sk/4e649bdeaf743b3bbeb8cd696fe9669d/image4.png" />
          </figure><p>Our performance data shows that even when we are not the top-ranked provider, we remain exceptionally competitive, often trailing the leader by a mere handful of percentage points. Our strength at the 95th percentile also highlights our superior performance in the most challenging scenarios. Cloudflare’s ability to outperform other providers, in the worst-case, is a testament to the resilience and efficiency of our network.</p><p>Moving forward, we'll continue to share multiple metrics and continue to make improvements to our network —and we’ll use this data to do it! Let’s talk about how. </p>
    <div>
      <h3>How does Cloudflare use this data to improve?</h3>
      <a href="#how-does-cloudflare-use-this-data-to-improve">
        
      </a>
    </div>
    <p>Cloudflare applies this data to identify regions and networks that need prioritization. If we are consistently slower than other providers in a network, we want to know why, so we can fix it.</p><p>For example, the graph below shows the 95th percentile of Connect Time for AS8966. Prior to June 13, 2025, our performance was suffering, and we were the slowest provider for the network. By referencing our own measurement data, we prioritized partner data centers in the region and almost immediately performance improved for users connecting through AS8966.</p><p>Cloudflare’s partner data centers consist of collaborations with local service providers who host Cloudflare's equipment within their own facilities. This allows us to expand our network to new locations and get closer to users more quickly. In the case of AS8966, adding a new partner data center took us from being ranked last to ranked first and improved latency by roughly 150ms in one day. By using a data-driven approach, we made our network faster and most importantly, improved the end user experience.</p><p><sub>TCP Connect Time (P95) for AS8966</sub></p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1kX76JFqDOZq0FF798XLRM/4dc346e0a33dd564f7d42db24f91cae1/image3.png" />
          </figure>
    <div>
      <h3>What’s next?</h3>
      <a href="#whats-next">
        
      </a>
    </div>
    <p>We are always working to build a faster network and will continue sharing our process as we go. Our approach is straightforward: identify performance bottlenecks, implement fixes, and report the results. We believe in being transparent about our methods and are committed to a continuous cycle of improvement to achieve the best possible performance. Follow our blog for the latest performance updates as we continue to optimize our network and share our progress.</p> ]]></content:encoded>
            <category><![CDATA[Birthday Week]]></category>
            <category><![CDATA[Network Performance Update]]></category>
            <category><![CDATA[Performance]]></category>
            <category><![CDATA[Network Services]]></category>
            <guid isPermaLink="false">6pNJpwIyHtXuRYh4AhebeR</guid>
            <dc:creator>Lai Yi Ohlsen</dc:creator>
        </item>
        <item>
            <title><![CDATA[The RUM Diaries: enabling Web Analytics by default]]></title>
            <link>https://blog.cloudflare.com/the-rum-diaries-enabling-web-analytics-by-default/</link>
            <pubDate>Wed, 17 Sep 2025 19:21:27 GMT</pubDate>
            <description><![CDATA[ On October 15th 2025, Cloudflare is enabling Web Analytics for all free domains by default—helping you see how your site performs around the world in real time, without ever collecting personal data. ]]></description>
            <content:encoded><![CDATA[ <p>Measuring and improving performance on the Internet can be a daunting task because it spans multiple layers: from the user’s device and browser, to DNS lookups and the network routes, to edge configurations and origin server location. Each layer introduces its own variability such as last-mile bandwidth constraints, third-party scripts, or limited CPU resources, that are often invisible unless you have robust <a href="https://www.cloudflare.com/learning/performance/what-is-observability/">observability tooling</a> in place. Even if you gather data from most of these Internet hops, performance engineers still need to correlate different metrics like front-end events, network processing times, and server-side logs in order to pinpoint where and why elusive “latency” occurs to understand how to fix it. </p><p>We want to solve this problem by providing a powerful, in-depth monitoring solution that helps you debug and optimize applications, so you can understand and trace performance issues across the Internet, end to end.</p><p>That’s why we’re excited to announce the <b><i>start</i></b> of a major upgrade to Cloudflare’s performance analytics suite: Web Analytics as part of our real user monitoring (RUM) tools will soon be combined with network-level insights to help you pinpoint performance issues anywhere on a packet’s journey — from a visitor’s browser, through Cloudflare’s network, to your origin.</p><p>Some popular web performance monitoring tools have also sacrificed user privacy in order to achieve depth of visibility. We’re also going to remove that tradeoff. By correlating client-side metrics (like <a href="https://web.dev/articles/vitals#core_web_vitals"><u>Core Web Vitals</u></a>) with detailed network and origin data, developers can see where slowdowns occur — and why —  all while preserving end user privacy (by dropping client-specific information and aggregating data by visits as explained in greater detail below).</p><p>Over the next several months we’ll share:</p><ul><li><p>How Web Analytics work</p></li><li><p>Real-world debugging examples from across the Internet</p></li><li><p>Tips to get the most value from Cloudflare’s analytics tools</p></li></ul><p>The journey starts on <b>October 15, 2025</b>, when Cloudflare will enable <a href="https://www.cloudflare.com/web-analytics/"><u>Web Analytics</u></a> <b>for all free domains by default</b> — helping you see how your site actually performs for visitors around the world in real time, without ever collecting any personal data (not applicable to traffic originating from the EU or UK, <a href="#what-does-privacy-first-mean">see below</a>). By the middle of 2026, we’ll deliver something nobody has ever had before: a comprehensive, <a href="https://blog.cloudflare.com/privacy-first-web-analytics/"><u>privacy-first platform</u></a> for performance monitoring and debugging. Unlike many other tools, this platform won’t just show you where latency lives, it will help you fix it, all in one place. From untangling the trickiest bottlenecks, to getting a crystal-clear view of global performance, this new tool will change how you see your web application and experiment with new performance features. And we’re not building it behind closed doors, we want to bring you along as we launch it in public. Follow along in this series, <i>The RUM Diaries</i>, as we share the journey.</p>
    <div>
      <h2>Why this matters</h2>
      <a href="#why-this-matters">
        
      </a>
    </div>
    <p>Performance monitoring is only as good as the detail you can see — and the trust your users have that while you’re watching traffic performance, you aren’t watching <i>them</i>. As we explain below, by combining <b>real user metrics</b> with <b>deep, in-network instrumentation</b>, we’ll give developers the visibility to debug any layer of the stack while maintaining Cloudflare’s zero-compromise stance on privacy.</p>
    <div>
      <h2>What problem are we solving? </h2>
      <a href="#what-problem-are-we-solving">
        
      </a>
    </div>
    <p>Many performance monitoring solutions provide only a narrow slice of the performance layer cake, focusing on either the client or the origin while lumping everything in between under a vague “processing time” due to lack of visibility. But as web applications get more complex and user expectations continue to rise, traditional analytics alone don’t cut it. Knowing <i>what</i> happened is just the tip of the iceberg; modern teams need to understand <i>why</i> a bottleneck occurred and <i>how</i> network conditions, code changes, or even a single external script can degrade load times. Moreover, often the tools available can only <i>observe</i> performance rather than helping to optimize it, which leaves teams unable to understand what to try to move the needle on latency.</p><p>We want to pull back the curtain so you can understand performance implications of the services you use on our platform and how you can make sure you’re getting the best performance possible. </p><p>Consider Shannon in Detroit, Michigan. She operates an e-commerce site selling hard-to-find watches to horology enthusiasts around the globe. Shannon knows that her customers are impatient (she pictures them frequently checking their wrists). If her site loads slowly, she loses sales, her SEO drops, and her customers go to a different store where they have a better online shopping experience. </p><p>As a result, Shannon continually monitors her site performance, but she frequently runs into problems trying to understand how her site is experienced by customers in different parts of the world. After updating her site, she frequently spot checks its performance using her browser on her office wifi in Detroit, but she continually hears complaints about slow load from her customers in Germany. So Shannon shops around for a solution that monitors performance around the globe. </p><p>This off-the-shelf performance monitoring solution offers her the ability to run similar tests from virtual machines situated around the world across various desktops, mobile devices, and even ISPs, close to her customers. Shannon receives data from these tests, ranging from how fast these synthetic clients’ DNS resolved, how quickly they connected to a particular server, and even when a response was on its way back to a client. Thankfully for Shannon, the off-the-shelf performance monitoring solution identified “server processing time” as the latency culprit in Germany. However, she can’t help but wonder, is it my server that is slow or the transit connection of my users in Germany? Can I make my site faster by adding another server in Germany, or updating my CDN configuration? It’s a three option head-scratcher: is it a networking problem, a server problem, or something else?</p><p>Cloudflare can help Shannon (and others!) because we sit in a unique place to provide richer performance analytics. As a reverse proxy positioned between the client and the origin, we are often the first web server a user connects to when requesting content. In addition to moving what’s important closer to your customers, our product suite can generate responses at our edge (e.g. <a href="https://developers.cloudflare.com/learning-paths/workers/get-started/first-worker/"><u>Workers</u></a>), steer traffic through our <a href="https://blog.cloudflare.com/backbone2024/"><u>dedicated backbone</u></a> (e.g. cloudflared and more), and route around Internet traffic jams (e.g. <a href="https://blog.cloudflare.com/argo-v2/"><u>Argo</u></a>). By tailoring a solution that brings together: </p><ul><li><p>client performance data, </p></li><li><p>real-time network metrics,</p></li><li><p>customer configuration settings, and</p></li><li><p>origin performance measurements</p></li></ul><p>we can provide more insightful information about what’s happening in the vague “processing time.” This will allow developers like Shannon to understand what they should tweak to make their site more performant, build her business and her customers happier. </p>
    <div>
      <h2>What is Web Analytics? </h2>
      <a href="#what-is-web-analytics">
        
      </a>
    </div>
    <p>Turning back to what’s happening on <b>October 15, 2025</b>: We’re enabling Web Analytics so teams can track down performance bottlenecks. Web Analytics works by adding a lightweight JavaScript snippet to your website, which helps monitor performance metrics from visitors to your site. In the Web Analytics dashboard you can see aggregate performance data related to: how a browser has painted the page (via <a href="https://web.dev/articles/lcp"><u>LCP</u></a>, <a href="https://web.dev/articles/inp"><u>INP</u></a>, and <a href="https://web.dev/articles/cls"><u>CLS</u></a>), general load time metrics associated with server processing, as well as aggregate counts of visitors.</p><p>If you’ve ever popped open DevTools in your browser and stared at the waterfall chart of a slow-loading page, you’ve had a taste of what Web Analytics is doing, except instead of measuring <i>your</i> load times from <i>your</i> laptop, it’s measuring it directly from the browsers of real visitors.</p><p>Here’s the high-level architecture:</p><p><b>A lightweight beacon in the browser
</b>Every page that you track with Cloudflare’s Web Analytics includes a tiny JavaScript snippet, optimized to load asynchronously so it won’t block rendering.</p><ul><li><p>This snippet hooks into modern browser APIs like the <a href="https://developer.mozilla.org/en-US/docs/Web/API/Performance"><u>Performance API</u></a>, Resource Timing, etc</p></li><li><p>This is how Cloudflare collects Core Web Vital metrics like <b>Largest Contentful Paint</b> and <b>Interaction to Next Paint</b>, plus data about resource load times, TLS handshake duration from the perspective of the client.</p></li></ul><p><b>Aggregation at the edge
</b>When the browser sends performance data, it goes to the nearest Cloudflare data center. Instead of pushing raw events straight to a database, we pre-process at the edge. This reduces storage needs, minimizes latency, and removes personal information like IP addresses. After this pre-processing, it is sent to a core datacenter to be processed and queried by users.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6QLjAwnkmYM5tXv9hbVv79/98684d34b3555532b3c2bc94039aacc2/BLOG-2675_2.png" />
          </figure><p><b>Web Analytics </b>sits under the <b>Analytics &amp; Logs</b> section of the dashboard (at both the account and domain level of the dashboard). Starting on October 15, 2025, free domains will begin to see Web Analytics enabled by default and will be able to view the performance of their visitors in their dashboard. Pro, Biz and ENT accounts can enable Web Analytics by selecting the hostname of the website to add the snippet to and selecting <b>Automatic Setup</b>. Alternatively, you can manually paste the JavaScript beacon before the closing <code>&lt;/body&gt;</code> tag on any HTML page you’d like to track from your origin. Just select “manage site” from the Web Analytics tab in the dashboard. </p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5ucGMd53CtM2Y5pGVPpaSa/8444898164ee7c45afa7755960000d38/BLOG-2675_3.png" />
          </figure><p>Once enabled, the JS snippet works with visitors’ browsers to measure how the user experienced page load times and reports on critical client-side metrics. Below these metrics are resource attribution tables that help users understand which assets are taking the most time per metrics to load so that users can better optimize their site performance. </p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/RrhjEuT91lp4OfEKi9dxm/490f270eebebd5cbd648c315d222d3d6/BLOG-2675_4.png" />
          </figure>
    <div>
      <h2>What does privacy-first mean?</h2>
      <a href="#what-does-privacy-first-mean">
        
      </a>
    </div>
    <p>From the beginning, our Web Analytics tools have centered on providing insights without compromising privacy. Being privacy-first means we don’t track individual users for analytics. We don’t use any client-side state (like cookies or localStorage) for analytics purposes, and we don’t track users over time by IP address, User Agent, or any other fingerprinting technique.</p><p>Moreover, when enabling Web Analytics, you can choose to drop requests from European and UK visitors if you so desire (listed <a href="https://developers.cloudflare.com/speed/speed-test/rum-beacon/#rum-excluding-eeaeu"><u>here</u></a> specifically), meaning we will not collect any RUM metrics from traffic that passes through our European and UK data centers. <b>The version of Web Analytics that will be enabled by default excludes data from EU visitors (this can be changed in the dashboard if you want). </b></p><p>The concept of a <i>visit</i> is key to our privacy approach. Rather than count unique IP addresses (requiring storing state about each visitor), we simply count page views that originate from a distinct referral or navigation event, avoiding the need to store information that might be considered personal data. We believe this same concept that we’ve used for years in providing our privacy-first Web Analytics can be logically extended to network and origin metrics. This will allow customers to gain the insights they need to debug and solve performance issues while ensuring they are not collecting unneeded data on visitors.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4UdLc8qugqv29lZUYyB41d/c4def741c23a6cbf2937d3b05a804c03/BLOG-2675_5.png" />
          </figure>
    <div>
      <h2>Opting-out</h2>
      <a href="#opting-out">
        
      </a>
    </div>
    <p>We built our Web Analytics service to give you the insights you need to run your website, all while maintaining a privacy-first approach. However, if you do want to opt-out, here are the steps to do so.</p>
    <div>
      <h3>Via Dashboard</h3>
      <a href="#via-dashboard">
        
      </a>
    </div>
    <p>If you have a free domain and do not want Web Analytics automatically enabled for your zone you should do the following before October 15, 2025: </p><ol><li><p>Navigate to the zone in the Cloudflare dashboard</p></li><li><p>In the list on the left of the screen, navigate to Web Analytics
</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/lWwBak29Cmv1UijeKGhH6/14c3980ddcf9845cd4e97571b362a8e4/Screenshot_2025-09-17_at_11.48.13%C3%A2__AM.png" />
          </figure><p></p></li><li><p>On the next page, select either `Enable Globally` or `Exclude EU` to activate the feature
</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4M8Gb1cqDkCmC1u45Xn1iG/bda1ffe64212b3a2e10befd7a01c9eb3/BLOG-2675_7.png" />
          </figure><p></p></li><li><p>Once Web Analytics has been activated, navigate to `Manage RUM Settings` in the Web Analytics dashboard
</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5LXl9FnYS2JRnfl4fsMXle/a5e74ed39dfd888514ed6e489db911f0/Screenshot_2025-09-17_at_11.47.46%C3%A2__AM.png" />
          </figure><p></p></li><li><p>Then, on the next page, select `Disable` to disable Web Analytics for the zone
</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6JCslLOmHqnqw7BXR4JHZf/fa9a391f399e70c525c2b947a8ed16a0/BLOG-2675_9.png" />
          </figure><p></p></li><li><p>OR, to remove Web Analytics from the zone entirely, delete the configs by clicking <code>Advanced Options</code> and then <code>Delete
</code></p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/GYyPsNL6mXt1SIVWsrm5M/ecd627e14ab398db1e1cc87edbb66030/BLOG-2675_10.png" />
          </figure><p>Once you have disabled the product once, we will not re-enable it again. You can choose to enable it whenever you want, however.</p></li></ol>
    <div>
      <h3>Via API</h3>
      <a href="#via-api">
        
      </a>
    </div>
    <ol><li><p>Create a Web Analytics configuration with the following API call:
</p>
            <pre><code>curl https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/rum/site_info \
    -H 'Content-Type: application/json' \
    -H "X-Auth-Email: $CLOUDFLARE_EMAIL" \
    -H "X-Auth-Key: $CLOUDFLARE_API_KEY" \
    -d '{
          "auto_install": false,
          "host": "example.com",
          "zone_tag": "023e105f4ecef8ad9ca31a8372d0c353"
        }'
</code></pre>
            <p><sub><i>Note: This will not cause your zone to collect RUM data because auto_install is set to `false`</i></sub></p></li><li><p>Collect the <code>site_tag</code> and <code>zone_tag</code> fields from the response to this call</p><ol><li><p><code>site_tag</code> in this response will correspond to <code>$SITE_ID</code> in the following calls</p></li></ol></li><li><p>EITHER Disable the Web Analytics configuration with the following API call:
</p>
            <pre><code>curl https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/rum/site_info/$SITE_ID \
    -X PUT \
    -H 'Content-Type: application/json' \
    -H "X-Auth-Email: $CLOUDFLARE_EMAIL" \
    -H "X-Auth-Key: $CLOUDFLARE_API_KEY" \
    -d '{
          "auto_install": true,
          "enabled": false,
          "host": "example.com",
          "zone_tag": "023e105f4ecef8ad9ca31a8372d0c353"
        }'

</code></pre>
            <p></p></li><li><p>OR Delete the Web Analytics configuration with the following API call:
</p>
            <pre><code>curl https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/rum/site_info/$SITE_ID \
    -X DELETE \
    -H "X-Auth-Email: $CLOUDFLARE_EMAIL" \
    -H "X-Auth-Key: $CLOUDFLARE_API_KEY"</code></pre>
            <p></p></li></ol>
    <div>
      <h2>Where We’re Going Next</h2>
      <a href="#where-were-going-next">
        
      </a>
    </div>
    <p>Today, Web Analytics gives you visibility into how <i>people</i> experience your site in the browser. Next, we’re expanding that lens to show <i>what’s happening across the entire request path</i>, from the click in a user’s browser, through Cloudflare’s global network, to your origin servers, and back.</p><p>Here’s what’s coming:</p><ol><li><p><b>Correlating Across Layers
</b>We’ll match RUM data from the client with network timing, Cloudflare edge processing, and origin response latency, allowing you to pinpoint whether a spike in TTFB comes from a slow script, a cache miss, or an origin bottleneck.</p></li><li><p><b>Proactive Alerting
</b>Configurable alerts will tell you when performance regresses in specific geographies, when a data center underperforms, or when origin latency spikes.</p></li><li><p><b>Actionable Insights
</b>We’ll go beyond “processing time” as a single number, breaking it into the real-world steps that make up the journey: proxy routing, security checks, cache lookups, origin fetches, and more.</p></li><li><p><b>Unified View
</b>All of this will live in one place (your Cloudflare dashboard) alongside your analytics, logs, firewall events, and configuration settings, so you can see cause and effect in one workflow.</p></li></ol>
    <div>
      <h2>Conclusion</h2>
      <a href="#conclusion">
        
      </a>
    </div>
    <p>Stay tuned as we work alongside you, in public, to build the most comprehensive, privacy-focused performance analytics platform. Together, we will illuminate every corner of the request journey so you can optimize, innovate, and deliver the best experiences to your users, every time.</p><p>The next chapters of this journey will unlock proactive alerts, cross-layer correlation, and actionable insights you can’t get anywhere else. Follow along as the RUM Diaries are just getting started.</p> ]]></content:encoded>
            <category><![CDATA[Analytics]]></category>
            <category><![CDATA[Performance]]></category>
            <category><![CDATA[Privacy]]></category>
            <category><![CDATA[Application Services]]></category>
            <guid isPermaLink="false">6R0B3dMIIePvBoBb8TzKNG</guid>
            <dc:creator>Alex Krivit</dc:creator>
            <dc:creator>Tim Kadlec</dc:creator>
        </item>
        <item>
            <title><![CDATA[Troubleshooting network connectivity and performance with Cloudflare AI]]></title>
            <link>https://blog.cloudflare.com/AI-troubleshoot-warp-and-network-connectivity-issues/</link>
            <pubDate>Fri, 29 Aug 2025 14:00:00 GMT</pubDate>
            <description><![CDATA[ Troubleshoot network connectivity issues by using Cloudflare AI-Power to quickly self diagnose and resolve WARP client and network issues. ]]></description>
            <content:encoded><![CDATA[ <p>Monitoring a corporate network and troubleshooting any performance issues across that network is a hard problem, and it has become increasingly complex over time. Imagine that you’re maintaining a corporate network, and you get the dreaded IT ticket. An executive is having a performance issue with an application, and they want you to look into it. The ticket doesn’t have a lot of details. It simply says: “Our internal documentation is taking forever to load. PLS FIX NOW”.</p><p>In the early days of IT, a corporate network was built on-premises. It provided network connectivity between employees that worked in person and a variety of corporate applications that were hosted locally.</p><p>The shift to cloud environments, the rise of SaaS applications, and a “work from anywhere” model has made IT environments significantly more complex in the past few years. Today, it’s hard to know if a performance issue is the result of:</p><ul><li><p>An employee’s device</p></li><li><p>Their home or corporate wifi</p></li><li><p>The corporate network</p></li><li><p>A cloud network hosting a SaaS app</p></li><li><p>An intermediary ISP</p></li></ul><p>A performance ticket submitted by an employee might even be a combination of multiple performance issues all wrapped together into one nasty problem.</p><p>Cloudflare built <a href="https://developers.cloudflare.com/cloudflare-one/"><u>Cloudflare One</u></a>, our <a href="https://www.cloudflare.com/learning/access-management/what-is-sase/">Secure Access Service Edge (SASE) </a>platform, to protect enterprise applications, users, devices, and networks. In particular, this platform relies on two capabilities to simplify troubleshooting performance issues:</p><ul><li><p>Cloudflare’s Zero Trust client, also known as <a href="https://developers.cloudflare.com/cloudflare-one/connections/connect-devices/warp/"><u>WARP</u></a>, forwards and encrypts traffic from devices to Cloudflare edge.</p></li><li><p>Digital Experience Monitoring (<a href="https://developers.cloudflare.com/cloudflare-one/insights/dex/"><u>DEX</u></a>) works alongside WARP to monitor device, network, and application performance.</p></li></ul><p>We’re excited to announce two new AI-powered tools that will make it easier to troubleshoot WARP client connectivity and performance issues.  We’re releasing a new WARP diagnostic analyzer in the <a href="https://www.cloudflare.com/learning/security/glossary/what-is-zero-trust/">Zero Trust</a> dashboard and a <a href="https://www.cloudflare.com/learning/ai/what-is-model-context-protocol-mcp/"><u>MCP (Model Context Protocol)</u></a> server for DEX. Today, every Cloudflare One customer has free access to both of these new features by default.</p>
    <div>
      <h2>WARP diagnostic analyzer</h2>
      <a href="#warp-diagnostic-analyzer">
        
      </a>
    </div>
    <p>The WARP client provides diagnostic logs that can be used to troubleshoot connectivity issues on a device. For desktop clients, the most common issues can be investigated with the information captured in logs called <a href="https://developers.cloudflare.com/learning-paths/warp-overview-course/series/warp-basics-2/"><u>WARP diagnostic</u></a>. Each WARP diagnostic log contains an extensive amount of information spanning days of captured events occurring on the client. It takes expertise to manually go through all of this information and understand the full picture of what is occurring on a client that is having issues. In the past, we’ve advised customers having issues to send their WARP diagnostic log straight to us so that our trained support experts can do a root cause analysis for them. While this is effective, we want to give our customers the tools to take control of deciphering common troubleshooting issues for even quicker resolution. </p><p>Enter the WARP diagnostic analyzer, a new AI available for free in the Cloudflare One dashboard as of today! This AI demystifies information in the WARP diagnostic log so you can better understand events impacting the performance of your clients and network connectivity. Now, when you run a <a href="https://developers.cloudflare.com/cloudflare-one/insights/dex/remote-captures/"><u>remote capture for WARP diagnostics</u></a> in the Cloudflare One dashboard, you can generate an <a href="https://developers.cloudflare.com/cloudflare-one/connections/connect-devices/warp/troubleshooting/warp-logs/#view-warp-diagnostics-summary-beta"><u>AI analysis of the WARP diagnostic file</u></a>. Simply go to your organization’s Zero Trust dashboard and select DEX &gt; Remote Captures from the side navigation bar. After you successfully run diagnostics and produce a WARP diagnostic file, you can open the status details and select View WARP Diag to generate your AI analysis.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/50lz9CFKKJJjL5GpppLu8V/4b404a2ec700713579b3ec9a616ee4c4/image4.png" />
          </figure><p>In the WARP Diag analysis, you will find a Cloudy summary of events that we recommend a deeper dive into.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6rV0XPL9aayuljbw9X46bQ/6fd046dfcf6d882948d1a98912cf7cab/image1.png" />
          </figure><p>Below this summary is an events section, where the analyzer highlights occurrences of events commonly occurring when there are client and connectivity issues. </p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4OxLtM2CQ4SSs8NTGUdcpn/b7e4f0e3eb519838d50759e6d1decf75/image7.png" />
          </figure><p>Expanding on any of the events detected will reveal a detailed page explaining the event, recommended resources to help troubleshoot, and a list of time stamped recent occurrences of the event on the device. </p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4ceezR6L1MybxhMtJGuL5U/31f24b0a057871a1f4330ea87f050873/Screenshot_2025-09-03_at_4.20.27%C3%A2__PM.png" />
          </figure><p>To further help with trouble shooting we’ve added a Device and WARP details section at the bottom of this page with a quick view of the device specifications and WARP configurations such as Operating system, WARP version, and the device profile ID.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/41N2iTeHQ9JfrOOsqG8MY5/550fa7573a6d4ed61479679cb4e954d3/image6.png" />
          </figure><p>Finally, we’ve made it easy to take all the information created in your AI summary with you by navigating to the JSON file tab and copying the contents. Your WARP Diag file is also available to download from this screen for any further analysis.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1Sha8rpC7XwSkCvBWt6lv2/2702873ce14fe80904d4f0886e6f3528/image2.png" />
          </figure>
    <div>
      <h2>MCP server for DEX</h2>
      <a href="#mcp-server-for-dex">
        
      </a>
    </div>
    <p>Alongside the new WARP Diagnostic Analyzer, we’re excited to announce that all Cloudflare One customers have access to a MCP (Model Context Protocol) server for our Digital Experience Monitoring (DEX) product. Let’s dive into how this will save our customers time and money.</p><p>Cloudflare One customers use Digital Experience Monitoring (DEX) to monitor devices across their employee network and troubleshoot any connectivity or performance issues. Like many products at Cloudflare, every data point generated by DEX is available to customers via Cloudflare’s API or log ingestion. DEX API and log data is valuable because it enables Enterprises to create custom analytics for their devices’ connectivity and performance in a SIEM (<a href="https://www.cloudflare.com/learning/security/what-is-siem/"><u>Security Information and Event Management</u></a>). Building new data pipelines and various dashboards can take a lot of time and can be expensive. Some of Cloudflare One customers aren’t able to dedicate the engineering time to build custom analytics: whether it’s due to budget, resource constraints, time constraints, or other factors.</p><p>Model Context Protocol (MCP) is an AI standardization that allows AI to connect to other applications. In this instance, using MCP together with DEX gives users a custom analytics experience — without all the work of integrating log data into a SIEM. Network security practitioners can type out a question (like “Show me the performance data for alice@acme.com’s device in a few graphs”), and an MCP server will answer that question with a customized graph on Alice’s device data from the DEX API.</p>
    <div>
      <h2>Troubleshooting DNS query performance with the DEX MCP server</h2>
      <a href="#troubleshooting-dns-query-performance-with-the-dex-mcp-server">
        
      </a>
    </div>
    <p>Any IT admin, security engineer, or network engineer can ask the MCP server a question like: “I received an IT ticket from bob@acme.com who is an enterprise employee. He is complaining about the performance of his device. Can you investigate this for me?”.</p><p>After you submit this question, the DEX MCP server will start by reviewing the device’s overall health.</p>
            <pre><code>Overall Device Health:
✅ Connection Status: Consistently connected for the past 24 hours with 100% availability
✅ WARP Mode: Running in optimal "warp+doh" (WARP + DNS over HTTPS) mode
✅ Client Version: Up-to-date with version 2025.1.7.2</code></pre>
            <p>So far, the device’s health looks great. Next, the DEX MCP server will begin analyzing the results of the performance tests that are configured on Bob’s device.</p>
            <pre><code>📊 Performance Metrics Analysis

HTTP Response Times:
Google.com test: Average 247ms (Range: 81-649ms)
  • Shows significant improvement throughout the day: Started at ~317ms, dropped to ~211ms by evening
Wiki.internal.acme.com: Average 1,374ms (Range: 877-2,431ms)
  • This is notably high and could be the source of performance complaints
  • DNS resolution particularly slow: Average 616ms with spikes up to 1,600ms</code></pre>
            <p>Now we’re making progress in our troubleshooting effort. The DEX MCP server identified that the HTTP GET test for wiki.internal.acme.com has a high average resource fetch time when compared to other websites (like google.com). In particular, high resource fetch time is the result of slow DNS resolution. Now, the DEX MCP server will summarize its findings.</p>
            <pre><code>🚨 Performance Issues Identified
Extremely Slow DNS Resolution:
  • Average DNS response time of 616ms is significantly high
  • Normal DNS should be &lt;50ms for good performance
High Resource Fetch Times on wiki.internal.acme.com:
   • 1.3+ second response times indicate potential application-level issues
   • Could be related to DNS performance or server-side delays
HTTP Status Code Issues:
        • wiki.internal.acme.com is showing consistent HTTP 400 errors
        • This suggests potential configuration or routing issues</code></pre>
            <p>Now, we’ve successfully identified that Bob’s slow wiki performance is the result of a high average DNS response time. Slow DNS resolution increases the average resource fetch time, and also generates HTTP 400 errors for Enterprise employees that are attempting to access wiki.internal.acme.com. Notably, it’s likely that if the wiki is seeing slow DNS response times, other internal applications are experiencing performance issues as well.</p><p>We can ask the DEX MCP server to suggest potential solutions that would fix this wider DNS performance issue.</p>
            <pre><code>🔧 DNS Optimization Strategies

Check WARP DNS configuration and settings
  • Since the device is using WARP+DoH mode, you can optimize the DNS settings
Implement or improve local DNS caching
  • Implement or improve local DNS caching to reduce repeated lookups
Network stack tuning
  • Optimize TCP / UDP settings for DNS</code></pre>
            
    <div>
      <h2>Try out the DEX MCP server today</h2>
      <a href="#try-out-the-dex-mcp-server-today">
        
      </a>
    </div>
    
    <div>
      <h3>Fast and easy option for testing an MCP server</h3>
      <a href="#fast-and-easy-option-for-testing-an-mcp-server">
        
      </a>
    </div>
    <p>Any Cloudflare One customer with a Free, PayGo, or ENT plan can start using the DEX MCP server in less than one minute. The fastest and easiest way to try out the DEX MCP server is to visit <a href="http://playground.ai.cloudflare.com"><u>playground.ai.cloudflare.com</u></a>. There are five steps to get started:</p><ol><li><p>Copy the URL for the DEX MCP server: https://dex.mcp.cloudflare.com/sse</p></li><li><p>Open <a href="http://playground.ai.cloudflare.com"><u>playground.ai.cloudflare.com</u></a> in a browser</p></li><li><p>Find the section in the left side bar titled <b>MCP Servers</b></p></li><li><p>Paste the URL for the DEX MCP server into the URL input box and click <b>Connect</b></p></li><li><p>Authenticate your Cloudflare account, and then start asking questions to the DEX MCP server</p></li></ol><p>It’s worth noting that end users will need to ask specific and explicit questions to the DEX MCP server to get a response. For example, you may need to say, “Set my production account as the active  account”, and then give the separate command, “Fetch the DEX test results for the user bob@acme.com over the past 24 hours”.</p>
    <div>
      <h3>Better experience for MCP servers that requires additional steps</h3>
      <a href="#better-experience-for-mcp-servers-that-requires-additional-steps">
        
      </a>
    </div>
    <p>Customers will get a more flexible prompt experience by configuring the DEX MCP server with their preferred AI assistant (Claude, Gemini, ChatGPT, etc.) that has MCP server support. MCP server support may require a subscription for some AI assistants. You can read the <a href="https://developers.cloudflare.com/cloudflare-one/insights/dex/dex-mcp-server"><u>Digital Experience Monitoring - MCP server documentation</u></a> for step by step instructions on how to get set up with each of the major AI assistants that are available today.</p><p>As an example, you can configure the DEX MCP server in Claude by downloading the Claude Desktop client, then selecting Claude Code &gt; Developer &gt; Edit Config. You will be prompted to open “claude_desktop_config.json” in a code editor of your choice. Simply add the following JSON configuration, and you’re ready to use Claude to call the DEX MCP server.</p>
            <pre><code>{
  "globalShortcut": "",
  "mcpServers": {
    "cloudflare-dex-analysis": {
      "command": "npx",
      "args": [
        "mcp-remote",
        "https://dex.mcp.cloudflare.com/sse"
      ]
    }
  }
}</code></pre>
            
    <div>
      <h2>Get started with Cloudflare One today</h2>
      <a href="#get-started-with-cloudflare-one-today">
        
      </a>
    </div>
    <p>Are you ready to secure your Internet traffic, employee devices, and private resources without compromising speed? You can get started with our new Cloudflare One AI powered tools today.</p><p>The WARP diagnostic analyzer and the DEX MCP server are generally available to all customers. Head to the Zero Trust dashboard to run a WARP diagnostic and learn more about your client’s connectivity with the WARP diagnostic analyzer. You can test out the new DEX MCP server (https://dex.mcp.cloudflare.com/sse) in less than one minute at <a href="http://playground.ai.cloudflare.com"><u>playground.ai.cloudflare.com</u></a>, and you can also configure an AI assistant like Claude to use the new <a href="https://developers.cloudflare.com/cloudflare-one/insights/dex/dex-mcp-server"><u>DEX MCP server</u></a>.</p><p>If you don’t have a Cloudflare account, and you want to try these new features, you can create a free account for up to 50 users. If you’re an Enterprise customer, and you’d like a demo of these new Cloudflare One AI features, you can reach out to your account team to set up a demo anytime. </p><p>You can stay up to date on latest feature releases across the Cloudflare One platform by following the <a href="https://developers.cloudflare.com/cloudflare-one/changelog/"><u>Cloudflare One changelogs</u></a> and joining the conversation in the <a href="https://community.cloudflare.com/"><u>Cloudflare community hub</u></a> or on our <a href="https://discord.cloudflare.com/"><u>Discord Server</u></a>.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/CvbpyPLYM62H7B0GhGqcZ/79317635029a9d09d31dacbec6793887/image5.png" />
          </figure><div>
  
</div><p></p> ]]></content:encoded>
            <category><![CDATA[AI Week]]></category>
            <category><![CDATA[Monitoring]]></category>
            <category><![CDATA[Analytics]]></category>
            <category><![CDATA[WARP]]></category>
            <category><![CDATA[Device Security]]></category>
            <category><![CDATA[Performance]]></category>
            <category><![CDATA[Dashboard]]></category>
            <category><![CDATA[Zero Trust]]></category>
            <category><![CDATA[Cloudflare One]]></category>
            <category><![CDATA[AI]]></category>
            <guid isPermaLink="false">7vSTlKJvMibVnsLp1YLWKe</guid>
            <dc:creator>Chris Draper</dc:creator>
            <dc:creator>Koko Uko</dc:creator>
        </item>
        <item>
            <title><![CDATA[Reducing double spend latency from 40 ms to < 1 ms on privacy proxy]]></title>
            <link>https://blog.cloudflare.com/reducing-double-spend-latency-from-40-ms-to-less-than-1-ms-on-privacy-proxy/</link>
            <pubDate>Tue, 05 Aug 2025 13:00:00 GMT</pubDate>
            <description><![CDATA[ We significantly sped up our privacy proxy service by fixing a 40ms delay in "double-spend" checks. ]]></description>
            <content:encoded><![CDATA[ <p>One of Cloudflare’s big focus areas is making the Internet faster for end users. Part of the way we do that is by looking at the "big rocks" or bottlenecks that might be slowing things down — particularly processes on the critical path. When we recently turned our attention to our privacy proxy product, we found a big opportunity for improvement.</p><p>What is our privacy proxy product? These proxies let users browse the web without exposing their personal information to the websites they’re visiting. Cloudflare runs infrastructure for privacy proxies like <a href="https://blog.cloudflare.com/icloud-private-relay/"><u>Apple’s Private Relay</u></a> and <a href="https://blog.cloudflare.com/cloudflare-now-powering-microsoft-edge-secure-network/"><u>Microsoft’s Edge Secure Network</u></a>.</p><p>Like any secure infrastructure, we make sure that users authenticate to these privacy proxies before we open up a connection to the website they’re visiting. In order to do this in a privacy-preserving way (so that Cloudflare collects the least possible information about end-users) we use an open Internet standard – <a href="https://www.rfc-editor.org/rfc/rfc9578.html">Privacy Pass </a>– to issue tokens that authenticate to our proxy service.</p><p>Every time a user visits a website via our Privacy Proxy, we check the validity of the Privacy Pass token which is included in the Proxy-Authorization header in their request. Before we cryptographically validate a user's token, we check if this token has already been spent. If the token is unspent, we let the user request through. Otherwise, it’s a "double-spend". From an access control perspective, double-spends are indicative of a problem. From a privacy perspective, double-spends can reduce the anonymity set and privacy characteristics. From a performance perspective, our privacy proxies see millions of requests per second – and any time spent authenticating delays people from accessing sites – so the check needs to be fast. Let’s see how we reduced the latency of these double-spend checks from ~40 ms to &lt;1 ms.</p>
    <div>
      <h2>How did we discover the issue?</h2>
      <a href="#how-did-we-discover-the-issue">
        
      </a>
    </div>
    <p>We use a tracing platform, <a href="https://www.jaegertracing.io/"><u>Jaeger</u></a>. It lets us see which paths our code took and how long functions took to run. When we looked into these traces, we saw latencies of ~ 40 ms. It was a good lead, but it alone was not enough to conclude it was an issue. The reason was we only sample a small percentage of our traces, so what we saw was not the whole picture. We needed to look at more data. We could’ve increased how many traces we sampled, but traces are large and heavy for our systems to process. Metrics are a lighter weight solution. We added metrics to get data on all double-spend checks.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/67v4incoE8gXu22EBSLnN0/3c5fbd6b44ccc25398c905889b61c05e/image4.png" />
          </figure><p>The lines in this graph are median latencies we saw for the slowest privacy proxies around the world. The metrics data gave us confidence that it was a problem affecting a large portion of requests… assuming that ~ 45 ms was longer than expected. But, was it expected? What numbers did we expect?</p>
    <div>
      <h2>The expected latency</h2>
      <a href="#the-expected-latency">
        
      </a>
    </div>
    <p>To understand what times are reasonable to expect, let’s go into detail on what makes up a “double-spend check”. When we do a double-spend check, we ask a backing data store if a Privacy Pass token exists. The data store we use is <a href="https://memcached.org/"><code><u>memcached</u></code></a>. We have many <code>memcached</code> instances running on servers around the world, so which server do we ask? For this, we use <a href="https://github.com/facebook/mcrouter"><code><u>mcrouter</u></code></a>. Instead of figuring out which <code>memcached</code> server to ask, we give our request to <code>mcrouter</code>, and it will handle choosing a good <code>memcached</code> server to use. We looked at the median time it took for <code>mcrouter</code> to process our request. This graph shows the average latencies per server over time. There are spikes, but most of the time the latency is &lt; 1 ms. </p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7LHxvtd813oeu1DyFh7MOA/0126ceb6212b50e8deeeffabba57e3e5/image1.png" />
          </figure><p>By this point, we were confident that double-spend check latencies were longer than expected everywhere, and we started looking for the root cause.</p>
    <div>
      <h2>How did we investigate the issue?</h2>
      <a href="#how-did-we-investigate-the-issue">
        
      </a>
    </div>
    <p>We took inspiration from the scientific method. We analyzed our code, created theories for why sections of code caused latency, and used data to reject those theories. For any remaining theories, we implemented fixes and tested if they worked.</p><p>Let’s look at the code. At a high level, the double-spend checking logic is:</p><ol><li><p>Get a connection, which can be broken down into:</p><ol><li><p>Send a <code>memcached version</code> command. This serves as a health check for whether the connection is still good to send data on.</p></li><li><p>If the connection is still good, acquire it. Otherwise, establish a new connection.</p></li></ol></li><li><p>Send a <code>memcached get</code> command on the connection.</p></li></ol><p>Let’s go through the theories we had for each step listed above.</p>
    <div>
      <h2>Theory 1: health check takes long</h2>
      <a href="#theory-1-health-check-takes-long">
        
      </a>
    </div>
    <p>We measured the health check primarily as a sanity check. The version command is simple and fast to process, so it should not take long. And we remained sane. The median latency was &lt; 1 ms.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6amdAWUKl3IvmGlvgwJhMP/57b6895aacf960b08ffc7d36d4569d25/image5.png" />
          </figure>
    <div>
      <h2>Theory 2: waiting to get a connection</h2>
      <a href="#theory-2-waiting-to-get-a-connection">
        
      </a>
    </div>
    <p>To understand why we may need to wait to get a connection, let’s go into more detail on how we get a connection. In our code, we use a connection pool. The pool is a set of ready-to-go connections to <code>mcrouter</code>. The benefit of having a pool is that we do not have to pay the overhead of establishing a connection every time we want to make a request. Pools have a size limit, though. Our limit was 20 per server, and this is where a potential problem lies. Imagine we have a server that processes 5,000 requests every second, and requests stay for 45 ms. We can use something called <a href="https://en.wikipedia.org/wiki/Little%27s_law"><u>Little’s Law</u></a> to estimate the average number of requests in our system: <code>5000 x 0.045 = 225</code>. Due to our pool size limits, we can only have 20 connections at a time, so we can only process 20 requests at any point in time. That means 205 requests are just waiting! When we do a double-spend check, maybe we’re waiting ~ 40 ms to get a connection?</p><p>We looked at the metrics of many different servers. No matter what the requests per second was, the latency was consistently ~ 40 ms, disproving the theory. For example, this graph shows data from a server that saw a maximum of 20 requests per second. It shows a histogram over time, and the large majority of requests fall in the 40 - 50 ms bucket.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1EJ7SlTzqMVLTIOTvqH1HL/7d64c441e606ecbe1823585f4ff19086/image7.png" />
          </figure>
    <div>
      <h2>Theory 3: delays in Nagle’s algorithm and delayed acks</h2>
      <a href="#theory-3-delays-in-nagles-algorithm-and-delayed-acks">
        
      </a>
    </div>
    <p>We decided to chat with Gemini, giving it the observations we had so far. It suggested many things, but the most interesting was to check if <code>TCP_NODELAY</code> was set. If we had set this option in our code, it would’ve disabled something called <a href="https://en.wikipedia.org/wiki/Nagle%27s_algorithm"><u>Nagle’s algorithm</u></a>. Nagle’s algorithm itself was not a problem, but when enabled alongside another feature, <a href="https://en.wikipedia.org/wiki/TCP_delayed_acknowledgment"><u>delayed ACKs</u></a>, latencies could creep in. To explain why, let’s go through an analogy.</p><p>Suppose we run a group chat app. Normally, people type a full thought and send it in one message. But, we have a friend who sends one word at a time: "Hi". Send. "how". Send. "are". Send. “you”. Send. That’s a lot of notifications. Nagle’s algorithm aims to prevent this. Nagle says that if the friend wants to send one short message, that’s fine, but it only lets them do it once per turn. When they try to send more single words right after, Nagle will save the words in a draft message. Once the draft message hits a certain length, Nagle sends. But what if the draft message never hits that length? To manage this, delayed ACKs initiates a 40 ms timer whenever the friend sends a message. If the app gets no further input before the timer ends, the message is sent to the group.</p><p>I took a closer look at the code, both Cloudflare authored code and code from dependencies we rely on. We depended on the <a href="https://crates.io/crates/memcache-async"><code>memcache-async</code></a> crate for implementing the code that lets us send <code>memcache</code> commands. Here is the code for sending a <code>memcached version</code> command:</p>
            <pre><code>self.io.write_all(b"version\r\n").await?;
self.io.flush().await?;</code></pre>
            <p>Nothing out of the ordinary. Then, we looked inside the get function.</p>
            <pre><code>let writer = self.io.get_mut();
writer.write_all(b"get ").await?;
writer.write_all(key.as_ref()).await?;
writer.write_all(b"\r\n").await?;
writer.flush().await?;</code></pre>
            <p>In our code, we set <code>io</code> as a <code>TcpStream</code>, meaning that each <code>write_all</code> call resulted in sending a message. With Nagle’s algorithm enabled, the data flow looked like this:</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6Vj6xfkbnIg2gmPeLy9g9I/2b003d8a4d81782148697fc83e793c6f/Screenshot_2025-07-24_at_13.16.05.png" />
          </figure><p>Oof. We tried to send all three small messages, but after we sent the “get “, the kernel put the token and <code>\r\n</code> in a buffer and started waiting. When <code>mcrouter</code> got the “get “, it could not do anything because it did not have the full command. So, it waited 40 ms. Then, it sent an ACK in response. We got the ACK, and sent the rest of the command in the buffer. <code>mcrouter</code> got the rest of the command, processed it, and returned a response telling us if the token exists. What would the data flow look like with Nagle’s algorithm disabled?</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4O0s8hb64olT2PDDc5wTFL/3cfe500a1f235276502db9e608cef966/Screenshot_2025-07-24_at_13.17.11.png" />
          </figure><p>We would send all three small messages. <code>mcrouter</code> would have the full command, and return a response immediately. No waiting, whatsoever.</p>
    <div>
      <h2>Why 40 ms?</h2>
      <a href="#why-40-ms">
        
      </a>
    </div>
    <p>Our Linux servers have minimum bounds for the delay. Here is a snippet of Linux source code that defines those bounds.</p>
            <pre><code>#if HZ &gt;= 100
#define TCP_DELACK_MIN	((unsigned)(HZ/25))	/* minimal time to delay before sending an ACK */
#define TCP_ATO_MIN	((unsigned)(HZ/25))
#else
#define TCP_DELACK_MIN	4U
#define TCP_ATO_MIN	4U
#endif</code></pre>
            <p>The comment tells us that <code>TCP_DELACK_MIN</code> is the minimum time delayed ACKs will wait before sending an ACK. We spent some time digging through Cloudflare’s custom kernel settings and found this:</p>
            <pre><code>CONFIG_HZ=1000</code></pre>
            <p><code>CONFIG_HZ</code> eventually propagates to <code>HZ</code> and results in a 40 ms delay. That's where the number comes from!</p>
    <div>
      <h2>The fix</h2>
      <a href="#the-fix">
        
      </a>
    </div>
    <p>We were sending three separate messages for a single command when we only needed to send one. We captured what a <code>get</code> command looked like in Wireshark to verify we were sending three separate messages. (We captured this locally on MacOS. Interestingly, we got an ACK for every message.)</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4B2qC70Dpeu25dTOP4V2hj/3720d8012f7d452696ca6cbe265d366e/image9.png" />
          </figure><p>The fix was to use <code>BufWriter&lt;TcpStream&gt;</code> so that <code>write_all</code> would buffer the small messages in a user-space memory buffer, and <code>flush</code> would send the entire <code>memcached</code> command in one message. The Wireshark capture looked much cleaner.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5S6R7qAIad9pjKIfQYTWbA/c7bfe663b707ba4653977319a02e5e07/image3.png" />
          </figure>
    <div>
      <h2>Conclusion</h2>
      <a href="#conclusion">
        
      </a>
    </div>
    <p>After deploying the fix to production, we saw the median double-spend check latency drop to expected values everywhere.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4kKCQFTw5wp0jEwdPcALb4/8425bcfe526c2eeb9570c7a98fc62c62/image8.png" />
          </figure><p>Our investigation followed a systematic, data-driven approach. We began by using observability tools to confirm the problem's scale. From there, we formed testable hypotheses and used data to systematically disprove them. This process ultimately led us to a subtle interaction between Nagle’s algorithm and delayed ACKs, caused by how we made use of a third-party dependency.</p><p>Ultimately, our mission is to help build a better Internet. Every millisecond saved contributes to a faster and more seamless, private browsing experience for end users. We're excited to have this rolled out and excited to continue to chase further performance improvements!</p> ]]></content:encoded>
            <category><![CDATA[Privacy]]></category>
            <category><![CDATA[Privacy Pass]]></category>
            <category><![CDATA[Performance]]></category>
            <category><![CDATA[TCP]]></category>
            <guid isPermaLink="false">29xmM9UQ1WEQlV0SiAuM2l</guid>
            <dc:creator>Ben Yang</dc:creator>
        </item>
        <item>
            <title><![CDATA[Building Jetflow: a framework for flexible, performant data pipelines at Cloudflare]]></title>
            <link>https://blog.cloudflare.com/building-jetflow-a-framework-for-flexible-performant-data-pipelines-at-cloudflare/</link>
            <pubDate>Wed, 23 Jul 2025 14:00:00 GMT</pubDate>
            <description><![CDATA[ Faced with a data-ingestion challenge at a massive scale, Cloudflare's Business Intelligence team built a new framework called Jetflow. ]]></description>
            <content:encoded><![CDATA[ <p>The Cloudflare Business Intelligence team manages a <a href="https://simple.wikipedia.org/wiki/Petabyte"><u>petabyte</u></a>-scale <a href="https://www.cloudflare.com/learning/cloud/what-is-a-data-lake/"><u>data lake</u></a> and ingests thousands of tables every day from many different sources. These include internal databases such as Postgres and ClickHouse, as well as external SaaS applications such as Salesforce. These tasks are often complex and tables may have hundreds of millions or billions of rows of new data each day. They are also business-critical for product decisions, growth plannings, and internal monitoring. In total, about <b>141 billion rows</b> are ingested every day.</p><p>As Cloudflare has grown, the data has become ever larger and more complex. Our existing <a href="https://www.ibm.com/think/topics/elt"><u>Extract Load Transform (ELT)</u></a> solution could no longer meet our technical and business requirements. After evaluating other common ELT solutions, we concluded that their performance generally did not surpass our current system, either.</p><p>It became clear that we needed to build our own framework to cope with our unique requirements — and so <b>Jetflow</b> was born. </p>
    <div>
      <h2>What we achieved</h2>
      <a href="#what-we-achieved">
        
      </a>
    </div>
    <p><b>Over 100x efficiency improvement in GB-s</b>:</p><ul><li><p>Our longest running job with 19 billion rows was taking <b>48 hours</b> using <b>300 GB of memory</b>, and now completes in <b>5.5 hours</b> using <b>4 GB of memory</b></p></li><li><p>We estimate that ingestion of 50 TB from Postgres via <b>Jetflow</b> could cost under $100 based on rates published by commercial cloud providers</p></li></ul><p><b>&gt;10x performance improvement:</b></p><ul><li><p>Our largest dataset was ingesting <b>60-80,000</b> rows per second, this is now <b>2-5 million</b> rows per second per database connection.</p></li><li><p>In addition, these numbers scale well with multiple database connections for some databases.</p></li></ul><p><b>Extensibility: </b></p><ul><li><p>The modular design makes it easy to extend and test<b>. </b>Today<b> Jetflow</b> works with ClickHouse, Postgres, Kafka, many different SaaS APIs, Google BigQuery and many others. It has continued to work well and remain flexible with the addition of new use cases.</p></li></ul>
    <div>
      <h2>How did we do this?</h2>
      <a href="#how-did-we-do-this">
        
      </a>
    </div>
    
    <div>
      <h3>Requirements</h3>
      <a href="#requirements">
        
      </a>
    </div>
    <p>The first step to designing our new framework had to be a clear understanding of the problems we were aiming to solve, with clear requirements to stop us creating new ones.</p>
    <div>
      <h5>Performant &amp; efficient</h5>
      <a href="#performant-efficient">
        
      </a>
    </div>
    <p>We needed to be able to move more data in less time as some ingestion jobs were taking ~24 hours, and our data will only grow. The data should be ingested in a streaming fashion and use less memory and compute resources than our existing solution.</p>
    <div>
      <h5>Backwards compatible </h5>
      <a href="#backwards-compatible">
        
      </a>
    </div>
    <p>Given the daily ingestion of thousands of tables, the chosen solution needed to allow for the migration of individual tables as needed. Due to our usage of <a href="https://spark.apache.org/"><u>Spark</u></a> downstream and Spark's limitations in merging disparate <a href="https://parquet.apache.org/"><u>Parquet</u></a> schemas, the chosen solution had to offer the flexibility to generate the precise schemas needed for each case to match legacy.</p><p>We also required seamless integration with our custom metadata system, used for dependency checks and job status information.</p>
    <div>
      <h5>Ease of use</h5>
      <a href="#ease-of-use">
        
      </a>
    </div>
    <p>We want a configuration file that can be version-controlled, without introducing bottlenecks on repositories with many concurrent changes.</p><p>To increase accessibility for different roles within the team, another requirement was no-code (or configuration as code) in the vast majority of cases. Users should not have to worry about availability or translation of data types between source and target systems, or writing new code for each new ingestion. The configuration needed should also be minimal — for example, data schema should be inferred from the source system and not need to be supplied by the user.</p>
    <div>
      <h5>Customizable</h5>
      <a href="#customizable">
        
      </a>
    </div>
    <p>Striking a balance with the no-code requirement above, although we want a low bar of entry we also want to have the option to tune and override options if desired, with a flexible and optional configuration layer. For example, writing Parquet files is often more expensive than reading from the database, so we want to be able to allocate more resources and concurrency as needed. </p><p>Additionally, we wanted to allow for control over where the work is executed, with the ability to spin up concurrent workers in different threads, different containers, or on different machines. The execution of workers and communication of data was abstracted away with an interface, and different implementations can be written and injected, controlled via the job configuration. </p>
    <div>
      <h5>Testable</h5>
      <a href="#testable">
        
      </a>
    </div>
    <p>We wanted a solution capable of running locally in a containerized environment, which would allow us to write tests for every stage of the pipeline. With “black box” solutions, testing often means validating the output after making a change, which is a slow feedback loop, risks not testing all edge cases as there isn’t good visibility of all code paths internally, and makes debugging issues painful.</p>
    <div>
      <h3>Designing a flexible framework </h3>
      <a href="#designing-a-flexible-framework">
        
      </a>
    </div>
    <p>To build a truly flexible framework, we broke the pipeline down into distinct stages, and then create a config layer to define the composition of the pipeline from these stages, and any configuration overrides. Every pipeline configuration that makes sense logically should execute correctly, and users should not be able to create pipeline configs that do not work. </p>
    <div>
      <h5>Pipeline configuration</h5>
      <a href="#pipeline-configuration">
        
      </a>
    </div>
    <p>This led us to a design where we created stages which were classified according to the meaningfully different categories of:</p><ul><li><p>Consumers</p></li><li><p>Transformers</p></li><li><p>Loaders</p></li></ul>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1ALi9AXyo5v1Y7cIjK619V/3c0735e2d5b36d5660072f92fe551ed3/image3.png" />
          </figure><p>The pipeline was constructed via a <a href="https://yaml.org/"><u>YAML</u></a> file that required a consumer, zero or more transformers, and at least one loader. Consumers create a data stream (via reading from the source system), Transformers (e.g. data transformations, validations) take a data stream input and output a data stream conforming to the same API so that they can be chained, and Loaders have the same data streaming interface, but are the stages with persistent effects — i.e. stages where data is saved to an external system. </p><p>This modular design means that each stage is independently testable, with shared behaviour (such as error handling and concurrency) inherited from shared base stages, significantly decreasing development time for new use cases and increasing confidence in code correctness.</p>
    <div>
      <h5>Data divisions</h5>
      <a href="#data-divisions">
        
      </a>
    </div>
    <p>Next, we designed a breakdown for the data that would allow the pipeline to be idempotent both on whole pipeline re-run and also on internal retry of any data partition due to transient error. We decided on a design that let us parallelize processing, while maintaining meaningful data divisions that allowed the pipeline to perform cleanups of data where required for a retry.</p><ul><li><p><b>RunInstance</b>: the least granular division, corresponding to a business unit for a single run of the pipeline (e.g. one month/day/hour of data). </p></li><li><p><b>Partition</b>: a division of the RunInstance that allows each row to be allocated to a partition in a way that is deterministic and self-evident from the row data without external state, and is therefore idempotent on retry. (e.g. an accountId range, a 10-minute interval)</p></li><li><p><b>Batch</b>: a division of the partition data that is non-deterministic and used only to break the data down into smaller chunks for streaming/parallel processing for faster processing with fewer resources. (e.g. 10k rows, 50 MB)</p></li></ul><p>The options that the user configures in the consumer stage YAML both construct the query that is used to retrieve the data from the source system, and also encode the semantic meaning of this data division in a system agnostic way, so that later stages understand what this data represents — e.g. this partition contains the data for all accounts IDs 0-500. This means that we can do targeted data cleanup and avoid, for example, duplicate data entries if a single data partition is retried due to error.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1NQsitJmwRwSpiLkj2Hoig/81db1750523268bb427d51e1d2746a46/image2.png" />
          </figure>
    <div>
      <h3>Framework implementation</h3>
      <a href="#framework-implementation">
        
      </a>
    </div>
    
    <div>
      <h5>Standard internal state for stage compatibility </h5>
      <a href="#standard-internal-state-for-stage-compatibility">
        
      </a>
    </div>
    <p>Our most common use case is something like read from a database, convert to Parquet format, and then save to <a href="https://www.cloudflare.com/learning/cloud/what-is-object-storage/">object storage,</a> with each of these steps being a separate stage. As more use cases were onboarded to <b>Jetflow,</b> we had to make sure that if someone wrote a new stage it would be compatible with the other stages. We don’t want to create a situation where new code needs to be written for every output format and target system, or you end up with a custom pipeline for every different use case.</p><p>The way we have solved this problem is by having our stage extractor class only allow output data in a single format. This means as long as any downstream stages support this format as in the input and output format they would be compatible with the rest of the pipeline. This seems obvious in retrospect, but internally was a painful learning experience, as we originally created a custom type system and struggled with stage interoperability. </p><p>For this internal format, we chose to use <a href="https://arrow.apache.org/"><u>Arrow</u></a>, an in-memory columnar data format. The key benefits of this format for us are:</p><ul><li><p><b>Arrow ecosystem</b>: Many data projects now support Arrow as an output format. This means when we write extractor stages for new data sources, it is often trivial to produce Arrow output.</p></li><li><p><b>No serialisation overhead</b>: This makes it easy to move Arrow data between machines and even programming languages with minimum overhead. <b>Jetflow</b> was designed from the start to have the flexibility to be able to run in a wide range of systems via a job controller interface, so this efficiency in data transmission means there’s minimal compromise on performance when creating distributed implementations.</p></li><li><p><b>Reserve memory in large fixed-size batches to avoid memory allocations</b>: As Go is a garbage collected (GC) language and GC cycle times are affected mostly by the number of objects rather than the sizes of those objects, fewer heap objects reduces CPU time spent garbage collecting significantly, even if the total size is the same. As the number of objects to scan, and possibly collect, during a GC cycle increases with the number of allocations, if we have 8192 rows with 10 columns each, Arrow would only require us to do 10 allocations versus the 8192 allocations of most drivers that allocate on a row by row basis, meaning fewer objects and lower GC cycle times with Arrow.</p></li></ul>
    <div>
      <h5>Converting rows to columns</h5>
      <a href="#converting-rows-to-columns">
        
      </a>
    </div>
    <p>Another important performance optimization was reducing the number of conversion steps that happen when reading and processing data. Most data ingestion frameworks internally represent data as rows. In our case, we are mostly writing data in Parquet format, which is column based. When reading data from column-based sources (e.g. ClickHouse, where most drivers receive RowBinary format), converting into row-based memory representations for the specific language implementation is inefficient. This is then converted again from rows to columns to write Parquet files. These conversions result in a significant performance impact.</p><p><b>Jetflow</b> instead reads data from column-based sources in columnar formats (e.g. for ClickHouse-native Block format) and then copies this data into Arrow column format. Parquet files are then written directly from Arrow columns. The simplification of this process improves performance.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5HEuO0Cn6Wob7tuR9hjnSP/852cdc44244f107b4289fb3b3553d213/image1.png" />
          </figure>
    <div>
      <h3>Writing each pipelines stage</h3>
      <a href="#writing-each-pipelines-stage">
        
      </a>
    </div>
    
    <div>
      <h5>Case study: ClickHouse</h5>
      <a href="#case-study-clickhouse">
        
      </a>
    </div>
    <p>When testing an initial version of <b>Jetflow</b>, we discovered<b> </b>that due to the architecture of ClickHouse, using additional connections would not be of any benefit, since ClickHouse was reading faster than we were receiving data. It should then be possible, with a more optimized database driver, to take better advantage of that single connection to read a much larger number of rows per second, without needing additional connections.</p><p>Initially, a custom database driver was written for ClickHouse, but we ended up switching to the excellent <a href="https://github.com/ClickHouse/ch-go"><u>ch-go low level library</u></a>, which directly reads <a href="https://clickhouse.com/docs/development/architecture#block"><u>Blocks</u></a> from ClickHouse in a columnar format. This had a dramatic effect on performance in comparison to the standard Go driver. Combined with the framework optimisations above, we now <b>ingest millions of rows per second</b> with a single ClickHouse connection.</p><p>A valuable lesson learned is that as with any software, tradeoffs are often made for the sake of convenience or a common use case that may not match your own. Most database drivers tend not to be optimized for reading large batches of rows, and have high per-row overhead.</p>
    <div>
      <h5>Case study: Postgres</h5>
      <a href="#case-study-postgres">
        
      </a>
    </div>
    <p>For Postgres, we use the excellent <a href="https://github.com/jackc/pgx"><u>jackc/pgx</u></a> driver, but instead of using the database/sql Scan interface, we directly receive the raw bytes for each row and use the jackc/pgx internal scan functions for each Postgres OID (Object Identifier) type.</p><p>The database/sql Scan interface in Go uses reflection to understand the type passed to the function and then also uses reflection to set each field with the column value received from Postgres. In typical scenarios, this is fast enough and easy to use, but falls short for our use cases in terms of performance. The <a href="https://github.com/jackc/pgx"><u>jackc/pgx</u></a> driver reuses the row bytes produced each time the next Postgres row is requested, resulting in zero allocations per row. This allows us to write high-performance, low-allocation code within Jetflow. With this design, we are able to achieve nearly <b>600,000 rows per second</b> per Postgres connection for most tables, with very low memory usage.</p>
    <div>
      <h2>Conclusion</h2>
      <a href="#conclusion">
        
      </a>
    </div>
    <p>As of early July 2025, the team ingests <b>77 billion</b> records per day via <b>Jetflow</b>. The remaining jobs are in the process of being migrated to <b>Jetflow</b>, which will bring the total daily ingestion to 141 billion records. The framework has allowed us to ingest tables in cases that would not otherwise have been possible, and provided significant cost savings due to ingestions running for less time and with fewer resources. </p><p>In the future, we plan to open source the project, and if you are interested in joining our team to help develop tools like this, then open roles can be found at <a href="https://www.cloudflare.com/en-gb/careers/jobs/"><u>https://www.cloudflare.com/careers/jobs/</u></a>.</p> ]]></content:encoded>
            <category><![CDATA[Data]]></category>
            <category><![CDATA[Go]]></category>
            <category><![CDATA[Performance]]></category>
            <category><![CDATA[Design]]></category>
            <category><![CDATA[Engineering]]></category>
            <guid isPermaLink="false">4wAX6JGDuRNIJwVqwdgrP8</guid>
            <dc:creator>Harry Hough</dc:creator>
            <dc:creator>Rebecca Walton-Jones </dc:creator>
            <dc:creator>Andy Fan</dc:creator>
            <dc:creator>Ricardo Margalhau</dc:creator>
            <dc:creator>Uday Sharma</dc:creator>
        </item>
        <item>
            <title><![CDATA[Network performance update: Developer Week 2025]]></title>
            <link>https://blog.cloudflare.com/network-performance-update-developer-week-2025/</link>
            <pubDate>Wed, 09 Apr 2025 14:00:00 GMT</pubDate>
            <description><![CDATA[ Cloudflare has been tracking and comparing our speed with other top networks since 2021. Let’s take a look at how things have changed since our last update. ]]></description>
            <content:encoded><![CDATA[ <p>As the Internet has become enmeshed in our everyday lives, so has our need for speed. No one wants to wait when adding shoes to our shopping carts, or accessing corporate assets from across the globe. And as the Internet supports more and more of our critical infrastructure, speed becomes more than just a measure of how quickly we can place a takeout order. It becomes the connective tissue between the systems that keep us safe, healthy, and organized. Governments, financial institutions, healthcare ecosystems, transit — they increasingly rely on the Internet. This is why at Cloudflare, building the fastest network is our north star. </p><p>We’re happy to announce that we are the fastest network in 48% of the top 1000 networks by 95th percentile TCP connection time between November 2024, and March 2025, up from 44% in September 2024.</p><p>In this post, we’re going to share with you how our network performance has changed since our <a href="https://blog.cloudflare.com/network-performance-update-birthday-week-2024/"><u>last post in September 2024</u></a>, and talk about what makes us faster than other networks.  But first, let’s talk a little bit about how we get this data.</p>
    <div>
      <h2>How does Cloudflare get this data?</h2>
      <a href="#how-does-cloudflare-get-this-data">
        
      </a>
    </div>
    <p>It’s happened to all of us — you casually click on a site, and suddenly you’ve reached a Cloudflare-branded error page. While you are shaking your fist at the sky, something interesting is happening on the back end. Cloudflare is using <a href="https://www.w3.org/TR/user-timing/"><u>Real User Monitoring (RUM)</u></a> to collect the data used to compare our performance against other networks. The monitoring we do is slightly different than the <a href="https://www.cloudflare.com/application-services/solutions/app-performance-monitoring/"><u>RUM Cloudflare offers</u></a> to customers. When the error page loads, a 100 KB file is fetched and loaded. This file is hosted on networks like Cloudflare, Akamai, Amazon CloudFront, Fastly, and Google Cloud CDN. Your browser processes the performance data, and sends it to Cloudflare, where we use it to get a clear view of how these different networks stack up in terms of speed. </p><p>We’ve been collecting and refining this data since June 2021.  You can read more about how we collect that data <a href="https://blog.cloudflare.com/benchmarking-edge-network-performance/"><u>here</u></a>, and we regularly <a href="https://blog.cloudflare.com/tag/network-performance-update/"><u>track our performance</u></a> during Innovation Weeks to hold ourselves accountable to you that we are always in pursuit of being the fastest network in the world.</p>
    <div>
      <h2>How are we doing?</h2>
      <a href="#how-are-we-doing">
        
      </a>
    </div>
    <p>In order to evaluate Cloudflare’s speed relative to others, we measure performance across the top 1000 “eyeball” networks using the list provided by the <a href="https://stats.labs.apnic.net/cgi-bin/aspop?c=IN"><u>Asia Pacific Network Information Centre (APNIC)</u></a>. So-called “eyeball” networks are those with a large concentration of subscribers/end users.  This information is important, because it gives us signals for where we can expand our presence or peering, or optimize our traffic engineering. When benchmarking, we assess the 95th percentile TCP connection time. This is the time it takes a user to establish a TCP connection to the server they are trying to reach. This metric helps us illustrate how Cloudflare’s network makes your traffic faster by serving your customers as locally as possible. </p><p>When we look at Cloudflare’s performance across the top 1000 networks, we can see that we’re fastest in 487, or over 48%, of these networks, between November 2024 and March 2025:</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/2vkfABpKwZtd7FJf5BU4lz/c2a778435be9b2c47656753cdb39e8f0/1.png" />
          </figure><p>In <a href="https://blog.cloudflare.com/network-performance-update-birthday-week-2024/"><u>September 2024</u></a>, we ranked #1 in 44% of these networks:</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/105vHx9riLNO4Fgm5XvxnL/4b7d106b84d90bcc674c3fb54043593c/2.png" />
          </figure><p>So why did we jump?  To get a better understanding of why, let’s take a look at the countries where we improved, which will give us a better sense of where to dive in.  This is what our network map looked like in <a href="https://blog.cloudflare.com/network-performance-update-birthday-week-2024/"><u>September 2024</u></a> (grey countries mean we do not have enough data or users to derive insights):</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5IfSvKcdYDsTE2Rl2WPLpE/1814ef571b8622c83ff6817b41102cf5/3.png" />
          </figure><p>(September 2024)</p><p>Today, using those same 95th percentile TCP connect times, we rank #1 in 48% of networks and the network map looks like this:</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/xYWPvT0dQH7eCxbqNSrqv/e758b2961faad0cd5e1d1d6a72351131/4.png" />
          </figure><p>(March 2025)</p><p>We made most of our gains in Africa, where countries that previously didn’t have enough samples saw an increase in samples, and Cloudflare pulled ahead. This could mean that there was either an increase in Cloudflare users, or an increase in error pages shown. These countries got faster almost exclusively due to the presence of our <a href="https://blog.cloudflare.com/how-cloudflare-helps-next-generation-markets/"><u>Edge Partner deployments</u></a>, which are Cloudflare locations embedded in last mile networks.  In next-generation markets like many African countries, these locations are crucial towards being faster as connectivity to end users tends to fall back to places like South Africa or London if in-country peering does not exist.</p><p>But let’s take a look at a couple of other places and see why we got faster.</p><p>In Canada, we were not the fastest in September 2024, but we are the fastest today. Today, we are the fastest in 40% of networks, which is the most out of all of our competitors:</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6bWdN0wG9g1LhujV4lY5Ne/5cdaa76a27cacc487622c45ab0ea38cd/5.png" />
          </figure><p>But when you look at the overall country numbers, we see that the race for the fastest network is quite close:</p><div>
    <figure>
        <table>
            <colgroup>
                <col></col>
                <col></col>
                <col></col>
                <col></col>
            </colgroup>
            <tbody>
                <tr>
                    <td>
                        <p><span><span><strong>Canada 95th Percentile TCP Connect Time by Provider</strong></span></span></p>
                    </td>
                </tr>
                <tr>
                    <td>
                        <p><span><span><strong>Rank</strong></span></span></p>
                    </td>
                    <td>
                        <p><span><span><strong>Entity</strong></span></span></p>
                    </td>
                    <td>
                        <p><span><span><strong>Connect Time (P95)</strong></span></span></p>
                    </td>
                    <td>
                        <p><span><span><strong>#1 Diff</strong></span></span></p>
                    </td>
                </tr>
                <tr>
                    <td>
                        <p><span><span>1</span></span></p>
                    </td>
                    <td>
                        <p><span><span>Cloudflare</span></span></p>
                    </td>
                    <td>
                        <p><span><span>179 ms</span></span></p>
                    </td>
                    <td>
                        <p><span><span>-</span></span></p>
                    </td>
                </tr>
                <tr>
                    <td>
                        <p><span><span>2</span></span></p>
                    </td>
                    <td>
                        <p><span><span>Fastly</span></span></p>
                    </td>
                    <td>
                        <p><span><span>180 ms</span></span></p>
                    </td>
                    <td>
                        <p><span><span>+0.48% (+0.87 ms)</span></span></p>
                    </td>
                </tr>
                <tr>
                    <td>
                        <p><span><span>3</span></span></p>
                    </td>
                    <td>
                        <p><span><span>Google</span></span></p>
                    </td>
                    <td>
                        <p><span><span>180 ms</span></span></p>
                    </td>
                    <td>
                        <p><span><span>+0.74% (+1.32 ms)</span></span></p>
                    </td>
                </tr>
                <tr>
                    <td>
                        <p><span><span>4</span></span></p>
                    </td>
                    <td>
                        <p><span><span>CloudFront</span></span></p>
                    </td>
                    <td>
                        <p><span><span>182 ms</span></span></p>
                    </td>
                    <td>
                        <p><span><span>+1.74% (+3.11 ms)</span></span></p>
                    </td>
                </tr>
                <tr>
                    <td>
                        <p><span><span>5</span></span></p>
                    </td>
                    <td>
                        <p><span><span>Akamai</span></span></p>
                    </td>
                    <td>
                        <p><span><span>215 ms </span></span></p>
                    </td>
                    <td>
                        <p><span><span>+20% (+36 ms)</span></span></p>
                    </td>
                </tr>
            </tbody>
        </table>
    </figure>
</div><p>The difference between Cloudflare and the third-fastest network is a little over a millisecond!  As we’ve <a href="https://blog.cloudflare.com/network-performance-update-birthday-week-2024/"><u>pointed out previously</u></a>, such fluctuations are quite common, especially at higher percentiles.  But there is still a significant difference between us and the slowest network; we’re around 20% faster.</p><p>However, looking at a place like Japan where were not the fastest in September 2024 but are now the fastest, there is a significant difference between Cloudflare and the number two network:</p><div>
    <figure>
        <table>
            <colgroup>
                <col></col>
                <col></col>
                <col></col>
                <col></col>
            </colgroup>
            <tbody>
                <tr>
                    <td>
                        <p><span><span><strong>Japan 95th Percentile TCP Connect Time by Provider</strong></span></span></p>
                    </td>
                </tr>
                <tr>
                    <td>
                        <p><span><span><strong>Rank</strong></span></span></p>
                    </td>
                    <td>
                        <p><span><span><strong>Entity</strong></span></span></p>
                    </td>
                    <td>
                        <p><span><span><strong>Connect Time (P95)</strong></span></span></p>
                    </td>
                    <td>
                        <p><span><span><strong>#1 Diff</strong></span></span></p>
                    </td>
                </tr>
                <tr>
                    <td>
                        <p><span><span>1</span></span></p>
                    </td>
                    <td>
                        <p><span><span>Cloudflare</span></span></p>
                    </td>
                    <td>
                        <p><span><span>116 ms</span></span></p>
                    </td>
                    <td>
                        <p><span><span>-</span></span></p>
                    </td>
                </tr>
                <tr>
                    <td>
                        <p><span><span>2</span></span></p>
                    </td>
                    <td>
                        <p><span><span>Fastly</span></span></p>
                    </td>
                    <td>
                        <p><span><span>122 ms</span></span></p>
                    </td>
                    <td>
                        <p><span><span>+5.23% (+6.08 ms)</span></span></p>
                    </td>
                </tr>
                <tr>
                    <td>
                        <p><span><span>3</span></span></p>
                    </td>
                    <td>
                        <p><span><span>Google</span></span></p>
                    </td>
                    <td>
                        <p><span><span>124 ms</span></span></p>
                    </td>
                    <td>
                        <p><span><span>+6.21% (+7.22 ms)</span></span></p>
                    </td>
                </tr>
                <tr>
                    <td>
                        <p><span><span>4</span></span></p>
                    </td>
                    <td>
                        <p><span><span>CloudFront</span></span></p>
                    </td>
                    <td>
                        <p><span><span>127 ms</span></span></p>
                    </td>
                    <td>
                        <p><span><span>+8.91% (+10 ms)</span></span></p>
                    </td>
                </tr>
                <tr>
                    <td>
                        <p><span><span>5</span></span></p>
                    </td>
                    <td>
                        <p><span><span>Akamai</span></span></p>
                    </td>
                    <td>
                        <p><span><span>153 ms </span></span></p>
                    </td>
                    <td>
                        <p><span><span>+32% (+37 ms)</span></span></p>
                    </td>
                </tr>
            </tbody>
        </table>
    </figure>
</div><p>Why is this? We are in more locations in Japan than our competitors and added more Edge Partner deployments in these locations, bringing us even closer to end-users. Edge Partner deployments are collaborations with ISPs, where we take space in their data centers, and peer with them directly. </p>
    <div>
      <h2>Why?</h2>
      <a href="#why">
        
      </a>
    </div>
    <p>Why do we track our network performance like this? The answer is simple: to improve user experience. This data allows us to track a key performance metric for Cloudflare and the other networks. When we see that we’re lagging in a region, it serves as a signal to dig deeper into our network. </p><p>This data is a gold mine for the teams tasked with improving Cloudflare’s network. When there are countries where Cloudflare is behind, it gives us signals for where we should expand or investigate. If we’re slow, we may need to invest in additional peering. If a region we have invested in heavily is slower, we may need to investigate our hardware.  The example from Japan shows exactly how this can benefit: we took a location where we were previously on par with our competitors, added peering in new locations, and we pulled ahead. </p><p>On top of this map, we have <a href="https://www.cloudflare.com/learning/network-layer/what-is-an-autonomous-system/"><u>autonomous system (ASN)</u></a> level granularity on how we are performing on each one of the top 1000 eyeball networks, and we continuously optimize our traffic flow with each of them.  This allows us to track individual networks that may lag and improve the customer experience in those networks through turning up peering, or even adding new deployments in those regions. </p>
    <div>
      <h2>What’s next?</h2>
      <a href="#whats-next">
        
      </a>
    </div>
    <p>We’re sharing our updates on our journey to become #1 everywhere so that you can see what goes into running the fastest network in the world. From here, our plan is the same as always: identify where we’re slower, fix it, and then tell you how we’ve gotten faster.</p> ]]></content:encoded>
            <category><![CDATA[Developer Week]]></category>
            <category><![CDATA[Performance]]></category>
            <category><![CDATA[Network]]></category>
            <category><![CDATA[Network Performance Update]]></category>
            <guid isPermaLink="false">2O9xvScPSeNZVBqldw8qgs</guid>
            <dc:creator>Emily Music</dc:creator>
            <dc:creator>Onur Karaagaoglu</dc:creator>
        </item>
        <item>
            <title><![CDATA[“You get Instant Purge, and you get Instant Purge!” — all purge methods now available to all customers]]></title>
            <link>https://blog.cloudflare.com/instant-purge-for-all/</link>
            <pubDate>Tue, 01 Apr 2025 14:00:00 GMT</pubDate>
            <description><![CDATA[ Following up on having the fastest purge in the industry, we have now increased Instant Purge quotas across all Cloudflare plans.  ]]></description>
            <content:encoded><![CDATA[ <p>There's a tradition at Cloudflare of launching real products on April 1, instead of the usual joke product announcements circulating online today. In previous years, we've introduced impactful products like <a href="https://blog.cloudflare.com/announcing-1111/"><u>1.1.1.1</u></a> and <a href="https://blog.cloudflare.com/introducing-1-1-1-1-for-families/"><u>1.1.1.1 for Families</u></a>. Today, we're excited to continue this tradition by <b>making every purge method available to all customers, regardless of plan type.</b></p><p>During Birthday Week 2024, we <a href="https://blog.cloudflare.com/instant-purge/"><u>announced our intention</u></a> to bring the full suite of purge methods — including purge by URL, purge by hostname, purge by tag, purge by prefix, and purge everything — to all Cloudflare plans. Historically, methods other than "purge by URL" and "purge everything" were exclusive to Enterprise customers. However, we've been openly rebuilding our purge pipeline over the past few years (hopefully you’ve read <a href="https://blog.cloudflare.com/part1-coreless-purge/"><u>some of our</u></a> <a href="https://blog.cloudflare.com/rethinking-cache-purge-architecture/"><u>blog</u></a> <a href="https://blog.cloudflare.com/instant-purge/"><u>series</u></a>), and we're thrilled to share the results more broadly. We've spent recent months ensuring the new Instant Purge pipeline performs consistently under 150 ms, even during increased load scenarios, making it ready for every customer.  </p><p>But that's not all — we're also significantly raising the default purge rate limits for Enterprise customers, allowing even greater purge throughput thanks to the efficiency of our newly developed <a href="https://blog.cloudflare.com/instant-purge/"><u>Instant Purge</u></a> system.</p>
    <div>
      <h2>Building a better purge: a two-year journey</h2>
      <a href="#building-a-better-purge-a-two-year-journey">
        
      </a>
    </div>
    <p>Stepping back, today's announcement represents roughly two years of focused engineering. Near the end of 2022, our team went heads down rebuilding Cloudflare’s purge pipeline with a clear yet challenging goal: dramatically increase our throughput while maintaining near-instant invalidation across our global network.</p><p>Cloudflare operates <a href="https://www.cloudflare.com/network"><u>data centers in over 335 cities worldwide</u></a>. Popular cached assets can reside across all of our data centers, meaning each purge request must quickly propagate to every location caching that content. Upon receiving a purge command, each data center must efficiently locate and invalidate cached content, preventing stale responses from being served. The amount of content that must be invalidated can vary drastically, from a single file, to all cached assets associated with a particular hostname. After the content has been purged, any subsequent requests will trigger retrieval of a fresh copy from the origin server, which will be stored in Cloudflare’s cache during the response. </p><p>Ensuring consistent, rapid propagation of purge requests across a vast network introduces substantial technical challenges, especially when accounting for occasional data center outages, maintenance, or network interruptions. Maintaining consistency under these conditions requires robust distributed systems engineering.</p>
    <div>
      <h2>How did we scale purge?</h2>
      <a href="#how-did-we-scale-purge">
        
      </a>
    </div>
    <p>We've <a href="https://blog.cloudflare.com/instant-purge/"><u>previously discussed</u></a> how our new Instant Purge system was architected to achieve sub-150 ms purge times. It’s worth noting that the performance improvements were only part of what our new architecture achieved, as it also helped us solve significant scaling challenges around storage and throughput that allowed us to bring Instant Purge to all users. </p><p>Initially, our purge system scaled well, but with rapid customer growth, the storage consumption from millions of daily purge keys that needed to be stored reduced available caching space. Early attempts to manage this storage and throughput demand involved <a href="https://www.boltic.io/blog/kafka-queue"><u>queues</u></a> and batching for smoothing traffic spikes, but this introduced latency and underscored the tight coupling between increased usage and rising storage costs.</p><p>We needed to revisit our thinking on how to better store purge keys and when to remove purged content so we could reclaim space. Historically, when a customer would purge by tag, prefix or hostname, Cloudflare would mark the content as expired and allow it to be evicted later. This is known as lazy-purge because nothing is actively removed from disk. Lazy-purge is fast, but not necessarily efficient, because it consumes storage for expired but not-yet-evicted content. After examining global or data center-level indexing for purge keys, we decided that wasn't viable due to increases in system complexity and the latency those indices could bring due to our network size. So instead, we opted for per-machine indexing, integrating indices directly alongside our cache proxies. This minimized network complexity, simplified reliability, and provided predictable scaling.</p><p>After careful analysis and benchmarking, we selected <a href="https://rocksdb.org/"><u>RocksDB</u></a>, an embedded key-value store that we could optimize for our needs, which formed the basis of <a href="https://blog.cloudflare.com/instant-purge/#putting-it-all-together"><u>CacheDB</u></a>, our Rust-based service running alongside each cache proxy. CacheDB manages indexing and immediate purge execution (active purge), significantly reducing storage needs and freeing space for caching.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4FZ0bQSx5MUhx3x3hwlRuk/91a27af7db5e629cd6d5fbe692397eaf/image2.png" />
          </figure><p>Local queues within CacheDB buffer purge operations to ensure consistent throughput without latency spikes, while the cache proxies consult CacheDB to guarantee rapid, active purges. Our updated distribution pipeline broadcasts purges directly to CacheDB instances across machines, dramatically improving throughput and purge speed.</p><p>Using CacheDB, we've reduced storage requirements 10x by eliminating lazy purge storage accumulation, instantly freeing valuable disk space. The freed storage enhances cache retention, boosting cache HIT ratios and minimizing origin egress. These savings in storage and increased throughput allowed us to scale to the point where we can offer Instant Purge to more customers.</p><p>For more information on how we designed the new Instant Purge system, please see the previous <a href="https://blog.cloudflare.com/instant-purge/"><u>installment</u></a> of our Purge series blog posts. </p>
    <div>
      <h2>Striking the right balance: what to purge and when</h2>
      <a href="#striking-the-right-balance-what-to-purge-and-when">
        
      </a>
    </div>
    <p>Moving on to practical considerations of using these new purge methods, it’s important to use the right method for what you want to invalidate. Purging too aggressively can overwhelm origin servers with unnecessary requests, driving up egress costs and potentially causing downtime. Conversely, insufficient purging leaves visitors with outdated content. Balancing precision and speed is vital.</p><p>Cloudflare supports multiple targeted purge methods to help customers achieve this balance.</p><ul><li><p><a href="https://developers.cloudflare.com/cache/how-to/purge-cache/purge-everything/"><b><u>Purge Everything</u></b></a>: Clears all cached content associated with a website.</p></li><li><p><a href="https://developers.cloudflare.com/cache/how-to/purge-cache/purge_by_prefix/"><b><u>Purge by Prefix</u></b></a>: Targets URLs sharing a common prefix.</p></li><li><p><a href="https://developers.cloudflare.com/cache/how-to/purge-cache/purge-by-hostname/"><b><u>Purge by Hostname</u></b></a>: Invalidates content by specific hostnames.</p></li><li><p><a href="https://developers.cloudflare.com/cache/how-to/purge-cache/purge-by-single-file/"><b><u>Purge by URL (single-file purge</u></b></a><b>)</b>: Precisely targets individual URLs.</p></li><li><p><a href="https://developers.cloudflare.com/cache/how-to/purge-cache/purge-by-tags/"><b><u>Purge by Tag</u></b></a>: Uses <a href="https://developers.cloudflare.com/cache/how-to/purge-cache/purge-by-tags/#add-cache-tag-http-response-headers"><u>Cache-Tag</u></a> headers to invalidate grouped assets, offering flexibility for complex cache management scenarios.</p></li></ul><p>Starting today, all of these methods are available to every Cloudflare customer.    </p>
    <div>
      <h2>How to purge </h2>
      <a href="#how-to-purge">
        
      </a>
    </div>
    <p>Users can select their purge method directly in the Cloudflare dashboard, located under the Cache tab in the <a href="https://dash.cloudflare.com/?to=/:account/:zone/caching/configuration"><u>configurations section</u></a>, or via the <a href="https://developers.cloudflare.com/api/resources/cache/"><u>Cloudflare API</u></a>. Each purge request should clearly specify the targeted URLs, hostnames, prefixes, or cache tags relevant to the selected purge type (known as purge keys). For instance, a prefix purge request might specify a directory such as example.com/foo/bar. To maximize efficiency and throughput, batching multiple purge keys in a single request is recommended over sending individual purge requests each with a single key.</p>
    <div>
      <h2>How much can you purge?</h2>
      <a href="#how-much-can-you-purge">
        
      </a>
    </div>
    <p>The new rate limits for Cloudflare's purge by tag, prefix, hostname, and purge everything are different for each plan type. We use a <a href="https://en.wikipedia.org/wiki/Token_bucket"><u>token bucket</u></a> rate limit system, so each account has a token bucket with a maximum size based on plan type. When we receive a purge request we first add tokens to the account’s bucket based on the time passed since the account’s last purge request divided by the refill rate for its plan type (which can be a fraction of a token). Then we check if there’s at least one whole token in the bucket, and if so we remove it and process the purge request. If not, the purge request will be rate limited. An easy way to think about this rate limit is that the refill rate represents the consistent rate of requests a user can send in a given period while the bucket size represents the maximum burst of requests available.</p><p>For example, a free user starts with a bucket size of 25 requests and a refill rate of 5 requests per minute (one request per 12 seconds). If the user were to send 26 requests all at once, the first 25 would be processed, but the last request would be rate limited. They would need to wait 12 seconds and retry their last request for it to succeed. </p><p>The current limits are applied per <b>account</b>: </p><table><tr><td><p><b>Plan</b></p></td><td><p><b>Bucket size</b></p></td><td><p><b>Request refill rate</b></p></td><td><p><b>Max keys per request</b></p></td><td><p><b>Total keys</b></p></td></tr><tr><td><p><b>Free</b></p></td><td><p>25 requests</p></td><td><p>5 per minute</p></td><td><p>100</p></td><td><p>500 per minute</p></td></tr><tr><td><p><b>Pro</b></p></td><td><p>25 requests</p></td><td><p>5 per second</p></td><td><p>100</p></td><td><p>500 per second</p></td></tr><tr><td><p><b>Biz</b></p></td><td><p>50 requests</p></td><td><p>10 per second</p></td><td><p>100</p></td><td><p>1,000 per second</p></td></tr><tr><td><p><b>Enterprise</b></p></td><td><p>500 requests</p></td><td><p>50 per second</p></td><td><p>100</p></td><td><p>5,000 per second</p></td></tr></table><p>More detailed documentation on all purge rate limits can be found in our <a href="https://developers.cloudflare.com/cache/how-to/purge-cache/"><u>documentation</u></a>.</p>
    <div>
      <h2>What’s next?</h2>
      <a href="#whats-next">
        
      </a>
    </div>
    <p>We’ve spent a lot of time optimizing our purge platform. But we’re not done yet. Looking forward, we will continue to enhance the performance of Cloudflare’s single-file purge. The current P50 performance is around 250 ms, and we suspect that we can optimize it further to bring it under 200 ms. We will also build out our ability to allow for greater purge throughput for all of our systems, and will continue to find ways to implement filtering techniques to ensure we can continue to scale effectively and allow customers to purge whatever and whenever they choose. </p><p>We invite you to try out our new purge system today and deliver an instant, seamless experience to your visitors.</p> ]]></content:encoded>
            <category><![CDATA[Cache]]></category>
            <category><![CDATA[Speed & Reliability]]></category>
            <category><![CDATA[Performance]]></category>
            <category><![CDATA[Cache Purge]]></category>
            <guid isPermaLink="false">4LTq8Utw6K58W4ojKxsqw8</guid>
            <dc:creator>Alex Krivit</dc:creator>
            <dc:creator> Connor Harwood</dc:creator>
            <dc:creator>Zaidoon Abd Al Hadi</dc:creator>
        </item>
        <item>
            <title><![CDATA[Dynamically optimize, clip, and resize video from any origin with Media Transformations]]></title>
            <link>https://blog.cloudflare.com/media-transformations-for-video-open-beta/</link>
            <pubDate>Fri, 07 Mar 2025 14:00:00 GMT</pubDate>
            <description><![CDATA[ With Cloudflare Stream’s new Media Transformations, content owners can resize, crop, clip, and optimize short-form video, all without migrating storage.  ]]></description>
            <content:encoded><![CDATA[ <p>Today, we are thrilled to announce Media Transformations, a new service that brings the magic of <a href="https://developers.cloudflare.com/images/transform-images/"><u>Image Transformations</u></a> to short-form video files wherever they are stored.</p><p>Since 2018, Cloudflare Stream has offered a managed video pipeline that empowers customers to serve rich video experiences at global scale easily, in multiple formats and quality levels. Sometimes, the greatest friction to getting started isn't even about video, but rather the thought of migrating all those files. Customers want a simpler solution that retains their current storage strategy to deliver small, optimized MP4 files. Now you can do that with Media Transformations.</p>
    <div>
      <h3>Short videos, big volume</h3>
      <a href="#short-videos-big-volume">
        
      </a>
    </div>
    <p>For customers with a huge volume of short video, such as generative AI output, e-commerce product videos, social media clips, or short marketing content, uploading those assets to Stream is not always practical. Furthermore, Stream’s key features like adaptive bitrate encoding and HLS packaging offer diminishing returns on short content or small files.</p><p>Instead, content like this should be fetched from our customers' existing storage like R2 or S3 directly, optimized by Cloudflare quickly, and delivered efficiently as small MP4 files. Cloudflare Images customers reading this will note that this sounds just like their existing Image Transformation workflows. Starting today, the same workflow can be applied to your short-form videos.</p>
    <div>
      <h3>What’s in a video?</h3>
      <a href="#whats-in-a-video">
        
      </a>
    </div>
    <p>The distinction between video and images online can sometimes be blurry --- consider an animated GIF: is that an image or a video? (They're usually smaller as MP4s anyway!) As a practical example, consider a selection of product images for a new jacket on an e-commerce site. You want a consumer to know how it looks, but also how it flows. So perhaps the first "image" in that carousel is actually a video of a model simply putting the jacket on. Media Transformations empowers customers to optimize the product video and images with similar tools and identical infrastructure.</p>
    <div>
      <h3>How to get started</h3>
      <a href="#how-to-get-started">
        
      </a>
    </div>
    <p>Any website that is already enabled for Image Transformations is now enabled for Media Transformations. To enable a new zone, navigate to “Transformations” under Stream (or Images), locate your zone in the list, and click Enable. Enabling and disabling a zone for transformations affects both Images and Media transformations.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5hltjlyKF43oV8gTvjr9vF/d904229983fbe9484b08763e22dcac8b/image3.png" />
          </figure><p>After enabling Media Transformations on a website, it is simple to construct a URL that transforms a video. The pattern is similar to Image Transformations, but uses the <code>media</code> endpoint instead of the <code>image</code> endpoint:</p>
            <pre><code>https://example.com/cdn-cgi/media/&lt;OPTIONS&gt;/&lt;SOURCE-VIDEO&gt;</code></pre>
            <p>The <code>&lt;OPTIONS&gt;</code> portion of the URL is a comma-separated <a href="https://developers.cloudflare.com/stream/transform-videos/"><u>list of flags</u></a> written as <code>key=value</code>. A few noteworthy flags:</p><ul><li><p><code>mode</code> can be <code>video</code> (the default) to output a video, <code>frame</code> to pull a still image of a single frame, or even spritesheet to generate an image with multiple frames, which is useful for seek previews or storyboarding.</p></li><li><p><code>time</code> specifies the exact start time from the input video to extract a frame or start making a clip</p></li><li><p><code>duration</code> specifies the length of an output video to make a clip shorter than the original</p></li><li><p><code>fit</code>, together with <code>height</code> and <code>width</code> allow resizing and cropping the output video or frame.</p></li><li><p>Setting <code>audio</code> to false removes the sound in the output video.</p></li></ul><p>The <code>&lt;SOURCE-VIDEO&gt;</code> is a full URL to a source file or a root-relative path if the origin is on the same zone as the transformation request.</p><p>A full list of supported options, examples, and troubleshooting information is <a href="https://developers.cloudflare.com/stream/transform-videos/"><u>available in DevDocs</u></a>.</p>
    <div>
      <h3>A few examples</h3>
      <a href="#a-few-examples">
        
      </a>
    </div>
    <p>I used my phone to take this video of the <a href="https://blog.cloudflare.com/harnessing-office-chaos/"><u>randomness mobile</u></a> in Cloudflare’s Austin Office and put it in an R2 bucket. Of course, it is possible to embed the original video file from R2 directly:</p>  


<p>That video file is almost 30 MB. Let’s optimize it together — a more efficient choice would be to resize the video to the width of this blog post template. Let’s apply a width adjustment in the options portion of the URL:</p>
            <pre><code>https://example.com/cdn-cgi/media/width=760/https://pub-d9fcbc1abcd244c1821f38b99017347f.r2.dev/aus-mobile.mp4</code></pre>
            <p>That will deliver the same video, resized and optimized:</p>


<p>Not only is this video the right size for its container, now it’s less than 4 MB. That’s a big bandwidth savings for visitors.</p><p>As I recorded the video, the lobby was pretty quiet, but there was someone talking in the distance. If we wanted to use this video as a background, we should remove the audio, shorten it, and perhaps crop it vertically. All of these options can be combined, comma-separated, in the options portion of the URL:</p>
            <pre><code>https://example.com/cdn-cgi/media/mode=video,duration=10s,width=480,height=720,fit=cover,audio=false/https://pub-d9fcbc1abcd244c1821f38b99017347f.r2.dev/aus-mobile.mp4</code></pre>
            <p>The result:</p>


<p>If this were a product video, we might want a small thumbnail to add to the carousel of images so shoppers can click to zoom in and see it move. Use the “frame” mode and a “time” to generate a static image from a single point in the video. The same size and fit options apply:</p>
            <pre><code>https://example.com/cdn-cgi/media/mode=frame,time=3s,width=120,height=120,fit=cover/https://pub-d9fcbc1abcd244c1821f38b99017347f.r2.dev/aus-mobile.mp4</code></pre>
            <p>Which generates this optimized image:</p> 
<img src="https://blog.cloudflare.com/cdn-cgi/media/mode=frame,time=3s,width=120,height=120,fit=cover/https://pub-d9fcbc1abcd244c1821f38b99017347f.r2.dev/aus-mobile.mp4" /><p>Try it out yourself using our video or one of your own: </p><ul><li><p>Enable transformations on your website/zone and use the endpoint: <code>https://[your-site]/cdn-cgi/media/</code></p></li><li><p>Mobile video: <a href="https://pub-d9fcbc1abcd244c1821f38b99017347f.r2.dev/aus-mobile.mp4"><u>https://pub-d9fcbc1abcd244c1821f38b99017347f.r2.dev/aus-mobile.mp4</u></a> </p></li><li><p>Check out the <a href="https://stream-video-transformer.kristianfreeman.com/"><u>Media Transformation URL Generator</u></a> from Kristian Freeman on our Developer Relations team, which he built using the <a href="https://streamlit.io/"><u>Streamlit</u></a> Python framework on Workers.</p></li></ul>
    <div>
      <h3>Input Limits</h3>
      <a href="#input-limits">
        
      </a>
    </div>
    <p>We are eager to start supporting real customer content, and we will right-size our input limitations with our early adopters. To start:</p><ul><li><p>Video files must be smaller than 40 megabytes.</p></li><li><p>Files must be MP4s and should be h.264 encoded.</p></li><li><p>Videos and images generated with Media Transformations will be cached. However, in our initial beta, the original content will not be cached which means regenerating a variant will result in a request to the origin.</p></li></ul>
    <div>
      <h3>How it works</h3>
      <a href="#how-it-works">
        
      </a>
    </div>
    <p>Unlike Stream, Media Transformations receives requests on a customer’s own website. Internally, however, these requests are passed to the same <a href="https://blog.cloudflare.com/behind-the-scenes-with-stream-live-cloudflares-live-streaming-service/"><u>On-the-Fly Encoder (“OTFE”) platform that Stream Live uses</u></a>. To achieve this, the Stream team built modules that run on our servers to act as entry points for these requests.</p><p>These entry points perform some initial validation on the URL formatting and flags before building a request to Stream’s own Delivery Worker, which in turn calls OTFE’s set of transformation handlers. The original asset is fetched from the <i>customer’s</i> origin, validated for size and type, and passed to the same OTFE methods responsible for manipulating and optimizing <a href="https://developers.cloudflare.com/stream/viewing-videos/displaying-thumbnails/"><u>video or still frame thumbnails</u></a> for videos uploaded to Stream. These tools do a final inspection of the media type and encoding for compatibility, then generate the requested variant. If any errors were raised along the way, an HTTP error response will be generated using <a href="https://developers.cloudflare.com/images/reference/troubleshooting/#error-responses-from-resizing"><u>similar error codes</u></a> to Image Transformations. When successful, the result is cached for future use and delivered to the requestor as a single file. Even for new or uncached requests, all of this operates much faster than the video’s play time.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7wfYn8FLcgzgIdLT6NFeq3/f6c51134363231ffed964300cb9992b0/flowchart.png" />
          </figure>
    <div>
      <h3>What it costs</h3>
      <a href="#what-it-costs">
        
      </a>
    </div>
    <p>Media Transformations will be free for all customers while in beta. We expect the beta period to extend into Q3 2025, and after that, Media Transformations will use the same subscriptions and billing mechanics as Image Transformations — including a free allocation for all websites/zones. Generating a still frame (single image) from a video counts as 1 transformation. Generating an optimized video is billed as 1 transformation <i>per second of the output video.</i> Each unique transformation is only billed once per month. All Media and Image Transformations cost $0.50 per 1,000 monthly unique transformation operations, with a free monthly allocation of 5,000.</p><p>Using this post as an example, recall the two transformed videos and one transformed image above — the big original doesn’t count because it wasn’t transformed. The first video (showing blog post width) was 15 seconds of output. The second video (silent vertical clip) was 10 seconds of output. The preview square is a still frame. These three operations would count as 26 transformations — and they would only bill once per month, regardless of how many visitors this page receives.</p>
    <div>
      <h3>Looking ahead</h3>
      <a href="#looking-ahead">
        
      </a>
    </div>
    <p>Our short-term focus will be on right-sizing input limits based on real customer usage as well as adding a caching layer for origin fetches to reduce any egress fees our customers may be facing from other storage providers. Looking further, we intend to streamline Images and Media Transformations to further simplify the developer experience, unify the features, and streamline enablement: Cloudflare’s Media Transformations will optimize your images and video, quickly and easily, wherever you need them.</p><p>Try it for yourself today using our sample asset above, or get started by enabling Transformations on a zone in your account and uploading a short file to R2, both of which offer a free tier to get you going.</p> ]]></content:encoded>
            <category><![CDATA[Cloudflare Media Platform]]></category>
            <category><![CDATA[Cloudflare Stream]]></category>
            <category><![CDATA[Performance]]></category>
            <category><![CDATA[Video]]></category>
            <guid isPermaLink="false">2KCsgqrpHOVpCClqBBPnYM</guid>
            <dc:creator>Taylor Smith</dc:creator>
            <dc:creator>Mickie Betz</dc:creator>
            <dc:creator>Ben Krebsbach</dc:creator>
        </item>
        <item>
            <title><![CDATA[Moving Baselime from AWS to Cloudflare: simpler architecture, improved performance, over 80% lower cloud costs]]></title>
            <link>https://blog.cloudflare.com/80-percent-lower-cloud-cost-how-baselime-moved-from-aws-to-cloudflare/</link>
            <pubDate>Thu, 31 Oct 2024 13:00:00 GMT</pubDate>
            <description><![CDATA[ Post-acquisition, we migrated Baselime from AWS to the Cloudflare Developer Platform and in the process, we improved query times, simplified data ingestion, and now handle far more events, all while cutting costs. Here’s how we built a modern, high-performing observability platform on Cloudflare’s network.  ]]></description>
            <content:encoded><![CDATA[ 
    <div>
      <h2>Introduction</h2>
      <a href="#introduction">
        
      </a>
    </div>
    <p>When <a href="https://blog.cloudflare.com/cloudflare-acquires-baselime-expands-observability-capabilities/"><u>Baselime joined Cloudflare</u></a> in April 2024, our architecture had evolved to hundreds of AWS Lambda functions, dozens of databases, and just as many queues. We were drowning in complexity and our cloud costs were growing fast. We are now building <a href="https://baselime.io/"><u>Baselime</u></a> and <a href="https://developers.cloudflare.com/workers/observability/logs/workers-logs/"><u>Workers Observability</u></a> on Cloudflare and will save over 80% on our cloud compute bill. The estimated potential Cloudflare costs are for Baselime, which remains a stand-alone offering, and the estimate is based on the <a href="https://developers.cloudflare.com/workers/platform/pricing/"><u>Workers Paid plan</u></a>. Not only did we achieve huge cost savings, we also simplified our architecture and improved overall latency, scalability, and reliability.</p><table><tr><td><p><b>Cost (daily)</b></p></td><td><p><b>Before (AWS)</b></p></td><td><p><b>After (Cloudflare)</b></p></td></tr><tr><td><p>Compute</p></td><td><p>$650 - AWS Lambda</p></td><td><p>$25 - Cloudflare Workers</p></td></tr><tr><td><p>CDN</p></td><td><p>$140 - Cloudfront</p></td><td><p>$0 - Free</p></td></tr><tr><td><p>Data Stream + Analytics database</p></td><td><p>$1,150 - Kinesis Data Stream + EC2</p></td><td><p>$300 - Workers Analytics Engine</p></td></tr><tr><td><p>Total (daily)</p></td><td><p>$1,940</p></td><td><p>$325</p></td></tr><tr><td><p><b>Total (annual)</b></p></td><td><p><b>$708,100</b></p></td><td><p><b>$118,625</b> (83% cost reduction)</p></td></tr></table><p><sub><i>Table 1: AWS vs. Workers Costs Comparison ($USD)</i></sub></p><p>When we joined Cloudflare, we immediately saw a surge in usage, and within the first week following the announcement, we were processing over a billion events daily and our weekly active users tripled.</p><p>As the platform grew, so did the challenges of managing real-time observability with new scalability, reliability, and cost considerations. This drove us to rebuild Baselime on the Cloudflare Developer Platform, where we could innovate quickly while reducing operational overhead.</p>
    <div>
      <h2>Initial architecture — all on AWS</h2>
      <a href="#initial-architecture-all-on-aws">
        
      </a>
    </div>
    <p>Our initial architecture was all on Amazon Web Services (AWS). We’ll focus here on the data pipeline, which covers ingestion, processing, and storage of tens of billions of events daily.</p><p>This pipeline was built on top of AWS Lambda, Cloudfront, Kinesis, EC2, DynamoDB, ECS, and ElastiCache.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1ruQuVe7kcd31FOS91nstS/9d422e751e0d2d3a90d603e6adcd5c48/image1.png" />
          </figure><p><sup><i>Figure1: Initial data pipeline architecture</i></sup></p><p>The key elements are:</p><ul><li><p><b>Data receptors</b>: Responsible for receiving telemetry data from multiple sources, including OpenTelemetry, Cloudflare Logpush, CloudWatch, Vercel, etc. They cover validation, authentication, and transforming data from each source into a common internal format. The data receptors were deployed either on AWS Lambda (using function URLs and Cloudfront) or ECS Fargate depending on the data source.</p></li><li><p><b>Kinesis Data Stream</b>: Responsible for transporting the data from the receptors to the next step: data processing.</p></li><li><p><b>Processor</b>: A single AWS Lambda function responsible for enriching and transforming the data for storage. It also performed real-time error tracking and detecting patterns in logs.</p></li><li><p><b>ClickHouse cluster</b>: All the telemetry data was ultimately indexed and stored in a self-hosted ClickHouse cluster on EC2.</p></li></ul><p>In addition to these key elements, the existing stack also included orchestration with Firehose, S3 buckets, SQS, DynamoDB and RDS for error handling, retries, and storing metadata.</p><p>While this architecture served us well in the early days, it started to show major cracks as we scaled our solution to more and larger customers.</p><p>Handling retries at the interface between the data receptors and the Kinesis Data Stream was complex, requiring introducing and orchestrating Firehose, S3 buckets, SQS, and another Lambda function.</p><p>Self-hosting ClickHouse also introduced major challenges at scale, as we continuously had to plan our capacity and update our setup to keep pace with our growing user base whilst attempting to maintain control over costs.</p><p>Costs began scaling unpredictably with our growing workloads, especially in AWS Lambda, Kinesis, and EC2, but also in less obvious ways, such as in Cloudfront (required for a custom domain in front of Lambda function URLs) and DynamoDB. Specifically, the time spent on I/O operations in AWS Lambda was a particularly costly piece. At every step, from the data receptors to the ClickHouse cluster, moving data to the next stage required waiting for a network request to complete, accounting for over 70% of wall time in the Lambda function.</p><p>In a nutshell, we were continuously paged by our alerts, innovating at a slower pace, and our costs were out of control.</p><p>Additionally, the entire solution was deployed in a single AWS region: eu-west-1. As a result, all developers located outside continental Europe were experiencing high latency when emitting logs and traces to Baselime. </p>
    <div>
      <h2>Modern architecture — transitioning to Cloudflare</h2>
      <a href="#modern-architecture-transitioning-to-cloudflare">
        
      </a>
    </div>
    <p>The shift to the <a href="https://www.cloudflare.com/en-gb/developer-platform/products/"><u>Cloudflare Developer Platform</u></a> enabled us to rethink our architecture to be exceptionally fast, globally distributed, and highly scalable, without compromising on cost, complexity, or agility. This new architecture is built on top of Cloudflare primitives.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/16ndcTUS2vAg6TM4djTGUH/0e187d50ae466c275839c6aac91e5249/image5.png" />
          </figure><p><sup><i>Figure 2: Modern data pipeline architecture</i></sup></p>
    <div>
      <h3>Cloudflare Workers: the core of Baselime</h3>
      <a href="#cloudflare-workers-the-core-of-baselime">
        
      </a>
    </div>
    <p><a href="https://www.cloudflare.com/developer-platform/workers/"><u>Cloudflare Workers</u></a> are now at the core of everything we do. All the data receptors and the processor run in Workers. Workers minimize cold-start times and are deployed globally by default. As such, developers always experience lower latency when emitting events to Baselime.</p><p>Additionally, we heavily use <a href="https://blog.cloudflare.com/javascript-native-rpc/"><u>JavaScript-native RPC</u></a> for data transfer between steps of the pipeline. It’s low-latency, lightweight, and simplifies communication between components. This further simplifies our architecture, as separate components behave more as functions within the same process, rather than completely separate applications.</p>
            <pre><code>export default {
  async fetch(request: Request, env: Bindings, ctx: ExecutionContext): Promise&lt;Response&gt; {
      try {
        const { err, apiKey } = auth(request);
        if (err) return err;

        const data = {
          workspaceId: apiKey.workspaceId,
          environmentId: apiKey.environmentId,
          events: request.body
        };
        await env.PROCESSOR.ingest(data);

        return success({ message: "Request Accepted" }, 202);
      } catch (error) {
        return failure({ message: "Internal Error" });
      }
  },
};</code></pre>
            <p><sup><i>Code Block 1: Simplified data receptor using JavaScript-native RPC to execute the processor.</i></sup></p><p>Workers also expose a <a href="https://developers.cloudflare.com/workers/runtime-apis/bindings/rate-limit/"><u>Rate Limiting binding</u></a> that enables us to automatically add rate limiting to our services, which we previously had to build ourselves using a combination of DynamoDB and ElastiCache.</p><p>Moreover, we heavily use <code>ctx.waitUntil</code> within our Worker invocations, to offload data transformation outside the request / response path. This further reduces the latency of calls developers make to our data receptors.</p>
    <div>
      <h3>Durable Objects: stateful data processing</h3>
      <a href="#durable-objects-stateful-data-processing">
        
      </a>
    </div>
    <p><a href="https://www.cloudflare.com/en-gb/developer-platform/durable-objects/"><u>Durable Objects</u></a> is a unique service within the Cloudflare Developer Platform, as it enables building stateful applications in a serverless environment. We use Durable Objects in the data pipelines for both real-time error tracking and detecting log patterns.</p><p>For instance, to track errors in real-time, we create a durable object for each new type of error, and this durable object is responsible for keeping track of the frequency of the error, when to notify customers, and the notification channels for the error. <b>This implementation with a single building block removes the need for ElastiCache, Kinesis, and multiple Lambda functions to coordinate protecting the RDS database from being overwhelmed by a high frequency error.</b></p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/14SC0ackxCGiRxAr8DY1Vs/c5729735296552013765c298af802b38/image4.png" />
          </figure><p><sup><i>Figure 3: Real-time error detection architecture comparison</i></sup></p><p>Durable Objects gives us precise control over consistency and concurrency of managing state in the data pipeline.</p><p>In addition to the data pipeline, we use Durable Objects for alerting. Our previous architecture required orchestrating EventBridge Scheduler, SQS, DynamoDB and multiple AWS Lambda functions, whereas with Durable Objects, everything is handled within the <code>alarm</code> handler. </p>
    <div>
      <h3>Workers Analytics Engine: high-cardinality analytics at scale</h3>
      <a href="#workers-analytics-engine-high-cardinality-analytics-at-scale">
        
      </a>
    </div>
    <p>Though managing our own ClickHouse cluster was technically interesting and challenging, it took us away from building the best observability developer experience. With this migration, more of our time is spent enhancing our product and none is spent managing server instances.</p><p><a href="https://developers.cloudflare.com/analytics/analytics-engine/"><u>Workers Analytics Engine</u></a> lets us synchronously write events to a scalable high-cardinality analytics database. We built on top of the same technology that powers Workers Analytics Engine. We also made internal changes to Workers Analytics Engine to natively enable high dimensionality in addition to high cardinality.</p><p>Moreover, Workers Analytics Engine and our solution leverages <a href="https://blog.cloudflare.com/explaining-cloudflares-abr-analytics/"><u>Cloudflare’s ABR analytics</u></a>. ABR stands for Adaptive Bit Rate, and enables us to store telemetry data in multiple tables with varying resolutions, from 100% to 0.0001% of the data. Querying the table with 0.0001% of the data will be several orders of magnitudes faster than the table with all the data, with a corresponding trade-off in accuracy. As such, when a query is sent to our systems, Workers Analytics Engine dynamically selects the most appropriate table to run the query, optimizing both query time and accuracy. Users always get the most accurate result with optimal query time, regardless of the size of their dataset or the timeframe of the query. Compared to our previous system, which was always running queries on the full dataset, the new system now delivers faster queries across our entire user base and use cases<i>.</i></p><p>In addition to these core services (Workers, Durable Objects, Workers Analytics Engine), the new architecture leverages other building blocks from the Cloudflare Developer Platform. <a href="https://www.cloudflare.com/en-gb/developer-platform/products/cloudflare-queues/"><u>Queues</u></a> for asynchronous messaging, decoupling services and enabling an event-driven architecture; <a href="https://www.cloudflare.com/en-gb/developer-platform/d1/"><u>D1</u></a> as our main database for transactional data (queries, alerts, dashboards, configurations, etc.); <a href="https://www.cloudflare.com/en-gb/developer-platform/workers-kv/"><u>Workers KV</u></a> for fast distributed storage; <a href="https://hono.dev/"><u>Hono</u></a> for all our APIs, etc.</p>
    <div>
      <h2>How did we migrate?</h2>
      <a href="#how-did-we-migrate">
        
      </a>
    </div>
    <p>Baselime is built on an event-driven architecture, where every user action triggers an event. It operates on the principle that every user action is recorded as an event and emitted to the rest of the system — whether it’s creating a user, editing a dashboard, or performing any other action. Migrating to Cloudflare involved transitioning our event-driven architecture without compromising uptime and data consistency. Previously, this was powered by AWS EventBridge and SQS, and we moved entirely to Cloudflare Queues.</p><p>We followed the <a href="https://martinfowler.com/bliki/StranglerFigApplication.html"><u>strangler fig pattern</u></a> to incrementally migrate the solution from AWS to Cloudflare. It consists of gradually replacing specific parts of the system with newer services, with minimal disruption to the system. Early in the process, we created a central Cloudflare Queue which acted as the backbone for all transactional event processing during the migration. Every event, whether a new user signup or a dashboard edit, was funneled into this Queue. From there, events were dynamically routed, each event to the relevant part of the application. User actions were synced into D1 and KV, ensuring that all user actions were mirrored across both AWS and Cloudflare during the transition.</p><p>This syncing mechanism enabled us to maintain consistency and ensure that no data was lost as users continued to interact with Baselime.</p><p>Here's an example of how events are processed:</p>
            <pre><code>export default {
  async queue(batch, env) {
    for (const message of batch.messages) {
      try {
        const event = message.body;
        switch (event.type) {
          case "WORKSPACE_CREATED":
            await workspaceHandler.create(env, event.data);
            break;
          case "QUERY_CREATED":
            await queryHandler.create(env, event.data);
            break;
          case "QUERY_DELETED":
            await queryHandler.remove(env, event.data);
            break;
          case "DASHBOARD_CREATED":
            await dashboardHandler.create(env, event.data);
            break;
          //
          // Many more events...
          //
          default:
            logger.info("Matched no events", { type: event.type });
        }
        message.ack();
      } catch (e) {
        if (message.attempts &lt; 3) {
          message.retry({ delaySeconds: Math.ceil(30 ** message.attempts / 10), });
        } else {
          logger.error("Failed handling event - No more retrys", { event: message.body, attempts: message.attempts }, e);
        }
      }
    }
  },
} satisfies ExportedHandler&lt;Env, InternalEvent&gt;;</code></pre>
            <p><sup><i>Code Block 2: Simplified internal events processing during migration.</i></sup></p><p>We migrated the data pipeline from AWS to Cloudflare with an outside-in method: we started with the data receptors and incrementally moved the data processor and the ClickHouse cluster to the new architecture. We began writing telemetry data (logs, metrics, traces, wide-events, etc.) to both ClickHouse (in AWS) and to Workers Analytics Engine simultaneously for the duration of the retention period (30 days).</p><p>The final step was rewriting all of our endpoints, previously hosted on AWS Lambda and ECS containers, into Cloudflare Workers. Once those Workers were ready, we simply switched the DNS records to point to the Workers instead of the existing Lambda functions.</p><p>Despite the complexity, the entire migration process, from the data pipeline to all re-writing API endpoints, took our then team of 3 engineers less than three months.</p>
    <div>
      <h2>We ended up saving over 80% on our cloud bill</h2>
      <a href="#we-ended-up-saving-over-80-on-our-cloud-bill">
        
      </a>
    </div>
    
    <div>
      <h3>Savings on the data receptors</h3>
      <a href="#savings-on-the-data-receptors">
        
      </a>
    </div>
    <p>After switching the data receptors from AWS to Cloudflare in early June 2024, our AWS Lambda cost was reduced by over 85%. These costs were primarily driven by I/O time the receptors spent sending data to a Kinesis Data Stream in the same region.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6P34XkcxEhqR6cjWAGnZaL/54025f08f4642649b7ae53fbaa3775b4/image3.png" />
          </figure><p><sup><i>Figure 4: Baselime daily AWS Lambda cost [note: the gap in data is the result of AWS Cost Explorer losing data when the parent organization of the cloud accounts was changed.]</i></sup></p><p>Moreover, we used Cloudfront to enable custom domains pointing to the data receptors. When we migrated the data receptors to Cloudflare, there was no need for Cloudfront anymore. As such, our Cloudfront cost was reduced to $0.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1vj3JUrE580W749VX8JMBj/c82477d00fe33fba1ffd8658ef0a1229/image2.png" />
          </figure><p><sup><i>Figure 5: Baselime daily Cloudfront cost [note: the gap in data is the result of AWS Cost Explorer losing data when the parent organization of the cloud accounts was changed.]</i></sup></p><p>If we were a regular Cloudflare customer, we estimate that our Cloudflare Workers bill would be around \$25/day after the switch, against \$790/day on AWS: over 95% cost reduction. These savings are primarily driven by the Workers pricing model, since Workers charge for CPU time, and the receptors are primarily just moving data, and as such, are mostly I/O bound.</p>
    <div>
      <h3>Savings on the ClickHouse cluster</h3>
      <a href="#savings-on-the-clickhouse-cluster">
        
      </a>
    </div>
    <p>To evaluate the cost impact of switching from self-hosting ClickHouse to using Workers Analytics Engine, we need to take into account not only the EC2 instances, but also the disk space, networking, and the Kinesis Data Stream cost.</p><p>We completed this switch in late August, achieving over 95% cost reduction in both the Kinesis Data Stream and all EC2 related costs.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3hpbfmwC5vEjDXeSIC2XIr/d103e813848926f224efe6e22e0717ce/image9.png" />
          </figure><p><sup><i>Figure 6: Baselime daily Kinesis Data Stream cost [note: the gap in data is the result of AWS Cost Explorer losing data when the parent organization of the cloud accounts was changed.]</i></sup></p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7aBU0TVUoRTC5GAo3HkS0N/eebb0559ad8d32b56e20f1e5c6121c6f/image6.png" />
          </figure><p><sup><i>Figure 7: Baselime daily EC2 cost [note: the gap in data is the result of AWS Cost Explorer losing data when the parent organization of the cloud accounts was changed.]</i></sup></p><p>If we were a regular Cloudflare customer, we estimate that our Workers Analytics Engine cost would be around \$300/day after the switch, compared to \$1150/day on AWS, a cost reduction of over 70%.</p><p>Not only did we significantly reduce costs by migrating to Cloudflare, but we also improved performance across the board. Responses to users are now faster, with real-time event ingestion happening across Cloudflare’s network, closer to our users. Responses to users querying their data are also much faster, thanks to Cloudflare’s deep expertise in operating ClickHouse at scale.</p><p>Most importantly, we’re no longer bound by limitations in throughput or scale. We launched <a href="https://developers.cloudflare.com/workers/observability/logs/workers-logs"><u>Workers Logs</u></a> on September 26, 2024, and our system now handles a much higher volume of events than before, with no sacrifices in speed or reliability.</p><p>These cost savings are outstanding as is, and do not include the total cost of ownership of those systems. We significantly simplified our systems and our codebase, as the platform is taking care of more for us. We’re paged less, we spend less time monitoring infrastructure, and we can focus on delivering product improvements.</p>
    <div>
      <h2>Conclusion</h2>
      <a href="#conclusion">
        
      </a>
    </div>
    <p>Migrating Baselime to Cloudflare has transformed how we build and scale our platform. With Workers, Durable Objects, Workers Analytics Engine, and other services, we now run a fully serverless, globally distributed system that’s more cost-efficient and agile. This shift has significantly reduced our operational overhead and enabled us to iterate faster, delivering better observability tooling to our users.</p><p>You can start observing your Cloudflare Workers today with <a href="https://developers.cloudflare.com/workers/observability/logs/workers-logs/"><u>Workers Logs</u></a>. Looking ahead, we’re excited about the features we will deliver directly in the Cloudflare Dashboard, including real-time error tracking, alerting, and a query builder for high-cardinality and dimensionality events. All coming by early 2025.</p> ]]></content:encoded>
            <category><![CDATA[Observability]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <category><![CDATA[Performance]]></category>
            <guid isPermaLink="false">6heSTMT0I5jl0vpeR9TucD</guid>
            <dc:creator>Boris Tane</dc:creator>
        </item>
        <item>
            <title><![CDATA[Wrapping up another Birthday Week celebration]]></title>
            <link>https://blog.cloudflare.com/birthday-week-2024-wrap-up/</link>
            <pubDate>Mon, 30 Sep 2024 13:00:00 GMT</pubDate>
            <description><![CDATA[ Recapping all the big announcements made during 2024’s Birthday Week. ]]></description>
            <content:encoded><![CDATA[ <p>2024 marks Cloudflare’s 14th birthday. Birthday Week each year is packed with major announcements and the release of innovative new offerings, all focused on giving back to our customers and the broader Internet community. Birthday Week has become a proud tradition at Cloudflare and our culture, to not just stay true to our mission, but to always stay close to our customers. We begin planning for this week of celebration earlier in the year and invite everyone at Cloudflare to participate.</p><p>Months before Birthday Week, we invited teams to submit ideas for what to announce. We were flooded with submissions, from proposals for implementing new standards to creating new products for developers. Our biggest challenge is finding space for it all in just one week — there is still so much to build. Good thing we have a birthday to celebrate each year, but we might need an extra day in Birthday Week next year!</p><p>In case you missed it, here’s everything we announced during 2024’s Birthday Week:</p>
    <div>
      <h3>Monday</h3>
      <a href="#monday">
        
      </a>
    </div>
    <div>
    <figure>
        <table>
            <colgroup>
                <col></col>
                <col></col>
            </colgroup>
            <tbody>
                <tr>
                    <td>
                        <p><span><span>What</span></span></p>
                    </td>
                    <td>
                        <p><span><span>In a sentence…</span></span></p>
                    </td>
                </tr>
                <tr>
                    <td>
                        <p><a href="https://blog.cloudflare.com/cloudflare-ai-audit-control-ai-content-crawlers"><span><span><u>Start auditing and controlling the AI models accessing your content</u></span></span></a></p>
                    </td>
                    <td>
                        <p><span><span>Understand which AI-related bots and crawlers can access your website, and which content you choose to allow them to consume.</span></span></p>
                    </td>
                </tr>
                <tr>
                    <td>
                        <p><a href="https://blog.cloudflare.com/batched-dns-changes/"><span><span><u>Making zone management more efficient with batch DNS record updates</u></span></span></a></p>
                    </td>
                    <td>
                        <p><span><span>Customers using Cloudflare to manage DNS can create a whole batch of records, enable </span></span><a href="https://developers.cloudflare.com/dns/manage-dns-records/reference/proxied-dns-records/"><span><span>proxying</span></span></a><span><span> on many records, update many records to point to a new target at the same time, or even delete all of their records.</span></span></p>
                    </td>
                </tr>
                <tr>
                    <td>
                        <p><a href="https://blog.cloudflare.com/turnstile-ephemeral-ids-for-fraud-detection"><span><span><u>Introducing Ephemeral IDs: a new tool for fraud detection</u></span></span></a></p>
                    </td>
                    <td>
                        <p><span><span>Taking the next step in advancing security with Ephemeral IDs, a new feature that generates a unique short-lived ID, without relying on any network-level information.</span></span></p>
                        <p> </p>
                    </td>
                </tr>
            </tbody>
        </table>
    </figure>
</div>
    <div>
      <h3>Tuesday</h3>
      <a href="#tuesday">
        
      </a>
    </div>
    <div>
    <figure>
        <table>
            <colgroup>
                <col></col>
                <col></col>
            </colgroup>
            <tbody>
                <tr>
                    <td>
                        <p><span><span>What</span></span></p>
                    </td>
                    <td>
                        <p><span><span>In a sentence…</span></span></p>
                    </td>
                </tr>
                <tr>
                    <td>
                        <p><a href="https://blog.cloudflare.com/safer-resolver/"><span><span><u>Cloudflare partners to deliver safer browsing experience to homes</u></span></span></a></p>
                    </td>
                    <td>
                        <p><span><span>Internet service, network, and hardware equipment providers can </span></span><a href="https://docs.google.com/spreadsheets/d/1ZIBbVz2gqPBsldhszk_Wo2eZeNwAZ5Mf9xSssxRrTuc/edit?resourcekey=&amp;gid=386353769#gid=386353769"><span><span><u>sign up</u></span></span></a><span><span> and partner with Cloudflare to deliver a safer browsing experience to homes.</span></span></p>
                    </td>
                </tr>
                <tr>
                    <td>
                        <p><a href="https://blog.cloudflare.com/a-safer-internet-with-cloudflare/"><span><span><u>A safer Internet with Cloudflare: free threat intelligence, analytics, and new threat detections</u></span></span></a></p>
                    </td>
                    <td>
                        <p><span><span>Free threat intelligence, analytics, new threat detections, and more.</span></span></p>
                    </td>
                </tr>
                <tr>
                    <td>
                        <p><a href="https://blog.cloudflare.com/automatically-generating-cloudflares-terraform-provider/"><span><span><u>Automatically generating Cloudflare’s Terraform provider</u></span></span></a></p>
                        <p> </p>
                    </td>
                    <td>
                        <p><span><span>The last pieces of the OpenAPI schemas ecosystem to now be automatically generated — the Terraform provider and API reference documentation.</span></span></p>
                    </td>
                </tr>
                <tr>
                    <td>
                        <p><a href="https://blog.cloudflare.com/key-transparency/"><span><span><u>Cloudflare helps verify the security of end-to-end encrypted messages by auditing key transparency for WhatsApp</u></span></span></a></p>
                    </td>
                    <td>
                        <p><span><span>Cloudflare helps verify the security of end-to-end encrypted messages by auditing key transparency for WhatsApp.</span></span></p>
                    </td>
                </tr>
            </tbody>
        </table>
    </figure>
</div>
    <div>
      <h3>Wednesday</h3>
      <a href="#wednesday">
        
      </a>
    </div>
    <div>
    <figure>
        <table>
            <colgroup>
                <col></col>
                <col></col>
            </colgroup>
            <tbody>
                <tr>
                    <td>
                        <p><span><span>What</span></span></p>
                    </td>
                    <td>
                        <p><span><span>In a sentence…</span></span></p>
                    </td>
                </tr>
                <tr>
                    <td>
                        <p><a href="https://blog.cloudflare.com/introducing-speed-brain/"><span><span><u>Introducing Speed Brain: helping web pages load 45% faster</u></span></span></a></p>
                    </td>
                    <td>
                        <p><span><span>Speed Brain, our latest leap forward in speed, uses the Speculation Rules API to prefetch content for users' likely next navigations — downloading web pages before they navigate to them and making pages load 45% faster.</span></span></p>
                    </td>
                </tr>
                <tr>
                    <td>
                        <p><a href="https://blog.cloudflare.com/instant-purge/"><span><span><u>Instant Purge: invalidating cached content in under 150ms</u></span></span></a></p>
                    </td>
                    <td>
                        <p><span><span>Instant Purge invalidates cached content in under 150ms, offering the industry's fastest cache purge with global latency for purges by tags, hostnames, and prefixes.</span></span></p>
                    </td>
                </tr>
                <tr>
                    <td>
                        <p><a href="https://blog.cloudflare.com/new-standards/"><span><span><u>New standards for a faster and more private Internet</u></span></span></a></p>
                    </td>
                    <td>
                        <p><span><span>Zstandard compression, Encrypted Client Hello, and more speed and privacy announcements all released for free.</span></span></p>
                    </td>
                </tr>
                <tr>
                    <td>
                        <p><a href="https://blog.cloudflare.com/webrtc-turn-using-anycast/"><span><span><u>TURN and anycast: making peer connections work globally</u></span></span></a></p>
                    </td>
                    <td>
                        <p><span><span>Starting today, </span></span><a href="https://developers.cloudflare.com/calls/turn/"><span><span>Cloudflare Calls’ TURN service</span></span></a><span><span> is now generally available to all Cloudflare accounts.</span></span></p>
                    </td>
                </tr>
                <tr>
                    <td>
                        <p><a href="https://blog.cloudflare.com/gen-12-servers"><span><span><u>Cloudflare’s 12th Generation servers — 145% more performant and 63% more efficient</u></span></span></a></p>
                    </td>
                    <td>
                        <p><span><span>Next generation servers focused on exceptional performance and security, enhanced support for AI/ML workloads, and significant strides in power efficiency.</span></span></p>
                        <p> </p>
                        <p><span><span> </span></span></p>
                    </td>
                </tr>
            </tbody>
        </table>
    </figure>
</div>
    <div>
      <h3>Thursday</h3>
      <a href="#thursday">
        
      </a>
    </div>
    <div>
    <figure>
        <table>
            <colgroup>
                <col></col>
                <col></col>
            </colgroup>
            <tbody>
                <tr>
                    <td>
                        <p><span><span>What</span></span></p>
                    </td>
                    <td>
                        <p><span><span>In a sentence…</span></span></p>
                    </td>
                </tr>
                <tr>
                    <td>
                        <p><a href="https://blog.cloudflare.com/startup-program-250k-credits"><span><span><u>Startup Program revamped: build and grow on Cloudflare with up to $250,000 in credits</u></span></span></a></p>
                        <p> </p>
                    </td>
                    <td>
                        <p><span><span>Eligible startups can now apply to receive up to $250,000 in credits to build using Cloudflare's Developer Platform.</span></span></p>
                    </td>
                </tr>
                <tr>
                    <td>
                        <p><a href="https://blog.cloudflare.com/workers-ai-bigger-better-faster"><span><span><u>Cloudflare’s bigger, better, faster AI platform </u></span></span></a></p>
                    </td>
                    <td>
                        <p><span><span>More powerful GPUs, expanded model support, enhanced logging and evaluations in AI Gateway, and Vectorize GA with larger index sizes and faster queries.</span></span></p>
                    </td>
                </tr>
                <tr>
                    <td>
                        <p><a href="https://blog.cloudflare.com/builder-day-2024-announcements"><span><span><u>Builder Day 2024: 18 big updates to the Workers platform</u></span></span></a></p>
                    </td>
                    <td>
                        <p><span><span>Persistent and queryable Workers logs, Node.js compatibility GA, improved Next.js support via OpenNext, built-in CI/CD for Workers, Gradual Deployments, Queues, and R2 Event Notifications GA, and more — making building on Cloudflare easier, faster, and more affordable.</span></span></p>
                    </td>
                </tr>
                <tr>
                    <td>
                        <p><a href="https://blog.cloudflare.com/faster-workers-kv"><span><span><u>Faster Workers KV</u></span></span></a></p>
                    </td>
                    <td>
                        <p><span><span>A deep dive into how we made Workers KV up to 3x faster.</span></span></p>
                    </td>
                </tr>
                <tr>
                    <td>
                        <p><a href="https://blog.cloudflare.com/sqlite-in-durable-objects"><span><span><u>Zero-latency SQLite storage in every Durable Object</u></span></span></a></p>
                    </td>
                    <td>
                        <p><span><span>Putting your application code into the storage layer, so your code runs where the data is stored.</span></span></p>
                    </td>
                </tr>
                <tr>
                    <td>
                        <p><a href="https://blog.cloudflare.com/making-workers-ai-faster/"><span><span><u>Making Workers AI faster and more efficient: Performance optimization with KV cache compression and speculative decoding</u></span></span></a></p>
                    </td>
                    <td>
                        <p><span><span>Using new optimization techniques such as KV cache compression and speculative decoding, we’ve made large language model (LLM) inference lightning-fast on the Cloudflare Workers AI platform.</span></span></p>
                    </td>
                </tr>
            </tbody>
        </table>
    </figure>
</div>
    <div>
      <h3>Friday</h3>
      <a href="#friday">
        
      </a>
    </div>
    <div>
    <figure>
        <table>
            <colgroup>
                <col></col>
                <col></col>
            </colgroup>
            <tbody>
                <tr>
                    <td>
                        <p><span><span>What</span></span></p>
                    </td>
                    <td>
                        <p><span><span>In a sentence…</span></span></p>
                    </td>
                </tr>
                <tr>
                    <td>
                        <p><a href="https://blog.cloudflare.com/container-platform-preview"><span><span><u>Our container platform is in production. It has GPUs. Here’s an early look.</u></span></span></a></p>
                        <p> </p>
                    </td>
                    <td>
                        <p><span><span>We’ve been working on something new — a platform for running containers across Cloudflare’s network. We already use it in production, for AI inference and more.</span></span></p>
                    </td>
                </tr>
                <tr>
                    <td>
                        <p><a href="https://blog.cloudflare.com/cisa-pledge-commitment-bug-bounty-vip"><span><span><u>Advancing cybersecurity: Cloudflare implements a new bug bounty VIP program as part of CISA Pledge commitment</u></span></span></a></p>
                    </td>
                    <td>
                        <p><span><span>We implemented a new bug bounty VIP program this year as part of our CISA Pledge commitment.</span></span></p>
                    </td>
                </tr>
                <tr>
                    <td>
                        <p><a href="https://blog.cloudflare.com/launchpad-cohort4-dev-starter-pack/"><span><span><u>Empowering builders: introducing the Dev Alliance and Workers Launchpad Cohort #4</u></span></span></a></p>
                    </td>
                    <td>
                        <p><span><span>Get free and discounted access to essential developer tools and meet the latest set of incredible startups building on Cloudflare.</span></span></p>
                    </td>
                </tr>
                <tr>
                    <td>
                        <p><a href="https://blog.cloudflare.com/expanding-our-support-for-oss-projects-with-project-alexandria"><span><span><u>Expanding our support for open source projects with Project Alexandria</u></span></span></a></p>
                    </td>
                    <td>
                        <p><span><span>Expanding our open source program and helping projects have a sustainable and scalable future, providing tools and protection needed to thrive.</span></span></p>
                    </td>
                </tr>
                <tr>
                    <td>
                        <p><a href="https://blog.cloudflare.com/radar-data-explorer-ai-assistant"><span><span><u>Network trends and natural language: Cloudflare Radar’s new Data Explorer &amp; AI Assistant</u></span></span></a></p>
                    </td>
                    <td>
                        <p><span><span>A simple Web-based interface to build more complex API queries, including comparisons and filters, and visualize the results.</span></span></p>
                    </td>
                </tr>
                <tr>
                    <td>
                        <p><a href="https://blog.cloudflare.com/bringing-ai-to-cloudflare"><span><span><u>AI Everywhere with the WAF Rule Builder Assistant, Cloudflare Radar AI Insights, and updated AI bot protection</u></span></span></a></p>
                    </td>
                    <td>
                        <p><span><span>Extending our AI Assistant capabilities to help you build new WAF rules, added new AI bot and crawler traffic insights to Radar, and new AI bot blocking capabilities.</span></span></p>
                    </td>
                </tr>
                <tr>
                    <td>
                        <p><a href="https://blog.cloudflare.com/cloudflares-commitment-to-free"><span><span><u>Reaffirming our commitment to Free</u></span></span></a></p>
                    </td>
                    <td>
                        <p><span><span>Our free plan is here to stay, and we reaffirm that commitment this week with 15 releases that make the Free plan even better.</span></span></p>
                        <p> </p>
                    </td>
                </tr>
            </tbody>
        </table>
    </figure>
</div>
    <div>
      <h2>One more thing…</h2>
      <a href="#one-more-thing">
        
      </a>
    </div>
    
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5FReOqd5AHo8vTgSmY6qe6/1ae02d93ec9d9af2f60c0b6024017f58/image3.png" />
          </figure><p>Cloudflare serves millions of customers and their millions of domains across nearly every country on Earth. However, as a global company, the payment landscape can be complex — especially in regions outside of North America. While credit cards are very popular for online purchases in the US, the global picture is quite different. <a href="https://www.fisglobal.com/-/media/fisglobal/files/campaigns/global-payments%20report/FIS_TheGlobalPaymentsReport_2023.pdf"><u>60% of consumers across EMEA, APAC and LATAM choose alternative payment methods</u></a>. For instance, European consumers often opt for SEPA Direct Debit, a bank transfer mechanism, while Chinese consumers frequently use Alipay, a digital wallet.</p><p>At Cloudflare, we saw this as an opportunity to meet customers where they are. Today, we're thrilled to announce that we are expanding our payment system and launching a closed beta for a new payment method called <a href="https://www.cloudflare.com/lp/cloudflare-introduces-stripe-link/"><u>Stripe Link</u></a>. The checkout experience will be faster and more seamless, allowing our self-serve customers to pay using saved bank accounts or cards with Link. Customers who have saved their payment details at any business using Link can quickly check out without having to reenter their payment information.</p><p>These are the first steps in our efforts to expand our payment system to support global payment methods used by customers around the world.<b> </b>We'll be rolling out new payment methods gradually, ensuring a smooth integration and gathering feedback from our customers every step of the way.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/v0v7QBRWeGSfArq6jE5eg/7d8d79cbfe3f63386db52469c4727d21/image2.png" />
          </figure>
    <div>
      <h2>Until next year</h2>
      <a href="#until-next-year">
        
      </a>
    </div>
    <p>That’s all for Birthday Week 2024. However, the innovation never stops at Cloudflare. Continue to follow the <a href="https://blog.cloudflare.com/"><u>Cloudflare Blog</u></a> all year long as we launch more products and features that help build a better Internet.</p> ]]></content:encoded>
            <category><![CDATA[Birthday Week]]></category>
            <category><![CDATA[Partners]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <category><![CDATA[Workers Launchpad]]></category>
            <category><![CDATA[Workers AI]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[Turnstile]]></category>
            <category><![CDATA[Performance]]></category>
            <category><![CDATA[Security]]></category>
            <category><![CDATA[Cache]]></category>
            <category><![CDATA[Speed]]></category>
            <category><![CDATA[Speed Brain]]></category>
            <category><![CDATA[Developers]]></category>
            <category><![CDATA[AI]]></category>
            <guid isPermaLink="false">65JnLP0MYKVzwTyOsItRJk</guid>
            <dc:creator>Kelly May Johnston</dc:creator>
            <dc:creator>Brendan Irvine-Broque</dc:creator>
        </item>
        <item>
            <title><![CDATA[We made Workers KV up to 3x faster — here’s the data]]></title>
            <link>https://blog.cloudflare.com/faster-workers-kv/</link>
            <pubDate>Thu, 26 Sep 2024 13:00:00 GMT</pubDate>
            <description><![CDATA[ Speed is a critical factor that dictates Internet behavior. Every additional millisecond a user spends waiting for your web page to load results in them abandoning your website.  ]]></description>
            <content:encoded><![CDATA[ <p>Speed is a critical factor that dictates Internet behavior. Every additional millisecond a user spends waiting for your web page to load results in them abandoning your website. The old adage remains as true as ever: <a href="https://www.cloudflare.com/learning/performance/more/website-performance-conversion-rates/"><u>faster websites result in higher conversion rates</u></a>. And with such outcomes tied to Internet speed, we believe a faster Internet is a better Internet.</p><p>Customers often use <a href="https://developers.cloudflare.com/kv/"><u>Workers KV</u></a> to provide <a href="https://developers.cloudflare.com/workers/"><u>Workers</u></a> with key-value data for configuration, routing, personalization, experimentation, or serving assets. Many of Cloudflare’s own products rely on KV for just this purpose: <a href="https://developers.cloudflare.com/pages"><u>Pages</u></a> stores static assets, <a href="https://developers.cloudflare.com/cloudflare-one/policies/access/"><u>Access</u></a> stores authentication credentials, <a href="https://developers.cloudflare.com/ai-gateway/"><u>AI Gateway</u></a> stores routing configuration, and <a href="https://developers.cloudflare.com/images/"><u>Images</u></a> stores configuration and assets, among others. So KV’s speed affects the latency of every request to an application, throughout the entire lifecycle of a user session. </p><p>Today, we’re announcing up to 3x faster KV hot reads, with all KV operations faster by up to 20ms. And we want to pull back the curtain and show you how we did it. </p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/67VzWOTRpMd9Dbc6ZM7M4M/cefb1d856344d9c963d4adfbe1b32fba/BLOG-2518_2.png" />
          </figure><p><sup><i>Workers KV read latency (ms) by percentile measured from Pages</i></sup></p>
    <div>
      <h2>Optimizing Workers KV’s architecture to minimize latency</h2>
      <a href="#optimizing-workers-kvs-architecture-to-minimize-latency">
        
      </a>
    </div>
    <p>At a high level, Workers KV is itself a Worker that makes requests to central storage backends, with many layers in between to properly cache and route requests across Cloudflare’s network. You can rely on Workers KV to support operations made by your Workers at any scale, and KV’s architecture will seamlessly handle your required throughput. </p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3pcw5pO2eeGJ1RriESJFSB/651fe26718f981eb741ad2ffb2f288c9/BLOG-2518_3.png" />
          </figure><p><sup><i>Sequence diagram of a Workers KV operation</i></sup></p><p>When your Worker makes a read operation to Workers KV, your Worker establishes a network connection within its Cloudflare region to KV’s Worker. The KV Worker then accesses the <a href="https://developers.cloudflare.com/workers/runtime-apis/cache/"><u>Cache API</u></a>, and in the event of a cache miss, retrieves the value from the storage backends. </p><p>Let’s look one level deeper at a simplified trace: </p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7mCpF8NRgSg70p8VNTFXu8/4cabdae18285575891f49a5cd34c9ab8/BLOG-2518_4.png" />
          </figure><p><sup><i>Simplified trace of a Workers KV operation</i></sup></p><p>From the top, here are the operations completed for a KV read operation from your Worker:</p><ol><li><p>Your Worker makes a connection to Cloudflare’s network in the same data center. This incurs ~5 ms of network latency.</p></li><li><p>Upon entering Cloudflare’s network, a service called <a href="https://blog.cloudflare.com/upgrading-one-of-the-oldest-components-in-cloudflare-software-stack/"><u>Front Line (FL)</u></a> is used to process the request. This incurs ~10 ms of operational latency.</p></li><li><p>FL proxies the request to the KV Worker. The KV Worker does a cache lookup for the key being accessed. This, once again, passes through the Front Line layer, incurring an additional ~10 ms of operational latency.</p></li><li><p>Cache is stored in various backends within each region of Cloudflare’s network. A service built upon <a href="https://blog.cloudflare.com/pingora-open-source/"><u>Pingora</u></a>, our open-sourced Rust framework for proxying HTTP requests, routes the cache lookup to the proper cache backend.</p></li><li><p>Finally, if the cache lookup is successful, the KV read operation is resolved. Otherwise, the request reaches our storage backends, where it gets its value.</p></li></ol><p>Looking at these flame graphs, it became apparent that a major opportunity presented itself to us: reducing the FL overhead (or eliminating it altogether) and reducing the cache misses across the Cloudflare network would reduce the latency for KV operations.</p>
    <div>
      <h3>Bypassing FL layers between Workers and services to save ~20ms</h3>
      <a href="#bypassing-fl-layers-between-workers-and-services-to-save-20ms">
        
      </a>
    </div>
    <p>A request from your Worker to KV doesn’t need to go through FL. Much of FL’s responsibility is to process and route requests from outside of Cloudflare — that’s more than is needed to handle a request from the KV binding to the KV Worker. So we skipped the Front Line altogether in both layers.</p><div>
  
</div>
<p><sup><i>Reducing latency in a Workers KV operation by removing FL layers</i></sup></p><p>To bypass the FL layer from the KV binding in your Worker, we modified the KV binding to connect directly to the KV Worker within the same Cloudflare location. Within the Workers host, we configured a C++ subpipeline to allow code from bindings to establish a direct connection with the proper routing configuration and authorization loaded. </p><p>The KV Worker also passes through the FL layer on its way to our internal <a href="https://blog.cloudflare.com/pingora-open-source/"><u>Pingora</u></a> service. In this case, we were able to use an internal Worker binding that allows Workers for Cloudflare services to bind directly to non-Worker services within Cloudflare’s network. With this fix, the KV Worker sets the proper cache control headers and establishes its connection to Pingora without leaving the network. </p><p>Together, both of these changes reduced latency by ~20 ms for every KV operation. </p>
    <div>
      <h3>Implementing tiered cache to minimize requests to storage backends</h3>
      <a href="#implementing-tiered-cache-to-minimize-requests-to-storage-backends">
        
      </a>
    </div>
    <p>We also optimized KV’s architecture to reduce the amount of requests that need to reach our centralized storage backends. These storage backends are further away and incur network latency, so improving the cache hit rate in regions close to your Workers significantly improves read latency.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1791aSxXPH1lgOIr3RQrLz/1685f6a57d627194e76cb657cd22ddd8/BLOG-2518_5.png" />
          </figure><p><sup><i>Workers KV uses Tiered Cache to resolve operations closer to your users</i></sup></p><p>To accomplish this, we used <a href="https://developers.cloudflare.com/cache/how-to/tiered-cache/#custom-tiered-cache"><u>Tiered Cache</u></a>, and implemented a cache topology that is fine-tuned to the usage patterns of KV. With a tiered cache, requests to KV’s storage backends are cached in regional tiers in addition to local (lower) tiers. With this architecture, KV operations that may be cache misses locally may be resolved regionally, which is especially significant if you have traffic across an entire region spanning multiple Cloudflare data centers. </p><p>This significantly reduced the amount of requests that needed to hit the storage backends, with ~30% of requests resolved in tiered cache instead of storage backends.</p>
    <div>
      <h2>KV’s new architecture</h2>
      <a href="#kvs-new-architecture">
        
      </a>
    </div>
    <p>As a result of these optimizations, KV operations are now simplified:</p><ol><li><p>When you read from KV in your Worker, the <a href="https://developers.cloudflare.com/kv/concepts/kv-bindings/"><u>KV binding</u></a> binds directly to KV’s Worker, saving 10 ms. </p></li><li><p>The KV Worker binds directly to the Tiered Cache service, saving another 10 ms. </p></li><li><p>Tiered Cache is used in front of storage backends, to resolve local cache misses regionally, closer to your users.</p></li></ol>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/2cW0MsOH120GKUAlIUvDR4/7f574632ee81d3b955ed99bf87d86fa2/BLOG-2518_6.png" />
          </figure><p><sup><i>Sequence diagram of KV operations with new architecture</i></sup></p><p>In aggregate, these changes significantly reduced KV’s latency. 

The impact of the direct binding to cache is clearly seen in the wall time of the KV Worker, given this value measures the duration of a retrieval of a key-value pair from cache. The 90th percentile of all KV Worker invocations now resolve in less than 12 ms — before the direct binding to cache, that was 22 ms. That’s a 10 ms decrease in latency. </p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1UmcRB0Afk8mHig2DrThsh/d913cd33a28c1c2b093379238a90551c/BLOG-2518_7.png" />
          </figure><p><sup><i>Wall time (ms) within the KV Worker by percentile</i></sup></p><p>These KV read operations resolve quickly because the data is cached locally in the same Cloudflare location. But what about reads that aren’t resolved locally? ~30% of these resolve regionally within the tiered cache. Reads from tiered cache are up to 100 ms faster than when resolved at central storage backends, once again contributing to making KV reads faster in aggregate.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1Gz2IxlcNuDDRp3MhC4m40/ee364b710cec4332a5c307a784f34fa4/BLOG-2518_8.png" />
          </figure><p><sup><i>Wall time (ms) within the KV Worker for tiered cache vs. storage backends reads</i></sup></p><p>These graphs demonstrate the impact of direct binding from the KV binding to cache, and tiered cache. To see the impact of the direct binding from a Worker to the KV Worker, we need to look at the latencies reported by Cloudflare products that use KV.</p><p><a href="https://developers.cloudflare.com/pages/"><u>Cloudflare Pages</u></a>, which serves static assets like HTML, CSS, and scripts from KV, saw load times for fetching assets improve by up to 68%. Workers asset hosting, which we also announced as part of today’s <a href="http://blog.cloudflare.com/builder-day-2024-announcements"><u>Builder Day announcements</u></a>, gets this improved performance from day 1.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/67VzWOTRpMd9Dbc6ZM7M4M/cefb1d856344d9c963d4adfbe1b32fba/BLOG-2518_2.png" />
          </figure><p><sup><i>Workers KV read operation latency measured within Cloudflare Pages by percentile</i></sup></p><p><a href="https://developers.cloudflare.com/queues/"><u>Queues</u></a> and <a href="https://developers.cloudflare.com/cloudflare-one/applications/"><u>Access</u></a> also saw their latencies for KV operations drop, with their KV read operations now 2-5x faster. These services rely on Workers KV data for configuration and routing data, so KV’s performance improvement directly contributes to making them faster on each request. </p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1Gz2IxlcNuDDRp3MhC4m40/ee364b710cec4332a5c307a784f34fa4/BLOG-2518_8.png" />
          </figure><p><sup><i>Workers KV read operation latency measured within Cloudflare Queues by percentile</i></sup></p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1HFapaO1Gv09g9VlODrLAu/56d39207fb3dfefe258fa5e8cb8bd67b/BLOG-2518_10.png" />
          </figure><p><sup><i>Workers KV read operation latency measured within Cloudflare Access by percentile</i></sup></p><p>These are just some of the direct effects that a faster KV has had on other services. Across the board, requests are resolving faster thanks to KV’s faster response times. </p><p>And we have one more thing to make KV lightning fast. </p>
    <div>
      <h3>Optimizing KV’s hottest keys with an in-memory cache </h3>
      <a href="#optimizing-kvs-hottest-keys-with-an-in-memory-cache">
        
      </a>
    </div>
    <p>Less than 0.03% of keys account for nearly half of requests to the Workers KV service across all namespaces. These keys are read thousands of times per second, so making these faster has a disproportionate impact. Could these keys be resolved within the KV Worker without needing additional network hops?</p><p>Almost all of these keys are under 100 KB. At this size, it becomes possible to use the in-memory cache of the KV Worker — a limited amount of memory within the <a href="https://developers.cloudflare.com/workers/reference/how-workers-works/#isolates"><u>main runtime process</u></a> of a Worker sandbox. And that’s exactly what we did. For the highest throughput keys across Workers KV, reads resolve without even needing to leave the Worker runtime process.</p><div>
  
</div>
<p><sup><i>Sequence diagram of KV operations with the hottest keys resolved within an in-memory cache</i></sup></p><p>As a result of these changes, KV reads for these keys, which represent over 40% of Workers KV requests globally, resolve in under a millisecond. We’re actively testing these changes internally and expect to roll this out during October to speed up the hottest key-value pairs on Workers KV.</p>
    <div>
      <h2>A faster KV for all</h2>
      <a href="#a-faster-kv-for-all">
        
      </a>
    </div>
    <p>Most of these speed gains are already enabled with no additional action needed from customers. Your websites that are using KV are already responding to requests faster for your users, as are the other Cloudflare services using KV under the hood and the countless websites that depend upon them. </p><p>And we’re not done: we’ll continue to chase performance throughout our stack to make your websites faster. That’s how we’re going to move the needle towards a faster Internet. </p><p>To see Workers KV’s recent speed gains for your own KV namespaces, head over to your dashboard and check out the <a href="https://developers.cloudflare.com/kv/observability/metrics-analytics/"><u>new KV analytics</u></a>, with latency and cache status detailed per namespace.</p> ]]></content:encoded>
            <category><![CDATA[Birthday Week]]></category>
            <category><![CDATA[Cloudflare Workers KV]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <category><![CDATA[Developers]]></category>
            <category><![CDATA[Performance]]></category>
            <guid isPermaLink="false">76i5gcdv0wbMNnwa7E17MR</guid>
            <dc:creator>Thomas Gauvin</dc:creator>
            <dc:creator>Rob Sutter</dc:creator>
            <dc:creator>Andrew Plunk</dc:creator>
        </item>
        <item>
            <title><![CDATA[Instant Purge: invalidating cached content in under 150ms]]></title>
            <link>https://blog.cloudflare.com/instant-purge/</link>
            <pubDate>Tue, 24 Sep 2024 23:00:00 GMT</pubDate>
            <description><![CDATA[ We’ve built the fastest cache purge in the industry by offering a global purge latency for purge by tags, hostnames, and prefixes of less than 150ms on average (P50), representing a 90% improvement.  ]]></description>
            <content:encoded><![CDATA[ <p><sup>(part 3 of the Coreless Purge </sup><a href="https://blog.cloudflare.com/rethinking-cache-purge-architecture/"><sup>series</sup></a><sup>)</sup></p><p>Over the past 14 years, Cloudflare has evolved far beyond a Content Delivery Network (CDN), expanding its offerings to include a comprehensive <a href="https://developers.cloudflare.com/cloudflare-one/"><u>Zero Trust</u></a> security portfolio, network security &amp; performance <a href="https://www.cloudflare.com/network-services/products/"><u>services</u></a>, application security &amp; performance <a href="https://www.cloudflare.com/application-services/products/"><u>optimizations</u></a>, and a powerful <a href="https://www.cloudflare.com/developer-platform/products/"><u>developer platform</u></a>. But customers also continue to rely on Cloudflare for caching and delivering static website content. CDNs are often judged on their ability to return content to visitors as quickly as possible. However, the speed at which content is removed from a CDN's global cache is just as crucial.</p><p>When customers frequently update content such as news, scores, or other data, it is essential they<a href="https://www.cloudflare.com/learning/cdn/common-cdn-issues/"> avoid serving stale, out-of-date information</a> from cache to visitors. This can lead to a <a href="https://www.cloudflare.com/learning/cdn/common-cdn-issues/">subpar experience</a> where users might see invalid prices, or incorrect news. The goal is to remove the stale content and cache the new version of the file on the CDN, as quickly as possible. And that starts by issuing a “purge.”</p><p>In May 2022, we released the <a href="https://blog.cloudflare.com/part1-coreless-purge/"><u>first part</u><b><u> </u></b></a>of the series detailing our efforts to rebuild and publicly document the steps taken to improve the system our customers use, to purge their cached content. Our goal was to increase scalability, and importantly, the speed of our customer’s purges. In that initial post, we explained how our purge system worked and the design constraints we found when scaling. We outlined how after more than a decade, we had outgrown our purge system and started building an entirely new purge system, and provided purge performance benchmarking that users experienced at the time. We set ourselves a lofty goal: to be the fastest.</p><p><b>Today, we’re excited to share that we’ve built the fastest cache purge in the industry.</b>  We now offer a global purge latency for purge by tags, hostnames, and prefixes of less than 150ms on average (P50), representing a 90% improvement since May 2022. Users can now purge from anywhere, (almost) <i>instantly</i>. By the time you hit enter on a purge request and your eyes blink, the file is now removed from our global network — including data centers in <a href="https://www.cloudflare.com/network/"><u>330 cities</u></a> and <a href="https://blog.cloudflare.com/backbone2024/"><u>120+ countries</u></a>.</p><p>But that’s not all. It wouldn’t be Birthday Week if we stopped at just being the fastest purge. We are <b><i>also</i></b><i> </i>announcing that we’re opening up more purge options to Free, Pro, and Business plans. Historically, only Enterprise customers had access to the full arsenal of <a href="https://developers.cloudflare.com/cache/how-to/purge-cache/"><u>cache purge methods</u></a> supported by Cloudflare, such as purge by cache-tags, hostnames, and URL prefixes. As part of rebuilding our purge infrastructure, we’re not only fast but we are able to scale well beyond our current capacity. This enables more customers to use different types of purge. We are excited to offer these new capabilities to all plan types once we finish rolling out our new purge infrastructure, and expect to begin offering additional purge capabilities to all plan types in early 2025. </p>
    <div>
      <h3>Why cache and purge? </h3>
      <a href="#why-cache-and-purge">
        
      </a>
    </div>
    <p>Caching content is like pulling off a spectacular magic trick. It makes loading website content lightning-fast for visitors, slashes the load on origin servers and the cost to operate them, and enables global scalability with a single button press. But here's the catch: for the magic to work, caching requires predicting the future. The right content needs to be cached in the right data center, at the right moment when requests arrive, and in the ideal format. This guarantees astonishing performance for visitors and game-changing scalability for web properties.</p><p>Cloudflare helps make this caching magic trick easy. But regular users of our cache know that getting content into cache is only part of what makes it useful. When content is updated on an origin, it must also be updated in the cache. The beauty of caching is that it holds content until it expires or is evicted. To update the content, it must be actively removed and updated across the globe quickly and completely. If data centers are not uniformly updated or are updated at drastically different times, visitors risk getting different data depending on where they are located. This is where cache “purging” (also known as “cache invalidation”) comes in.</p>
    <div>
      <h3>One-to-many purges on Cloudflare</h3>
      <a href="#one-to-many-purges-on-cloudflare">
        
      </a>
    </div>
    <p>Back in <a href="https://blog.cloudflare.com/rethinking-cache-purge-architecture/"><u>part 2 of the blog series</u></a>, we touched on how there are multiple ways of purging cache: by URL, cache-tag, hostname, URL prefix, and “purge everything”, and discussed a necessary distinction between purging by URL and the other four kinds of purge — referred to as flexible purges — based on the scope of their impact.</p><blockquote><p><i>The reason flexible purge isn’t also fully coreless yet is because it’s a more complex task than “purge this object”; flexible purge requests can end up purging multiple objects – or even entire zones – from cache. They do this through an entirely different process that isn’t coreless compatible, so to make flexible purge fully coreless we would have needed to come up with an entirely new multi-purge mechanism on top of redesigning distribution. We chose instead to start with just purge by URL, so we could focus purely on the most impactful improvements, revamping distribution, without reworking the logic a data center uses to actually remove an object from cache.</i></p></blockquote><p>We said our next steps included a redesign of flexible purges at Cloudflare, and today we’d like to walk you through the resulting system. But first, a brief history of flexible cache purges at Cloudflare and elaboration on why the old flexible purge system wasn’t “coreless compatible”.</p>
    <div>
      <h3>Just in time</h3>
      <a href="#just-in-time">
        
      </a>
    </div>
    <p>“Cache” within a given data center is made up of many machines, all contributing disk space to store customer content. When a request comes in for an asset, the URL and headers are used to calculate a <a href="https://developers.cloudflare.com/cache/how-to/cache-keys/"><u>cache key</u></a>, which is the filename for that content on disk and also determines which machine in the datacenter that file lives on. The filename is the same for every data center, and every data center knows how to use it to find the right machine to cache the content. A <a href="https://developers.cloudflare.com/cache/how-to/purge-cache/"><u>purge request</u></a> for a URL (plus headers) therefore contains everything needed to generate the cache key — the pointer to the response object on disk — and getting that key to every data center is the hardest part of carrying out the purge.</p><p>Purging content based on response properties has a different hardest part. If a customer wants to purge all content with the cache-tag “foo”, for example, there’s no way for us to generate all the cache keys that will point to the files with that cache-tag at request time. Cache-tags are response headers, and the decision of where to store a file is based on request attributes only. To find all files with matching cache-tags, we would need to look at every file in every cache disk on every machine in every data center. That’s thousands upon thousands of machines we would be scanning for each purge-by-tag request. There are ways to avoid actually continuously scanning all disks worldwide (foreshadowing!) but for our first implementation of our flexible purge system, we hoped to avoid the problem space altogether.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7n56ZDJwdBbaTNPJII6s2S/db998973efdca121536a932bc50dd842/image5.png" />
            
            </figure><p>An alternative approach to going to every machine and looking for all files that match some criteria to actively delete from disk was something we affectionately referred to as “lazy purge”. Instead of deleting all matching files as soon as we process a purge request, we wait to do so when we get an end user request for one of those files. Whenever a request comes in, and we have the file in cache, we can compare the timestamp of any recent purge requests from the file owner to the insertion timestamp of the file we have on disk. If the purge timestamp is fresher than the insertion timestamp, we pretend we didn’t find the file on disk. For this to work, we needed to keep track of purge requests going back further than a data center’s maximum cache eviction age to be sure that any file a customer sends a matching flex purge to clear from cache will either be <a href="https://developers.cloudflare.com/cache/concepts/retention-vs-freshness/#retention"><u>naturally evicted</u></a>, or forced to cache MISS and get refreshed from the origin. With this approach, we just needed a distribution and storage system for keeping track of flexible purges.</p>
    <div>
      <h3>Purge looks a lot like a nail</h3>
      <a href="#purge-looks-a-lot-like-a-nail">
        
      </a>
    </div>
    <p>At Cloudflare there is a lot of configuration data that needs to go “everywhere”: cache configuration, load balancer settings, firewall rules, host metadata — countless products, features, and services that depend on configuration data that’s managed through Cloudflare’s control plane APIs. This data needs to be accessible by every machine in every datacenter in our network. The vast majority of that data is distributed via <a href="https://blog.cloudflare.com/introducing-quicksilver-configuration-distribution-at-internet-scale/"><u>a system introduced several years ago called Quicksilver</u></a>. The system works <i>very, very well</i> (sub-second p99 replication lag, globally). It’s extremely flexible and reliable, and reads are lightning fast. The team responsible for the system has done such a good job that Quicksilver has become a hammer that when wielded, makes everything look like a nail… like flexible purges.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/BFntOlvapsYYjYTcxMs7o/d73ad47b63b9d4893b46aeddc28a8698/image2.png" />
            
            </figure><p><sup><i>Core-based purge request entering a data center and getting backhauled to a core data center where Quicksilver distributes the request to all network data centers (hub and spoke).</i></sup><sup> </sup></p><p>Our first version of the flexible purge system used Quicksilver’s spoke-hub distribution to send purges from a core data center to every other data center in our network. It took less than a second for flexible purges to propagate, and once in a given data center, the purge key lookups in the hot path to force cache misses were in the low hundreds of microseconds. We were quite happy with this system at the time, especially because of the simplicity. Using well-supported internal infrastructure meant we weren’t having to manage database clusters or worry about transport between data centers ourselves, since we got that “for free”. Flexible purge was a new feature set and the performance seemed pretty good, especially since we had no predecessor to compare against.</p>
    <div>
      <h3>Victims of our own success</h3>
      <a href="#victims-of-our-own-success">
        
      </a>
    </div>
    <p>Our first version of flexible purge didn’t start showing cracks for years, but eventually both our network and our customer base grew large enough that our system was reaching the limits of what it could scale to. As mentioned above, we needed to store purge requests beyond our maximum eviction age. Purge requests are relatively small, and compress well, but thousands of customers using the API millions of times a day adds up to quite a bit of storage that Quicksilver needed on each machine to maintain purge history, and all of that storage cut into disk space we could otherwise be using to cache customer content. We also found the limits of Quicksilver in terms of how many writes per second it could handle without replication slowing down. We bought ourselves more runway by putting <a href="https://www.boltic.io/blog/kafka-queue#:~:text=Apache%20Kafka%20Queues%3F-,Apache%20Kafka%20queues,-are%20a%20powerful"><u>Kafka queues</u></a> in front of Quicksilver to buffer and throttle ourselves to even out traffic spikes, and increased batching, but all of those protections introduced latency. We knew we needed to come up with a solution without such a strong correlation between usage and operational costs.</p><p>Another pain point exposed by our growing user base that we mentioned in <a href="https://blog.cloudflare.com/rethinking-cache-purge-architecture/"><u>Part 2</u></a> was the excessive round trip times experienced by customers furthest away from our core data centers. A purge request sent by a customer in Australia would have to cross the Pacific Ocean and back before local customers would see the new content.</p><p>To summarize, three issues were plaguing us:</p><ol><li><p>Latency corresponding to how far a customer was from the centralized ingest point.</p></li><li><p>Latency due to the bottleneck for writes at the centralized ingest point.</p></li><li><p>Storage needs in all data centers correlating strongly with throughput demand.</p></li></ol>
    <div>
      <h3>Coreless purge proves useful</h3>
      <a href="#coreless-purge-proves-useful">
        
      </a>
    </div>
    <p>The first two issues affected all types of purge. The spoke-hub distribution model was problematic for purge-by-URL just as much as it was for flexible purges. So we embarked on the path to peer-to-peer distribution for purge-by-URL to address the latency and throughput issues, and the results of that project were good enough that we wanted to propagate flexible purges through the same system. But doing so meant we’d have to replace our use of Quicksilver; it was so good at what it does (fast/reliable replication network-wide, extremely fast/high read throughput) in large part because of the core assumption of spoke-hub distribution it could optimize for. That meant there was no way to write to Quicksilver from “spoke” data centers, and we would need to find another storage system for our purges.</p>
    <div>
      <h3>Flipping purge on its head</h3>
      <a href="#flipping-purge-on-its-head">
        
      </a>
    </div>
    <p>We decided if we’re going to replace our storage system we should dig into exactly what our needs are and find the best fit. It was time to revisit some of our oldest conclusions to see if they still held true, and one of the earlier ones was that proactively purging content from disk would be difficult to do efficiently given our storage layout.</p><p>But was that true? Or could we make active cache purge fast and efficient (enough)? What would it take to quickly find files on disk based on their metadata? “Indexes!” you’re probably screaming, and for good reason. Indexing files’ hostnames, cache-tags, and URLs would undoubtedly make querying for relevant files trivial, but a few aspects of our network make it less straightforward.</p><p>Cloudflare has hundreds of data centers that see trillions of unique files, so any kind of global index — even ignoring the networking hurdles of aggregation — would suffer the same type of bottlenecking issues with our previous spoke-hub system. Scoping the indices to the data center level would be better, but they vary in size up to several hundred machines. Managing a database cluster in each data center scaled to the appropriate size for the aggregate traffic of all the machines was a daunting proposition; it could easily end up being enough work on its own for a separate team, not something we should take on as a side hustle.</p><p>The next step down in scope was an index per machine. Indexing on the same machine as the cache proxy had some compelling upsides: </p><ul><li><p>The proxy could talk to the index over <a href="https://en.wikipedia.org/wiki/Unix_domain_socket"><u>UDS</u></a> (Unix domain sockets), avoiding networking complexities in the hottest paths.</p></li><li><p>As a sidecar service, the index just had to be running anytime the machine was accepting traffic. If a machine died, so would the index, but that didn’t matter, so there wasn’t any need to deal with the complexities of distributed databases.</p></li><li><p>While data centers were frequently adding and removing machines, machines weren’t frequently adding and removing disks. An index could reasonably count on its maximum size being predictable and constant based on overall disk size.</p></li></ul><p>But we wanted to make sure it was feasible on our machines. We analyzed representative cache disks from across our fleet, gathering data like the number of cached assets per terabyte and the average number of cache-tags per asset. We looked at cache MISS, REVALIDATED, and EXPIRED rates to estimate the required write throughput.</p><p>After conducting a thorough analysis, we were convinced the design would work. With a clearer understanding of the anticipated read/write throughput, we started looking at databases that could meet our needs. After benchmarking several relational and non-relational databases, we ultimately chose <a href="https://github.com/facebook/rocksdb"><u>RocksDB</u></a>, a high-performance embedded key-value store. We found that with proper tuning, it could be extremely good at the types of queries we needed.</p>
    <div>
      <h3>Putting it all together</h3>
      <a href="#putting-it-all-together">
        
      </a>
    </div>
    <p>And so CacheDB was born — a service written in Rust and built on RocksDB, which operates on each machine alongside the cache proxy to manage the indexing and purging of cached files. We integrated the cache proxy with CacheDB to ensure that indices are stored whenever a file is cached or updated, and they’re deleted when a file is removed due to eviction or purging. In addition to indexing data, CacheDB maintains a local queue for buffering incoming purge operations. A background process reads purge operations in the queue, looking up all matching files using the indices, and deleting the matched files from disk. Once all matched files for an operation have been deleted, the process clears the indices and removes the purge operation from the queue.</p><p>To further optimize the speed of purges taking effect, the cache proxy was updated to check with CacheDB — similar to the previous lazy purge approach — when a cache HIT occurs before returning the asset. CacheDB does a quick scan of its local queue to see if there are any pending purge operations that match the asset in question, dictating whether the cache proxy should respond with the cached file or fetch a new copy. This means purges will prevent the cache proxy from returning a matching cached file as soon as a purge reaches the machine, even if there are millions of files that correspond to a purge key, and it takes a while to actually delete them all from disk.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5W0ZIGBBbG5Cnc3DSbCPGT/a1572b0b67d844d4e5b7cc7899d320b1/image3.png" />
            
            </figure><p><sup><i>Coreless purge using CacheDB and Durable Objects to distribute purges without needing to first stop at a core data center.</i></sup></p><p>The last piece to change was the distribution pipeline, updated to broadcast flexible purges not just to every data center, but to the CacheDB service running on every machine. We opted for CacheDB to handle the last-mile fan out of machine to machine within a data center, using <a href="https://www.consul.io/"><u>consul</u></a> to keep each machine informed of the health of its peers. The choice let us keep the Workers largely the same for purge-by-URL (more <a href="https://blog.cloudflare.com/rethinking-cache-purge-architecture/"><u>here</u></a>) and flexible purge handling, despite the difference in termination points.</p>
    <div>
      <h3>The payoff</h3>
      <a href="#the-payoff">
        
      </a>
    </div>
    <p>Our new approach reduced the long tail of the lazy purge, saving 10x storage. Better yet, we can now delete purged content immediately instead of waiting for the lazy purge to happen or expire. This new-found storage will improve cache retention on disk for all users, leading to improved cache HIT ratios and reduced egress from your origin.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6B0kVX9Q6qA2JshmcTZSrt/80a845904adf8ba69c121bb54923959e/image1.png" />
            
            </figure><p><sup><i>The shift from lazy content purging (</i></sup><sup><i><u>left</u></i></sup><sup><i>) to the new Coreless Purge architecture allows us to actively delete content (</i></sup><sup><i><u>right</u></i></sup><sup><i>). This helps reduce storage needs and increase cache retention times across our service.</i></sup></p><p>With the new coreless cache purge, we can now get a purge request into any datacenter, distribute the keys to purge, and instantly purge the content from the cache database. This all occurs in less than 150 milliseconds on P50 for tags, hostnames, and prefix URL, covering all <a href="https://www.cloudflare.com/network/"><u>330 cities</u></a> in <a href="https://blog.cloudflare.com/backbone2024/"><u>120+ countries</u></a>.</p>
    <div>
      <h3>Benchmarks</h3>
      <a href="#benchmarks">
        
      </a>
    </div>
    <p>To measure Instant Purge, we wanted to make sure that we were looking at real user metrics — that these were purges customers were actually issuing and performance that was representative of what we were seeing under real conditions, rather than marketing numbers.</p><p>The time we measure represents the period when a request enters the local datacenter, and ends with when the purge has been executed in every datacenter. When the local data center receives the request, one of the first things we do is to add a timestamp to the purge request. When all data centers have completed the purge action, another timestamp is added to “stop the clock.” Each purge request generates this performance data, and it is then sent to a database for us to measure the appropriate quantiles and to help us understand how we can improve further.</p><p>In August 2024, we took purge performance data and segmented our collected data by region based on where the local data center receiving the request was located.</p><table><tr><td><p><b>Region</b></p></td><td><p><b>P50 Aug 2024 (Coreless)</b></p></td><td><p><b>P50 May 2022 (Core-based)</b></p></td><td><p><b>Improvement</b></p></td></tr><tr><td><p>Africa</p></td><td><p>303ms</p></td><td><p>1,420ms</p></td><td><p>78.66%</p></td></tr><tr><td><p>Asia Pacific Region (APAC)</p></td><td><p>199ms</p></td><td><p>1,300ms</p></td><td><p>84.69%</p></td></tr><tr><td><p>Eastern Europe (EEUR)</p></td><td><p>140ms</p></td><td><p>1,240ms</p></td><td><p>88.70%</p></td></tr><tr><td><p>Eastern North America (ENAM)</p></td><td><p>119ms</p></td><td><p>1,080ms</p></td><td><p>88.98%</p></td></tr><tr><td><p>Oceania</p></td><td><p>191ms</p></td><td><p>1,160ms</p></td><td><p>83.53%</p></td></tr><tr><td><p>South America (SA)</p></td><td><p>196ms</p></td><td><p>1,250ms</p></td><td><p>84.32%</p></td></tr><tr><td><p>Western Europe (WEUR)</p></td><td><p>131ms</p></td><td><p>1,190ms</p></td><td><p>88.99%</p></td></tr><tr><td><p>Western North America (WNAM)</p></td><td><p>115ms</p></td><td><p>1,000ms</p></td><td><p>88.5%</p></td></tr><tr><td><p><b>Global</b></p></td><td><p><b>149ms</b></p></td><td><p><b>1,570ms</b></p></td><td><p><b>90.5%</b></p></td></tr></table><p><sup>Note: Global latency numbers on the core-based measurements (May 2022) may be larger than the regional numbers because it represents all of our data centers instead of only a regional portion, so outliers and retries might have an outsized effect.</sup></p>
    <div>
      <h3>What’s next?</h3>
      <a href="#whats-next">
        
      </a>
    </div>
    <p>We are currently wrapping up the roll-out of the last throughput changes which allow us to efficiently scale purge requests. As that happens, we will revise our rate limits and open up purge by tag, hostname, and prefix to all plan types! We expect to begin rolling out the additional purge types to all plans and users beginning in early <b>2025</b>.</p><p>In addition, in the process of implementing this new approach, we have identified improvements that will shave a few more milliseconds off our single-file purge. Currently, single-file purges have a P50 of 234ms. However, we want to, and can, bring that number down to below 200ms.</p><p>If you want to come work on the world's fastest purge system, check out <a href="http://www.cloudflare.com/careers">our open positions</a>.</p>
    <div>
      <h3>Watch on Cloudflare TV</h3>
      <a href="#watch-on-cloudflare-tv">
        
      </a>
    </div>
    <div>
  
</div><p></p> ]]></content:encoded>
            <category><![CDATA[Birthday Week]]></category>
            <category><![CDATA[Performance]]></category>
            <category><![CDATA[Cache]]></category>
            <category><![CDATA[Speed & Reliability]]></category>
            <guid isPermaLink="false">11EWaw0wCUNwPTM30w7oUN</guid>
            <dc:creator>Alex Krivit</dc:creator>
            <dc:creator>Tim Kornhammar</dc:creator>
            <dc:creator> Connor Harwood</dc:creator>
        </item>
        <item>
            <title><![CDATA[Network performance update: Birthday Week 2024]]></title>
            <link>https://blog.cloudflare.com/network-performance-update-birthday-week-2024/</link>
            <pubDate>Mon, 23 Sep 2024 13:00:00 GMT</pubDate>
            <description><![CDATA[ Since June 2021, we’ve been measuring and ranking our network performance against the top global networks in the world. We use this data to improve our performance, and to share the results of those initiatives. 

In this post, we’re going to share with you how network performance has changed since our last post in March 2024, and discuss the tools and processes we are using to assess network performance. 
 ]]></description>
            <content:encoded><![CDATA[ <p>When it comes to the Internet, everyone wants to be the fastest. At Cloudflare, we’re no different. We want to be the fastest network in the world, and are constantly working towards that goal. Since <a href="https://blog.cloudflare.com/benchmarking-edge-network-performance/"><u>June 2021</u></a>, we’ve been measuring and ranking our network performance against the top global networks. We use this data to improve our performance, and to share the results of those initiatives. </p><p>In this post, we’re going to share with you how our network performance has changed since our last <a href="https://blog.cloudflare.com/network-performance-update-security-week-2024/"><u>post in March 2024</u></a>, and discuss the tools and processes we are using to assess network performance. </p>
    <div>
      <h3>Digging into the data</h3>
      <a href="#digging-into-the-data">
        
      </a>
    </div>
    <p>Cloudflare has been measuring network performance across these top networks from the top 1,000 ISPs by estimated population (according to the <a href="https://stats.labs.apnic.net/cgi-bin/aspop?c=IN"><u>Asia Pacific Network Information Centre (APNIC)</u></a>), and optimizing our network for ISPs and countries where we see opportunities to improve. For performance benchmarking, we look at TCP connection time. This is the time it takes an end user to connect to the website or endpoint they are trying to reach. We chose this metric to show how our network helps make your websites faster by serving your traffic where your customers are. Back in June 2021, Cloudflare was ranked #1 in 33% of the networks.</p><p>As of September 2024, examining 95th percentile (p95) TCP connect times measured from September 4 to September 19, Cloudflare is the #1 provider in 44% of the top 1000 networks:</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6zN4WFnD4yijPB5fCX2wOY/db6d517f4054327beb433b5189e626d4/BLOG-2569_2.png" />
          </figure><p>This graph shows that we are fastest in 410 networks, but that would only make us the fastest in 41% of the top 1,000. To make sure we’re looking at the networks that eyeballs connect from, we exclude networks like transit networks that aren’t last-mile ISPs. That brings the number of measured networks to 932, which makes us fastest in 44% of ISPs.</p><p>Now let’s take a look at the fastest provider by country. The map below shows the top network by 95th percentile TCP connection time, and Cloudflare is fastest in many countries. For those where we weren’t, we were within a few milliseconds of our closest competitor. </p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7pEP1CiCQKL2lzSg3vH3C2/0f909c05ae4aa926a9ac2e7d39d117ab/BLOG-2569_3.png" />
          </figure><p>This color coding is generated by grouping all the measurements we generate by which country the measurement originates from, and then looking at the 95th percentile measurements for each provider to see who is the fastest. This is in contrast to how we measure who is fastest in individual networks, which involves grouping the measurements by ISP and measuring which provider is fastest. Cloudflare is still the fastest provider in 44% of the measured networks, which is consistent with our March report. Cloudflare is also the fastest in many countries, but the map is less orange than it was when we published our measurements from March 2024:</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/MTyot2PbJD2K6eFJ41o5V/00c3957c2e3174ba4e5a00d2f027fed2/BLOG-2569_4.png" />
          </figure><p>This can be explained by the fact that the fastest provider in a country is often determined by latency differences so small it is often less than 5% faster than the second-fastest provider. As an example, let’s take a look at India, a country where we are now the fastest:</p><p><b>India performance by provider</b></p><table><tr><th><p>Rank</p></th><th><p>Entity</p></th><th><p>95th percentile TCP Connect (ms)</p></th><th><p>Difference from #1</p></th></tr><tr><td><p>1</p></td><td><p><b>Cloudflare</b></p></td><td><p>290 ms</p></td><td><p>-</p></td></tr><tr><td><p>2</p></td><td><p>Google</p></td><td><p>291 ms</p></td><td><p>+0.28% (+0.81 ms)</p></td></tr><tr><td><p>3</p></td><td><p>CloudFront</p></td><td><p>304 ms</p></td><td><p>+4.61% (+13 ms)</p></td></tr><tr><td><p>4</p></td><td><p>Fastly</p></td><td><p>325 ms</p></td><td><p>+12% (+35 ms)</p></td></tr><tr><td><p>5</p></td><td><p>Akamai</p></td><td><p>373 ms</p></td><td><p>+29% (+83 ms)</p></td></tr></table><p>In India, we are the fastest network, but we are beating the runner-up by less than a millisecond, which shakes out to less than 1% difference between us and the number two! The competition for the number one spot in many countries is fierce and sometimes can be determined by what days you’re looking at the data, because variance in connectivity or last-mile outages can materially impact this data.</p><p>For example, on September 17, there was <a href="https://economictimes.indiatimes.com/news/india/jio-network-down-several-users-complaint-network-issue-downdetector-confirms-outage/articleshow/113417252.cms?from=mdr"><u>an outage on a major network in India</u></a>, which impacted many users’ ability to access the Internet. People using this network were significantly impacted in their ability to connect to Cloudflare, and you can actually see that impact in the data. Here’s what the data looked like on the day of the outage, as we dropped from the number one spot that day:</p><p><b>India performance by provider</b></p><table><tr><th><p>Rank</p></th><th><p>Entity</p></th><th><p>95th percentile TCP Connect (ms)</p></th><th><p>Difference from #1</p></th></tr><tr><td><p>1</p></td><td><p>Google</p></td><td><p>219 ms</p></td><td><p>-</p></td></tr><tr><td><p>2</p></td><td><p>CloudFront</p></td><td><p>230 ms</p></td><td><p>+5% (+11 ms)</p></td></tr><tr><td><p>3</p></td><td><p><b>Cloudflare</b></p></td><td><p>236 ms</p></td><td><p>+7.47% (+16 ms)</p></td></tr><tr><td><p>4</p></td><td><p>Fastly</p></td><td><p>261 ms</p></td><td><p>+19% (+41 ms)</p></td></tr><tr><td><p>5</p></td><td><p>Akamai</p></td><td><p>286 ms</p></td><td><p>+30% (+67 ms)</p></td></tr></table><p>We were impacted more than other providers here because our automated traffic management systems detected the high packet loss as a result of the outage and aggressively moved all of our traffic away from the impacted provider. After review internally, we have identified opportunities to improve our traffic management to be more fine-grained in our approach to outages of this type, which would help us continue to be fast despite changes in the surrounding ecosystem. These unplanned occurrences can happen to any network, but these events also provide us opportunities to improve and mitigate the randomness we see on the Internet.</p><p>The phenomenon of providers having fluctuating latencies can also work against us. Consider Poland, a country where we were the fastest provider in March, but are no longer the fastest provider today. Digging into the data a bit more, we can see that even though we are no longer the fastest, we’re slower by less than a millisecond, giving us confidence that our architecture is optimized for performance in the region:</p><p><b>Poland performance by provider</b></p><table><tr><th><p>Rank</p></th><th><p>Entity</p></th><th><p>95th percentile TCP Connect (ms)</p></th><th><p>Difference from #1</p></th></tr><tr><td><p>1</p></td><td><p>Google</p></td><td><p>246 ms</p></td><td><p>-</p></td></tr><tr><td><p>2</p></td><td><p><b>Cloudflare</b></p></td><td><p>246 ms</p></td><td><p>+0.15% (+0.36 ms)</p></td></tr><tr><td><p>3</p></td><td><p>CloudFront</p></td><td><p>250 ms</p></td><td><p>+1.7% (+4.17 ms)</p></td></tr><tr><td><p>4</p></td><td><p>Akamai</p></td><td><p>272 ms</p></td><td><p>+11% (+26 ms)</p></td></tr><tr><td><p>5</p></td><td><p>Fastly</p></td><td><p>295 ms</p></td><td><p>+20% (+50 ms)</p></td></tr></table><p>These nuances in the data can make us look slower in more countries than we actually are. From a numbers' perspective we’re neck-and-neck with our competitors and still fastest in the most networks around the world. Going forward, we’re going to take a longer look at how we’re visualizing our network performance to paint a clearer picture for you around performance. But let’s go into more about how we actually get the underlying data we use to measure ourselves.</p>
    <div>
      <h3>How we measure performance</h3>
      <a href="#how-we-measure-performance">
        
      </a>
    </div>
    <p>When you see a Cloudflare-branded error page, something interesting happens behind the scenes. Every time one of these error pages is displayed, Cloudflare gathers Real User Measurements (RUM) by fetching a tiny file from various networks, including Cloudflare, Akamai, Amazon CloudFront, Fastly, and Google Cloud CDN. Your browser sends back performance data from the end-user’s perspective, helping us get a clear view of how these different networks stack up in terms of speed. The main goal? Figure out where we’re fast, and more importantly, where we can make Cloudflare even faster. If you're curious about the details, the original <a href="https://blog.cloudflare.com/introducing-radar-internet-quality-page/"><u>Speed Week blog post</u></a> dives deeper into the methodology.</p><p>Using this RUM data, we track key performance metrics such as TCP Connection Time, Time to First Byte (TTFB), and Time to Last Byte (TTLB) for Cloudflare and the other networks. </p><p>Starting from March, we fixed the list of networks we look at to be the top 1000 networks by estimated population as determined by <a href="https://stats.labs.apnic.net/cgi-bin/aspop?c=IN"><u>APNIC</u></a>, and we removed networks that weren’t last-mile ISPs. This change makes our measurements and reporting more consistent because we look at the same set of networks for every reporting cycle.</p>
    <div>
      <h3>How does Cloudflare use this data?</h3>
      <a href="#how-does-cloudflare-use-this-data">
        
      </a>
    </div>
    <p>Cloudflare uses this data to improve our network performance in lagging regions. For example, in 2022 we recognized that performance on a network in Finland was not as fast as some comparable regions. Users were taking 300+ ms to connect to Cloudflare at the 95th percentile:</p><p><b>Performance for Finland network</b></p><table><tr><th><p>Rank</p></th><th><p>Entity</p></th><th><p>95th percentile TCP Connect (ms)</p></th><th><p>Difference from #1</p></th></tr><tr><td><p>1</p></td><td><p>Fastly</p></td><td><p>15 ms</p></td><td><p>-</p></td></tr><tr><td><p>2</p></td><td><p>CloudFront</p></td><td><p>19 ms</p></td><td><p>+19% (+3 ms)</p></td></tr><tr><td><p>3</p></td><td><p>Akamai</p></td><td><p>20 ms</p></td><td><p>+28% (+4.3 ms)</p></td></tr><tr><td><p>4</p></td><td><p>Google</p></td><td><p>72 ms</p></td><td><p>+363% (+56 ms)</p></td></tr><tr><td><p>5</p></td><td><p><b>Cloudflare</b></p></td><td><p>368 ms</p></td><td><p>+2378% (+353 ms)</p></td></tr></table><p>After investigating, we recognized that one major network in Finland was seeing high latency due to issues resulting from congestion. Simply put, we were using all the capacity we had. We immediately planned an expansion, and within two weeks of that expansion completion, our latency decreased, and we became the fastest provider in the region, as you can see in the map above.</p><p>We are constantly improving our network and infrastructure to better serve our customers. Data like this helps us identify where we can be most impactful, and improve service for our customers. </p>
    <div>
      <h3>What’s next </h3>
      <a href="#whats-next">
        
      </a>
    </div>
    <p>We’re sharing our updates on our journey to become as fast as we can be everywhere so that you can see what goes into running the fastest network in the world. From here, our plan is the same as always: identify where we’re slower, fix it, and then tell you how we’ve gotten faster.</p> ]]></content:encoded>
            <category><![CDATA[Birthday Week]]></category>
            <category><![CDATA[Network Performance Update]]></category>
            <category><![CDATA[Performance]]></category>
            <category><![CDATA[Radar]]></category>
            <category><![CDATA[Network Services]]></category>
            <guid isPermaLink="false">1CRWV43VAHSo5XHLkwPw2R</guid>
            <dc:creator>Emily Music</dc:creator>
        </item>
    </channel>
</rss>