
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/">
    <channel>
        <title><![CDATA[ The Cloudflare Blog ]]></title>
        <description><![CDATA[ Get the latest news on how products at Cloudflare are built, technologies used, and join the teams helping to build a better Internet. ]]></description>
        <link>https://blog.cloudflare.com</link>
        <atom:link href="https://blog.cloudflare.com/" rel="self" type="application/rss+xml"/>
        <language>en-us</language>
        <image>
            <url>https://blog.cloudflare.com/favicon.png</url>
            <title>The Cloudflare Blog</title>
            <link>https://blog.cloudflare.com</link>
        </image>
        <lastBuildDate>Fri, 03 Apr 2026 17:15:31 GMT</lastBuildDate>
        <item>
            <title><![CDATA[Fixing request smuggling vulnerabilities in Pingora OSS deployments]]></title>
            <link>https://blog.cloudflare.com/pingora-oss-smuggling-vulnerabilities/</link>
            <pubDate>Mon, 09 Mar 2026 14:00:00 GMT</pubDate>
            <description><![CDATA[ Today we’re disclosing request smuggling vulnerabilities when our open source Pingora service is deployed as an ingress proxy and how we’ve fixed them in Pingora 0.8.0.  ]]></description>
            <content:encoded><![CDATA[ <p>In December 2025, Cloudflare received reports of HTTP/1.x request smuggling vulnerabilities in the <a href="https://github.com/cloudflare/pingora"><u>Pingora open source</u></a> framework when Pingora is used to build an ingress proxy. Today we are discussing how these vulnerabilities work and how we patched them in <a href="https://github.com/cloudflare/pingora/releases/tag/0.8.0"><u>Pingora 0.8.0</u></a>.</p><p>The vulnerabilities are <a href="https://www.cve.org/CVERecord?id=CVE-2026-2833"><u>CVE-2026-2833</u></a>, <a href="https://www.cve.org/CVERecord?id=CVE-2026-2835"><u>CVE-2026-2835</u></a>, and <a href="https://www.cve.org/CVERecord?id=CVE-2026-2836"><u>CVE-2026-2836</u></a>. These issues were responsibly reported to us by Rajat Raghav (xclow3n) through our <a href="https://www.cloudflare.com/disclosure/"><u>Bug Bounty Program</u></a>.</p><p><b>Cloudflare’s CDN and customer traffic were not affected</b>, our investigation found. <b>No action is needed for Cloudflare customers, and no impact was detected.</b> </p><p>Due to the architecture of Cloudflare’s network, these vulnerabilities could not be exploited: Pingora is not used as an ingress proxy in Cloudflare’s CDN.</p><p>However, these issues impact standalone Pingora deployments exposed to the Internet, and may enable an attacker to:</p><ul><li><p>Bypass Pingora proxy-layer security controls</p></li><li><p>Desync HTTP request/responses with backends for cross-user hijacking attacks (session or credential theft)</p></li><li><p>Poison Pingora proxy-layer caches retrieving content from shared backends</p></li></ul><p>We have released <a href="https://github.com/cloudflare/pingora/releases/tag/0.8.0"><u>Pingora 0.8.0</u></a> with fixes and hardening. While Cloudflare customers were not affected, we strongly recommend users of the Pingora framework to <b>upgrade as soon as possible.</b></p>
    <div>
      <h2>What was the vulnerability?</h2>
      <a href="#what-was-the-vulnerability">
        
      </a>
    </div>
    <p>The reports described a few different HTTP/1 attack payloads that could cause desync attacks. Such requests could cause the proxy and backend to disagree about where the request body ends, allowing a second request to be “smuggled” past proxy‑layer checks. The researcher provided a proof-of-concept to validate how a basic Pingora reverse proxy misinterpreted request body lengths and forwarded those requests to server backends such as Node/Express or uvicorn.</p><p>Upon receiving the reports, our engineering team immediately investigated and validated that, as the reporter also confirmed, the Cloudflare CDN itself was not vulnerable. However, the team did also validate that vulnerabilities exist when Pingora acts as the ingress proxy to shared backends.</p><p>By design, the Pingora framework <a href="https://blog.cloudflare.com/how-we-built-pingora-the-proxy-that-connects-cloudflare-to-the-internet/#design-decisions"><u>does allow</u></a> edge case HTTP requests or responses that are not strictly RFC compliant, because we must accept this sort of traffic for customers with legacy HTTP stacks. But this leniency has limits to avoid exposing Cloudflare itself to vulnerabilities.</p><p>In this case, Pingora had non-RFC-compliant interpretations of request bodies within its HTTP/1 stack that allowed these desync attacks to exist. Pingora deployments within Cloudflare are not directly exposed to ingress traffic, and we found that production traffic that arrived at Pingora services were not subject to these misinterpretations. Thus, the attacks were not exploitable on Cloudflare traffic itself, unlike a <a href="https://blog.cloudflare.com/resolving-a-request-smuggling-vulnerability-in-pingora/"><u>previous Pingora smuggling vulnerability</u></a> disclosed in May 2025.</p><p>We’ll explain, case-by-case, how these attack payloads worked.</p>
    <div>
      <h3>1. Premature upgrade without 101 handshake</h3>
      <a href="#1-premature-upgrade-without-101-handshake">
        
      </a>
    </div>
    <p>The first report showed that a request with an <code>Upgrade</code> header value would cause Pingora to pass through subsequent bytes on the HTTP connection immediately, before the backend had accepted an upgrade (by returning <code>101 Switching Protocols</code>). The attacker could thus pipeline a second HTTP request after the upgrade request on the same connection:</p>
            <pre><code>GET / HTTP/1.1
Host: example.com
Upgrade: foo


GET /admin HTTP/1.1
Host: example.com</code></pre>
            <p>Pingora would parse only the initial request, then treat the remaining buffered bytes as the “upgraded” stream and forward them directly to the backend in a “passthrough” mode <a href="https://github.com/cloudflare/pingora/blob/ef017ceb01962063addbacdab2a4fd2700039db5/pingora-core/src/protocols/http/v1/server.rs#L797"><u>due to the Upgrade header</u></a> (until the response <a href="https://github.com/cloudflare/pingora/blob/ef017ceb01962063addbacdab2a4fd2700039db5/pingora-core/src/protocols/http/v1/server.rs#L523"><u>was received</u></a>).</p><p>This is not at all how the HTTP/1.1 Upgrade process per <a href="https://datatracker.ietf.org/doc/html/rfc9110#field.upgrade"><u>RFC 9110</u></a> is intended to work. The subsequent bytes should <i>only</i> be interpreted as part of an upgraded stream if a <code>101 Switching Protocols</code> header is received, and if a <code>200 OK</code> response is received instead, the subsequent bytes should continue to be interpreted as HTTP.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/2IYHyGkABpNA0e09wiiGpY/4f51ea330c2d266260f6361dd9d64d79/image4.png" />
          </figure><p><sup><i>An attacker that sends an Upgrade request, then pipelines a partial HTTP request may cause a desync attack. Pingora will incorrectly interpret both as the same upgraded request, even if the backend server declines the upgrade with a 200.</i></sup></p><p>Via the improper pass-through, a Pingora deployment that received a non-101 response could still forward the second partial HTTP request to the upstream as-is, bypassing any Pingora user‑defined ACL-handling or WAF logic, and poison the connection to the upstream so that a subsequent request from a different user could improperly receive the <code>/admin</code> response.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/oIwatu6gaMoJHCCs95sFN/8ea94ee8f04be6f7f00474168b382180/image3.png" />
          </figure><p><sup><i>After the attack payload, Pingora and the backend server are now “desynced.” The backend server will wait until it thinks the rest of the partial /attack request header that Pingora forwarded is complete. When Pingora forwards a different user’s request, the two headers are combined from the backend server’s perspective, and the attacker has now poisoned the other user’s response.</i></sup></p><p>We’ve since <a href="https://github.com/cloudflare/pingora/commit/824bdeefc61e121cc8861de1b35e8e8f39026ecd"><u>patched</u></a> Pingora to switch the interpretation of subsequent bytes only once the upstream responds with <code>101 Switching Protocols</code>.</p><p>We verified Cloudflare was <b>not affected</b> for two reasons:</p><ol><li><p>The ingress CDN proxies do not have this improper behavior.</p></li><li><p>The clients to our internal Pingora services do not attempt to <a href="https://en.wikipedia.org/wiki/HTTP_pipelining"><u>pipeline</u></a> HTTP/1 requests. Furthermore, the Pingora service these clients talk directly with disables keep-alive on these <code>Upgrade</code> requests by injecting a <code>Connection: close</code> header; this prevents additional requests that would be sent — and subsequently smuggled — over the same connection.</p></li></ol>
    <div>
      <h3>2. HTTP/1.0, close-delimiting, and transfer-encoding</h3>
      <a href="#2-http-1-0-close-delimiting-and-transfer-encoding">
        
      </a>
    </div>
    <p>The reporter also demonstrated what <i>appeared</i> to be a more classic “CL.TE” desync-type attack, where the Pingora proxy would use Content-Length as framing while the backend would use Transfer-Encoding as framing:</p>
            <pre><code>GET / HTTP/1.0
Host: example.com
Connection: keep-alive
Transfer-Encoding: identity, chunked
Content-Length: 29

0

GET /admin HTTP/1.1
X:
</code></pre>
            <p>In the reporter’s example, Pingora would treat all subsequent bytes after the first GET / request header as part of that request’s body, but the node.js backend server would interpret the body as chunked and ending at the zero-length chunk. There are actually a few things going on here:</p><ol><li><p>Pingora’s chunked encoding recognition was quite barebones (only checking for whether <code>Transfer-Encoding</code> was “<a href="https://github.com/cloudflare/pingora/blob/9ac75d0356f449d26097e08bf49af14de6271727/pingora-core/src/protocols/http/v1/common.rs#L146"><u>chunked</u></a>”) and assumed that there could only be one encoding or <code>Transfer-Encoding</code> header. But the RFC only <a href="https://datatracker.ietf.org/doc/html/rfc9112#section-6.3-2.4.1"><u>mandates</u></a> that the <i>final</i> encoding must be <code>chunked</code> to apply chunked framing. So per RFC, this request should have a chunked message body (if it were not HTTP/1.0 — more on that below).</p></li><li><p>Pingora was <i>also </i>not actually using the <code>Content-Length</code> (because the Transfer-Encoding overrode the Content-Length <a href="https://datatracker.ietf.org/doc/html/rfc9112#section-6.3-2.3"><u>per RFC</u></a>). Because of the unrecognized Transfer-Encoding and the HTTP/1.0 version, the request body was <a href="https://github.com/cloudflare/pingora/blob/ef017ceb01962063addbacdab2a4fd2700039db5/pingora-core/src/protocols/http/v1/server.rs#L817"><u>instead treated as close-delimited</u></a> (which means that the response body’s end is marked by closure of the underlying transport connection). An absence of framing headers would also trigger the same misinterpretation on HTTP/1.0. Although response bodies are allowed to be close-delimited, request bodies are <i>never</i> close-delimited. In fact, this clarification is now explicitly called out as a separate note in <a href="https://datatracker.ietf.org/doc/html/rfc9112#section-6.3-4.1"><u>RFC 9112</u></a>.</p></li><li><p>This is an HTTP/1.0 request that <a href="https://datatracker.ietf.org/doc/html/rfc9112#appendix-C.2.3-1"><u>did not define</u></a> Transfer-Encoding. The RFC <a href="https://datatracker.ietf.org/doc/html/rfc9112#section-6.1-16">mandates</a> that HTTP/1.0 requests containing Transfer-Encoding must “treat the message as if the framing is faulty” and close the connection. Parsers such as the ones in nginx and hyper just reject these requests to avoid ambiguous framing.</p></li></ol>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1jLbMNafmF96toxAPxj2Cm/8561b96a56dc0fc654476e33d0f34888/image2.png" />
          </figure><p><sup><i>When an attacker pipelines a partial HTTP request header after the HTTP/1.0 + Transfer-Encoding request, Pingora would incorrectly interpret that partial header as part of the same request, rather than as a distinct request. This enables the same kind of desync attack as described in the premature Upgrade example.</i></sup></p><p>This spoke to a more fundamental misreading of the RFC particularly in terms of response vs. request message framing. We’ve since fixed the improper <a href="https://github.com/cloudflare/pingora/commit/7f7166d62fa916b9f11b2eb8f9e3c4999e8b9023"><u>multiple Transfer-Encoding parsing</u></a>, adhere strictly to the request length guidelines such that HTTP request bodies can <a href="https://github.com/cloudflare/pingora/commit/40c3c1e9a43a86b38adeab8da7a2f6eba68b83ad"><u>never be considered close-delimited</u></a>, and reject <a href="https://github.com/cloudflare/pingora/commit/fc904c0d2c679be522de84729ec73f0bd344963d"><u>invalid Content-Length</u></a> and <a href="https://github.com/cloudflare/pingora/commit/87e2e2fb37edf9be33e3b1d04726293ae6bf2052"><u>HTTP/1.0 + Transfer-Encoding</u></a> request messages. Further protections we’ve added include <a href="https://github.com/cloudflare/pingora/commit/d3d2cf5ef4eca1e5d327fe282ec4b4ee474350c6"><u>rejecting</u></a> <a href="https://datatracker.ietf.org/doc/html/rfc9110#name-connect"><u>CONNECT</u></a> requests by default because the HTTP proxy logic doesn’t currently treat CONNECT as special for the purposes of CONNECT upgrade proxying, and these requests have special <a href="https://datatracker.ietf.org/doc/html/rfc9112#section-6.3-2.2"><u>message framing rules</u></a>. (Note that incoming CONNECT requests are <a href="https://developers.cloudflare.com/fundamentals/concepts/traffic-flow-cloudflare/#cloudflares-network"><u>rejected</u></a> by the Cloudflare CDN.)</p><p>When we investigated and instrumented our services internally, we found no requests arriving at our Pingora services that would have been misinterpreted. We found that downstream proxy layers in the CDN would forward as HTTP/1.1 only, reject ambiguous framing such as invalid Content-Length, and only forward a single <code>Transfer-Encoding: chunked</code> header for chunked requests.</p>
    <div>
      <h3>3. Cache key construction</h3>
      <a href="#3-cache-key-construction">
        
      </a>
    </div>
    <p>The researcher also reported one other cache poisoning vulnerability regarding default <code>CacheKey</code> construction. The <a href="https://github.com/cloudflare/pingora/blob/ef017ceb01962063addbacdab2a4fd2700039db5/pingora-cache/src/key.rs#L218"><u>naive default implementation</u></a> factored in only the URI path (without other factors such as host header or upstream server HTTP scheme), which meant different hosts using the same HTTP path could collide and poison each other’s cache.</p><p>This would affect users of the alpha proxy caching feature who chose to use the default <code>CacheKey</code> implementation. We have since <a href="https://github.com/cloudflare/pingora/commit/257b59ada28ed6cac039f67d0b71f414efa0ab6e"><u>removed that default</u></a>, because while using something like HTTP scheme + host + URI makes sense for many applications, we want users to be careful when constructing their cache keys for themselves. If their proxy logic will conditionally adjust the URI or method on the upstream request, for example, that logic likely also must be factored into the cache key scheme to avoid poisoning.</p><p>Internally, Cloudflare’s <a href="https://developers.cloudflare.com/cache/how-to/cache-keys/"><u>default cache key</u></a> uses a number of factors to prevent cache key poisoning, and never made use of the previously provided default.</p>
    <div>
      <h2>Recommendation</h2>
      <a href="#recommendation">
        
      </a>
    </div>
    <p>If you use Pingora as a proxy, upgrade to <a href="https://github.com/cloudflare/pingora/releases/tag/0.8.0"><u>Pingora 0.8.0</u></a> at your earliest convenience.</p><p>We apologize for the impact this vulnerability may have had on Pingora users. As Pingora earns its place as critical Internet infrastructure beyond Cloudflare, we believe it’s important for the framework to promote use of strict RFC compliance by default and will continue this effort. Very few users of the framework should have to deal with the same “wild Internet” that Cloudflare does. Our intention is that stricter adherence to the latest RFC standards by default will harden security for Pingora users and move the Internet as a whole toward best practices.</p>
    <div>
      <h2>Disclosure and response timeline</h2>
      <a href="#disclosure-and-response-timeline">
        
      </a>
    </div>
    <p>- 2025‑12‑02: Upgrade‑based smuggling reported via bug bounty.</p><p>- 2026‑01‑13: Transfer‑Encoding / HTTP/1.0 parsing issues reported.</p><p>- 2026-01-18: Default cache key construction issue reported.</p><p>- 2026‑01‑29 to 2026‑02‑13: Fixes validated with the reporter. Work on more RFC-compliance checks continues.</p><p>- 2026-02-25: Cache key default removal and additional RFC checks validated with researcher.</p><p>- 2026‑03-02: Pingora 0.8.0 released.</p><p>- 2026-03-04: CVE advisories published.</p>
    <div>
      <h2>Acknowledgements</h2>
      <a href="#acknowledgements">
        
      </a>
    </div>
    <p>We thank Rajat Raghav (xclow3n) for the report, detailed reproductions, and verification of the fixes through our bug bounty program. Please see the researcher's<a href="https://xclow3n.github.io/post/6"> corresponding blog post</a> for more information.</p><p>We would also extend a heartfelt thank you to the Pingora open source community for their active engagement, issue reports, and contributions to the framework. You truly help us build a better Internet.</p> ]]></content:encoded>
            <category><![CDATA[Pingora]]></category>
            <category><![CDATA[Application Security]]></category>
            <category><![CDATA[Open Source]]></category>
            <category><![CDATA[Security]]></category>
            <guid isPermaLink="false">1b0iJgL57wbfiLHXhEjuwR</guid>
            <dc:creator>Edward Wang</dc:creator>
            <dc:creator>Fei Deng</dc:creator>
            <dc:creator>Andrew Hauck</dc:creator>
        </item>
        <item>
            <title><![CDATA[Resolving a request smuggling vulnerability in Pingora]]></title>
            <link>https://blog.cloudflare.com/resolving-a-request-smuggling-vulnerability-in-pingora/</link>
            <pubDate>Thu, 22 May 2025 13:00:00 GMT</pubDate>
            <description><![CDATA[ Cloudflare patched a vulnerability (CVE-2025-4366) in the Pingora OSS framework, which exposed users of the framework and Cloudflare CDN’s free tier to potential request smuggling attacks. ]]></description>
            <content:encoded><![CDATA[ <p>On April 11, 2025 09:20 UTC, Cloudflare was notified via its <a href="https://www.cloudflare.com/disclosure/"><u>Bug Bounty Program</u></a> of a request smuggling vulnerability (<a href="https://www.cve.org/cverecord?id=CVE-2025-4366"><u>CVE-2025-4366</u></a>) in the <a href="https://github.com/cloudflare/pingora/tree/main"><u>Pingora OSS framework</u></a> discovered by a security researcher experimenting to find exploits using Cloudflare’s Content Delivery Network (CDN) free tier which serves some cached assets via Pingora.</p><p>Customers using the free tier of Cloudflare’s CDN or users of the caching functionality provided in the open source <a href="https://github.com/cloudflare/pingora/tree/main/pingora-proxy"><u>pingora-proxy</u></a> and <a href="https://github.com/cloudflare/pingora/tree/main/pingora-cache"><u>pingora-cache</u></a> crates could have been exposed.  Cloudflare’s investigation revealed no evidence that the vulnerability was being exploited, and was able to mitigate the vulnerability by April 12, 2025 06:44 UTC within 22 hours after being notified.</p>
    <div>
      <h2>What was the vulnerability?</h2>
      <a href="#what-was-the-vulnerability">
        
      </a>
    </div>
    <p>The bug bounty report detailed that an attacker could potentially exploit an HTTP/1.1 request smuggling vulnerability on Cloudflare’s CDN service. The reporter noted that via this exploit, they were able to cause visitors to Cloudflare sites to make subsequent requests to their own server and observe which URLs the visitor was originally attempting to access.</p><p>We treat any potential request smuggling or caching issue with extreme urgency.  After our security team escalated the vulnerability, we began investigating immediately, took steps to disable traffic to vulnerable components, and deployed a patch. 
</p><p>We are sharing the details of the vulnerability, how we resolved it, and what we can learn from the action. No action is needed from Cloudflare customers, but if you are using the Pingora OSS framework, we strongly urge you to upgrade to a version of Pingora <a href="https://github.com/cloudflare/pingora/releases/tag/0.5.0"><u>0.5.0</u></a> or later.</p>
    <div>
      <h2>What is request smuggling?</h2>
      <a href="#what-is-request-smuggling">
        
      </a>
    </div>
    <p>Request smuggling is a type of attack where an attacker can exploit inconsistencies in the way different systems parse HTTP requests. For example, when a client sends an HTTP request to an application server, it typically passes through multiple components such as load balancers, reverse proxies, etc., each of which has to parse the HTTP request independently. If two of the components the request passes through interpret the HTTP request differently, an attacker can craft a request that one component sees as complete, but the other continues to parse into a second, malicious request made on the same connection.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4Zo8gcyLwmR2liZIUetcGe/d0647a83dc2bc1e676ee2b61f14c3964/image2.png" />
          </figure>
    <div>
      <h2>Request smuggling vulnerability in Pingora</h2>
      <a href="#request-smuggling-vulnerability-in-pingora">
        
      </a>
    </div>
    <p>In the case of Pingora, the reported request smuggling vulnerability was made possible due to a HTTP/1.1 parsing bug when caching was enabled.</p><p>The pingora-cache crate adds an HTTP caching layer to a Pingora proxy, allowing content to be cached on a configured storage backend to help improve response times, and reduce bandwidth and load on backend servers.</p><p>HTTP/1.1 supports “<a href="https://www.rfc-editor.org/rfc/rfc9112.html#section-9.3"><u>persistent connections</u></a>”, such that one TCP connection can be reused for multiple HTTP requests, instead of needing to establish a connection for each request. However, only one request can be processed on a connection at a time (with rare exceptions such as <a href="https://www.rfc-editor.org/rfc/rfc9112.html#section-9.3.2"><u>HTTP/1.1 pipelining</u></a>). The RFC notes that each request must have a “<a href="https://www.rfc-editor.org/rfc/rfc9112.html#section-9.3-7"><u>self-defined message length</u></a>” for its body, as indicated by headers such as <code>Content-Length</code> or <code>Transfer-Encoding</code> to determine where one request ends and another begins.</p><p>Pingora generally handles requests on HTTP/1.1 connections in an RFC-compliant manner, either ensuring the downstream request body is properly consumed or declining to reuse the connection if it encounters an error. After the bug was filed, we discovered that when caching was enabled, this logic was skipped on cache hits (i.e. when the service’s cache backend can serve the response without making an additional upstream request).</p><p>This meant on a cache hit request, after the response was sent downstream, any unread request body left in the HTTP/1.1 connection could act as a vector for request smuggling. When formed into a valid (but incomplete) header, the request body could “poison” the subsequent request. The following example is a spec-compliant HTTP/1.1 request which exhibits this behavior:</p>
            <pre><code>GET /attack/foo.jpg HTTP/1.1
Host: example.com
&lt;other headers…&gt;
content-length: 79

GET / HTTP/1.1
Host: attacker.example.com
Bogus: foo</code></pre>
            <p>Let’s say there is a different request to <code>victim.example.com</code> that will be sent after this one on the reused HTTP/1.1 connection to a Pingora reverse proxy. The bug means that a Pingora service may not respect the <code>Content-Length</code> header and instead misinterpret the smuggled request as the beginning of the next request:</p>
            <pre><code>GET /attack/foo.jpg HTTP/1.1
Host: example.com
&lt;other headers…&gt;
content-length: 79

GET / HTTP/1.1 // &lt;- “smuggled” body start, interpreted as next request
Host: attacker.example.com
Bogus: fooGET /victim/main.css HTTP/1.1 // &lt;- actual next valid req start
Host: victim.example.com
&lt;other headers…&gt;</code></pre>
            <p>Thus, the smuggled request could inject headers and its URL into a subsequent valid request sent on the same connection to a Pingora reverse proxy service.</p>
    <div>
      <h2>CDN request smuggling and hijacking</h2>
      <a href="#cdn-request-smuggling-and-hijacking">
        
      </a>
    </div>
    <p>On April 11, 2025, Cloudflare was in the process of rolling out a Pingora proxy component with caching support enabled to a subset of CDN free plan traffic. This component was vulnerable to this request smuggling attack, which could enable modifying request headers and/or URL sent to customer origins.</p><p>As previously noted, the security researcher reported that they were also able to cause visitors to Cloudflare sites to make subsequent requests to their own malicious origin and observe which site URLs the visitor was originally attempting to access. During our investigation, Cloudflare found that certain origin servers would be susceptible to this secondary attack effect. The smuggled request in the example above would be sent to the correct origin IP address per customer configuration, but some origin servers would respond to the rewritten attacker <code>Host</code> header with a 301 redirect. Continuing from the prior example:</p>
            <pre><code>GET / HTTP/1.1 // &lt;- “smuggled” body start, interpreted as next request
Host: attacker.example.com
Bogus: fooGET /victim/main.css HTTP/1.1 // &lt;- actual next valid req start
Host: victim.example.com
&lt;other headers…&gt;

HTTP/1.1 301 Moved Permanently // &lt;- susceptible victim origin response
Location: https://attacker.example.com/
&lt;other headers…&gt;</code></pre>
            <p>When the client browser followed the redirect, it would trigger this attack by sending a request to the attacker hostname, along with a Referrer header indicating which URL was originally visited, making it possible to load a malicious asset and observe what traffic a visitor was trying to access.</p>
            <pre><code>GET / HTTP/1.1 // &lt;- redirect-following request
Host: attacker.example.com
Referrer: https://victim.example.com/victim/main.css
&lt;other headers…&gt;</code></pre>
            <p>Upon verifying the Pingora proxy component was susceptible, the team immediately disabled CDN traffic to the vulnerable component on 2025-04-12 06:44 UTC to stop possible exploitation. By 2025-04-19 01:56 UTC and prior to re-enablement of any traffic to the vulnerable component, a patch fix to the component was released, and any assets cached on the component’s backend were invalidated in case of possible cache poisoning as a result of the injected headers.</p>
    <div>
      <h2>Remediation and next steps</h2>
      <a href="#remediation-and-next-steps">
        
      </a>
    </div>
    <p>If you are using the caching functionality in the Pingora framework, you should update to the latest version of <a href="https://github.com/cloudflare/pingora/releases/tag/0.5.0"><u>0.5.0.</u></a> If you are a Cloudflare customer with a free plan, you do not need to do anything, as we have already applied the patch for this vulnerability.</p>
    <div>
      <h2>Timeline</h2>
      <a href="#timeline">
        
      </a>
    </div>
    <p><i>All timestamps are in UTC.</i></p><ul><li><p>2025-04-11 09:20 – Cloudflare is notified of a CDN request smuggling vulnerability via the Bug Bounty Program.</p></li><li><p>2025-04-11 17:16 to 2025-04-12 03:28 – Cloudflare confirms vulnerability is reproducible and investigates which component(s) require necessary changes to mitigate.</p></li><li><p>2025-04-12 04:25 – Cloudflare isolates issue to roll out of a Pingora proxy component with caching enabled and prepares release to disable traffic to this component.</p></li><li><p>2025-04-12 06:44 – Rollout to disable traffic complete, vulnerability mitigated.</p></li></ul>
    <div>
      <h2>Conclusion</h2>
      <a href="#conclusion">
        
      </a>
    </div>
    <p>We would like to sincerely thank <a href="https://www.linkedin.com/in/james-kettle-albinowax/"><u>James Kettle</u></a> &amp; <a href="https://www.linkedin.com/in/wannes-verwimp/"><u>Wannes Verwimp</u></a>, who responsibly disclosed this issue via our <a href="https://www.cloudflare.com/en-gb/disclosure/"><u>Cloudflare Bug Bounty Program</u></a>, allowing us to identify and mitigate the vulnerability. We welcome further submissions from our community of researchers to continually improve the security of all of our products and open source projects.</p><p>Whether you are a customer of Cloudflare or just a user of our Pingora framework, or both, we know that the trust you place in us is critical to how you connect your properties to the rest of the Internet. Security is a core part of that trust and for that reason we treat these kinds of reports and the actions that follow with serious urgency. We are confident about this patch and the additional safeguards that have been implemented, but we know that these kinds of issues can be concerning. Thank you for your continued trust in our platform. We remain committed to building with security as our top priority and responding swiftly and transparently whenever issues arise.</p> ]]></content:encoded>
            <category><![CDATA[Pingora]]></category>
            <category><![CDATA[CDN]]></category>
            <category><![CDATA[Security]]></category>
            <category><![CDATA[CVE]]></category>
            <category><![CDATA[Bug Bounty]]></category>
            <guid isPermaLink="false">W02DuD98fCm1sYwa3gNH8</guid>
            <dc:creator>Edward Wang</dc:creator>
            <dc:creator>Andrew Hauck</dc:creator>
            <dc:creator>Aki Shugaeva</dc:creator>
        </item>
        <item>
            <title><![CDATA[A good day to trie-hard: saving compute 1% at a time]]></title>
            <link>https://blog.cloudflare.com/pingora-saving-compute-1-percent-at-a-time/</link>
            <pubDate>Tue, 10 Sep 2024 14:00:00 GMT</pubDate>
            <description><![CDATA[ Pingora handles 35M+ requests per second, so saving a few microseconds per request can translate to thousands of dollars saved on computing costs. In this post, we share how we freed up over 500 CPU  ]]></description>
            <content:encoded><![CDATA[ 
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5uwVNobeSBws457ad5SoNY/080de413142fc98caffc3c0108912fe4/2442-1-hero.png" />
          </figure><p>Cloudflare’s global network handles <i>a lot</i> of HTTP requests – over 60 million per second on average. That in and of itself is not news, but it is the starting point to an adventure that started a few months ago and ends with the announcement of a new <a href="https://github.com/cloudflare/trie-hard"><u>open-source Rust crate</u></a> that we are using to reduce our CPU utilization, enabling our CDN to handle even more of the world’s ever-increasing Web traffic. </p>
    <div>
      <h2>Motivation</h2>
      <a href="#motivation">
        
      </a>
    </div>
    <p>Let’s start at the beginning. You may recall a few months ago we released <a href="https://blog.cloudflare.com/pingora-open-source/"><u>Pingora</u></a> (the heart of our Rust-based proxy services) as an <a href="https://github.com/cloudflare/pingora"><u>open-source project on GitHub</u></a>. I work on the team that maintains the Pingora framework, as well as Cloudflare’s production services built upon it. One of those services is responsible for the final step in transmitting users’ (non-cached) requests to their true destination. Internally, we call the request’s destination server its “origin”, so our service has the (unimaginative) name of “pingora-origin”.</p><p>One of the many responsibilities of pingora-origin is to ensure that when a request leaves our infrastructure, it has been cleaned to remove the internal information we use to route, measure, and optimize traffic for our customers. This has to be done for every request that leaves Cloudflare, and as I mentioned above, it’s <i>a lot</i> of requests. At the time of writing, the rate of requests leaving pingora-origin (globally) is 35 million requests per second. Any code that has to be run per-request is in the hottest of hot paths, and it’s in this path that we find this code and comment:</p>
            <pre><code>// PERF: heavy function: 1.7% CPU time
pub fn clear_internal_headers(request_header: &amp;mut RequestHeader) {
    INTERNAL_HEADERS.iter().for_each(|h| {
        request_header.remove_header(h);
    });
}</code></pre>
            <p></p><p>This small and pleasantly-readable function consumes more than 1.7% of pingora-origin’s total cpu time. To put that in perspective, the total cpu time consumed by pingora-origin is 40,000 compute-seconds per second. You can think of this as 40,000 saturated CPU cores fully dedicated to running pingora-origin. Of those 40,000, 1.7% (680) are only dedicated to evaluating <code>clear_internal_headers</code>. The function’s heavy usage and simplicity make it seem like a great place to start optimizing.</p>
    <div>
      <h2>Benchmarking</h2>
      <a href="#benchmarking">
        
      </a>
    </div>
    <p>Benchmarking the function shown above is straightforward because we can use the wonderful <a href="https://crates.io/crates/criterion"><u>criterion</u></a> Rust crate. Criterion provides an api for timing rust code down to the nanosecond by aggregating multiple isolated executions. It also provides feedback on how the performance improves or regresses over time. The input for the benchmark is a large set of synthesized requests with a random number of headers with a uniform distribution of internal vs. non-internal headers. With our tooling and test data we find that our original <code>clear_internal_headers</code> function runs in an average of <b>3.65µs</b>. Now for each new method of clearing headers, we can measure against the same set of requests and get a relative performance difference. </p>
    <div>
      <h2>Reducing Reads</h2>
      <a href="#reducing-reads">
        
      </a>
    </div>
    <p>One potentially quick win is to invert how we find the headers that need to be removed from requests. If you look at the original code, you can see that we are evaluating <code>request_header.remove_header(h)</code> for each header in our list of internal headers, so 100+ times. Diagrammatically, it looks like this:</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7y2qHbNfBQeoGRc8PqjBcp/9e8fccb6951a475a26def66695e47635/2442-2.png" />
          </figure><p></p><p>Since an average request has significantly fewer than 100 headers (10-30), flipping the lookup direction should reduce the number of reads while yielding the same intersection. Because we are working in Rust (and because <code>retain</code> does not exist for <code>http::HeaderMap</code> <a href="https://github.com/hyperium/http/issues/541"><u>yet</u></a>), we have to collect the identified internal headers in a separate step before removing them from the request. Conceptually, it looks like this:</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6hgLavu1hZbwkw91Tee8e1/4d43b538274ae2c680236ca66791d73b/2442-3.png" />
          </figure><p></p><p>Using our benchmarking tool, we can measure the impact of this small change, and surprisingly this is already a substantial improvement. The runtime improves from <b>3.65µs</b> to <b>1.53µs</b>. That’s a 2.39x speed improvement for our function. We can calculate the theoretical CPU percentage by multiplying the starting utilization by the ratio of the new and old times: 1.71% * 1.53 / 3.65 = 0.717%. Unfortunately, if we subtract that from the original 1.71% that only equates to saving 1.71% - 0.717% = <i>0.993%</i> of the total CPU time. We should be able to do better. </p>
    <div>
      <h2>Searching Data Structures</h2>
      <a href="#searching-data-structures">
        
      </a>
    </div>
    <p>Now that we have reorganized our function to search a static set of internal headers instead of the actual request, we have the freedom to choose what data structure we store our header name in simply by changing the type of <code>INTERNAL_HEADER_SET</code>.</p>
            <pre><code>pub fn clear_internal_headers(request_header: &amp;mut RequestHeader) {
   let to_remove = request_header
       .headers
       .keys()
       .filter_map(|name| INTERNAL_HEADER_SET.get(name))
       .collect::&lt;Vec&lt;_&gt;&gt;();


   to_remove.into_iter().for_each(|k| {
       request_header.remove_header(k);
   });</code></pre>
            <p></p><p>Our first attempt used <code>std::HashMap</code>, but there may be other data structures that better suit our needs. All computer science students were taught at some point that hash tables are great because they have constant-time asymptotic behavior, or O(1), for reading. (If you are not familiar with <a href="https://www.khanacademy.org/computing/computer-science/algorithms/asymptotic-notation/a/big-o-notation"><u>big O notation</u></a>, it is a way to express how an algorithm consumes a resource, in this case time, as the input size changes.) This means no matter how large the map gets, reads always take the same amount of time. Too bad this is only partially true. In order to read from a hash table, you have to compute the hash. Computing a hash for strings requires reading every byte, so while read time for a hashmap is constant over the table’s size, it’s linear over key length. So, our goal is to find a data structure that is better than O(L) where L is the length of the key.</p><p>There are a few common data structures that provide for reads that have read behavior that meets our criteria. Sorted sets like <code>BTreeSet</code> use comparisons for searching, and that makes them logarithmic over key length <b>O(log(L))</b>, but they are also logarithmic in size too. The net effect is that even very fast sorted sets like <a href="https://crates.io/crates/fst"><u>FST</u></a> work out to be a little (50 ns) slower in our benchmarks than the standard hashmap.</p><p>State machines like parsers and regex are another common tool for searching for strings, though it’s hard to consider them data structures. These systems work by accepting input one unit at a time and determining on each step whether or not to keep evaluating. Being able to make these determinations at every step means state machines are very fast to identify negative cases (i.e. when a string is not valid or not a match). This is perfect for us because only one or two headers per request on average will be internal. In fact, benchmarking an implementation of <code>clear_internal_headers</code> using regular expressions clocks in as taking about twice as long as the hashmap-based solution. This is impressively fast given that regexes, while powerful, aren't known for their raw speed. This approach feels promising – we just need something in between a data structure and a state machine. </p><p>That’s where the trie comes in.</p>
    <div>
      <h2>Don’t Just Trie</h2>
      <a href="#dont-just-trie">
        
      </a>
    </div>
    <p>A <a href="https://en.wikipedia.org/wiki/Trie"><u>trie</u></a> (pronounced like “try” or “tree”) is a type of <a href="https://en.wikipedia.org/wiki/Tree_(data_structure)"><u>tree data structure</u></a> normally used for prefix searches or auto-complete systems over a known set of strings. The structure of the trie lends itself to this because each node in the trie represents a substring of characters found in the initial set. The connections between the nodes represent the characters that can follow a prefix. Here is a small example of a trie built from the words: “and”, “ant”, “dad”, “do”, &amp; “dot”. </p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5wy48j3XNs9awxRNvjLljC/4e2a05b4e1802eba26f9e10e95bd843f/2442-4.png" />
          </figure><p>The root node represents an empty string prefix, so the two lettered edges directed out of it are the only letters that can appear as the first letter in the list of strings, “a” and “d”. Subsequent nodes have increasingly longer prefixes until the final valid words are reached. This layout should make it easy to see how a trie could be useful for quickly identifying strings that are not contained. Even at the root node, we can eliminate any strings that are presented that do not start with “a” or “d”. This paring down of the search space on every step gives reading from a trie the <b>O(log(L))</b> we were looking for … but only for misses. Hits within a trie are still <b>O(L)</b>, but that’s okay, because we are getting misses over 90% of the time.</p><p>Benchmarking a few trie implementations from <a href="https://crates.io/search?q=trie"><u>crates.io</u></a> was disheartening. Remember, most tries are used in response to keyboard events, so optimizing them to run in the hot path of tens of millions of requests per second is not a priority. The fastest existing implementation we found was <a href="https://crates.io/crates/radix_trie"><u>radix_trie</u></a>, but it still clocked in at a full microsecond slower than hashmap. The only thing left to do was write our own implementation of a trie that was optimized for our use case.</p>
    <div>
      <h2>Trie Hard</h2>
      <a href="#trie-hard">
        
      </a>
    </div>
    <p>And we did! Today we are announcing <a href="https://github.com/cloudflare/trie-hard"><u>trie-hard</u></a>. The repository gives a full description of how it works, but the big takeaway is that it gets its speed from storing node relationships in the bits of unsigned integers and keeping the entire tree in a contiguous chunk of memory. In our benchmarks, we found that trie-hard reduced the average runtime for <code>clear_internal_headers</code> to under a microsecond (0.93µs). We can reuse the same formula from above to calculate the expected CPU utilization for trie-hard to be 1.71% * 3.65 / 0.93 = 0.43% That means we have finally achieved and surpassed our goal by reducing the compute utilization of pingora-origin by 1.71% - 0.43% =  <b>1.28%</b>! </p><p>Up until now we have been working only in theory and local benchmarking. What really matters is whether our benchmarking reflects real-life behavior. Trie-hard has been running in production since July 2024, and over the course of this project we have been collecting performance metrics from the running production of pingora-origin using a statistical sampling of its stack trace over time. Using this technique, the CPU utilization percentage of a function is estimated by the percent of samples in which the function appears. If we compare the sampled performance of the different versions of <code>clear_internal_headers</code>, we can see that the results from the performance sampling closely match what our benchmarks predicted.</p><table><tr><th><p>Implementation</p></th><th><p>Stack trace samples containing <code>clear_internal_headers</code></p></th><th><p>Actual CPU Usage (%)</p></th><th><p>Predicted CPU Usage (%)</p></th></tr><tr><td><p>Original </p></td><td><p>19 / 1111</p></td><td><p>1.71</p></td><td><p>n/a</p></td></tr><tr><td><p>Hashmap</p></td><td><p>9 / 1103</p></td><td><p>0.82</p></td><td><p>0.72</p></td></tr><tr><td><p>trie-hard</p></td><td><p>4 / 1171</p></td><td><p>0.34</p></td><td><p>0.43</p></td></tr></table>
    <div>
      <h2>Conclusion</h2>
      <a href="#conclusion">
        
      </a>
    </div>
    <p>Optimizing functions and writing new data structures is cool, but the real conclusion for this post is that knowing where your code is slow and by how much is more important than how you go about optimizing it. Take a moment to thank your observability team (if you're lucky enough to have one), and make use of flame graphs or any other profiling and benchmarking tool you can. Optimizing operations that are already measured in microseconds may seem a little silly, but these small improvements add up.</p> ]]></content:encoded>
            <category><![CDATA[Internet Performance]]></category>
            <category><![CDATA[Rust]]></category>
            <category><![CDATA[Open Source]]></category>
            <category><![CDATA[Optimization]]></category>
            <category><![CDATA[Pingora]]></category>
            <guid isPermaLink="false">2CqKLNS1jaf5H2j99sDONe</guid>
            <dc:creator>Kevin Guthrie</dc:creator>
        </item>
        <item>
            <title><![CDATA[Open sourcing Pingora: our Rust framework for building programmable network services]]></title>
            <link>https://blog.cloudflare.com/pingora-open-source/</link>
            <pubDate>Wed, 28 Feb 2024 15:00:11 GMT</pubDate>
            <description><![CDATA[ Pingora, our framework for building programmable and memory-safe network services, is now open source. Get started using Pingora today ]]></description>
            <content:encoded><![CDATA[ <p></p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7aQBSJRQlM3b1ZRdvJycqY/72f9bd908abc139faba41716074d69d5/Rock-crab-pingora-open-source-mascot.png" />
            
            </figure><p>Today, we are proud to open source Pingora, the Rust framework we have been using to build services that power a significant portion of the traffic on Cloudflare. Pingora is <a href="https://github.com/cloudflare/pingora">released</a> under the <a href="https://www.apache.org/licenses/LICENSE-2.0">Apache License version 2.0</a>.</p><p>As mentioned in our previous blog post, <a href="/how-we-built-pingora-the-proxy-that-connects-cloudflare-to-the-internet">Pingora</a> is a Rust async multithreaded framework that assists us in constructing HTTP proxy services. Since our last blog post, Pingora has handled nearly a quadrillion Internet requests across our global network.</p><p>We are open sourcing Pingora to help build a better and more secure Internet beyond our own infrastructure. We want to provide tools, ideas, and inspiration to our customers, users, and others to build their own Internet infrastructure using a memory safe framework. Having such a framework is especially crucial given the increasing awareness of the importance of memory safety across the <a href="https://www.theregister.com/2022/09/20/rust_microsoft_c/">industry</a> and the <a href="https://www.whitehouse.gov/oncd/briefing-room/2024/02/26/press-release-technical-report/">US government</a>. Under this common goal, we are collaborating with the <a href="https://www.abetterinternet.org/">Internet Security Research Group</a> (ISRG) <a href="https://www.memorysafety.org/blog/introducing-river">Prossimo project</a> to help advance the adoption of Pingora in the Internet’s most critical infrastructure.</p><p>In our <a href="/how-we-built-pingora-the-proxy-that-connects-cloudflare-to-the-internet">previous blog post</a>, we discussed why and how we built Pingora. In this one, we will talk about why and how you might use Pingora.</p><p>Pingora provides building blocks for not only proxies but also clients and servers. Along with these components, we also provide a few utility libraries that implement common logic such as <a href="/how-pingora-keeps-count/">event counting</a>, error handling, and caching.</p>
    <div>
      <h3>What’s in the box</h3>
      <a href="#whats-in-the-box">
        
      </a>
    </div>
    <p>Pingora provides libraries and APIs to build services on top of HTTP/1 and HTTP/2, TLS, or just TCP/UDS. As a proxy, it supports HTTP/1 and HTTP/2 end-to-end, gRPC, and websocket proxying. (HTTP/3 support is on the roadmap.) It also comes with customizable load balancing and failover strategies. For compliance and security, it supports both the commonly used OpenSSL and BoringSSL libraries, which come with FIPS compliance and <a href="https://pq.cloudflareresearch.com/">post-quantum crypto</a>.</p><p>Besides providing these features, Pingora provides filters and callbacks to allow its users to fully customize how the service should process, transform and forward the requests. These APIs will be especially familiar to OpenResty and NGINX users, as many map intuitively onto OpenResty's "*_by_lua" callbacks.</p><p>Operationally, Pingora provides zero downtime graceful restarts to upgrade itself without dropping a single incoming request. Syslog, Prometheus, Sentry, OpenTelemetry and other must-have observability tools are also easily integrated with Pingora as well.</p>
    <div>
      <h3>Who can benefit from Pingora</h3>
      <a href="#who-can-benefit-from-pingora">
        
      </a>
    </div>
    <p>You should consider Pingora if:</p><p><b>Security is your top priority:</b> Pingora is a more memory safe alternative for services that are written in C/C++. While some might argue about memory safety among programming languages, from our practical experience, we find ourselves way less likely to make coding mistakes that lead to memory safety issues. Besides, as we spend less time struggling with these issues, we are more productive implementing new features.</p><p><b>Your service is performance-sensitive:</b> Pingora is fast and efficient. As explained in our previous blog post, we saved a lot of CPU and memory resources thanks to Pingora’s multi-threaded architecture. The saving in time and resources could be compelling for workloads that are sensitive to the cost and/or the speed of the system.</p><p><b>Your service requires extensive customization:</b> The APIs that the Pingora proxy framework provides are highly programmable. For users who wish to build a customized and advanced gateway or load balancer, Pingora provides powerful yet simple ways to implement it. We provide examples in the next section.</p>
    <div>
      <h2>Let’s build a load balancer</h2>
      <a href="#lets-build-a-load-balancer">
        
      </a>
    </div>
    <p>Let's explore Pingora's programmable API by building a simple load balancer. The load balancer will select between <a href="https://1.1.1.1/">https://1.1.1.1/</a> and <a href="https://1.0.0.1/">https://1.0.0.1/</a> to be the upstream in a round-robin fashion.</p><p>First let’s create a blank HTTP proxy.</p>
            <pre><code>pub struct LB();

#[async_trait]
impl ProxyHttp for LB {
    async fn upstream_peer(...) -&gt; Result&lt;Box&lt;HttpPeer&gt;&gt; {
        todo!()
    }
}</code></pre>
            <p>Any object that implements the <code>ProxyHttp</code> trait (similar to the concept of an interface in C++ or Java) is an HTTP proxy. The only required method there is <code>upstream_peer()</code>, which is called for every request. This function should return an <code>HttpPeer</code> which contains the origin IP to connect to and how to connect to it.</p><p>Next let’s implement the round-robin selection. The Pingora framework already provides the <code>LoadBalancer</code> with common selection algorithms such as round robin and hashing, so let’s just use it. If the use case requires more sophisticated or customized server selection logic, users can simply implement it themselves in this function.</p>
            <pre><code>pub struct LB(Arc&lt;LoadBalancer&lt;RoundRobin&gt;&gt;);

#[async_trait]
impl ProxyHttp for LB {
    async fn upstream_peer(...) -&gt; Result&lt;Box&lt;HttpPeer&gt;&gt; {
        let upstream = self.0
            .select(b"", 256) // hash doesn't matter for round robin
            .unwrap();

        // Set SNI to one.one.one.one
        let peer = Box::new(HttpPeer::new(upstream, true, "one.one.one.one".to_string()));
        Ok(peer)
    }
}</code></pre>
            <p>Since we are connecting to an HTTPS server, the SNI also needs to be set. Certificates, timeouts, and other connection options can also be set here in the HttpPeer object if needed.</p><p>Finally, let's put the service in action. In this example we hardcode the origin server IPs. In real life workloads, the origin server IPs can also be discovered dynamically when the <code>upstream_peer()</code> is called or in the background. After the service is created, we just tell the LB service to listen to 127.0.0.1:6188. In the end we created a Pingora server, and the server will be the process which runs the load balancing service.</p>
            <pre><code>fn main() {
    let mut upstreams = LoadBalancer::try_from_iter(["1.1.1.1:443", "1.0.0.1:443"]).unwrap();

    let mut lb = pingora_proxy::http_proxy_service(&amp;my_server.configuration, LB(upstreams));
    lb.add_tcp("127.0.0.1:6188");

    let mut my_server = Server::new(None).unwrap();
    my_server.add_service(lb);
    my_server.run_forever();
}</code></pre>
            <p>Let’s try it out:</p>
            <pre><code>curl 127.0.0.1:6188 -svo /dev/null
&gt; GET / HTTP/1.1
&gt; Host: 127.0.0.1:6188
&gt; User-Agent: curl/7.88.1
&gt; Accept: */*
&gt; 
&lt; HTTP/1.1 403 Forbidden</code></pre>
            <p>We can see that the proxy is working, but the origin server rejects us with a 403. This is because our service simply proxies the Host header, 127.0.0.1:6188, set by curl, which upsets the origin server. How do we make the proxy correct that? This can simply be done by adding another filter called <code>upstream_request_filter</code>. This filter runs on every request after the origin server is connected and before any HTTP request is sent. We can add, remove or change http request headers in this filter.</p>
            <pre><code>async fn upstream_request_filter(…, upstream_request: &amp;mut RequestHeader, …) -&gt; Result&lt;()&gt; {
    upstream_request.insert_header("Host", "one.one.one.one")
}</code></pre>
            <p>Let’s try again:</p>
            <pre><code>curl 127.0.0.1:6188 -svo /dev/null
&lt; HTTP/1.1 200 OK</code></pre>
            <p>This time it works! The complete example can be found <a href="https://github.com/cloudflare/pingora/blob/main/pingora-proxy/examples/load_balancer.rs">here</a>.</p><p>Below is a very simple diagram of how this request flows through the callback and filter we used in this example. The Pingora proxy framework currently provides <a href="https://github.com/cloudflare/pingora/blob/main/docs/user_guide/phase.md">more filters</a> and callbacks at different stages of a request to allow users to modify, reject, route and/or log the request (and response).</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/WvHfhONcYEEL4kPRwGzKp/f2456d177e727063a49265eea831b8af/Flow_Diagram.png" />
            
            </figure><p>Behind the scenes, the Pingora proxy framework takes care of connection pooling, TLS handshakes, reading, writing, parsing requests and any other common proxy tasks so that users can focus on logic that matters to them.</p>
    <div>
      <h2>Open source, present and future</h2>
      <a href="#open-source-present-and-future">
        
      </a>
    </div>
    <p>Pingora is a library and toolset, not an executable binary. In other words, Pingora is the engine that powers a car, not the car itself. Although Pingora is production-ready for industry use, we understand a lot of folks want a batteries-included, ready-to-go web service with low or no-code config options. Building that application on top of Pingora will be the focus of our collaboration with the ISRG to expand Pingora's reach. Stay tuned for future announcements on that project.</p><p>Other caveats to keep in mind:</p><ul><li><p><b>Today, API stability is not guaranteed.</b> Although we will try to minimize how often we make breaking changes, we still reserve the right to add, remove, or change components such as request and response filters as the library evolves, especially during this pre-1.0 period.</p></li><li><p><b>Support for non-Unix based operating systems is not currently on the roadmap.</b> We have no immediate plans to support these systems, though this could change in the future.</p></li></ul>
    <div>
      <h2>How to contribute</h2>
      <a href="#how-to-contribute">
        
      </a>
    </div>
    <p>Feel free to raise bug reports, documentation issues, or feature requests in our GitHub <a href="https://github.com/cloudflare/pingora/issues">issue tracker</a>. Before opening a pull request, we strongly suggest you take a look at our <a href="https://github.com/cloudflare/pingora/blob/main/.github/CONTRIBUTING.md">contribution guide</a>.</p>
    <div>
      <h2>Conclusion</h2>
      <a href="#conclusion">
        
      </a>
    </div>
    <p>In this blog we announced the open source of our Pingora framework. We showed that Internet entities and infrastructure can benefit from Pingora’s security, performance and customizability. We also demonstrated how easy it is to use Pingora and how customizable it is.</p><p>Whether you're building production web services or experimenting with network technologies we hope you find value in Pingora. It's been a long journey, but sharing this project with the open source community has been a goal from the start. We'd like to thank the Rust community as Pingora is built with many great open-sourced Rust crates. Moving to a memory safe Internet may feel like an impossible journey, but it's one we hope you join us on.</p> ]]></content:encoded>
            <category><![CDATA[Developer Platform]]></category>
            <category><![CDATA[Developers]]></category>
            <category><![CDATA[Rust]]></category>
            <category><![CDATA[Open Source]]></category>
            <category><![CDATA[Performance]]></category>
            <category><![CDATA[Pingora]]></category>
            <guid isPermaLink="false">38GzyqaZUYTF8fJFAiwpfx</guid>
            <dc:creator>Yuchen Wu</dc:creator>
            <dc:creator>Edward Wang</dc:creator>
            <dc:creator>Andrew Hauck</dc:creator>
        </item>
        <item>
            <title><![CDATA[How Pingora keeps count]]></title>
            <link>https://blog.cloudflare.com/how-pingora-keeps-count/</link>
            <pubDate>Fri, 12 May 2023 13:00:56 GMT</pubDate>
            <description><![CDATA[ In this blog post, we explain and open source the counting algorithm that powers Pingora. This will be the first of a series of blog posts that share both the Pingora libraries and the ideas behind  ]]></description>
            <content:encoded><![CDATA[ <p></p><p>A while ago we shared how we replaced NGINX with our in-house proxy, <a href="/how-we-built-pingora-the-proxy-that-connects-cloudflare-to-the-internet/">Pingora</a>. We promised to share more technical details as well as our open sourcing plan. This blog post will be the first of a series that shares both the code libraries that power Pingora and the ideas behind them.</p><p>Today, we take a look at one of Pingora’s libraries: pingora-limits.</p><p>pingora-limits provides the functionality to count inflight events and estimate the rate of events over time. These functions are commonly used to protect infrastructure and services from being overwhelmed by certain types of malicious or misbehaving requests.</p><p>For example, when an origin server becomes slow or unresponsive, requests will accumulate on our servers, which adds pressure on both our servers and our customers’ servers. With this library, we are able to identify which origins have issues, so that action can be taken without affecting other traffic.</p><p>The problem can be abstracted in a very simple way. The input is a (never ending) stream of different types of events. At any point, the system should be able to tell the number of appearances (or the rate) of a certain type of event.</p><p>In a simple example, colors are used as the type of event. The following is one possible example of a sequence of events:</p><p><code>red, blue, red, orange, green, brown, red, blue,...</code></p><p>In this example, the system should report that “red” appears three times.</p><p>The corresponding algorithms are straightforward to design. One obvious answer is to use a hash table, where the keys are the colors and the values are their corresponding appearances. Whenever a new event appears, the algorithm looks up the hash table and increases the appearance counter. It is not hard to tell that this algorithm’s time complexity is O(1) (per event) and the space complexity O(n) where n is the number of the types of events.</p>
    <div>
      <h2>How Pingora does it</h2>
      <a href="#how-pingora-does-it">
        
      </a>
    </div>
    <p>The hash table solution is fine in common scenarios, but we believe there are a few things that can be improved.</p><ul><li><p>We observe traffic to millions of different servers when the misbehaving ones are only a few at a given time. It seems a bit wasteful to require space (memory) that holds the counter for all the keys.</p></li><li><p>Concurrently updating the hash table (especially when adding new keys) requires a lock. This behavior potentially forces all concurrent event processing to go through our system serialized. In other words, when lock contention is severe, the lock slows down the system.</p></li></ul><p>The motivation to improve the above algorithm is even stronger considering such algorithms need to be deployed at scale. This algorithm operates on tens of thousands of machines. It handles more than twenty million requests per second. The benefits of efficiency improvement can be significant.</p><p>pingora-limits adopts a different approach: <a href="https://en.wikipedia.org/wiki/Count%E2%80%93min_sketch">count–min sketch</a> (CM sketch) estimation. CM sketch estimates the counts of events in O(1) (per event) but only using O(log(n)) of space (polylogarithmic, to be precise, more details <a href="https://dsf.berkeley.edu/cs286/papers/countmin-latin2004.pdf">here</a>). Because of the simplicity of this algorithm, which we will discuss in a bit, it can be implemented without locks. Therefore, pingora-limits runs much faster and more efficiently compared to the hash table approach discussed earlier.</p>
    <div>
      <h3>CM sketch</h3>
      <a href="#cm-sketch">
        
      </a>
    </div>
    <p>The idea of a CM sketch is similar to a Bloom filter. The mathematical details of the CM sketch can be found in <a href="http://dimacs.rutgers.edu/~graham/pubs/papers/cmencyc.pdf">this paper</a>. In this section, we will just illustrate how it works.</p><p>A CM sketch data structure takes two parameters, H: number of hashes (rows) and N number of counters (columns) per hash (row). The rows and columns form a matrix. The space they take is H*N. Each row has its own <a href="https://en.wikipedia.org/wiki/K-independent_hashing">independent</a> hash function (hash_i()).</p><p>For this example, we use H=3 and N=4:</p>
<table>
<thead>
  <tr>
    <th><span>0</span></th>
    <th><span>0</span></th>
    <th><span>0</span></th>
    <th><span>0</span></th>
  </tr>
</thead>
<tbody>
  <tr>
    <td><span>0</span></td>
    <td><span>0</span></td>
    <td><span>0</span></td>
    <td><span>0</span></td>
  </tr>
  <tr>
    <td><span>0</span></td>
    <td><span>0</span></td>
    <td><span>0</span></td>
    <td><span>0</span></td>
  </tr>
</tbody>
</table><p>When an event, "red", arrives, it is counted by every row independently. Each row will use its own hashing function ( hash_i(“red”) ) to choose a column. The counter of the column is increased without <b>worrying about collisions</b> (see the end of this section).</p><p>The table below illustrates a possible state of the matrix after a single “red” event:</p>
<table>
<thead>
  <tr>
    <th><span>0</span></th>
    <th><span>1</span></th>
    <th><span>0</span></th>
    <th><span>0</span></th>
  </tr>
</thead>
<tbody>
  <tr>
    <td><span>0</span></td>
    <td><span>0</span></td>
    <td><span>1</span></td>
    <td><span>0</span></td>
  </tr>
  <tr>
    <td><span>1</span></td>
    <td><span>0</span></td>
    <td><span>0</span></td>
    <td><span>0</span></td>
  </tr>
</tbody>
</table><p>Then, let’s assume the event "blue" arrives, and we assume it collides with "red" at row 2: both hash to the third slot:</p>
<table>
<thead>
  <tr>
    <th><span>1</span></th>
    <th><span>1</span></th>
    <th><span>0</span></th>
    <th><span>0</span></th>
  </tr>
</thead>
<tbody>
  <tr>
    <td><span>0</span></td>
    <td><span>0</span></td>
    <td><span>2</span></td>
    <td><span>0</span></td>
  </tr>
  <tr>
    <td><span>1</span></td>
    <td><span>0</span></td>
    <td><span>0</span></td>
    <td><span>1</span></td>
  </tr>
</tbody>
</table><p>Let’s say after another series of events, “blue, red, red, red, blue, red”, So far the algorithm observed 5  “red”s and 3 “blue”s in total. Following the algorithm, the estimator eventually becomes:</p>
<table>
<thead>
  <tr>
    <th><span>3</span></th>
    <th><span>5</span></th>
    <th><span>0</span></th>
    <th><span>0</span></th>
  </tr>
</thead>
<tbody>
  <tr>
    <td><span>0</span></td>
    <td><span>0</span></td>
    <td><span>8</span></td>
    <td><span>0</span></td>
  </tr>
  <tr>
    <td><span>5</span></td>
    <td><span>0</span></td>
    <td><span>0</span></td>
    <td><span>3</span></td>
  </tr>
</tbody>
</table><p>Now, let’s see how the matrix reports the occurrence of each event. In order to retrieve the count of keys, the estimator just returns the minimal value of all the columns to which that key belongs. So the count of red is min(5, 8, 5) = 5 and blue is min(3, 8, 3) = 3.</p><p>This algorithm chooses the cells with the least collisions (via the min() operations). Therefore, collisions between events in single cells are acceptable because as long as there are collision free cells for a given type of event, the counting for that event is accurate.</p><p>The estimator can overestimate when two (or more) keys collide on all slots. Assuming there are only two keys, the probability of their total collision is 1/ N^H (1/64 in this example). On the other hand, it never underestimates because it never loses count of any events.</p>
    <div>
      <h3>Practical implementation</h3>
      <a href="#practical-implementation">
        
      </a>
    </div>
    <p>Because the algorithm only requires hashing, array index and counter increment, it can be implemented in a few lines of code and lock-free.</p><p>The following is a code snippet of how it is implemented in Rust.</p>
            <pre><code>pub struct Estimator {
    estimator: Box&lt;[(Box&lt;[AtomicIsize]&gt;, RandomState)]&gt;,
}
 
impl Estimator {
    /// Increment `key` by the value given. Return the new estimated value as a result.
    pub fn incr&lt;T: Hash&gt;(&amp;self, key: T, value: isize) -&gt; isize {
        let mut min = isize::MAX;
        for (slot, hasher) in self.estimator.iter() {
            let hash = hash(&amp;key, hasher) as usize;
            let counter = &amp;slot[hash % slot.len()];
            let current = counter.fetch_add(value, Ordering::Relaxed);
            min = std::cmp::min(min, current + value);
        }
        min
    }
}</code></pre>
            
    <div>
      <h3>Performance</h3>
      <a href="#performance">
        
      </a>
    </div>
    <p>We compare the design above with the two hash table based approaches.</p><ol><li><p>naive: Mutex&lt;HashMap&lt;u32, usize&gt;&gt;. This approach references the simple hash table approach mentioned above. This design requires a lock on every operation.</p></li><li><p>optimized: DashMap&lt;u32, AtomicUsize&gt;. <a href="https://docs.rs/dashmap/latest/dashmap/">DashMap</a> leverages multiple hash tables in order to shard the keys to reduce contentions across different keys. We also use atomic counters here so that counting existing keys won't need a write lock.</p></li></ol><p>We have two test cases, one that is single threaded and another that is multi-threaded. In both cases, we have one million keys. We generate 100 million events from the keys. The keys are uniformly distributed among the events.</p><p>The results below are performed on Debian VM running on M1 MacBook Pro.</p><p><b>Speed</b>Per event (the incr() function above) timing, lower is better:</p>
<table>
<thead>
  <tr>
    <th></th>
    <th><span>pingora-limits</span></th>
    <th><span>naive</span></th>
    <th><span>optimized</span></th>
  </tr>
</thead>
<tbody>
  <tr>
    <td><span>Single thread</span></td>
    <td><span>10ns</span></td>
    <td><span>51ns</span></td>
    <td><span>43ns</span></td>
  </tr>
  <tr>
    <td><span>Eight threads</span></td>
    <td><span>212ns</span></td>
    <td><span>1505ns</span></td>
    <td><span>212ns</span></td>
  </tr>
</tbody>
</table><p>In the single thread case, where there is no lock contention, our approach is 5x faster than the naive one and 4x faster than the optimized one. With multiple threads, there is a high amount of contention. Our approach is similar to the optimized version. Both are 7x faster than the naive one. The reason the performance of pingora-limits and the optimized hash table are similar is because in both approaches the hot path is just updating the atomic counter.</p><p><b>Memory consumption</b>Lower is better. The numbers are collected only from the single threaded test cases for simplicity.</p>
<table>
<thead>
  <tr>
    <th></th>
    <th><span>peak memory bytes</span></th>
    <th><span> total allocations</span></th>
    <th><span>total allocated bytes</span></th>
  </tr>
</thead>
<tbody>
  <tr>
    <td><span>pingora-limits</span></td>
    <td><span>26,184</span></td>
    <td><span>9</span></td>
    <td><span>26,184</span></td>
  </tr>
  <tr>
    <td><span>naive</span></td>
    <td><span>53,477,392</span></td>
    <td><span>20</span></td>
    <td><span>71,303,260 </span></td>
  </tr>
  <tr>
    <td><span>optimized</span></td>
    <td><span>36,211,208 </span></td>
    <td><span>491</span></td>
    <td><span>71,307,722</span></td>
  </tr>
</tbody>
</table><p>Pingora-limits at peak requires 1/2000 of the memory compared to the naive one and 1/1300 of the memory of the optimized one.</p><p>From the data above, pingora-limits is both CPU and memory efficient.</p><p>The estimator provided by Pingora-limits is a biased estimator because it is possible for it to overestimate the appearance of events.</p><p>In the case of accurate counting, where false positives are absolutely unacceptable, pingora-limits can still be very useful. It can work as a first stage filter where only the events beyond a certain threshold are fed to a hash table to perform accurate counting. In this case, the majority of low frequency event types are filtered out by the filter so that the hash table also consumes little memory without losing any accuracy.</p>
    <div>
      <h2>How it is used in production</h2>
      <a href="#how-it-is-used-in-production">
        
      </a>
    </div>
    <p>In production, pingora uses this library in a few places. The most common one is the connection limit feature. When our servers try to establish too many connections to a single origin server, in order to protect the server and our infrastructure from becoming overloaded, this feature will start rejecting new requests with <a href="https://developers.cloudflare.com/support/troubleshooting/cloudflare-errors/troubleshooting-cloudflare-5xx-errors/#error-503-service-temporarily-unavailable">503 errors</a>.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/WEm9dkrBXLof8baG7ph9e/8ac9205ac8c47ec1a89fabbab848f2ae/image2-4.png" />
            
            </figure><p>In this feature every incoming request increases a counter, shared by all other requests with the same customer ID, server IP and the server hostname. When the request finishes, the counter decreases accordingly. If the value of the counter is beyond a certain threshold, the request is rejected with a 503 error response. In our production environment we choose the parameters of the library so that a theoretical collision chance between two unrelated customers is about 1 / 2 ^ 52. Additionally, the rejection threshold is significantly higher than what a healthy customer’s traffic would reach. Therefore, even if multiple customers’ counters collide, it is not likely that the overestimated value would reach the threshold. So a false positive on the connection limit is not likely to happen.</p>
    <div>
      <h2>Conclusion</h2>
      <a href="#conclusion">
        
      </a>
    </div>
    <p>Pingora-limits crate is available now on <a href="https://github.com/cloudflare/pingora/tree/main/pingora-limits">GitHub</a>. Both the core functionality and the performance benchmark performed above can be found there.</p><p>In this blog post, we introduced pingora-limits, a library that counts events efficiently. We explained the core idea, which is based on a probabilistic data structure. We also showed through a performance benchmark that the pingora-limits implementation is fast and very efficient for memory consumption.</p><p>Not only that, but we will continue introducing and open sourcing Pingora components and libraries because we believe that sharing the idea behind the code is equally important as sharing the code itself.</p><p>Interested in joining us to help build a better Internet? Our engineering teams are <a href="https://www.cloudflare.com/careers/jobs/?department=Engineering">hiring</a>.</p> ]]></content:encoded>
            <category><![CDATA[Performance]]></category>
            <category><![CDATA[Open Source]]></category>
            <category><![CDATA[Rust]]></category>
            <category><![CDATA[Pingora]]></category>
            <guid isPermaLink="false">3HZMEE4RsNyQcQ4Zw5dWzl</guid>
            <dc:creator>Yuchen Wu</dc:creator>
        </item>
        <item>
            <title><![CDATA[How we built Pingora, the proxy that connects Cloudflare to the Internet]]></title>
            <link>https://blog.cloudflare.com/how-we-built-pingora-the-proxy-that-connects-cloudflare-to-the-internet/</link>
            <pubDate>Wed, 14 Sep 2022 13:00:00 GMT</pubDate>
            <description><![CDATA[ Today we are excited to talk about Pingora, a new HTTP proxy we’ve built in-house using Rust that serves over 1 trillion requests a day ]]></description>
            <content:encoded><![CDATA[ <p></p>
    <div>
      <h2>Introduction</h2>
      <a href="#introduction">
        
      </a>
    </div>
    <p>Today we are excited to talk about Pingora, a new HTTP proxy we’ve built in-house using <a href="https://www.rust-lang.org/">Rust</a> that serves over 1 trillion requests a day, boosts our performance, and enables many new features for Cloudflare customers, all while requiring only a third of the CPU and memory resources of our previous proxy infrastructure.</p><p>As Cloudflare has scaled we’ve outgrown NGINX. It was great for many years, but over time its limitations at our scale meant building something new made sense. We could no longer get the performance we needed nor did NGINX have the features we needed for our very complex environment.</p><p>Many Cloudflare customers and users use the Cloudflare global network as a proxy between HTTP clients (such as web browsers, apps, IoT devices and more) and servers. In the past, we’ve talked a lot about how browsers and other user agents connect to our network, and we’ve developed a lot of technology and implemented new protocols (see <a href="/the-road-to-quic/">QUIC</a> and <a href="/delivering-http-2-upload-speed-improvements/">optimizations for http2</a>) to make this leg of the connection more efficient.</p><p>Today, we’re focusing on a different part of the equation: the service that proxies traffic between our network and servers on the Internet. This proxy service powers our CDN, Workers fetch, Tunnel, Stream, R2 and many, many other features and products.</p><p>Let’s dig in on why we chose to replace our legacy service and how we developed Pingora, our new system designed specifically for Cloudflare’s customer use cases and scale.</p>
    <div>
      <h2>Why build yet another proxy</h2>
      <a href="#why-build-yet-another-proxy">
        
      </a>
    </div>
    <p>Over the years, our usage of NGINX has run up against limitations. For some limitations, we optimized or worked around them. But others were much harder to overcome.</p>
    <div>
      <h3>Architecture limitations hurt performance</h3>
      <a href="#architecture-limitations-hurt-performance">
        
      </a>
    </div>
    <p>The NGINX <a href="https://www.nginx.com/blog/inside-nginx-how-we-designed-for-performance-scale/">worker (process) architecture</a> has operational drawbacks for our use cases that hurt our performance and efficiency.</p><p>First, in NGINX each request can only be served by a single worker. This results in <a href="/the-sad-state-of-linux-socket-balancing/">unbalanced load across all CPU cores</a>, which <a href="/keepalives-considered-harmful/">leads to slowness</a>.</p><p>Because of this request-process pinning effect, requests that do <a href="/the-problem-with-event-loops/">CPU heavy</a> or <a href="/how-we-scaled-nginx-and-saved-the-world-54-years-every-day/">blocking IO tasks</a> can slow down other requests. As those blog posts attest we’ve spent a lot of time working around these problems.</p><p>The most critical problem for our use cases is poor connection reuse. Our machines establish TCP connections to origin servers to proxy HTTP requests. Connection reuse speeds up TTFB (time-to-first-byte) of requests by reusing previously established connections from a connection pool, skipping TCP and TLS handshakes required on a new connection.</p><p>However, the <a href="https://www.nginx.com/blog/load-balancing-with-nginx-plus-part-2/">NGINX connection pool</a> is per worker. When a request lands on a certain worker, it can only reuse the connections within that worker. When we add more NGINX workers to scale up, our connection reuse ratio gets worse because the connections are scattered across more isolated pools of all the processes. This results in slower TTFB and more connections to maintain, which consumes resources (and money) for both us and our customers.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3a7AeWb20XQpmYKyQpv5fb/351efd47c3e67ab6e2d814007c10fad5/image2-4.png" />
            
            </figure><p>As mentioned in past blog posts, we have workarounds for some of these issues. But if we can address the fundamental issue: the worker/process model, we will resolve all these problems naturally.</p>
    <div>
      <h3>Some types of functionality are difficult to add</h3>
      <a href="#some-types-of-functionality-are-difficult-to-add">
        
      </a>
    </div>
    <p>NGINX is a very good web server, load balancer or a simple gateway. But Cloudflare does way more than that. We used to build all the functionality we needed around NGINX, which is not easy to do while trying not to diverge too much from NGINX upstream codebase.</p><p>For example, when <a href="/new-tools-to-monitor-your-server-and-avoid-downtime/">retrying/failing over</a> a request, sometimes we want to send a request to a different origin server with a different set of request headers. But that is not something NGINX allows us to do. In cases like this, we spend time and effort on working around the NGINX constraints.</p><p>Meanwhile, the programming languages we had to work with didn’t provide help alleviating the difficulties. NGINX is purely in C, which is not memory safe by design. It is very error-prone to work with such a 3rd party code base. It is quite easy to get into <a href="/incident-report-on-memory-leak-caused-by-cloudflare-parser-bug/">memory safety issues</a>, even for experienced engineers, and we wanted to avoid these as much as possible.</p><p>The other language we used to complement C is Lua. It is less risky but also less performant. In addition, we often found ourselves missing <a href="https://en.wikipedia.org/wiki/Type_system#Static_type_checking">static typing</a> when working with complicated Lua code and business logic.</p><p>And the NGINX community is not very active, and development tends to be <a href="https://dropbox.tech/infrastructure/how-we-migrated-dropbox-from-nginx-to-envoy">“behind closed doors”</a>.</p>
    <div>
      <h3>Choosing to build our own</h3>
      <a href="#choosing-to-build-our-own">
        
      </a>
    </div>
    <p>Over the past few years, as we’ve continued to grow our customer base and feature set, we continually evaluated three choices:</p><ol><li><p>Continue to invest in NGINX and possibly fork it to tailor it 100% to our needs. We had the expertise needed, but given the architecture limitations mentioned above, significant effort would be required to rebuild it in a way that fully supported our needs.</p></li><li><p>Migrate to another 3rd party proxy codebase. There are definitely good projects, like <a href="https://dropbox.tech/infrastructure/how-we-migrated-dropbox-from-nginx-to-envoy">envoy</a> and <a href="https://linkerd.io/2020/12/03/why-linkerd-doesnt-use-envoy/">others</a>. But this path means the same cycle may repeat in a few years.</p></li><li><p>Start with a clean slate, building an in-house platform and framework. This choice requires the most upfront investment in terms of engineering effort.</p></li></ol><p>We evaluated each of these options every quarter for the past few years. There is no obvious formula to tell which choice is the best. For several years, we continued with the path of the least resistance, continuing to augment NGINX. However, at some point, building our own proxy’s return on investment seemed worth it. We made a call to build a proxy from scratch, and began designing the proxy application of our dreams.</p>
    <div>
      <h2>The Pingora Project</h2>
      <a href="#the-pingora-project">
        
      </a>
    </div>
    
    <div>
      <h3>Design decisions</h3>
      <a href="#design-decisions">
        
      </a>
    </div>
    <p>To make a proxy that serves millions of requests per second fast, efficient and secure, we have to make a few important design decisions first.</p><p>We chose <a href="https://www.rust-lang.org/">Rust</a> as the language of the project because it can do what C can do in a memory safe way without compromising performance.</p><p>Although there are some great off-the-shelf 3rd party HTTP libraries, such as <a href="https://github.com/hyperium/hyper">hyper</a>, we chose to build our own because we want to maximize the flexibility in how we handle HTTP traffic and to make sure we can innovate at our own pace.</p><p>At Cloudflare, we handle traffic across the entire Internet. We have many cases of bizarre and non-RFC compliant HTTP traffic that we have to support. This is a common dilemma across the HTTP community and web, where there is tension between strictly following HTTP specifications and accommodating the nuances of a wide ecosystem of potentially legacy clients or servers. Picking one side can be a tough job.</p><p>HTTP status codes are defined in <a href="https://www.rfc-editor.org/rfc/rfc9110.html#name-status-codes">RFC 9110 as a three digit integer</a>, and generally expected to be in the range 100 through 599. Hyper was one such implementation. However, many servers support the use of status codes between 599 and 999. <a href="https://github.com/hyperium/http/issues/144">An issue</a> had been created for this feature, which explored various sides of the debate. While the hyper team did ultimately accept that change, there would have been valid reasons for them to reject such an ask, and this was only one of many cases of noncompliant behavior we needed to support.</p><p>In order to satisfy the requirements of Cloudflare's position in the HTTP ecosystem, we needed a robust, permissive, customizable HTTP library that can survive the wilds of the Internet and support a variety of noncompliant use cases. The best way to guarantee that is to implement our own.</p><p>The next design decision was around our workload scheduling system. We chose multithreading over <a href="https://www.nginx.com/blog/inside-nginx-how-we-designed-for-performance-scale/#Inside-the-NGINX-Worker-Process">multiprocessing</a> in order to share resources, especially connection pools, easily. We also decided that <a href="https://en.wikipedia.org/wiki/Work_stealing">work stealing</a> was required to avoid some classes of performance problems mentioned above. The Tokio async runtime turned out to be <a href="https://tokio.rs/blog/2019-10-scheduler">a great fit</a> for our needs.</p><p>Finally, we wanted our project to be intuitive and developer friendly. What we build is not the final product, and should be extensible as a platform as more features are built on top of it. We decided to implement a “life of a request” event based programmable interface <a href="https://openresty-reference.readthedocs.io/en/latest/Directives/">similar to NGINX/OpenResty</a>. For example, the “request filter” phase allows developers to run code to modify or reject the request when a request header is received. With this design, we can separate our business logic and generic proxy logic cleanly. Developers who previously worked on NGINX can easily switch to Pingora and quickly become productive.</p>
    <div>
      <h2>Pingora is faster in production</h2>
      <a href="#pingora-is-faster-in-production">
        
      </a>
    </div>
    <p>Let’s fast-forward to the present. Pingora handles almost every HTTP request that needs to interact with an origin server (for a cache miss, for example), and we’ve collected a lot of performance data in the process.</p><p>First, let’s see how Pingora speeds up our customer’s traffic. Overall traffic on Pingora shows 5ms reduction on median TTFB and 80ms reduction on the 95th percentile. This is not because we run code faster. Even our old service could handle requests in the sub-millisecond range.</p><p>The savings come from our new architecture which can share connections across all threads. This means a better connection reuse ratio, which spends less time on TCP and TLS handshakes.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1pLeU6tqq4NDdGqPRWBm7r/3e05c76314b83f174beebbc6b3263ae8/image3-3.png" />
            
            </figure><p>Across all customers, Pingora makes only a third as many new connections per second compared to the old service. For one major customer, it increased the connection reuse ratio from 87.1% to 99.92%, which reduced new connections to their origins by 160x. To present the number more intuitively, by switching to Pingora, we save our customers and users 434 years of handshake time every day.</p>
    <div>
      <h3>More features</h3>
      <a href="#more-features">
        
      </a>
    </div>
    <p>Having a developer friendly interface engineers are familiar with while eliminating the previous constraints allows us to develop more features, more quickly. Core functionality like new protocols act as building blocks to more products we can offer to customers.</p><p>As an example, we were able to add HTTP/2 upstream support to Pingora without major hurdles. This allowed us to offer <a href="/road-to-grpc/">gRPC</a>  to our customers shortly afterwards. Adding this same functionality to NGINX would have required <a href="https://mailman.nginx.org/pipermail/nginx-devel/2017-July/010357.html">significantly more engineering effort and might not have materialized</a>.</p><p>More recently we've announced <a href="/introducing-cache-reserve/">Cache Reserve</a> where Pingora uses R2 storage as a caching layer. As we add more functionality to Pingora, we’re able to offer new products that weren’t feasible before.</p>
    <div>
      <h3>More efficient</h3>
      <a href="#more-efficient">
        
      </a>
    </div>
    <p>In production, Pingora consumes about 70% less CPU and 67% less memory compared to our old service with the same traffic load. The savings come from a few factors.</p><p>Our Rust code runs <a href="https://benchmarksgame-team.pages.debian.net/benchmarksgame/fastest/rust.html">more efficiently</a> compared to our old <a href="https://benchmarksgame-team.pages.debian.net/benchmarksgame/fastest/lua.html">Lua code</a>. On top of that, there are also efficiency differences from their architectures. For example, in NGINX/OpenResty, when the Lua code wants to access an HTTP header, it has to read it from the NGINX C struct, allocate a Lua string and then copy it to the Lua string. Afterwards, Lua has to garbage-collect its new string as well. In Pingora, it would just be a direct string access.</p><p>The multithreading model also makes sharing data across requests more efficient. NGINX also has shared memory but due to implementation limitations, every shared memory access has to use a mutex lock and only strings and numbers can be put into shared memory. In Pingora, most shared items can be accessed directly via shared references behind <a href="https://doc.rust-lang.org/std/sync/struct.Arc.html">atomic reference counters</a>.</p><p>Another significant portion of CPU saving, as mentioned above, is from making fewer new connections. TLS handshakes are expensive compared to just sending and receiving data via established connections.</p>
    <div>
      <h3>Safer</h3>
      <a href="#safer">
        
      </a>
    </div>
    <p>Shipping features quickly and safely is difficult, especially at our scale. It's hard to predict every edge case that can occur in a distributed environment processing millions of requests a second. Fuzzing and static analysis can only mitigate so much. Rust's memory-safe semantics guard us from undefined behavior and give us confidence our service will run correctly.</p><p>With those assurances we can focus more on how a change to our service will interact with other services or a customer's origin. We can develop features at a higher cadence and not be burdened by memory safety and hard to diagnose crashes.</p><p>When crashes do occur an engineer needs to spend time to diagnose how it happened and what caused it. Since Pingora's inception we’ve served a few hundred trillion requests and have yet to crash due to our service code.</p><p>In fact, Pingora crashes are so rare we usually find unrelated issues when we do encounter one. Recently we discovered <a href="https://lkml.org/lkml/2022/3/15/6">a kernel bug</a> soon after our service started crashing. We've also discovered hardware issues on a few machines, in the past ruling out rare memory bugs caused by our software even after significant debugging was nearly impossible.</p>
    <div>
      <h2>Conclusion</h2>
      <a href="#conclusion">
        
      </a>
    </div>
    <p>To summarize, we have built an in-house proxy that is faster, more efficient and versatile as the platform for our current and future products.</p><p>We will be back with more technical details regarding the problems we faced, the optimizations we applied and the lessons we learned from building Pingora and rolling it out to power a significant portion of the Internet. We will also be back with our plan to open source it.</p><p>Pingora is our latest attempt at rewriting our system, but it won’t be our last. It is also only one of the building blocks in the re-architecting of our systems.</p><p>Interested in joining us to help build a better Internet? <a href="https://www.cloudflare.com/careers/jobs/?department=Engineering">Our engineering teams are hiring</a>.</p> ]]></content:encoded>
            <category><![CDATA[Rust]]></category>
            <category><![CDATA[NGINX]]></category>
            <category><![CDATA[Performance]]></category>
            <category><![CDATA[Pingora]]></category>
            <guid isPermaLink="false">5nZ4Ad1lcQreuoYkJfV36b</guid>
            <dc:creator>Yuchen Wu</dc:creator>
            <dc:creator>Andrew Hauck</dc:creator>
        </item>
    </channel>
</rss>