Thibault Meunier - The Cloudflare Blog

Moving past bots vs. humans

Thibault Meunier — Tue, 21 Apr 2026 13:00:00 GMT

For us humans to interact with the online world, we need a gateway: keyboard, screen, browser, device. What is called "human detection" online are patterns that humans use when interacting with such devices. These patterns have changed in recent years: a startup CEO now uses their browser to summarize the news, a tech enthusiast automates the process to book their concert tickets when sales open at night, someone who's visually impaired enables accessibility on their screen reader, and companies route their employee traffic through zero trust proxies.

At the same time, website owners are still looking to protect their data, manage their resources, control content distribution, and prevent abuse. These problems aren’t solved by knowing whether the client is a human or a bot: There are wanted bots and there are unwanted humans. These problems require knowing intent and behavior. The ability to detect automation remains critical. However, as the distinctions between actors become blurry, the systems we build now should accommodate a future where "bots vs. humans" is not the important data point.

What actually matters is not humanity in the abstract, but questions such as: is this attack traffic, is that crawler load proportional to the traffic it returns, do I expect this user to connect from this new country, are my ads being gamed?

What we discuss with the term “bots” is really two stories. The first is whether website owners should let known crawlers through when they are not getting traffic back. We have touched on this with bot authentication with http message signatures for crawlers that want to identify without being impersonated. The second is the emergence of new clients that do not embed the same behaviors as web browsers historically did, which matters for systems such as private rate limit.

In this post, we explore how web protection works today, and how it must evolve when the line between bot and human is fading.

The Web we had

When we use the Web, we don't talk directly to the thousands of servers we interact with every day. We use Web browsers. These are also known as "user agents" because they act on our behalf, representing our interests so that we can safely shop, read, and watch the Web without giving sites access to our entire computer or phone.

Websites also have an interest in how browsers work. They want to make sure that their content is presented accurately (fits the screen on mobile, has the right background color, the correct language). Websites also want to ensure that people are able to complete a purchase, read their articles, use their microphone, or sign in securely without a password. They also want people to see the ads beside the articles.

This tension between the interests of browser users and websites has been going on for a long time. Publishers typically want pixel-level control over the experiences of their users, but the people on the other side of the browser often want to use the data they access in ways that weren't envisioned by the publisher.

Web browser vendors and the standards ecosystem around them have paid careful attention to balancing these interests, sometimes with great controversy. For example, you can use browser extensions to block ads, but over time browsers have restricted what such extensions can do. Accessibility standards (e.g., WCAG) have paved the way for using Web content in ways that aren't about pixels, backed in many places by regulatory requirements. One can question the specifics of each of those tradeoffs, but they come as a package: if you want to be on the Web, you have to accept it, whether you are a publisher or a user.

Now, however, that balance is shifting. Having an assistant summarize the news or aggregate research is not a new concept, but AI democratizes this capability for everyone. The friction comes from how these emerging clients operate. A human assistant might print an article or take a screenshot without the publisher knowing, but they still use a standard web browser to render the site in the first place. AI agents bypass this step, disrupting the balanced approach to publishers’ vs. users’ rights that browsers built. They quietly fetch the raw data without rendering the page. For publishers, because of their overlap with pre-existing browser traffic, these clients are inherently opaque. Website owners cannot tell if their fetched content is serving one private report (possibly distorted, possibly unattributed) or being ingested to train a model for a million users, which disrupts the predictable (and monetizable) traffic that keeps their sites online.

The implicit agreement that made the Web work is breaking down. To understand how, the next section goes over a common architecture on the Internet.

The client-server model

Let’s take a step back, and look at one of the main deployment patterns on the Internet: the client-server model. A client makes a request to a server to obtain a resource:

^{Figure 1: Client-Server model. A client sends a request which the server responds to.}

To handle more requests, a website can increase its capacity to serve; it can deploy additional servers or place a cache in front of static traffic. Similarly, the number of requests coming from the client side can increase if one client makes more requests, or if the number of clients multiplies.

^{Figure 2: Multiple clients send multiple requests to different servers, with one fronted by a CDN.}

That simplicity is part of what made the Web successful. It allows many kinds of clients to exist, and it allows the network to evolve without each server needing to know exactly what software is on the other end.

^{Figure 3: Two different client contexts that send requests to servers. Each server only sees a request, but not the end-user behind it.}

That openness also creates uncertainty. A website can see a valid request for a resource, yet it usually cannot know what happens after the response leaves the server: whether the content is rendered for one person using a keyboard, a mouse, and a screen to control a browser; or if it's an independent program making requests automatically, archiving responses, indexing them, and feeding into a larger system.

Bot management today

This model works surprisingly well. That is why operating a website can be as simple as starting a web server with a connection to the Internet. It holds only until the server has to decide which requests it can afford to serve, trust, or prioritize.

Sometimes that is about capacity. If your service is provisioned to handle 100 requests per second globally, but you're receiving 200, you have to drop certain requests. If your server only has 1 CPU but incoming requests require 2, you have to drop requests. If the cost of serving 200 is prohibitive, then you have to rate-limit all requests.

You can drop requests at random. It's possibly unfair, and may miss the target by affecting wanted clients, but it works. In the absence of other signals, there is no other choice.

And capacity is only part of the picture. Servers also try to distinguish among clients for many other reasons: to separate attacks from ordinary traffic, to manage non-malicious load, to prevent extraction of data, to limit ad fraud, to prevent fake account creation, or to stop automated actions being taken on a user's behalf.

The difficulty is that web clients are unauthenticated by default, while still exposing many partial signals. Therefore, most servers decide to apply access control logic based on the information they receive. If a single IP address is making 10x the number of requests as others, it might be blocked. A server that goes further might infer that this IP address is used by a VPN, and therefore proxies the traffic of more than one user. The service could decide to apply a coefficient: assuming each client can make 10 requests per second, a shared IP address would be allowed 100 rps before seeing their requests being dropped.

That's one of the keys to bot management: it aims to provide the server with more information about the client to help it make decisions. This information is inherently imprecise, because the client is not under the control of the server. In addition, the same information creates fingerprint vectors that can be used by the server for different purposes such as personalized advertising. This transforms a mitigation vector to a tracking vector.

At a high level, the server sees the following signals from the client:

Passive client signals: required to make a request on the Internet. Clients necessarily send your IP address, and usually establish a TLS session.
Active client signals: voluntarily provided by the client, often invisible to the end user. This includes a User-Agent header or authentication credentials.
Server signals: information the server observes, such as the geographic location of the edge server handling the request, or the local time the request is received.

To limit and cap volumetric abuse, what matters to the origin is the capability and intent of the client to make multiple requests. In the case of an ad-funded website, the origin needs confidence that ads are actually displayed to the end-user. To preserve their brand, origins may want to ensure that the client has specific rendering capabilities: PDF reader, SVG renderer, virtual keyboard. And if the request is coming from an intercepting proxy, the origin may want to ensure that the request actually originates from an end client

If traffic grows then so do the costs to operate. If clients do not generate value, monetary or not, then the server has no incentive to cover those costs.

Different operators respond to this environment differently. Some large crawlers and platforms identify themselves because predictable access is worth the cost of being attributable. It may even help. Others try to avoid identification: because they expect to be blocked, because they seek anonymity, or because they are operating on behalf of end users. The result is an unstable balance built on partial signals.

This is why the human versus bots frame is misleading. What the origin cares about is not humanity in the abstract, but whether the client is behaving in ways the site can support.

A digression: the rate limit trilemma

^{Figure 4: Rate limit trilemma. Decentralized, anonymous, accountable — pick two}

There's a fundamental tension in how we govern access on the Internet: decentralized, anonymous, accountable — pick two.

Fully decentralized + anonymous means no accountability. A blocked client can spawn a new account without impact on its reputation. This implies that origins have to invest more to manage their resources. This is the default of the Web.
Decentralized + accountable means everyone knows who you are, which works for certain use cases but has clear drawbacks. Think OAuth mechanisms such as “Log in with”, which requires account registration and revealing activity to a third party.
Anonymous + accountable likely requires governance, rules, and enforcement. No widely deployed system achieves both properties for the same actor. The closest precedent is the Web PKI, where governance (CA policies, Certificate Transparency) holds servers accountable. When that governance fails, there are consequences. No equivalent exists today for the client side.

Current tools build on elements from that first space to strive for the second: TLS fingerprints, IP addresses, robots.txt. They attempt accountability, but only hold as long as the derived fingerprints remain stable.

The important distinctions are what, not who

For a website owner deciding how to handle incoming traffic, the meaningful distinction isn't necessarily bots vs. humans. It's about balancing the origin’s needs to understand the traffic it receives with the clients’ needs to preserve their privacy.

Platforms and services that want to be identifiable

Figure 5: A crawler makes multiple request to a server

Some traffic comes from known operators making high volumes of requests: search engine crawlers, cloud platforms, enterprise infrastructure. These actors often have low privacy expectations. They're infrastructure making millions of requests from identifiable sources. The ability to identify the source of a request helps to mitigate misjudgment if an infrastructure provider is sending you too many requests or accessing pages it should not. Self-identification is one of the principles for responsible AI bots we proposed. It is based on these principles that Cloudflare operates its URL scanner for Radar, or how we expose crawling capabilities.

For this traffic, identity works. More precisely, some operators can tolerate attributable requests because reliable access is worth it. Web Bot Auth using HTTP Message Signatures allow operators to cryptographically sign their requests. OpenAI, Google, Cloudflare, or AWS, for example, sign requests originating from their platforms. Origins can verify "this request really came from the platform infrastructure" without relying on IP ranges or User-Agent strings.

Humans and other end-users rightfully have expectations other than being identifiable, to preserve anonymity without sacrificing their access and quality of experience.

Distributed traffic that needs anonymity

^{Figure 6: Three distinct browsers make a request to a server. One is operated by a human, one by an on-device assistant, and one is proxied through a corporate proxy.}

Other traffic comes from many sources, each making relatively few requests. This includes humans browsing the web, researchers doing measurements, scrapers using residential proxies, and increasingly, AI assistants acting on humans’ behalf.

And increasingly the distinction between bots and humans is moot. There is no meaningful difference between the AI assistant booking concert tickets and the human who would have done so manually. Both are distributed. Both need anonymity. In each case, an origin would want to create less friction for users who wish to use the service as intended, rather than abuse it.

Identity could work. To replace the old assumption we had for IP addresses, it should provide a unique, verifiable set of attributes tied to a specific client, proven through an account login, an email address, or a hardware key. However, it implies the need to present this identity when accessing websites. It also undermines privacy.

We want to build modern solutions that prove behavior without proving identity.

Anonymous credentials for the Web

Since 2019, clients accessing websites via Cloudflare have been able to provide such proof of behavior, by sending a privacy token along with their request. This is due to Cloudflare’s early support for Privacy Pass. Privacy Pass, as standardised in RFC 9576, RFC 9578, lets a client carry an issuer-backed proof of some prior check, such as having solved a challenge, without turning that result into a stable identifier. It defines tokens that are unlinkable with any prior visit, request, or session.

This matters because it offers a different model from fingerprinting. Instead of collecting passive signals, the server can ask the client for an active privacy-preserving signal.

This reduces the friction on session establishment. Privacy Pass has scaled to billions of tokens per day across Cloudflare’s infrastructure, primarily for privacy relay services.

^{Figure 7: Privacy Pass Redemption and Issuance Protocol Interaction from Section 3.1 of RFC 9576}

The RFC highlights four roles. The issuer trusts one or more attesters to perform some checks before issuing credentials (tokens in the RFC case). The client holds these credentials and decides when to present them, within the right scope. The origin remains in control of which issuers it trusts and what each presentation means. This does not remove abuse or policy questions, it simply provides clients and servers with a privacy-preserving way to handle them.

The system is simple, but it also has bounds: it does not, for example, allow for dynamic rate limits. If a client is issued 100 tokens, and starts consuming too many resources after the first or second session, there’s no way to invalidate the remaining tokens that were previously issued.

In addition, because of the unlinkability property, it’s hard for new issuers to emerge. There is no feedback mechanism that an origin can provide regarding the quality of the signal an issuer token conveys.

Finally, there’s a 1:1 relationship between the number of tokens that an issuer provides, and the number of unlinkable presentations that can be made with those tokens when they are redeemed: one token per presentation. Ideally, we would like a system in which the client contacts an issuer once and can later make multiple presentations scoped to a particular origin context. That points toward user agents holding vouched credentials and presenting proofs derived from them, rather than repeatedly acquiring single-use tokens.

Our goal is to help establish an open private rate limiting ecosystem. In that spirit, we are helping to develop and explore new Privacy Pass primitives, such as Anonymous Rate-Limit Credentials (ARC) and Anonymous Credit Tokens (ACT).

With ACT, for instance, clients can prove something like "I have a good history with this service" without revealing "I am this user.” ACT preserves unlinkability between presentations at the protocol level, which is the key cryptographic property here. Even in the joint issuer-origin deployment model in Section 4.3 of RFC 9576, the protocol is designed so that token issuance and presentation are not directly linkable. That does not eliminate correlation through other layers such as IP addresses, cookies, account state, or timing. The same properties can be provided using standardized VOPRF and BlindRSA primitives within the reverse flow framework that ACT implements.

A successful ecosystem needs to be an open issuer ecosystem. In practice, that means more than saying anyone can mint credentials. Origins need to be able to decide which issuers to trust. User agents need a consistent way to present what is being requested. The ecosystem also needs ways for issuers to establish reputation and for relying parties to stop trusting low-quality issuers. No single gatekeeper should control participation.

To make this work, there needs to be a protocol and client API that works across browsers and other user agents. It has to be simple to deploy, clear to users, and narrow enough that browsers can place limits on abusive proof requests rather than merely surfacing them.

The trajectory if we do nothing

Website owners are already reacting to the disruption caused by emerging clients. This is partly caused by large-scale scraping and model training, and also by user agents acting in ways sites did not anticipate. Websites, therefore, have asked for more technical means to block AI crawlers and associated tools. In an ecosystem where the lines between bots and humans are increasingly blurred, the measures we have today will become less effective on their own.

If those measures aren't effective, we can expect sites to pivot: requiring an account to see any content, or tying access to a stable identifier. This means no more ad-supported login-free articles, no more "three free articles a month.” Other content businesses may move away from the Web completely, offering their data and services directly to AI vendors for a fee, or within walled gardens operated by large platforms.

These outcomes are bad. Everyone benefits from the open access to information that the Web offers. It is not that all sites will make these choices. There are many reasons for offering content online, and not all of them are commercial. But if enough sites do, they change what "normal" is on the Web to be something worse.

That matters because the open Web is an environment in which different clients can gather information from different sources without relying on a handful of players. We also benefit from having a diversity of sources of information. On a Web where access to information is largely mediated through a small handful of companies, we put too much power into too few hands. The result is not just more friction for anonymous clients, but a more brittle Internet with fewer ways for publishers to meet users.

Anonymous authentication brings some risk, too

We should be clear about what we're building. Infrastructure for proving properties can become infrastructure for requiring properties. Anonymous credentials are meant to prove something about their holder; for example, "I solved a challenge" or "I have not exceeded a rate limit." But a system that can prove any single attribute is also capable of proving other attributes, which is a source of concern.

Today, presenting a Privacy Pass token may convey "solved a CAPTCHA". Tomorrow, the same systems could prove entirely different attributes. For instance, issuing tokens only to devices that "have device attestation" excludes older devices and their users. Similarly, requiring attributes such as "has an Apple or Google account" excludes users of non-mainstream platforms.

Once the infrastructure exists to verify anonymous proofs, what gets proven can expand. We need to make sure this does not gate access to the Internet.

Why we should build it anyway

Gates already exist. Platforms increasingly require identity. Websites are blocking traffic coming from shared proxies. The question isn't whether gates will appear, it's whether the user remains in control of their privacy.

As we’ve discussed, bot management requires some signals to be shared. The alternatives to anonymous proofs are worse. Without the ability to prove attributes anonymously, every gate requires fingerprints: retry from a specific browser, link your account, don’t use a VPN. These may not even be options to people, such as the ones which have no idea their connections are proxied.

Privacy-preserving credentials do not remove the need for trust or policy. They can make those demands more explicit and less pervasive. Unlike fingerprints, proofs are explicit. Users can see what is being asked, and clients such as web browsers and AI assistants can help enforce consent.

To decide, use this guardrail

There is a simple test to evaluate the next methods for the Internet that serves everyone: do the methods allow anyone, from anywhere in the world, to build their own device, their own browser, use any operating system, and get access to the Web. If that property cannot hold, if device attestation from specific manufacturers becomes the only viable signal, we should stop.

This means we need to foster an open issuer ecosystem, where no single gatekeeper decides who can participate. In the rate limit trilemma, decentralization is mandatory on the open Web. We don't yet know fully how to build it, but we know we need to foster it.

A new balance

Until now the Web has largely been in balance. Some aspects may have been a happy accident, while others could have been inevitable. For many end users and publishers, it worked because the Web stayed open enough to support a variety of clients accessing a similar variety of resources.

That balance is at risk. Privacy-preserving primitives for the Web are one attempt to build a different outcome: privacy-preserving, open, accountable. It is not guaranteed to succeed. But it is better than waiting.

If you are interested in tracking and participating, this work happens in the open at the IETF and at the W3C. We believe the existing places where people gathered to shape the Web of today are the best places to design the Web of tomorrow.

The Internet is for the end user, and they need to be in the center of it.

Bringing more transparency to post-quantum usage, encrypted messaging, and routing security

David Belson — Fri, 27 Feb 2026 06:00:00 GMT

Cloudflare Radar already offers a wide array of security insights — from application and network layer attacks, to malicious email messages, to digital certificates and Internet routing.

And today we’re introducing even more. We are launching several new security-related data sets and tools on Radar:

We are extending our post-quantum (PQ) monitoring beyond the client side to now include origin-facing connections. We have also released a new tool to help you check any website's post-quantum encryption compatibility.
A new Key Transparency section on Radar provides a public dashboard showing the real-time verification status of Key Transparency Logs for end-to-end encrypted messaging services like WhatsApp, showing when each log was last signed and verified by Cloudflare's Auditor. The page serves as a transparent interface where anyone can monitor the integrity of public key distribution and access the API to independently validate our Auditor’s proofs.
Routing Security insights continue to expand with the addition of global, country, and network-level information about the deployment of ASPA, an emerging standard that can help detect and prevent BGP route leaks.

Measuring origin post-quantum support

Since April 2024, we have tracked the aggregate growth of client support for post-quantum encryption on Cloudflare Radar, chronicling its global growth from under 3% at the start of 2024, to over 60% in February 2026. And in October 2025, we added the ability for users to check whether their browser supports X25519MLKEM768 — a hybrid key exchange algorithm combining classical X25519 with ML-KEM, a lattice-based post-quantum scheme standardized by NIST. This provides security against both classical and quantum attacks.

However, post-quantum encryption support on user-to-Cloudflare connections is only part of the story.

For content not in our CDN cache, or for uncacheable content, Cloudflare’s edge servers establish a separate connection with a customer’s origin servers to retrieve it. To accelerate the transition to quantum-resistant security for these origin-facing fetches, we previously introduced an API allowing customers to opt in to preferring post-quantum connections. Today, we’re making post-quantum compatibility of origin servers visible on Radar.

The new origin post-quantum support graph on Radar illustrates the share of customer origins supporting X25519MLKEM768. This data is derived from our automated TLS scanner, which probes TLS 1.3-compatible origins and aggregates the results daily. It is important to note that our scanner tests for support rather than the origin server's specific preference. While an origin may support a post-quantum key exchange algorithm, its local TLS key exchange preference can ultimately dictate the encryption outcome.

While the headline graph focuses on post-quantum readiness, the scanner also evaluates support for classical key exchange algorithms. Within the Radar Data Explorer view, you can also see the full distribution of these supported TLS key exchange methods.

As shown in the graphs above, approximately 10% of origins could benefit from a post-quantum-preferred key agreement today. This represents a significant jump from less than 1% at the start of 2025 — a 10x increase in just over a year. We expect this number to grow steadily as the industry continues its migration. This upward trend likely accelerated in 2025 as many server-side TLS libraries, such as OpenSSL 3.5.0+, GnuTLS 3.8.9+, and Go 1.24+, enabled hybrid post-quantum key exchange by default, allowing platforms and services to support post-quantum connections simply by upgrading their cryptographic library dependencies.

In addition to the Radar and Data Explorer graphs, the origin readiness data is available through the Radar API as well.

As an additional part of our efforts to help the Internet transition to post-quantum cryptography, we are also launching a tool to test whether a specific hostname supports post-quantum encryption. These tests can be run against any publicly accessible website, as long as they allow connections from Cloudflare’s egress IP address ranges.

_{A screenshot of the tool in Radar to test whether a hostname supports post-quantum encryption.}

The tool presents a simple form where users can enter a hostname (such as cloudflare.com or www.wikipedia.org) and optionally specify a custom port (the default is 443, the standard HTTPS port). After clicking "Test", the result displays a tag indicating PQ support status alongside the negotiated TLS key exchange algorithm. If the server prefers PQ secure connections, a green "PQ" tag appears with a message confirming the connection is "post-quantum secure." Otherwise, a red tag indicates the connection is "not post-quantum secure", showing the classical algorithm that was negotiated.

Under the hood, this tool uses Cloudflare Containers — a new capability that allows running container workloads alongside Workers. Since the Workers runtime is not exposed to details of the underlying TLS handshake, Workers cannot initiate TLS scans. Therefore, we created a Go container that leverages the crypto/tls package's support for post-quantum compatibility checks. The container runs on-demand and performs the actual handshake to determine the negotiated TLS key exchange algorithm, returning results through the Radar API.

With the addition of these origin-facing insights, complementing the existing client-facing insights, we have moved all the post-quantum content to its own section on Radar.

Securing E2EE messaging systems with Key Transparency

End-to-end encrypted (E2EE) messaging apps like WhatsApp and Signal have become essential tools for private communication, relied upon by billions of people worldwide. These apps use public-key cryptography to ensure that only the sender and recipient can read the contents of their messages — not even the messaging service itself. However, there's an often-overlooked vulnerability in this model: users must trust that the messaging app is distributing the correct public keys for each contact.

If an attacker were able to substitute an incorrect public key in the messaging app's database, they could intercept messages intended for someone else — all without the sender knowing.

Key Transparency addresses this challenge by creating an auditable, append-only log of public keys — similar in concept to Certificate Transparency for TLS certificates. Messaging apps publish their users' public keys to a transparency log, and independent third parties can verify and vouch that the log has been constructed correctly and consistently over time. In September 2024, Cloudflare announced such a Key Transparency auditor for WhatsApp, providing an independent verification layer that helps ensure the integrity of public key distribution for the messaging app's billions of users.

Today, we're publishing Key Transparency audit data in a new Key Transparency section on Cloudflare Radar. This section showcases the Key Transparency logs that Cloudflare audits, giving researchers, security professionals, and curious users a window into the health and activity of these critical systems.

The new page launches with two monitored logs: WhatsApp and Facebook Messenger Transport. Each monitored log is displayed as a card containing the following information:

Status: Indicates whether the log is online, in initialization, or disabled. An "online" status means the log is actively publishing key updates into epochs that Cloudflare audits. (An epoch represents a set of updates applied to the key directory at a specific time.)
Last signed epoch: The most recent epoch that has been published by the messaging service's log and acknowledged by Cloudflare. By clicking on the eye icon, users can view the full epoch data in JSON format, including the epoch number, timestamp, cryptographic digest, and signature.
Last verified epoch: The most recent epoch that Cloudflare has verified. Verification involves checking that the transition of the transparency log data structure from the previous epoch to the current one represents a valid tree transformation — ensuring the log has been constructed correctly. The verification timestamp indicates when Cloudflare completed its audit.
Root: The current root hash of the Auditable Key Directory (AKD) tree. This hash cryptographically represents the entire state of the key directory at the current epoch. Like the epoch fields, users can click to view the complete JSON response from the auditor.

The data shown on the page is also available via the Key Transparency Auditor API, with endpoints for auditor information and namespaces.

If you would like to perform audit proof verification yourself, you can follow the instructions in our Auditing Key Transparency blog post. We hope that these use cases are the first of many that we publish in this Key Transparency section in Radar — if your company or organization is interested in auditing for your public key or related infrastructure, you can reach out to us here.

Tracking RPKI ASPA adoption

While the Border Gateway Protocol (BGP) is the backbone of Internet routing, it was designed without built-in mechanisms to verify the validity of the paths it propagates. This inherent trust has long left the global network vulnerable to route leaks and hijacks, where traffic is accidentally or maliciously detoured through unauthorized networks.

Although RPKI and Route Origin Authorizations (ROAs) have successfully hardened the origin of routes, they cannot verify the path traffic takes between networks. This is where ASPA (Autonomous System Provider Authorization) comes in. ASPA extends RPKI protection by allowing an Autonomous System (AS) to cryptographically sign a record listing the networks authorized to propagate its routes upstream. By validating these Customer-to-Provider relationships, ASPA allows systems to detect invalid path announcements with confidence and react accordingly.

While the specific IETF standard remains in draft, the operational community is moving fast. Support for creating ASPA objects has already landed in the portals of Regional Internet Registries (RIRs) like ARIN and RIPE NCC, and validation logic is available in major software routing stacks like OpenBGPD and BIRD.

To provide better visibility into the adoption of this emerging standard, we have added comprehensive RPKI ASPA support to the Routing section of Cloudflare Radar. Tracking these records globally allows us to understand how quickly the industry is moving toward better path validation.

Our new ASPA deployment view allows users to examine the growth of ASPA adoption over time, with the ability to visualize trends across the five Regional Internet Registries (RIRs) based on AS registration. You can view the entire history of ASPA entries, dating back to October 1, 2023, or zoom into specific date ranges to correlate spikes in adoption with industry events, such as the introduction of ASPA features on ARIN and RIPE NCC online dashboards.

Beyond aggregate trends, we have also introduced a granular, searchable explorer for real-time ASPA content. This table view allows you to inspect the current state of ASPA records, searchable by AS number, AS name, or by filtering for only providers or customer ASNs. This allows network operators to verify that their records are published correctly and to view other networks’ configurations.

We have also integrated ASPA data directly into the country/region routing pages. Users can now track how different locations are progressing in securing their infrastructure, based on the associated ASPA records from the customer ASNs registered locally.

On individual AS pages, we have updated the Connectivity section. Now, when viewing the connections of a network, you may see a visual indicator for "ASPA Verified Provider." This annotation confirms that an ASPA record exists authorizing that specific upstream connection, providing an immediate signal of routing hygiene and trust.

For ASes that have deployed ASPA, we now display a complete list of authorized provider ASNs along with their details. Beyond the current state, Radar also provides a detailed timeline of ASPA activity involving the AS. This history distinguishes between changes initiated by the AS itself ("As customer") and records created by others designating it as a provider ("As provider"), allowing users to immediately identify when specific routing authorizations were established or modified.

Visibility is an essential first step toward broader adoption of emerging routing security protocols like ASPA. By surfacing this data, we aim to help operators deploy protections and assist researchers in tracking the Internet's progress toward a more secure routing path. For those who need to integrate this data into their own workflows or perform deeper analysis, we are also exposing these metrics programmatically. Users can now access ASPA content snapshots, historical timeseries, and detailed changes data using the newly introduced endpoints in the Cloudflare Radar API.

As security evolves, so does our data

Internet security continues to evolve, with new approaches, protocols, and standards being developed to ensure that information, applications, and networks remain secure. The security data and insights available on Cloudflare Radar will continue to evolve as well. The new sections highlighted above serve to expand existing routing security, transparency, and post-quantum insights already available on Cloudflare Radar.

If you share any of these new charts and graphs on social media, be sure to tag us: @CloudflareRadar (X), noc.social/@cloudflareradar (Mastodon), and radar.cloudflare.com (Bluesky). If you have questions or comments, or suggestions for data that you’d like to see us add to Radar, you can reach out to us on social media, or contact us via email.

Beyond IP lists: a registry format for bots and agents

Thibault Meunier — Thu, 30 Oct 2025 22:00:00 GMT

As bots and agents start cryptographically signing their requests, there is a growing need for website operators to learn public keys as they are setting up their service. I might be able to find the public key material for well-known fetchers and crawlers, but what about the next 1,000 or next 1,000,000? And how do I find their public key material in order to verify that they are who they say they are? This problem is called discovery.

We share this problem with Amazon Bedrock AgentCore, a comprehensive agentic platform to build, deploy and operate highly capable agents at scale, and their AgentCore Browser, a fast, secure, cloud-based browser runtime to enable AI agents to interact with websites at scale. The AgentCore team wants to make it easy for each of their customers to sign their own requests, so that Cloudflare and other operators of CDN infrastructure see agent signatures from individual agents rather than AgentCore as a monolith. (Note: this method does not identify individual users.) In order to do this, Cloudflare needed a way to ingest and register the public keys of AgentCore’s customers at scale.

In this blog post, we propose a registry of bots and agents as a way to easily discover them on the Internet. We also outline how Web Bot Auth can be expanded with a registry format. Similar to IP lists that can be authored by anyone and easily imported, the registry format is a list of URLs at which to retrieve agent keys and can be authored and imported easily.

We believe such registries should foster and strengthen an open ecosystem of curators that website operators can trust.

A need for more trustworthy authentication

In May, we introduced a protocol proposal called Web Bot Auth, which describes how bot and agent developers can cryptographically sign requests coming from their infrastructure.

There have now been multiple implementations of the proposed protocol, from Vercel to Shopify to Visa. It has been actively discussed and contributions have been made. Web Bot Auth marks a first step towards moving from brittle identification, like IPs and user agents, to more trustworthy cryptographic authentication. However, like IP addresses, cryptographic keys are a pseudonymous form of identity. If you operate a website without the scale and reach of large CDNs, how do you discover the public key of known crawlers?

The first protocol proposal suggested one approach: bot operators would provide a newly-defined HTTP header Signature-Agent that refers to an HTTP endpoint hosting their keys. Similar to IP addresses, the default is to allow all, but if a particular operator is making too many requests, you can start taking actions: increase their rate limit, contact the operator, etc.

Here’s an example from Shopify's online store:

A registry format

With all that in mind, we come to the following problem. How can Cloudflare ensure customers have control over the traffic they want to allow, with sensible defaults, while fostering an open curation ecosystem that doesn’t lock in customers or small origins?

Such an ecosystem exists for lists of IP addresses (e.g. avestel-bots-ip-lists) and robots.txt (e.g. ai-robots-txt). For both, you can find canonical lists on the Internet to easily configure your website to allow or disallow traffic from those IPs. They provide direct configuration for your nginx or haproxy, and you can use it to configure your Cloudflare account. For instance, I could import the robots.txt below:

This is where the registry format comes in, providing a list of URLs pointing to Signature Agent keys:

And that's it. A registry could contain a list of all known signature agents, a curated list for academic research agents, for search agents, etc.

Anyone can maintain and host these lists. Similar to IP or robots.txt list, you can host such a registry on any public file system. This means you can have a repository on GitHub, put the file on Cloudflare R2, or send it as an email attachment. Cloudflare intends to provide one of the first instances of this registry, so that others can contribute to it or reference it when building their own.

Learn more about an incoming request

Knowing the Signature-Agent is great, but not sufficient. For instance, to be a verified bot, Cloudflare requires a contact method, in case requests from that infrastructure suddenly fail or change format in a way that causes unexpected errors upstream. In fact, there is a lot of information an origin might want to know: a name for the operator, a contact method, a logo, the expected crawl rate, etc.

Therefore, to complement the registry format, we have proposed a signature-agent card format that extends the JWKS directory (RFC 7517) with additional metadata. Similar to an old-fashioned contact card, it includes all the important information someone might want to know about your agent or crawler.

We provide an example below for illustration. Note that the fields may change: introducing jwks-uri, logo being more descriptive, etc.

Operating a registry

Amazon Bedrock AgentCore, an agentic platform for building and deploying AI agents at scale, adopted Web Bot Auth for its AgentCore Browser service (learn more in their post). AgentCore Browser intends to transition from a service signing key that is currently available in their public preview, to customer-specific keys, once the protocol matures. Cloudflare and other operators of origin protection service will be able to see and validate signatures from individual AgentCore customers rather than AgentCore as a whole.

Cloudflare also offers a registry for bots and agents it trusts, provided through Radar. It uses the registry format to allow for the consumption of bots trusted by Cloudflare on your server.

You can use these registries today – we’ve provided a demo in Go for Caddy server that would allow us to import keys from multiple registries. It’s on cloudflare/web-bot-auth. The configuration looks like this:

There are several reasons why you might want to operate and curate a registry leveraging the Signature Agent Card format:

Monitor incoming Signature-Agents. This should allow you to collect signature-agent cards of agents reaching out to your domain.
Import them from existing registries, and categorize them yourself. There could be a general registry constructed from the monitoring step above, but registries might be more useful with more categories.
Establish direct relationships with agents. Cloudflare does this for its bot registry for instance, or you might use a public GitHub repository where people can open issues.
Learn from your users. If you offer a security service, allowing your customers to specify the registries/signature-agents they want to let through allows you to gain valuable insight.

Moving forward

As cryptographic authentication for bots and agents grows, the need for discovery increases.

With the introduction of a lightweight format and specification to attach metadata to Signature-Agent, and curate them in the form of registries, we begin to address this need. The HTTP Message Signature directory format is being expanded to include some self-certified metadata, and the registry maintains a curation ecosystem.

Down the line, we predict that clients and origins will choose the signature-agent they trust, use a common format to migrate their configuration between CDN providers, and rely on a third-party registry for curation. We are working towards integrating these capabilities into our bot management and rule engines.

If you’d like to experiment, our demo is on GitHub. If you’d like to help us, we’re hiring 1,111 interns over the course of next year, and have open positions.

Anonymous credentials: rate-limiting bots and agents without compromising privacy

Thibault Meunier — Thu, 30 Oct 2025 13:00:00 GMT

The way we interact with the Internet is changing. Not long ago, ordering a pizza meant visiting a website, clicking through menus, and entering your payment details. Soon, you might ask your phone to order a pizza that matches your preferences. A program on your device or on a remote server, which we call an AI agent, would visit the website and orchestrate the necessary steps on your behalf.

Of course, agents can do much more than order pizza. Soon we might use them to buy concert tickets, plan vacations, or even write, review, and merge pull requests. While some of these tasks will eventually run locally, for now, most are powered by massive AI models running in the biggest datacenters in the world. As agentic AI increases in popularity, we expect to see a large increase in traffic from these AI platforms and a corresponding drop in traffic from more conventional sources (like your phone).

This shift in traffic patterns has prompted us to assess how to keep our customers online and secure in the AI era. On one hand, the nature of requests are changing: Websites optimized for human visitors will have to cope with faster, and potentially greedier, agents. On the other hand, AI platforms may soon become a significant source of attacks, originating from malicious users of the platforms themselves.

Unfortunately, existing tools for managing such (mis)behavior are likely too coarse-grained to manage this transition. For example, when Cloudflare detects that a request is part of a known attack pattern, the best course of action often is to block all subsequent requests from the same source. When the source is an AI agent platform, this could mean inadvertently blocking all users of the same platform, even honest ones who just want to order pizza. We started addressing this problem earlier this year. But as agentic AI grows in popularity, we think the Internet will need more fine-grained mechanisms of managing agents without impacting honest users.

At the same time, we firmly believe that any such security mechanism must be designed with user privacy at its core. In this post, we'll describe how to use anonymous credentials (AC) to build these tools. Anonymous credentials help website operators to enforce a wide range of security policies, like rate-limiting users or blocking a specific malicious user, without ever having to identify any user or track them across requests.

Anonymous credentials are under development at IETF in order to provide a standard that can work across websites, browsers, platforms. It's still in its early stages, but we believe this work will play a critical role in keeping the Internet secure and private in the AI era. We will be contributing to this process as we work towards real-world deployment. This is still early days. If you work in this space, we hope you will follow along and contribute as well.

Let’s build a small agent

To help us discuss how AI agents are affecting web servers, let’s build an agent ourselves. Our goal is to have an agent that can order a pizza from a nearby pizzeria. Without an agent, you would open your browser, figure out which pizzeria is nearby, view the menu and make selections, add any extras (double pepperoni), and proceed to checkout with your credit card. With an agent, it’s the same flow —except the agent is opening and orchestrating the browser on your behalf.

In the traditional flow, there’s a human all along the way, and each step has a clear intent: list all pizzerias within 3 Km of my current location; pick a pizza from the menu; enter my credit card; and so on. An agent, on the other hand, has to infer each of these actions from the prompt "order me a pizza."

In this section, we’ll build a simple program that takes a prompt and can make outgoing requests. Here’s an example of a simple Worker that takes a specific prompt and generates an answer accordingly. You can find the code on GitHub:

In this context, the LLM provides its best answer. It gives us a plan and instruction, but does not perform the action on our behalf. You and I are able to take a list of instructions and act upon it because we have agency and can affect the world. To allow our agent to interact with more of the world, we’re going to give it control over a web browser.

Cloudflare offers a Browser Rendering service that can bind directly into our Worker. Let’s do that. The following code uses Stagehand, an automation framework that makes it simple to control the browser. We pass it an instance of Cloudflare remote browser, as well as a client for Workers AI.

You can access that code for yourself on https://mini-ai-agent.cloudflareresearch.com/llm. Here’s the response we got on October 10, 2025:

Using the screenshot API of browser rendering, we can also inspect what the agent is doing. Here's how the browser renders the page in the example above:

Stagehand allows us to identify components on the page, such as page.act(“Click on pepperoni pizza”) and page.act(“Click on Pay now”). This eases interaction between the developer and the browser.

To go further, and instruct the agent to perform the whole flow autonomously, we have to use the appropriately named agent mode of Stagehand. This feature is not yet supported by Cloudflare Workers, but is provided below for completeness.

We can see that instead of adding step-by-step instructions, the agent is provided control. To actually pay, it would need access to a payment method such as a virtual credit card.

The prompt had some subtlety in that we’ve scoped the location to Cloudflare’s Austin office. This is because while the agent responds to us, it needs to understand our context. In this case, the agent operates out of Cloudflare edge, a location remote to us. This implies we are unlikely to pick up a pizza from this data center if it was ever delivered.

The more capabilities we provide to the agent, the more it has the ability to create some disruption. Instead of someone having to make 5 clicks at a slow rate of 1 request per 10 seconds, they’d have a program running in a data center possibly making all 5 requests in a second.

This agent is simple, but now imagine many thousands of these — some benign, some not — running at datacenter speeds. This is the challenge origins will face.

Protecting origins

For humans to interact with the online world, they need a web browser and some peripherals with which to direct the behavior of that browser. Agents are another way of directing a browser, so it may be tempting to think that not much is actually changing from the origin's point of view. Indeed, the most obvious change from the origin's point of view is merely where traffic comes from:

The reason this change is significant has to do with the tools the server has to manage traffic. Websites generally try to be as permissive as possible, but they also need to manage finite resources (bandwidth, CPU, memory, storage, and so on). There are a few basic ways to do this:

Global security policy: A server may opt to slow down, CAPTCHA, or even temporarily block requests from all users. This policy may be applied to an entire site, a specific resource, or to requests classified as being part of a known or likely attack pattern. Such mechanisms may be deployed in reaction to an observed spike in traffic, as in a DDoS attack, or in anticipation of a spike in legitimate traffic, as in Waiting Room.
Incentives: Servers sometimes try to incentivize users to use the site when more resources are available. For instance, a server price may be lower depending on the location or request time. This could be implemented with a Cloudflare Snippet.

While both tools can be effective, they also sometimes cause significant collateral damage. For example, while rate limiting a website's login endpoint can help prevent credential stuffing attacks, it also degrades the user experience for non-attackers. Before resorting to such measures, servers will first try to apply the security policy (whether a rate limit, a CAPTCHA, or an outright block) to individual users or groups of users.

However, in order to apply a security policy to individuals, the server needs some way of identifying them. Historically, this has been done via some combination of IP addresses, User-Agent, an account tied to the user identity (if available), and other fingerprints. Like most cloud service providers, Cloudflare has a dedicated offering for per-user rate limits based on such heuristics.

Fingerprinting works for the most part. However, it's unequitably distributed. On mobile, users have an especially difficult time solving CAPTCHAs, when using a VPN they’re more likely to be blocked, and when using reading mode they can mess up their fingerprint, preventing rendering of the page.

Likewise, agentic AI only exacerbates the limitations of fingerprinting. Not only will more traffic be concentrated on a smaller source IP range, the agents themselves will run the same software and hardware platform, making it harder to distinguish honest from malicious users.

Something that could help is Web Bot Auth, which would allow agents to identify to the origin which platform they're operated by. However, we wouldn't want to extend this mechanism — intended for identifying the platform itself — to identifying individual users of the platforms, as this would create an unacceptable privacy risk for these users.

We need some way of implementing security controls for individual users without identifying them. But how? The Privacy Pass protocol provides a partial solution.

Privacy Pass and its limitations

Today, one of the most prominent use cases for Privacy Pass is to rate limit requests from a user to an origin, as we have discussed before. The protocol works roughly as follows. The client is issued a number of tokens. Each time it wants to make a request, it redeems one of its tokens to the origin; the origin allows the request through only if the token is fresh, i.e., has never been observed before by the origin.

In order to use Privacy Pass for per-user rate-limiting, it's necessary to limit the number of tokens issued to each user (e.g., 100 tokens per user per hour). To rate limit an AI agent, this role would be fulfilled by the AI platform. To obtain tokens, the user would log in with the platform, and said platform would allow the user to get tokens from the issuer. The AI platform fulfills the attester role in Privacy Pass parlance. The attester is the party guaranteeing the per-user property of the rate limit. The AI platform, as an attester, is incentivized to enforce this token distribution as it stakes its reputation: Should it allow for too many tokens to be issued, the issuer could distrust them.

The issuance and redemption protocols are designed to have two properties:

Tokens are unforgeable: only the issuer can issue valid tokens.
Tokens are unlinkable: no party, including the issuer, attester, or origin, can tell which user a token was issued to.

These properties can be achieved using a cryptographic primitive called a blind signature scheme. In a conventional signature scheme, the signer uses its private key to produce a signature for a message. Later on, a verifier can use the signer’s public key to verify the signature. Blind signature schemes work in the same way, except that the message to be signed is blinded such that the signer doesn't know the message it's signing. The client “blinds” the message to be signed and sends it to the server, which then computes a blinded signature over the blinded message. The client obtains the final signature by unblinding the signature.

This is exactly how the standardised Privacy Pass issuance protocols are defined by RFC 9578:

Blind signatures are simple, cheap, and perfectly suited for many applications. However, they have some limitations that make them unsuitable for our use case.

First, the communication cost of the issuance protocol is too high. For each token issued, the user sends a 256-byte, blinded nullifier and the issuer replies with a 256-byte blind signature (assuming RSA-2048 is used). That's 0.5KB of additional communication per request, or 500KB for every 1,000 requests. This is manageable as we’ve seen in a previous experiment for Privacy Pass, but not ideal. Ideally, the bandwidth would be sublinear in the rate limit we want to enforce. An alternative to blind signatures with lower compute time are Oblivious Pseudorandom Functions (VOPRF), but the bandwidth is still asymptotically linear. We’ve discussed them in the past, as they served as the basis for early deployments of Privacy Pass.

Second, blind signatures can't be used to rate-limit on a per-origin basis. Ideally, when issuing $N$ tokens to the client, the client would be able to redeem at most $N$ tokens at any origin server that can verify the token's validity. However, the client can't safely redeem the same token at more than one server because it would be possible for the servers to link those redemptions to the same client. What's needed is some mechanism for what we'll call late origin-binding: transforming a token for redemption at a particular origin in a way that's unlinkable to other redemptions of the same token.

Third, once a token is issued, it can't be revoked: it remains valid as long as the issuer's public key is valid. This makes it impossible for an origin to block a specific user if it detects an attack, or if its tokens are compromised. The origin can block the offending request, but the user can continue to make requests using its remaining token budget.

Anonymous credentials and the future of Privacy Pass

As noted by Chaum in 1985, an anonymous credential system allows users to obtain a credential from an issuer, and later prove possession of this credential, in an unlinkable way, without revealing any additional information. Also, it is possible to demonstrate that some attributes are attached to the credential.

One way to think of an anonymous credential is as a kind of blind signature with some additional capabilities: late-binding (link a token to an origin after issuance), multi-show (generate multiple tokens from a single issuer response), and expiration distinct from key rotation (token validity decoupled of the issuer cryptographic key validity). In the redemption flow for Privacy Pass, the client presents the unblinded message and signature to the server. To accept the redemption, the server needs to verify the signature. In an AC system, the client only presents a part of the message. In order for the server to accept the request, the client needs to prove to the server that it knows a valid signature for the entire message without revealing the whole thing.

The flow we described above would therefore include this additional presentation step.

Note that the tokens generated through blind signatures or VOPRFs can only be used once, so they can be regarded as single-use tokens. However, there exists a type of anonymous credentials that allows tokens to be used multiple times. For this to work, the issuer grants a credential to the user, who can later derive at most N many single-use tokens for redemption. Therefore, the user can send multiple requests, at the expense of a single issuance session.

The table below describes how blind signatures and anonymous credentials provide features of interest to rate limiting.

checks that the counter is greater than zero; and
decrements the counter issuing a new credential for the updated counter and a fresh nullifier.

A blind signature could be used to meet this functionality. However, whereas the nullifier can be blinded as before, it would be necessary to handle the counter in plaintext so that the server can check that the counter is valid (Step 1) and update it (Step 2). This creates an obvious privacy risk since the server, which is in control of the counter, can use it to link multiple presentations by the same client. For example, when you reach out to buy a pepperoni pizza, the origin could assign you a special counter value, which eases fingerprinting when you present it a second time. Fortunately, there exist anonymous credentials designed to close this kind of privacy gap.

The scheme above is a simplified version of Anonymous Credit Tokens (ACT), one of the anonymous credential schemes being considered for adoption by the Privacy Pass working group at IETF. The key feature of ACT is its statefulness: upon successful redemption, the server re-issues a new credential with updated nullifier and counter values. This creates a feedback loop between the client and server that can be used to express a variety of security policies.

By design, it's not possible to present ACT credentials multiple times simultaneously: the first presentation must be completed so that the re-issued credential can be presented in the next request. Parallelism is the key feature of Anonymous Rate-limited Credential (ARC), another scheme under discussion at the Privacy Pass working group. ARCs can be presented across multiple requests in parallel up to the presentation limit determined during issuance.

Another important feature of ARC is its support for late origin-binding: when a client is issued an ARC with presentation limit $N$, it can safely use its credential to present up to $N$ times to any origin that can verify the credential.

These are just examples of relevant features of some anonymous credentials. Some applications may benefit from a subset of them; others may need additional features. Fortunately, both ACT and ARC can be constructed from a small set of cryptographic primitives that can be easily adapted for other purposes.

Building blocks for anonymous credentials

ARC and ACT share two primitives in common: algebraic MACs, which provide for limited computations on the blinded message; and zero-knowledge proofs (ZKP) for proving validity of the part of the message not revealed to the server. Let's take a closer look at each.

Algebraic MACs

A Message Authenticated Code (MAC) is a cryptographic tag used to verify a message's authenticity (that it comes from the claimed sender) and integrity (that it has not been altered). Algebraic MACs are built from mathematical structures like group actions. The algebraic structure gives them some additional functionality, one of them being a homomorphism that we can blind easily to conceal the actual value of the MAC. Adding a random value on an algebraic MAC blinds the value.

Unlike blind signatures, both ACT and ARC are only privately verifiable, meaning the issuer and the origin must both have the issuer's private key. Taking Cloudflare as an example, this means that a credential issued by Cloudflare can only be redeemed by an origin behind Cloudflare. Publicly verifiable variants of both are possible, but at an additional cost.

Zero-Knowledge Proofs for linear relations

Zero knowledge proofs (ZKP) allow us to prove a statement is true without revealing the exact value that makes the statement true. The ZKP is constructed by a prover in such a way that it can only be generated by someone who actually possesses the secret. The verifier can then run a quick mathematical check on this proof. If the check passes, the verifier is convinced that the prover's initial statement is valid. The crucial property is that the proof itself is just data that confirms the statement; it contains no other information that could be used to reconstruct the original secret.

For ARC and ACT, we want to prove linear relations of secrets. In ARC, a user needs to prove that different tokens are linked to the same original secret credential. For example, a user can generate a proof showing that a request token was derived from a valid issued credential. The system can verify this proof to confirm the tokens are legitimately connected, all without ever learning the underlying secret credential that ties them together. This allows the system to validate user actions while guaranteeing their privacy.

Proving simple linear relations can be extended to prove a number of powerful statements, for example that a number is in range. For example, this is useful to prove that you have a positive balance on your account. To prove your balance is positive, you prove that you can encode your balance in binary. Let’s say you can at most have 1024 credits in your account. To prove your balance is non-zero when it is, for example, 12, you prove two things simultaneously: first, that you have a set of binary bits, in this case 12=(1100)2, and second, that a linear equation using these bits (8*1 + 4*1 + 2*0 + 1*0) correctly adds up to your total committed balance. This convinces the verifier that the number is validly constructed without them learning the exact value. This is how it works for powers of two, but it can easily be extended to arbitrary ranges.

The mathematical structure of algebraic MACs allows easy blinding and evaluation. The structure also allows for an easy proof that a MAC has been evaluated with the private key without revealing the MAC. In addition, ARC could use ZKPs to prove that a nonce has not been spent before. In contrast, ACT uses ZKPs to prove we have enough of a balance left on our token. The balance is subtracted homomorphically using more group structure.

How much does this all cost?

Anonymous credentials allow for more flexibility, and have the potential to reduce the communication cost, compared to blind signatures in certain applications. To identify such applications, we need to measure the concrete communication cost of these new protocols. In addition, we need to understand how their CPU usage compares to blind signatures and oblivious pseudorandom functions.

We measure the time that each participant spends at each stage of some AC schemes. We also report the size of messages transmitted across the network. For ARC, ACT, and VOPRF, we'll use ristretto255 as the prime group and SHAKE128 for hashing. For Blind RSA, we'll use a 2048-bit modulus and SHA-384 for hashing.

Each algorithm was implemented in Go, on top of the CIRCL library. We plan to open source the code once the specifications of ARC and ACT begin to stabilize.

Let’s take a look at the most widely used deployment in Privacy Pass: Blind RSA. Redemption time is low, and most of the cost lies with the server at issuance time. Communication cost is mostly constant and in the order of 256 bytes.

When looking at VOPRF, verification time on the server is slightly higher than for Blind RSA, but communication cost and issuance are much faster. Evaluation time on the server is 10x faster for 1 token, and more than 25x faster when using amortized token issuance. Communication cost per token is also more appealing, with a message size at least 3x lower.

This makes VOPRF tokens appealing for applications requiring a lot of tokens that can accept a slightly higher redemption cost, and that don’t need public verifiability.

Now, let’s take a look at the figures for ARC and ACT anonymous credential schemes. For both schemes we measure the time to issue a credential that can be presented at most $N=1000$ times.

As we would hope, the communication cost and the server’s runtime is much lower than a batched issuance with either Blind RSA or VOPRF. For example, a VOPRF issuance of 1000 tokens takes 99 ms (99 µs per token) vs 1.35 ms for issuing one ARC credential that allows for 1000 presentations. This is about 70x faster. The trade-off is that presentation is more expensive, both for the client and server.

How about ACT? Like ARC, we would expect the communication cost of issuance grows much slower with respect to the credits issued. Our implementation bears this out. However, there are some interesting performance differences between ARC and ACT: issuance is much cheaper for ACT than it is for ARC, but redemption is the opposite.

What's going on? The answer has largely to do with what each party needs to prove with ZKPs at each step. For example, during ACT redemption, the client proves to the server (in zero-knowledge) that its counter $C$ is in the desired range, i.e., $0 \leq C \leq N$. The proof size is on the order of $\log_{2} N$, which accounts for the larger message size. In the current version, ARC redemption does not involve range proofs, but a range proof may be added in a future version. Meanwhile, the statements the client and server need to prove during ARC issuance are a bit more complicated than for ARC presentation, which accounts for the difference in runtime there.

The advantage of anonymous credentials, as discussed in the previous sections, is that issuance only has to be performed once. When a server evaluates its cost, it takes into account the cost of all issuances and the cost of all verifications. At present, only accounting for credentials costs, it’s cheaper for a server to issue and verify tokens than verify an anonymous credential presentation.

The advantage of multiple-use anonymous credentials is that instead of the issuer generating $N$ tokens, the bulk of computation is offloaded to the clients. This is more scoped. Late origin binding allows them to work for multiple origins/namespace, range proof to decorrelate expiration from key rotation, and refund to provide a dynamic rate limit. Their current applications are dictated by the limitation of single-use token based schemes, more than by the added efficiency they provide. This seems to be an exciting area to explore, and see if closing the gap is possible.

Managing agents with anonymous credentials

Managing agents will likely require features from both ARC and ACT.

ARC already has much of the functionality we need: it supports rate limiting, is communication-efficient, and it supports late origin-binding. Its main downside is that, once an ARC credential is issued, it can't be revoked. A malicious user can always make up to N requests to any origin it wants.

We can allow for a limited form of revocation by pairing ARC with blind signatures (or VOPRF). Each presentation of the ARC credential is accompanied by a Privacy Pass token: upon successful presentation, the client is issued another Privacy Pass token it can use during the next presentation. To revoke a credential, the server would simply not re-issue the token:

This scheme is already quite useful. However, it has some important limitations:

Parallel presentation across origins is not possible: the client must wait for the request to one origin to succeed before it can initiate a request to a second origin.
Revocation is global rather than per-origin, meaning the credential is not only revoked for the origin to whom it was presented, but for every origin it can be presented to. We suspect this will be undesirable in some cases. For example, an origin may want to revoke if a request violates its robots.txt policy; but the same request may have been accepted by other origins.

A more fundamental limitation of this design is that the decision to revoke can only be made on the basis of a single request — the one in which the credential was presented. It may be risky to decide to block a user on the basis of a single request; in practice, attack patterns may only emerge across many requests. ACT's statefulness enables at least a rudimentary form of this kind of defense. Consider the following scheme:

Issuance: The client is issued an ARC with presentation limit $N=1$.
Presentation:
- When the client presents its ARC credential to an origin for the first time, the server issues an ACT credential with a valid initial state.
- When the client presents an ACT with valid state (e.g., credit counter greater than 0), the origin either:
  - refuses to issue a new ACT, thereby revoking the credential. It would only do so if it had high confidence that the request was part of an attack; or
  - issues a new ACT with state updated to reduce the ACT credit by the amount of resources consumed while processing the request.

Benign requests wouldn't change the state by much (if at all), but suspicious requests might impact the state in a way that gets the user closer to their rate limit much faster.

Demo

To see how this idea works in practice, let's look at a working example that uses the Model Context Protocol. The demo below is built using MCP Tools. Tools are extensions the AI agent can call to extend its capabilities. They don't need to be integrated at release time within the MCP client. This provides a nice and easy prototyping avenue for anonymous credentials.

Tools are offered by the server via an MCP compatible interface. You can see details on how to build such MCP servers in a previous blog.

In our pizza context, this could look like a pizzeria that offers you a voucher. Each voucher gets you 3 pizza slices. Mocking a design, an integration within a chat application could look as follows:

The first panel presents all tools exposed by the MCP server. The second one showcases an interaction performed by the agent calling these tools.

To look into how such a flow would be implemented, let’s write the MCP tools, offer them in an MCP server, and manually orchestrate the calls with the MCP Inspector.

The MCP server should provide two tools:

act-issue which issues an ACT credential valid for 3 requests. The code used here is an earlier version of the IETF draft which has some limitations.
act-redeem makes a presentation of the local credential, and fetches our pizza menu.

First, we run act-issue. At this stage, we could ask the agent to run an OAuth flow, fetch an internal authentication endpoint, or to compute a proof of work.

This gives us 3 credits to spend against an origin. Then, we run act-redeem

Et voilà. If we run act-redeem once more, we see we have one fewer credit.

You can test it yourself, here are the source codes available. The MCP server is written in Rust to integrate with the ACT rust library. The browser-based client works similarly, check it out.

Moving further

In this post, we’ve presented a concrete approach to rate limit agent traffic. It is in full control of the client, and is built to protect the user's privacy. It uses emerging standards for anonymous credentials, integrates with MCP, and can be readily deployed on Cloudflare Workers.

We're on the right track, but there are still questions that remain. As we touched on before, a notable limitation of both ARC and ACT is that they are only privately verifiable. This means that the issuer and origin need to share a private key, for issuing and verifying the credential respectively. There are likely to be deployment scenarios for which this isn't possible. Fortunately, there may be a path forward for these cases using pairing-based cryptography, as in the BBS signature specification making its way through IETF. We’re also exploring post-quantum implications in a concurrent post.

If you are an agent platform, an agent developer, or a browser, all our code is available on GitHub for you to experiment. Cloudflare is actively working on vetting this approach for real-world use cases.

The specification and discussion are happening within the IETF and W3C. This ensures the protocols are built in the open, and receive participation from experts. Improvements are still to be made to clarify the right performance-to-privacy tradeoff, or even the story to deploy on the open web.

If you’d like to help us, we’re hiring 1,111 interns over the course of next year, and have open positions.

Addressing the unauthorized issuance of multiple TLS certificates for 1.1.1.1

Joe Abley — Thu, 04 Sep 2025 17:30:00 GMT

Over the past few days Cloudflare has been notified through our vulnerability disclosure program and the certificate transparency mailing list that unauthorized certificates were issued by Fina CA for 1.1.1.1, one of the IP addresses used by our public DNS resolver service. From February 2024 to August 2025, Fina CA issued twelve certificates for 1.1.1.1 without our permission. We did not observe unauthorized issuance for any properties managed by Cloudflare other than 1.1.1.1.

We have no evidence that bad actors took advantage of this error. To impersonate Cloudflare's public DNS resolver 1.1.1.1, an attacker would not only require an unauthorized certificate and its corresponding private key, but attacked users would also need to trust the Fina CA. Furthermore, traffic between the client and 1.1.1.1 would have to be intercepted.

While this unauthorized issuance is an unacceptable lapse in security by Fina CA, we should have caught and responded to it earlier. After speaking with Fina CA, it appears that they issued these certificates for the purposes of internal testing. However, no CA should be issuing certificates for domains and IP addresses without checking control. At present all certificates have been revoked. We are awaiting a full post-mortem from Fina.

While we regret this situation, we believe it is a useful opportunity to walk through how trust works on the Internet between networks like ourselves, destinations like 1.1.1.1, CAs like Fina, and devices like the one you are using to read this. To learn more about the mechanics, please keep reading.

Background

Cloudflare operates a public DNS resolver 1.1.1.1 service that millions of devices use to resolve domain names from a human-readable format such as example.com to an IP address like 192.0.2.42 or 2001:db8::2a.

The 1.1.1.1 service is accessible using various methods, across multiple domain names, such as cloudflare-dns.com and one.one.one.one, and also using various IP addresses, such as 1.1.1.1, 1.0.0.1, 2606:4700:4700::1111, and 2606:4700:4700::1001. 1.1.1.1 for Families also provides public DNS resolver services and is hosted on different IP addresses — 1.1.1.2, 1.1.1.3, 1.0.0.2, 1.0.0.3, 2606:4700:4700::1112, 2606:4700:4700::1113, 2606:4700:4700::1002, 2606:4700:4700::1003.

As originally specified in RFC 1034 and RFC 1035, the DNS protocol includes no privacy or authenticity protections. DNS queries and responses are exchanged between client and server in plain text over UDP or TCP. These represent around 60% of queries received by the Cloudflare 1.1.1.1 service. The lack of privacy or authenticity protection means that any intermediary can potentially read the DNS query and response and modify them without the client or the server being aware.

To address these shortcomings, we have helped develop and deploy multiple solutions at the IETF. The two of interest to this post are DNS over TLS (DoT, RFC 7878) and DNS over HTTPS (DoH, RFC 8484). In both cases the DNS protocol itself is mainly unchanged, and the desirable security properties are implemented in a lower layer, replacing the simple use of plain-text in UDP and TCP in the original specification. Both DoH and DoT use TLS to establish an authenticated, private, and encrypted channel over which DNS messages can be exchanged. To learn more you can read DNS Encryption Explained.

During the TLS handshake, the server proves its identity to the client by presenting a certificate. The client validates this certificate by verifying that it is signed by a Certification Authority that it already trusts. Only then does it establish a connection with the server. Once connected, TLS provides encryption and integrity for the DNS messages exchanged between client and server. This protects DoH and DoT against eavesdropping and tampering between the client and server.

The TLS certificates used in DoT and DoH are the same kinds of certificates HTTPS websites serve. Most website certificates are issued for domain names like example.com. When a client connects to that website, they resolve the name example.com to an IP like 192.0.2.42, then connect to the domain on that IP address. The server responds with a TLS certificate containing example.com, which the device validates.

However, DNS server certificates tend to be used slightly differently. Certificates used for DoT and DoH have to contain the service IP addresses, not just domain names. This is due to clients being unable to resolve a domain name in order to contact their resolver, like cloudflare-dns.com. Instead, devices are first set up by connecting to their resolver via a known IP address, such as 1.1.1.1 in the case of Cloudflare public DNS resolver. When this connection uses DoT or DoH, the resolver responds with a TLS certificate issued for that IP address, which the client validates. If the certificate is valid, the client believes that it is talking to the owner of 1.1.1.1 and starts sending DNS queries.

You can see that the IP addresses are included in the certificate Cloudflare’s public resolver uses for DoT/DoH:

Rogue certificate issuance

The section above describes normal, expected use of Cloudflare public DNS resolver 1.1.1.1 service, using certificates managed by Cloudflare. However, Cloudflare has been made aware of other, unauthorized certificates being issued for 1.1.1.1. Since certificate validation is the mechanism by which DoH and DoT clients establish the authenticity of a DNS resolver, this is a concern. Let’s now dive a little further in the security model provided by DoH and DoT.

Consider a client that is preconfigured to use the 1.1.1.1 resolver service using DoT. The client must establish a TLS session with the configured server before it can send any DNS queries. To be trusted, the server needs to present a certificate issued by a CA that the client trusts. The collection of certificates trusted by the client is also called the root store.

A Certification Authority (CA) is an organisation, such as DigiCert in the section above, whose role is to receive requests to sign certificates and verify that the requester has control of the domain. In this incident, Fina CA issued certificates for 1.1.1.1 without Cloudflare's involvement. This means that Fina CA did not properly check whether the requestor had legitimate control over 1.1.1.1. According to Fina CA:

“They were issued for the purpose of internal testing of certificate issuance in the production environment. An error occurred during the issuance of the test certificates when entering the IP addresses and as such they were published on Certificate Transparency log servers.”

Although it’s not clear whether Fina CA sees it as an error, we emphasize that it is not an error to publish test certificates on Certificate Transparency (more about what that is later on). Instead, the error at hand is Fina CA using their production keys to sign a certificate for an IP address without permission of the controller. We have talked about misuse of 1.1.1.1 in documentation, lab, and testing environments at length. Instead of the Cloudflare public DNS resolver 1.1.1.1 IP address, Fina should have used an IP address it controls itself.

Unauthorized certificates are unfortunately not uncommon, whether due to negligence — such as IdenTrust in November 2024 — or compromise. Famously in 2011, the Dutch CA DigiNotar was hacked, and its keys were used to issue hundreds of certificates. This hack was a wake-up call and motivated the introduction of Certificate Transparency (CT), later formalised in RFC 6962. The goal of Certificate Transparency is not to directly prevent misissuance, but to be able to detect any misissuance once it has happened, by making sure every certificate issued by a CA is publicly available for inspection.

In certificate transparency several independent parties, including Cloudflare, operate public logs of issued certificates. Many modern browsers do not accept certificates unless they provide proof in the form of signed certificate timestamps (SCTs) that the certificate has been logged in at least two logs. Domain owners can therefore monitor all public CT logs for any certificate containing domains they care about. If they see a certificate for their domains that they did not authorize, they can raise the alarm. CT is also the data source for public services such as crt.sh and Cloudflare Radar’s certificate transparency page.

Not all clients require proof of inclusion in certificate transparency. Browsers do, but most DNS clients don’t. We were fortunate that Fina CA did submit the unauthorized certificates to the CT logs, which allowed them to be discovered.

Investigation into potential malicious use

Our immediate concern was that someone had maliciously used the certificates to impersonate the 1.1.1.1 service. Such an attack would require all the following:

An attacker would require a rogue certificate and its corresponding private key.
Attacked clients would need to trust the Fina CA.
Traffic between the client and 1.1.1.1 would have to be intercepted.

In light of this incident, we have reviewed these requirements one by one:

1. We know that a certificate was issued without Cloudflare's involvement. We must assume that a corresponding private key exists, which is not under Cloudflare's control. This could be used by an attacker. Fina CA wrote to us that the private keys were exclusively in Fina’s controlled environment and were immediately destroyed even before the certificates were revoked. As we have no way to verify this, we have and continue to take steps to detect malicious use as described in point 3.

2. Furthermore, some clients trust Fina CA. It is included by default in Microsoft’s root store and in an EU Trust Service provider. We can exclude some clients, as the CA certificate is not included by default in the root stores of Android, Apple, Mozilla, or Chrome. These users cannot have been affected with these default settings. For these certificates to be used nefariously, the client’s root store must include the Certification Authority (CA) that issued them. Upon discovering the problem, we immediately reached out to Fina CA, Microsoft, and the EU Trust Service provider. Microsoft responded quickly, and started rolling out an update to their disallowed list, which should cause clients that use it to stop trusting the certificate.

3. Finally, we have launched an investigation into possible interception between users and 1.1.1.1. The first way this could happen is when the attacker is on-path of the client request. Such man-in-the-middle attacks are likely to be invisible to us. Clients will get responses from their on-path middlebox and we have no reliable way of telling that is happening. On-path interference has been a persistent problem for 1.1.1.1, which we’ve been working on ever since we announced 1.1.1.1.

A second scenario can occur when a malicious actor is off-path, but is able to hijack 1.1.1.1 routing via BGP. These are scenarios we have discussed in a previous blog post, and increasing adoption of RPKI route origin validation (ROV) makes BGP hijacks with high penetration harder. We looked at the historical BGP announcements involving 1.1.1.1, and have found no evidence that such routing hijacks took place.

Although we cannot be certain, so far we have seen no evidence that these certificates have been used to impersonate Cloudflare public DNS resolver 1.1.1.1 traffic. In later sections we discuss the steps we have taken to prevent such impersonation in the future, as well as concrete actions you can take to protect your own systems and users.

A closer look at the unauthorized certificates attributes

All unauthorized certificates for 1.1.1.1 were valid for exactly one year and included other domain names. Most of these domain names are not registered, which indicates that the certificates were issued without proper domain control validation. This violates sections 3.2.2.4 and 3.2.2.5 of the CA/Browser Forum’s Baseline Requirements, and sections 3.2.2.3 and 3.2.2.4 of the Fina CA Certificate Policy.

The full list of domain names we identified on the unauthorized certificates are as follows:

It’s also worth noting that the Subject attribute points to a fictional organisation TEST D.D., as can be seen on this unauthorized certificate:

Incident timeline and impact

All timestamps are UTC. All certificates are identified by their date of validity.

The first certificate was issued to be valid starting February 2024, and revoked 33 min later. 11 certificate issuances with common name 1.1.1.1 followed from February 2024 to August 2025. Public reports have been made on Hacker News and on the certificate-transparency mailing list early in September 2025, which Cloudflare responded to.

While responding to the incident, we identified the full list of misissued certificates, their revocation status, and which clients trust them.

The full timeline for the incident is as follows.

Remediation and follow-up steps

Cloudflare has invested from the very start in the Certificate Transparency ecosystem. Not only do we operate CT logs ourselves, we also run a CT monitor that we use to alert customers when certificates are mis-issued for their domains.

It is therefore disappointing that we failed to properly monitor certificates for our own domain. We failed three times. The first time because 1.1.1.1 is an IP certificate and our system failed to alert on these. The second time because even if we were to receive certificate issuance alerts, as any of our customers can, we did not implement sufficient filtering. With the sheer number of names and issuances we manage it has not been possible for us to keep up with manual reviews. Finally, because of this noisy monitoring, we did not enable alerting for all of our domains. We are addressing all three shortcomings.

We double-checked all certificates issued for our names, including but not limited to 1.1.1.1, using certificate transparency, and confirmed that as of 3 September, the Fina CA issued certificates are the only unauthorized issuances. We contacted Fina, and the root programs we know that trust them, to ask for revocation and investigation. The certificates have been revoked.

Despite no indication of usage of these certificates so far, we take this incident extremely seriously. We have identified several steps we can take to address the risk of these sorts of problems occurring in the future, and we plan to start working on them immediately:

Alerting: Cloudflare will improve alerts and escalation for issuance of certificates for missing Cloudflare owned domains including 1.1.1.1 certificates.

Transparency: The issuance of these unauthorised 1.1.1.1 certificates were detected because Fina CA used Certificate Transparency. Transparency inclusion is not enforced by most DNS clients, which implies that this detection was a lucky one. We are working on bringing transparency to non-browser clients, in particular DNS clients that rely on TLS.

Bug Bounty: Our procedure for triaging reports made through our vulnerability disclosure program was the cause for a delayed response. We are working to revise our triaging process to ensure such reports get the right visibility.

Monitoring: During this incident, our team relied on crt.sh to provide us a convenient UI to explore CA issued certificates. We’d like to give a shout to the Sectigo team for maintaining this tool. Given Cloudflare is an active CT Monitor, we have started to build a dedicated UI to explore our data in Radar. We are looking to enable exploration of certs with IP addresses as common names to Radar as well.

What steps should you take?

This incident demonstrates the disproportionate impact that the current root store model can have. It is enough for a single certification authority going rogue for everyone to be at risk.

If you are an IT manager with a fleet of managed devices, you should consider whether you need to take direct action to revoke these unauthorized certificates. We provide the list in the timeline section above. As the certificates have since been revoked, it is possible that no direct intervention should be required; however, system-wide revocation is not instantaneous and automatic and hence we recommend checking.

If you are tasked to review the policy of a root store that includes Fina CA, you should take immediate actions to review their inclusion in your program. The issue that has been identified through the course of this investigation raises concerns, and requires a clear report and follow-up from the CA. In addition, to make it possible to detect future such incidents, you should consider having a requirement for all CAs in your root store to participate in Certificate Transparency. Without CT logs, problems such as the one we describe here are impossible to address before they result in impact to end users.

We are not suggesting that you should stop using DoH or DoT. DNS over UDP and TCP are unencrypted, which puts every single query and response at risk of tampering and unauthorised surveillance. However, we believe that DoH and DoT client security could be improved if clients required that server certificates be included in a certificate transparency log.

Conclusion

This event is the first time we have observed a rogue issuance of a certificate used by our public DNS resolver 1.1.1.1 service. While we have no evidence this was malicious, we know that there might be future attempts that are.

We plan to accelerate how quickly we discover and alert on these types of issues ourselves. We know that we can catch these earlier, and we plan to do so.

The identification of these kinds of issues rely on an ecosystem of partners working together to support Certificate Transparency. We are grateful for the monitors who noticed and reported this issue.

Forget IPs: using cryptography to verify bot and agent traffic

Thibault Meunier — Thu, 15 May 2025 13:00:00 GMT

With the rise of traffic from AI agents, what’s considered a bot is no longer clear-cut. There are some clearly malicious bots, like ones that DoS your site or do credential stuffing, and ones that most site owners do want to interact with their site, like the bot that indexes your site for a search engine, or ones that fetch RSS feeds.

Historically, Cloudflare has relied on two main signals to verify legitimate web crawlers from other types of automated traffic: user agent headers and IP addresses. The User-Agent header allows bot developers to identify themselves, i.e. MyBotCrawler/1.1. However, user agent headers alone are easily spoofed and are therefore insufficient for reliable identification. To address this, user agent checks are often supplemented with IP address validation, the inspection of published IP address ranges to confirm a crawler's authenticity. However, the logic around IP address ranges representing a product or group of users is brittle – connections from the crawling service might be shared by multiple users, such as in the case of privacy proxies and VPNs, and these ranges, often maintained by cloud providers, change over time.

Cloudflare will always try to block malicious bots, but we think our role here is to also provide an affirmative mechanism to authenticate desirable bot traffic. By using well-established cryptography techniques, we’re proposing a better mechanism for legitimate agents and bots to declare who they are, and provide a clearer signal for site owners to decide what traffic to permit.

Today, we’re introducing two proposals – HTTP message signatures and request mTLS – for friendly bots to authenticate themselves, and for customer origins to identify them. In this blog post, we’ll share how these authentication mechanisms work, how we implemented them, and how you can participate in our closed beta.

Existing bot verification mechanisms are broken

Historically, if you’ve worked on ChatGPT, Claude, Gemini, or any other agent, you’ve had several options to identify your HTTP traffic to other services:

You define a user agent, an HTTP header described in RFC 9110. The problem here is that this header is easily spoofable and there’s not a clear way for agents to identify themselves as semi-automated browsers — agents often use the Chrome user agent for this very reason, which is discouraged. The RFC states:
“If a user agent masquerades as a different user agent, recipients can assume that the user intentionally desires to see responses tailored for that identified user agent, even if they might not work as well for the actual user agent being used.”
You publish your IP address range(s). This has limitations because the same IP address might be shared by multiple users or multiple services within the same company, or even by multiple companies when hosting infrastructure is shared (like Cloudflare Workers, for example). In addition, IP addresses are prone to change as underlying infrastructure changes, leading services to use ad-hoc sharing mechanisms like CIDR lists.
You go to every website and share a secret, like a Bearer token. This is impractical at scale because it requires developers to maintain separate tokens for each website their bot will visit.

We can do better! Instead of these arduous methods, we’re proposing that developers of bots and agents cryptographically sign requests originating from their service. When protecting origins, reverse proxies such as Cloudflare can then validate those signatures to confidently identify the request source on behalf of site owners, allowing them to take action as they see fit.

A typical system has three actors:

User: the entity that wants to perform some actions on the web. This may be a human, an automated program, or anything taking action to retrieve information from the web.
Agent: an orchestrated browser or software program. For example, Chrome on your computer, or OpenAI’s Operator with ChatGPT. Agents can interact with the web according to web standards (HTML rendering, JavaScript, subrequests, etc.).
Origin: the website hosting a resource. The user wants to access it through the browser. This is Cloudflare when your website is using our services, and it’s your own server(s) when exposed directly to the Internet.

In the next section, we’ll dive into HTTP Message Signatures and request mTLS, two mechanisms a browser agent may implement to sign outgoing requests, with different levels of ease for an origin to adopt.

Introducing HTTP Message Signatures

HTTP Message Signatures is a standard that defines the cryptographic authentication of a request sender. It’s essentially a cryptographically sound way to say, “hey, it’s me!”. It’s not the only way that developers can sign requests from their infrastructure — for example, AWS has used Signature v4, and Stripe has a framework for authenticating webhooks — but Message Signatures is a published standard, and the cleanest, most developer-friendly way to sign requests.

We’re working closely with the wider industry to support these standards-based approaches. For example, OpenAI has started to sign their requests. In their own words:

"Ensuring the authenticity of Operator traffic is paramount. With HTTP Message Signatures (RFC 9421), OpenAI signs all Operator requests so site owners can verify they genuinely originate from Operator and haven’t been tampered with” – Eugenio, Engineer, OpenAI

Without further delay, let’s dive in how HTTP Messages Signatures work to identify bot traffic.

Scoping standards to bot authentication

Generating a message signature works like this: before sending a request, the agent signs the target origin with a public key. When fetching https://example.com/path/to/resource, it signs example.com. This public key is known to the origin, either because the agent is well known, because it has previously registered, or any other method. Then, the agent writes a Signature-Input header with the following parameters:

A validity window (created and expires timestamps)
A Key ID that uniquely identifies the key used in the signature. This is a JSON Web Key Thumbprint.
A tag that shows websites the signature’s purpose and validation method, i.e. web-bot-auth for bot authentication.

In addition, the Signature-Agent header indicates where the origin can find the public keys the agent used when signing the request, such as in a directory hosted by signer.example.com. This header is part of the signed content as well.

Here’s an example:

For those building bots, we propose signing the authority of the target URI, i.e. www.example.com, and a way to retrieve the bot public key in the form of signature-agent, if present, i.e. crawler.search.google.com for Google Search, operator.openai.com for OpenAI Operator, workers.dev for Cloudflare Workers.

The User-Agent from the example above indicates that the software making the request is Chrome, because it is an agent that uses an orchestrated Chrome to browse the web. You should note that MyBotCrawler/1.1 is still present. The User-Agent header can actually contain multiple products, in decreasing order of importance. If our agent is making requests via Chrome, that’s the most important product and therefore comes first.

At Internet-level scale, these signatures may add a notable amount of overhead to request processing. However, with the right cryptographic suite, and compared to the cost of existing bot mitigation, both technical and social, this seems to be a straightforward tradeoff. This is a metric we will monitor closely, and report on as adoption grows.

Generating request signatures

We’re making several examples for generating Message Signatures for bots and agents available on Github (though we encourage other implementations!), all of which are standards-compliant, to maximize interoperability.

Imagine you’re building an agent using a managed Chromium browser, and want to sign all outgoing requests. To achieve this, the webextensions standard provides chrome.webRequest.onBeforeSendHeaders, where you can modify HTTP headers before they are sent by the browser. The event is triggered before sending any HTTP data, and when headers are available.

Here’s what that code would look like:

Cloudflare provides a web-bot-auth helper package on npm that helps you generate request signatures with the correct parameters. onBeforeSendHeaders is a Chrome extension hook that needs to be implemented synchronously. To do so, we import {signatureHeadersSync} from “web-bot-auth”. Once the signature completes, both Signature and Signature-Input headers are assigned. The request flow can then continue.

This extension code is available on GitHub, alongside a debugging server, deployed at https://http-message-signatures-example.research.cloudflare.com.

Validating request signatures

Using our debug server, we can now inspect and validate our request signatures from the perspective of the website we’d be visiting. We should now see the Signature and Signature-Input headers:

^{In this example, the homepage of the debugging server validates the signature from the RFC 9421 Ed25519 verifying key, which the extension uses for signing.}

The above demo and code walkthrough has been fully written in TypeScript: the verification website is on Cloudflare Workers, and the client is a Chrome browser extension. We are cognisant that this does not suit all clients and servers on the web. To demonstrate the proposal works in more environments, we have also implemented bot signature validation in Go with a plugin for Caddy server.

Experimentation with request mTLS

HTTP is not the only way to convey signatures. For instance, one mechanism that has been used in the past to authenticate automated traffic against secured endpoints is mTLS, the “mutual” presentation of TLS certificates. As described in our knowledge base:

Mutual TLS, or mTLS for short, is a method for mutual authentication. mTLS ensures that the parties at each end of a network connection are who they claim to be by verifying that they both have the correct private key. The information within their respective TLS certificates provides additional verification.

While mTLS seems like a good fit for bot authentication on the web, it has limitations. If a user is asked for authentication via the mTLS protocol but does not have a certificate to provide, they would get an inscrutable and unskippable error. Origin sites need a way to conditionally signal to clients that they accept or require mTLS authentication, so that only mTLS-enabled clients use it.

A TLS flag for bot authentication

TLS flags are an efficient way to describe whether a feature, like mTLS, is supported by origin sites. Within the IETF, we have proposed a new TLS flag called req mTLS to be sent by the client during the establishment of a connection that signals support for authentication via a client certificate.

This proposal leverages the tls-flags proposal under discussion in the IETF. The TLS Flags draft allows clients and servers to send an array of one bit flags to each other, rather than creating a new extension (with its associated overhead) for each piece of information they want to share. This is one of the first uses of this extension, and we hope that by using it here we can help drive adoption.

When a client sends the req mTLS flag to the server, they signal to the server that they are able to respond with a certificate if requested. The server can then safely request a certificate without risk of blocking ordinary user traffic, because ordinary users will never set this flag.

Let’s take a look at what an example of such a req mTLS would look like in Wireshark, a network protocol analyser. You can follow along in the packet capture here.

The extension number is 65025, or 0xfe01. This corresponds to an unassigned block of TLS extensions that can be used to experiment with TLS Flags. Once the standard is adopted and published by the IETF, the number would be fixed. To use the req mTLS flag the client needs to set the 80^th bit to true, so with our block length of 12 bytes, it should contain the data 0b0000000000000000000001, which is the case here. The server then responds with a certificate request, and the request follows its course.

Request mTLS in action

Code for this section is available in GitHub under cloudflareresearch/req-mtls

Because mutual TLS is widely supported in TLS libraries already, the parts we need to introduce to the client and server are:

Sending/parsing of TLS-flags
Specific support for the req mTLS flag

To the best of our knowledge, there is no complete public implementation of either scheme. Using it for bot authentication may provide a motivation to do so.

Using our experimental fork of Go, a TLS client could support req mTLS as follows:

This example library allows you to configure Go to send req mTLS 0x50 bytes in the TLS Flags extension. If you’d like to test your implementation out, you can prompt your client for certificates against req-mtls.research.cloudflare.com using the Cloudflare Research client cloudflareresearch/req-mtls. For clients, once they set the TLS Flags associated with req mTLS, they are done. The code section taking care of normal mTLS will take over at that point, with no need to implement something new.

Two approaches, one goal

We believe that developers of agents and bots should have a public, standard way to authenticate themselves to CDNs and website hosting platforms, regardless of the technology they use or provider they choose. At a high level, both HTTP Message Signatures and request mTLS achieve a similar goal: they allow the owner of a service to authentically identify themselves to a website. That’s why we’re participating in the standardizing effort for both of these protocols at the IETF, where many other authentication mechanisms we’ve discussed here — from TLS to OAuth Bearer tokens –— been developed by diverse sets of stakeholders and standardized as RFCs.

Evaluating both proposals against each other, we’re prioritizing HTTP Message Signatures for Bots because it relies on the previously adopted RFC 9421 with several reference implementations, and works at the HTTP layer, making adoption simpler. request mTLS may be a better fit for site owners with concerns about the additional bandwidth, but TLS Flags has fewer implementations, is still waiting for IETF adoption, and upgrading the TLS stack has proven to be more challenging than with HTTP. Both approaches share similar discovery and key management concerns, as highlighted in a glossary draft at the IETF. We’re actively exploring both options, and would love to hear from both site owners and bot developers about how you’re evaluating their respective tradeoffs.

The bigger picture

In conclusion, we think request signatures and mTLS are promising mechanisms for bot owners and developers of AI agents to authenticate themselves in a tamper-proof manner, forging a path forward that doesn’t rely on ever-changing IP address ranges or spoofable headers such as User-Agent. This authentication can be consumed by Cloudflare when acting as a reverse proxy, or directly by site owners on their own infrastructure. This means that as a bot owner, you can now go to content creators and discuss crawling agreements, with as much granularity as the number of bots you have. You can start implementing these solutions today and test them against the research websites we’ve provided in this post.

Bot authentication also empowers site owners small and large to have more control over the traffic they allow, empowering them to continue to serve content on the public Internet while monitoring automated requests. Longer term, we will integrate these authentication mechanisms into our AI Audit and Bot Management products, to provide better visibility into the bots and agents that are willing to identify themselves.

Being able to solve problems for both origins and clients is key to helping build a better Internet, and we think identification of automated traffic is a step towards that. If you want us to start verifying your message signatures or client certificates, have a compelling use case you’d like us to consider, or any questions, please reach out.

Sometimes I cache: implementing lock-free probabilistic caching

Thibault Meunier — Thu, 26 Dec 2024 14:00:00 GMT

HTTP caching is conceptually simple: if the response to a request is in the cache, serve it, and if not, pull it from your origin, put it in the cache, and return it. When the response is old, you repeat the process. This is called cache revalidation. If you are worried about too many requests going to your origin at once, you protect it with a cache lock: a small program, possibly distinct from your cache, that indicates if a request is already going to your origin. This is called request collapsing.

In this blog post, we dive into how cache revalidation works, and present a new approach based on probability. For every request going to the origin, we simulate a die roll. If it’s 6, the request can go to the origin. Otherwise, it stays stale to protect our origin from being overloaded. To see how this is built and optimised, read on.

Background

Let's take the example of an online image library. When a client requests an image, the service first checks its cache to see if the resource is present. If it is, it returns it. If it is not, the image server processes the request, places the response into the cache for a day, and returns it. When the cache expires, the process is repeated.

Figure 1: Uncached request goes to the origin

Figure 2: Cached request stops at the cache

And this is where things get complex. The image of a cat might be quite popular. Let's say it's requested 10 times per second. Let’s also assume the image server cannot handle more than 1 request per second. After a day, the cache expires. 10 requests hit the service. Given there are no up-to-date items in cache, these 10 requests are going to go directly to the image server. This problem is known as cache stampede. When the image server sees these 10 requests all happening at the same time, it gets overloaded.

Figure 3: Image server overloaded upon cache expiration. This can happen to one or multiple users, across locations.

This all stops if the cache gets populated, as it can handle a lot more requests than the origin.

Figure 4: Cache is populated and can handle the load. The image server is healthy again.

In the following sections, we build this image service, see how it can prevent cache stampede with a cache lock, then dive into probabilistic cache revalidation, and its optimisation.

Setup

Let's write this image service. We need an image, a server, and a cache. For the image we're going to use a picture of my cat, Cloudflare Workers for the server, and the Cloudflare Cache API for caching.

Note to the reader: On purpose, we aren’t using Cloudflare KV or Cloudflare CDN Cache, because they already solve our cache validation problem by using a cache lock.

Codeblock 1: Image server with a non-collapsing cache

Expectation about cache revalidation

The image service is receiving 10 requests per second, and it caches images for a day. It's reasonable to assume we would like to start revalidating the cache 5 minutes before it expires. The code evolves as follows:

Codeblock 2: Image server with early-revalidation and a non-collapsing cache

That code works, and we can now revalidate 5 minutes in advance of cache expiration. However, instead of fetching the image from the origin server at expiration time, all requests are going to be made 5 minutes in advance, and that does not solve our cache stampede problem. This happens no matter if requests are coming to a single location or not, given the code above does not collapse requests.

To solve our cache stampede problem, we need the revalidation process to not send too many requests at the same time. Ideally, we would like only one request to be sent between expiration - 5min and expiration.

The usual solution: a cache lock

To make sure there is only one request at a time going to the origin server, the solution that's usually deployed is a cache lock. The idea is that for a specific item, a cat picture in our case, requests to the origin try to obtain a lock. The request obtaining the lock can go to the origin, the others will serve stale content.

The lock has two methods: try_lock and unlock.

try_lock if the lock is free, take it and return true. If not, return false.
unlock releases the lock.

Such a lock can be implemented as a Cloudflare RPC service:

Codeblock 3: Lock service implemented with a Durable Object

That service can then be used as a cache lock.

Codeblock 4: Image server with early-revalidation and a cache using a cache-lock

Now you might say "Et voilà. No need for probabilities and mathematics. Peak engineering has triumphed." And you might be right, in most cases. That's why cache locks are so predominant: they are conceptually simple, deterministic for the same key, and scale well with predictable resource usage.

On the other hand, cache locks add latency and fallibility. To take ownership of a lock, cache revalidation has to contact the lock service. This service is shared across different processes, possibly different machines in different locations. Requests therefore take time. In addition, this service might be unavailable. Probabilistic cache revalidation does not suffer from these, given it does not reach out to an external service but rolls a die with the local randomness generator. It does so at the cost of not guaranteeing the number of requests going to the origin server: maybe zero for an extended period, maybe more than one. On average, this is going to be fine. But there can be border cases, similar to how one can roll a die 10 times and get 10 sixes. It’s unlikely, but not unrealistic, and certain services need that certainty. In the following sections, we dissect this approach.

First dive into probabilities given a stable request rate

A first approach is to reduce the number of requests going to the origin server. Instead of always sending a request to revalidate, we are going to send 1 out of 10. This means that instead of sending 10 requests per second when the cache is invalidated, we send 1 per second.

Because we don't have a lock, we do that with probabilities. We set the probability of sending a request to the origin to be $p=\frac{1}{10}$. With a rate $r$ of 10 requests per second, after 1 second, the expectancy of a request being sent to the origin is $1-(1-p)^{10}=65\%$. We draw the evolution of the function $E(r, t)=1-(1-p)^{r \times t}$ representing the expectancy of a request being sent to the server over time.

Figure 5: Revalidation time $E(t)$ with $r=10$ and $p=\frac{1}{10}$. At time $t$, $E(t)$ is the probability that an early revalidation occurred.

The graph moves very quickly towards $1$. This means we might still have space to reduce the number of requests going to our origin server. We can set a lower probability, such as $p_2=\frac{1}{500}$ (1 request every 5 seconds on average). The graph looks as follows:

Figure 6: Revalidation time $E(t)$ with $r=10$ and $p=\frac{1}{500}$.

This looks great. Let's implement it.

Codeblock 5: Image server with early-revalidation and a probabilistic cache using uniform distribution

That's it. If the cache is not close to expiration, we don't revalidate. If the cache is expired, we revalidate. Otherwise, we revalidate based on a probability.

Adaptive cache revalidation

Until now, we assumed the picture of the cat received a stable request rate. However, for a real service, this does not necessarily hold. For instance, if instead of 10 requests per second, imagine the service receives only 1. The expectancy function does not look as good. After 5 minutes (300s), $E(r=1, t=300)=45\%$. On the other hand, if the image service is receiving 10,000 requests per second, $E(r=10000, t = 300) \approx 100\%$, but our server receives on average $10000 \times \frac{1}{500} = 20$ requests per second. It would be ideal to design a probability function that would adapt to the request rate.

That function would return a low probability when expiration time is far in the future, and increase over time such that the cache is revalidated before it expires. It would cap the request rate going to the origin server.

Let’s design the variation of probability $p$ over 5 minutes. When far from the expiration, the probability to revalidate should be low. This should help match the high request rate. For example, with a request rate of 10k requests per second, we would like the revalidation probability $p$ to be $\frac{1}{100000}$. This ensures the request rates seen by our server are going to be low on average, at about 1 request every 10 seconds. As time passes, we increase this probability to allow for revalidation even at a lower request rate.

Table 1: Variation of revalidation probability over time

For each of these intervals, there is a high likelihood that a request rate $r$ will trigger a cache revalidation, and low likelihood that a lower request rate will trigger it. If it does, it's ok.

We can update our revalidation function as follows:

Codeblock 6: Image server with early-revalidation and a probabilistic cache using piecewise uniform distribution

Optimal cache stampede solution

There seems to be a lot of decisions going on here. To solve this, we can reference an academic paper written by A Vattani, T Chierichetti, and K Lowenstein in 2015 called Optimal Probabilistic Cache Stampede Prevention. If you read it, you'll recognise that what we have been discussing until now is close to what the paper presents. For instance, both the cache revalidation algorithm structure and the early revalidation function look similar.

Figure 7: Probabilistic early expiration of a cache item as defined by Figure 2 of Optimal Probabilistic Cache Stampede Prevention paper. In our case, $\mathcal{D}=300$

One takeaway from the paper is that instead of discretization, with a probability from 0 to 60s, then from 60s to 120s, …, the probability function can be continuous. Instead of a fixed $p$, there is a function $p(t)$ of time $t$.

$p(t)=e^{-\lambda (expiry-t)}, \text{ with } expiry=300, \text{ and } t \in [0, 300]$

We call $\lambda$ the steepness parameter, and set it to $\frac{1}{300}$, $300$ being our early expiration gap.

The expectancy over time is $E(r, t)=1-e^{-rλt}$. This leads to the expectancy below for various request rates. You can note that when $r=1$, there is not a $100%$ chance that the request will be revalidated before expiry.

Figure 8: Revalidation time $E(t)$ for multiple $r$ with an exponential distribution.

This leads to the final code snippet:

Codeblock 7: Image server with early-revalidation and a probabilistic cache using exponential distribution

And that's it. Given Date.now() has a granularity, and is not continuous, it would also be possible to discretise these functions, even though the gains are minimal. This is what we have done in a production worker implementation, where the number of requests is important. It is a service that benefits from caching for performance consideration, and that cannot use built-in stale-while-revalidate from within Cloudflare workers. Probabilistic cache stampede prevention is well-suited here, as no new component has to be built, and it performs well at different request rates.

Conclusion

We have seen how to solve cache stampede without a lock, its implementation, and why it is optimal. In the real world, you likely will not encounter this issue: either because it’s good enough to optimize your origin service to serve more requests, or because you can leverage a CDN cache. In fact, most HTTP caches provide an API that follows Cache Control, and likely have all the tools you need. This primitive is also built into certain products, such as Cloudflare KV.

If you have not done so, you can go and experiment with all the code snippets presented in this blog on the Cloudflare Workers Playground at cloudflareworkers.com.

Cloudflare helps verify the security of end-to-end encrypted messages by auditing key transparency for WhatsApp

Thibault Meunier — Tue, 24 Sep 2024 13:00:00 GMT

Chances are good that today you’ve sent a message through an end-to-end encrypted (E2EE) messaging app such as WhatsApp, Signal, or iMessage. While we often take the privacy of these conversations for granted, they in fact rely on decades of research, testing, and standardization efforts, the foundation of which is a public-private key exchange. There is, however, an oft-overlooked implicit trust inherent in this model: that the messaging app infrastructure is distributing the public keys of all of its users correctly.

Here’s an example: if Joe and Alice are messaging each other on WhatsApp, Joe uses Alice’s phone number to retrieve Alice’s public key from the WhatsApp database, and Alice receives Joe’s public key. Their messages are then encrypted using this key exchange, so that no one — even WhatsApp — can see the contents of their messages besides Alice and Joe themselves. However, in the unlikely situation where an attacker, Bob, manages to register a different public key in WhatsApp’s database, Joe would try to message Alice but unknowingly be messaging Bob instead. And while this threat is most salient for journalists, activists, and those most vulnerable to cyber attacks, we believe that protecting the privacy and integrity of end-to-end encrypted conversations is for everyone.

There are several methods that end-to-end encrypted messaging apps have deployed thus far to protect the integrity of public key distribution, the most common of which is to do an in-person verification of the QR code fingerprint of your public key (WhatsApp and Signal both have a version of this). As you can imagine, this experience is inconvenient and unwieldy, especially as your number of contacts and group chats increase.

Over the past few years, there have been significant developments in this area of cryptography, and WhatsApp has paved the way with their Key Transparency announcement. But as an independent third party, Cloudflare can provide stronger reassurance: that’s why we’re excited to announce that we’re now verifying WhatsApp’s Key Transparency audit proofs.

Auditing: the next frontier of encryption

We didn’t build this in a vacuum: similar to how the web and messaging apps became encrypted over time, we see auditing public key infrastructure as the next logical step in securing Internet infrastructure. This solution builds upon learnings from Certificate Transparency and Binary Transparency, which share some of the underlying data structure and cryptographic techniques, and we’re excited about the formation of a working group at the IETF to make multi-party operation of Key Transparency-like systems tractable for a broader set of use cases.

We see our role here as a pioneer of a real world deployment of this auditing infrastructure, working through and sharing the operational challenges of operating a system that is critical for a messaging app used by billions of people around the world.

We’ve also done this before — in 2022, Cloudflare announced Code Verify, a partnership in which we verify that the code delivered in the browser for WhatsApp Web has not been tampered with. When users run WhatsApp in their browser, the WhatsApp Code Verify extension compares a hash of the code that is executing in the browser with the hash that Cloudflare has of the codebase, enabling WhatsApp web users to easily see whether the code that is executing is the code that was publicly committed to.

^{In Code Verify, Cloudflare builds a non-mutable chain associating the WhatsApp version with the hash of its code.}

Cloudflare’s role in Key Transparency is similar in that we are checking that a tree-based directory of public keys (more on this later) has been constructed correctly, and has been done so consistently over time.

How Key Transparency works

The architectural foundation of Key Transparency is the Auditable Key Directory (AKD): a tree-shaped data structure, constructed and maintained by WhatsApp, in which the nodes contain hashed contact details of each user. We’ll explain the basics here but if you’re interested in learning more, check out the SEEMless and Parakeet papers.

The AKD tree is constructed by building a binary tree, each parent node of which is a hash of each of its left and right child nodes:

^{Each child node on the tree contains contact and public key details for a user (shown here for illustrative purposes). In reality, Cloudflare only sees a hash of each node rather than Alice and Bob’s contact info in plaintext.}

An epoch describes a specific version of the tree at a given moment in time, identified by its root node. Using a structure similar to Code Verify, the WhatsApp Log stores each root node hash as part of an append-only time structure of updates.

What kind of changes are valid to be included in a given epoch? When a new person, Brian, joins WhatsApp, WhatsApp inserts a new “B” node in the AKD tree, and a new epoch. If Alice loses her phone and rotates her key, her “version” is updated to v1 in the next update.

How we built the Auditor on Cloudflare Workers

The role of the Auditor is to provide two main guarantees: that epochs are globally unique, and that they are valid. They are, however, quite different: global uniqueness requires consistency on whether an epoch and its associated root hash has been seen, while validity is a matter of computation — is the transition from the previous epoch to the current one a correct tree transformation?

Timestamping service

^{Timestamping service architecture (Cloudflare Workers in Rust, using a Durable Object for storage)}

At regular intervals, the WhatsApp Log puts all new updates into the tree, and cuts a new epoch in the format “{counter}/{previous}/{current}”. The counter is a number, whereby “previous” is a hexadecimal encoded hash of the previous tree root, and “current” is a hexadecimal encoded hash for the new tree root. As a shorthand, epochs can be referred to by their counter only.

Here’s an example:

1001/d0bbf29c48716f26a951ae2a244eb1d070ee38865c29c8ad8174e8904e3cdc1a/e1006114485e8f0bbe2464e0ebac77af37bce76851745592e8dd5991ff2cd411

Once an epoch is constructed, the WhatsApp Log sends it to the Auditor for cross-signing, to ensure it has only been seen once. The Auditor adds a timestamp as to when this new epoch has been seen. Cloudflare’s Auditor uses a Durable Object for every epoch to create their timestamp. This guarantees the global uniqueness of an epoch, and the possibility of replay in the event the WhatsApp Log experiences an outage or is distributed across multiple locations. WhatsApp’s Log is expected to produce new epochs at regular intervals, given this constrains the propagation of public key updates seen by their users. Therefore, Cloudflare Auditor does not have to keep the durable object state forever. Once replay and consistency have been accounted for, this state is cleared. This is done after a month, thanks to durable object alarms.

Additional checks are performed by the service, such as checking that the epochs are consecutive, or that their digest is unique. This enforces a chain of epochs and their associated digests, provided by the WhatsApp Log and signed by the Auditor, providing a consistent view for all to see.

We decided to write this service in Rust because Workers rely on cloudflare/workers-rs bindings, and the auditable key directory library is also in Rust (facebook/akd).

Tree validation service

With the timestamping service above, WhatsApp users (as well as their Log) have assurance that epochs are transparent. WhatsApp’s directory can be audited at any point in time, and if it were to be tampered with by WhatsApp or an intermediary, the WhatsApp Log can be held accountable for it.

Epochs and their digests are only representations of their underlying key directory. To fully audit the directory, the transition from the previous digest to a current digest has to be validated. To perform validation, we need to run the epoch validation method. Specifically, we want to run verify_consecutive_append_only on every epoch constructed by the Log. The size of an epoch varies with the number of updates it contains, and therefore the number of associated nodes in the tree to construct as well. While Workers are able to run such validation for a small number of updates, this is a compute-intensive task. Therefore, still leveraging the same Rust codebase, the Auditor leverages a container that only performs the tree construction and validation. The Auditor retrieves the updates for a given epoch, copies them into its own R2 bucket, and delegates the validation to a container running on Cloudflare. Once validated, the epoch is marked as verified.

^{Architecture for Cloudflare’s Plexi Auditor. The proof verification and signatures stored do not contain personally identifiable information such as your phone number, public key, or other metadata tied to your WhatsApp account.}

This decouples global uniqueness requirements and epoch validation, which happens at two distinct times. It allows the validation to take more time, and not be latency sensitive.

How can I verify Cloudflare has signed an epoch?

Anyone can perform audit proof verification — the proofs are publicly available — but Cloudflare will be doing so automatically and publicly to make the results accessible to all. Verify that Cloudflare’s signature matches WhatsApp’s by visiting our Key Transparency website, or via our command line tool.

To use our command line tool, you’ll need to download the plexi client. It helps construct data structures which are used for signatures, and requires you to have git and cargo installed.

With the client installed, let’s now check the audit proofs for WhatsApp namespace: whatsapp.key-transparency.v1. Plexi Auditor is represented by one public key, which can verify and vouch for multiple Logs with their own dedicated “namespace.” To validate an epoch, such as epoch 458298 (the epoch at which the log decided to start sharing data), you can run the following command:

Interested in having Cloudflare audit your public key infrastructure?

At the end of the day, security threats shouldn’t become usability problems — everyday messaging app users shouldn’t have to worry about whether the public keys of the people they’re talking to have been compromised. In the same way that certificate transparency is now built into the issuance and use of digital certificates to encrypt web traffic, we think that public key transparency and auditing should be built into end-to-end encrypted systems by default, so that users never have to do manual QR code verification again.

We built our auditing service to be general purpose, reliable, and fast, and WhatsApp’s Key Transparency is just the first of several use cases it will be used for – Cloudflare is interested in helping audit the delivery of code binaries and integrity of all types of end-to-end encrypted infrastructure. If your company or organization is interested in working with us, you can reach out to us here.

Harnessing chaos in Cloudflare offices

Cefan Daniel Rubin — Fri, 08 Mar 2024 14:00:24 GMT

In the children’s book The Snail and Whale, after an unexpectedly far-flung adventure, the principal character returns to declarations of “How time’s flown” and “Haven’t you grown?” It has been about four years since we last wrote about LavaRand and during that time the story of how Cloudflare uses physical sources of entropy to add to the security of the Internet has continued to travel and be a source of interest to many. What was initially just a single species of physical entropy source – lava lamps – has grown and diversified. We want to catch you up a little on the story of LavaRand. This blog post will cover the new sources of “chaos” that have been added to LavaRand and how you can make use of that harnessed chaos in your next application. We’ll cover how public randomness can open up uses of publicly trusted randomness — imagine not needing to take the holders of a “random draw” at their word when they claim the outcome is not manipulated in some way. And finally we’ll discuss timelock encryption which is a way to ensure that a message cannot be decrypted until some chosen time in the future.

LavaRand origins

The entropy sourced from our wall of lava lamps in San Francisco has long played its part in the randomness that secures connections made through Cloudflare.

^{Lava lamps with flowing wax.}

Cloudflare’s servers collectively handle upwards of 55 million HTTP requests per second, the vast majority of which are secured via the TLS protocol to ensure authenticity and confidentiality. Under the hood, cryptographic protocols like TLS require an underlying source of secure randomness – otherwise, the security guarantees fall apart.

Secure randomness used in cryptography needs to be computationally indistinguishable from “true” randomness. For this, it must both pass statistical randomness tests, and the output needs to be unpredictable to any computationally-bounded adversary, no matter how much previous output they’ve already seen. The typical way to achieve this is to take some random ‘seed’ and feed it into a Cryptographically Secure Pseudorandom Number Generator (CSPRNG) that can produce an essentially-endless stream of unpredictable bytes upon request. The properties of a CSPRNG ensure that all outputs are practically indistinguishable from truly random outputs to anyone that does not know its internal state. However, this all depends on having a secure random seed to begin with. Take a look at this blog for more details on true randomness versus pseudorandomness, and this blog for some great examples of what can go wrong with insecure randomness.

For many years, Cloudflare’s servers relied on local sources of entropy (such as the precise timing of packet arrivals or keyboard events) to seed their entropy pools. While there’s no reason to believe that the local entropy sources on those servers are insecure or could be easily compromised, we wanted to hedge our bets against that possibility. Our solution was to set up a system where our servers could periodically refresh their entropy pools with true randomness from an external source.

That brings us to LavaRand. “Lavarand” has long been the name given to systems used for the generation of randomness (first by Silicon Graphics in 1997). Cloudflare launched its instantiation of a LavaRand system in 2017 as a system that collects entropy from the wall of lava lamps in our San Francisco office and makes it available via an internal API. Our servers then periodically query the API to retrieve fresh randomness from LavaRand and incorporate it into their entropy pools. The contributions made by LavaRand can be considered spice added to the entropy pool mix! (For more technical details on contributions made by LavaRand, read our previous blog post.)

^{Lava lamps in Cloudflare’s San Francisco office.}

Adding to the office chaos

Our lava lamps in San Francisco have been working tirelessly for years to supply fresh entropy to our systems, but they now have siblings across the world to help with their task! As Cloudflare has grown, so has the variety of entropy sources found in and sourced from our offices. Cloudflare’s Places team works hard to ensure that our offices reflect aspects of our values and culture. Several of our larger office locations include installations of physical systems of entropy, and it is these installations that we have worked to incorporate into LavaRand over time. The tangible and exciting draw of these systems is their basis in physical mechanics that we intuitively consider random. The gloops of warmed ascending “lava” floating past cooler sinking blobs within lava lamps attract our attention just as other unpredictable (and often beautiful) dynamic systems capture our interest.

London’s unpredictable pendulums

Visible to visitors of our London office is a wall of double pendulums whose beautiful swings translate to another source of entropy to LavaRand and to the pool of randomness that Cloudflare’s servers pull from.

^{Close-up of double pendulum display in Cloudflare’s London office.}

To the untrained eye the shadows of the pendulum stands and those cast by the rotating arms on the rear wall might seem chaotic. If so, then this installation should be labeled a success! Different light conditions and those shadows add to the chaos that is captured from this entropy source.

^{Double pendulum display in Cloudflare’s London office with changing light conditions.}

Indeed, even with these arms restricted to motion in two dimensions, the path traced by the arms is mesmerizingly varied, and can be shown to be mathematically chaotic. Even if we forget air resistance, temperature, and the environment, and then assume that the mutation is completely deterministic, still the resulting long-term motion is hard to predict. In particular the system is very sensitive to initial conditions, this initial state – how they are set in motion – paired with deterministic behavior produces a unique path that is traced until the pendulum comes to rest, and the system is set in motion by a Cloudflare employee in London once again.

Austin’s mesmerizing mobiles

The beautiful new Cloudflare office in Austin, Texas recently celebrated its first year since opening. This office contributes its own spin on physical entropy: suspended above the entrance of the Cloudflare office in downtown Austin is an installation of translucent rainbow mobiles. These twirl, reflecting the changing light, and cast coloured patterns on the enclosing walls. The display of hanging mobiles and their shadows are very sensitive to a physical environment which includes the opening and closing of doors, HVAC changes, and ambient light. This chaotic system’s mesmerizing and changing scene is captured periodically and fed into the stream of LavaRand randomness.

^{Hanging rainbow mobiles in Cloudflare’s Austin office.}

Mixing new sources into LavaRand

We incorporated the new sources of office chaos into the LavaRand system (still called LavaRand despite including much more than lava lamps) in the same way as the existing lava lamps, which we’ve previously described in detail.

To recap, at repeated intervals, a camera captures an image of the current state of the randomness display. Since the underlying system is truly random, the produced image contains true randomness. Even shadows and changing light conditions play a part in producing something unique and unpredictable! There is another secret that we should share: at a base level, image sensors in the real world are often a source of sufficient noise that even images taken without the lens cap removed could work well as a source of entropy! We consider this added noise to be a serendipitous addition to the beautiful chaotic motion of these installations.

^{Close-up of hanging rainbow mobiles in Cloudflare’s Austin office.}

Once we have a still image that captures the state of the randomness display at a particular point in time, we compute a compact representation – a hash – of the image to derive a fixed-sized output of truly random bytes.

^{Process of converting physical entropy displays into random byte strings.}

The random bytes are then used as an input (along with the previous seed and some randomness from the system’s local entropy sources) to a Key Derivation Function (KDF) to compute a new randomness seed that is fed into a Cryptographically Secure Pseudorandom Number Generator (CSPRNG) that can produce an essentially-endless stream of unpredictable bytes upon request. The properties of a CSPRNG ensure that all outputs are practically indistinguishable from truly random outputs to anyone that does not know its internal state. LavaRand then exposes this stream of randomness via a simple internal API where clients can request fresh randomness.

How can I use LavaRand?

Applications typically use secure randomness in one of two flavors: private and public.

Private randomness is used for generating passwords, cryptographic keys, user IDs, and other values that are meant to stay secret forever. As we’ve previously described, our servers periodically request fresh private randomness from LavaRand to help to update their entropy pools. Because of this, randomness from LavaRand is essentially available to the outside world! One easy way for developers to tap into private randomness from LavaRand is to use the Web Crypto API’s getRandomValues function from a Cloudflare Worker, or use one that someone has already built, like csprng.xyz (source).

Public randomness consists of unpredictable and unbiased random values that are made available to everyone once they are published, and for this reason should not be used for generating cryptographic keys. The winning lottery numbers and the coin flip at the start of a sporting event are some examples of public random values. A double-headed coin would not be an unbiased and unpredictable source of entropy and would have drastic impacts on the sports betting world.

In addition to being unpredictable and unbiased, it’s also desirable for public randomness to be trustworthy so that consumers of the randomness are assured that the values were faithfully produced. Not many people would buy lottery tickets if they believed that the winning ticket was going to be chosen unfairly! Indeed, there are known cases of corrupt insiders subverting public randomness for personal gain, like the state lottery employee who co-opted the lottery random number generator, allowing his friends and family to win millions of dollars.

A fundamental challenge of public randomness is that one must trust the authority producing the random outputs. Trusting a well-known authority like NIST may suffice for many applications, but could be problematic for others (especially for applications where decentralization is important).

`drand`: distributed and verifiable public randomness

To help solve this problem of trust, Cloudflare joined forces with seven other independent and geographically distributed organizations back in 2019 to form the League of Entropy to launch a public randomness beacon using the drand (pronounced dee-rand) protocol. Each organization contributes its own unique source of randomness into the joint pool of entropy used to seed the drand network – with Cloudflare using randomness from LavaRand, of course!

While the League of Entropy started out as an experimental network, with the guidance and support from the drand team at Protocol Labs, it’s become a reliable and production-ready core Internet service, relied upon by applications ranging from distributed file storage to online gaming to timestamped proofs to timelock encryption (discussed further below). The League of Entropy has also grown, and there are now 18 organizations across four continents participating in the drand network.

The League of Entropy’s drand beacons (each of which runs with different parameters, such as how frequently random values are produced and whether the randomness is chained – more on this below) have two important properties that contribute to their trustworthiness: they are decentralized and verifiable. Decentralization ensures that one or two bad actors cannot subvert or bias the randomness beacon, and verifiability allows anyone to check that the random values are produced according to the drand protocol and with participation from a threshold (at least half, but usually more) of the participants in the drand network. Thus, with each new member, the trustworthiness and reliability of the drand network continues to increase.

We give a brief overview of how drand achieves these properties using distributed key generation and threshold signatures below, but for an in-depth dive see our previous blog post and some of the excellent posts from the drand team.

Distributed key generation and threshold signatures

During the initial setup of a drand beacon, nodes in the network run a distributed key generation (DKG) protocol based on the Pedersen commitment scheme, the result of which is that each node holds a “share” (a keypair) for a distributed group key, which remains fixed for the lifetime of the beacon. In order to do something useful with the group secret key like signing a message, at least a threshold (for example 7 out of 9) of nodes in the network must participate in constructing a BLS threshold signature. The group information for the quicknet beacon on the League of Entropy’s mainnet drand network is shown below:

(The hex value 52db9b… in the URL above is the hash of the beacon’s configuration. Visit https://drand.cloudflare.com/chains to see all beacons supported by our mainnet drand nodes.)

The nodes in the network are configured to periodically (every 3s for quicknet) work together to produce a signature over some agreed-upon message, like the current round number and previous round signature (more on this below). Each node uses its share of the group key to produce a partial signature over the current round message, and broadcasts it to other nodes in the network. Once a node has enough partial signatures, it can aggregate them to produce a group signature for the given round.

The group signature for a round is the randomness (in the output above, the randomness value is simply the sha256 hash of the signature, for applications that prefer a shorter, fixed-sized output). The signature is unpredictable in advance as long as enough (at least a majority, but can be configured to be higher) of the nodes in the drand network are honest and do not collude. Further, anyone can validate the signature for a given round using the beacon’s group public key. It’s recommended that developers use the drand client libraries or CLI to perform verification on every value obtained from the beacon.

Chained vs unchained randomness

When the League of Entropy launched its first generation of drand beacons in 2019, the per-round message over which the group signature was computed included the previous round’s signature. This creates a chain of randomness rounds all the way to the first “genesis” round. Chained randomness provides some nice properties for single-source randomness beacons, and is included as a requirement in NIST’s spec for interoperable public randomness beacons.

However, back in 2022 the drand team introduced the notion of unchained randomness, where the message to be signed is predictable and doesn’t depend on any randomness from previous rounds, and showed that it provides the same security guarantees as chained randomness for the drand network (both require an honest threshold of nodes). In the implementation of unchained randomness in the quicknet, the message to be signed simply consists of the round number.

Unchained randomness provides some powerful properties and usability improvements. In terms of usability, a consumer of the randomness beacon does not need to reconstruct the full chain of randomness to the genesis round to fully validate a particular round – the only information needed is the current round number and the group public key. This provides much more flexibility for clients, as they can choose how frequently they consume randomness rounds without needing to continuously follow the randomness chain.

Since the messages to be signed are known in advance (since they’re just the round number), unchained randomness also unlocks a powerful new property: timelock encryption.

^{Rotating double pendulums.}

Timelock encryption

Timelock (or “timed-release”) encryption is a method for encrypting a message such that it cannot be decrypted until a certain amount of time has passed. Two basic approaches to timelock encryption were described by Rivest, Shamir, and Wagner:

There are two natural approaches to implementing timed release cryptography:

- Use “time-lock puzzles” – computational problems that cannot be solved without running a computer continuously for at least a certain amount of time.

- Use trusted agents who promise not to reveal certain information until a specified date.

Using trusted agents has the obvious problem of ensuring that the agents are trustworthy. Secret sharing approaches can be used to alleviate this concern.

The drand network is a group of independent agents using secret sharing for trustworthiness, and the ‘certain information’ not to be revealed until a specified date sounds a lot like the per-round randomness! We describe next how timelock encryption can be implemented on top of a drand network with unchained randomness, and finish with a practical demonstration. While we don’t delve into the bilinear groups and pairings-based cryptography that make this possible, if you’re interested we encourage you to read tlock: Practical Timelock Encryption from Threshold BLS by Nicolas Gailly, Kelsey Melissaris, and Yolan Romailler.

How to timelock your secrets

First, identify the randomness round that, once revealed, will allow your timelock-encrypted message to be decrypted. An important observation is that since drand networks produce randomness at fixed intervals, each round in a drand beacon is closely tied to a specific timestamp (modulo small delays for the network to actually produce the beacon) which can be easily computed taking the beacon’s genesis timestamp and then adding the round number multiplied by the beacon’s period.

Once the round is decided upon, the properties of bilinear groups allow you to encrypt your message to some round with the drand beacon’s group public key.

After the nodes in the drand network cooperate to derive the randomness for the round (really, just the signature on the round number using the beacon’s group secret key), anyone can decrypt the ciphertext (this is where the magic of bilinear groups comes in).

To make this practical, the timelocked message is actually the secret key for a symmetric scheme. This means that we encrypt the message with a symmetric key and encrypt the key with timelock, allowing for a decryption in the future.

Now, for a practical demonstration of timelock encryption, we use a tool that one of our own engineers built on top of Cloudflare Workers. The source code is publicly available if you’d like to take a look under the hood at how it works.

What’s next?

We hope you’ve enjoyed revisiting the tale of LavaRand as much as we have, and are inspired to visit one of Cloudflare’s offices in the future to see the randomness displays first-hand, and to use verifiable public randomness and timelock encryption from drand in your next project.

Chaos is required by the encryption that secures the Internet. LavaRand at Cloudflare will continue to turn the chaotic beauty of our physical world into a randomness stream – even as new sources are added – for novel uses all of us explorers – just like that snail – have yet to dream up.

And she gazed at the sky, the sea, the landThe waves and the caves and the golden sand.She gazed and gazed, amazed by it all,And she said to the whale, “I feel so small.”

^{A snail on a whale.}

Tune in for more news, announcements and thought-provoking discussions! Don't miss the full Security Week hub page.

Privacy Pass: upgrading to the latest protocol version

Thibault Meunier — Thu, 04 Jan 2024 16:07:22 GMT

Enabling anonymous access to the web with privacy-preserving cryptography

The challenge of telling humans and bots apart is almost as old as the web itself. From online ticket vendors to dating apps, to ecommerce and finance — there are many legitimate reasons why you'd want to know if it's a person or a machine knocking on the front door of your website.

Unfortunately, the tools for the web have traditionally been clunky and sometimes involved a bad user experience. None more so than the CAPTCHA — an irksome solution that humanity wastes a staggering amount of time on. A more subtle but intrusive approach is IP tracking, which uses IP addresses to identify and take action on suspicious traffic, but that too can come with unforeseen consequences.

And yet, the problem of distinguishing legitimate human requests from automated bots remains as vital as ever. This is why for years Cloudflare has invested in the Privacy Pass protocol — a novel approach to establishing a user’s identity by relying on cryptography, rather than crude puzzles — all while providing a streamlined, privacy-preserving, and often frictionless experience to end users.

Cloudflare began supporting Privacy Pass in 2017, with the release of browser extensions for Chrome and Firefox. Web admins with their sites on Cloudflare would have Privacy Pass enabled in the Cloudflare Dash; users who installed the extension in their browsers would see fewer CAPTCHAs on websites they visited that had Privacy Pass enabled.

Since then, Cloudflare stopped issuing CAPTCHAs, and Privacy Pass has come a long way. Apple uses a version of Privacy Pass for its Private Access Tokens system which works in tandem with a device’s secure enclave to attest to a user’s humanity. And Cloudflare uses Privacy Pass as an important signal in our Web Application Firewall and Bot Management products — which means millions of websites natively offer Privacy Pass.

In this post, we explore the latest changes to Privacy Pass protocol. We are also excited to introduce a public implementation of the latest IETF draft of the Privacy Pass protocol — including a set of open-source templates that can be used to implement Privacy Pass Origins, Issuers, and Attesters. These are based on Cloudflare Workers, and are the easiest way to get started with a new deployment of Privacy Pass.

To complement the updated implementations, we are releasing a new version of our Privacy Pass browser extensions (Firefox, Chrome), which are rolling out with the name: Silk - Privacy Pass Client. Users of these extensions can expect to see fewer bot-checks around the web, and will be contributing to research about privacy preserving signals via a set of trusted attesters, which can be configured in the extension’s settings panel.

Finally, we will discuss how Privacy Pass can be used for an array of scenarios beyond differentiating bot from human traffic.

Notice to our users

If you use the Privacy Pass API that controls Privacy Pass configuration on Cloudflare, you can remove these calls. This API is no longer needed since Privacy Pass is now included by default in our Challenge Platform. Out of an abundance of caution for our customers, we are doing a four-month deprecation notice.
If you have the Privacy Pass extension installed, it should automatically update to Silk - Privacy Pass Client (Firefox, Chrome) over the next few days. We have renamed it to keep the distinction clear between the protocol itself and a client of the protocol.

Brief history

In the last decade, we've seen the rise of protocols with privacy at their core, including Oblivious HTTP (OHTTP), Distributed aggregation protocol (DAP), and MASQUE. These protocols improve privacy when browsing and interacting with services online. By protecting users' privacy, these protocols also ask origins and website owners to revise their expectations around the data they can glean from user traffic. This might lead them to reconsider existing assumptions and mitigations around suspicious traffic, such as IP filtering, which often has unintended consequences.

In 2017, Cloudflare announced support for Privacy Pass. At launch, this meant improving content accessibility for web users who would see a lot of interstitial pages (such as CAPTCHAs) when browsing websites protected by Cloudflare. Privacy Pass tokens provide a signal about the user’s capabilities to website owners while protecting their privacy by ensuring each token redemption is unlinkable to its issuance context. Since then, the technology has turned into a fully fledged protocol used by millions thanks to academic and industry effort. The existing browser extension accounts for hundreds of thousands of downloads. During the same time, Cloudflare has dramatically evolved the way it allows customers to challenge their visitors, being more flexible about the signals it receives, and moving away from CAPTCHA as a binary legitimacy signal.

Deployments of this research have led to a broadening of use cases, opening the door to different kinds of attestation. An attestation is a cryptographically-signed data point supporting facts. This can include a signed token indicating that the user has successfully solved a CAPTCHA, having a user’s hardware attest it’s untampered, or a piece of data that an attester can verify against another data source.

For example, in 2022, Apple hardware devices began to offer Privacy Pass tokens to websites who wanted to reduce how often they show CAPTCHAs, by using the hardware itself as an attestation factor. Before showing images of buses and fire hydrants to users, CAPTCHA providers can request a Private Access Token (PAT). This native support does not require installing extensions, or any user action to benefit from a smoother and more private web browsing experience.

Below is a brief overview of changes to the protocol we participated in:

The timeline presents cryptographic changes, community inputs, and industry collaborations. These changes helped shape better standards for the web, such as VOPRF (RFC 9497), or RSA Blind Signatures (RFC 9474). In the next sections, we dive in the Privacy Pass protocol to understand its ins and outs.

Anonymous credentials in real life

Before explaining the protocol in more depth, let's use an analogy. You are at a music festival. You bought your ticket online with a student discount. When you arrive at the gates, an agent scans your ticket, checks your student status, and gives you a yellow wristband and two drink tickets.

During the festival, you go in and out by showing your wristband. When a friend asks you to grab a drink, you pay with your tickets. One for your drink and one for your friend. You give your tickets to the bartender, they check the tickets, and give you a drink. The characteristics that make this interaction private is that the drinks tickets cannot be traced back to you or your payment method, but they can be verified as having been unused and valid for purchase of a drink.

In the web use case, the Internet is a festival. When you arrive at the gates of a website, an agent scans your request, and gives you a session cookie as well as two Privacy Pass tokens. They could have given you just one token, or more than two, but in our example ‘two tokens’ is the given website’s policy. You can use these tokens to attest your humanity, to authenticate on certain websites, or even to confirm the legitimacy of your hardware.

Now, you might wonder if this is a technique we have been using for years, why do we need fancy cryptography and standardization efforts? Well, unlike at a real-world music festival where most people don’t carry around photocopiers, on the Internet it is pretty easy to copy tokens. For instance, how do we stop people using a token twice? We could put a unique number on each token, and check it is not spent twice, but that would allow the gate attendant to tell the bartender which numbers were linked to which person. So, we need cryptography.

When another website presents a challenge to you, you provide your Privacy Pass token and are then allowed to view a gallery of beautiful cat pictures. The difference with the festival is this challenge might be interactive, which would be similar to the bartender giving you a numbered ticket which would have to be signed by the agent before getting a drink. The website owner can verify that the token is valid but has no way of tracing or connecting the user back to the action that provided them with the Privacy Pass tokens. With Privacy Pass terminology, you are a Client, the website is an Origin, the agent is an Attester, and the bar an Issuer. The next section goes through these in more detail.

Privacy Pass protocol

Privacy Pass specifies an extensible protocol for creating and redeeming anonymous and transferable tokens. In fact, Apple has their own implementation with Private Access Tokens (PAT), and later we will describe another implementation with the Silk browser extension. Given PAT was the first to implement the IETF defined protocol, Privacy Pass is sometimes referred to as PAT in the literature.

The protocol is generic, and defines four components:

Client: Web user agent with a Privacy Pass enabled browser. This could be your Apple device with PAT, or your web browser with the Silk extension installed. Typically, this is the actor who is requesting content and is asked to share some attribute of themselves.
Origin: Serves content requested by the Client. The Origin trusts one or more Issuers, and presents Privacy Pass challenges to the Client. For instance, Cloudflare Managed Challenge is a Privacy Pass origin serving two Privacy Pass challenges: one for Apple PAT Issuer, one for Cloudflare Research Issuer.
Issuer: Signs Privacy Pass tokens upon request from a trusted party, either an Attester or a Client depending on the deployment model. Different Issuers have their own set of trusted parties, depending on the security level they are looking for, as well as their privacy considerations. An Issuer validating device integrity should use different methods that vouch for this attribute to acknowledge the diversity of Client configurations.
Attester: Verifies an attribute of the Client and when satisfied requests a signed Privacy Pass token from the Issuer to pass back to the Client. Before vouching for the Client, an Attester may ask the Client to complete a specific task. This task could be a CAPTCHA, a location check, or age verification or some other check that will result in a single binary result. The Privacy Pass token will then share this one-bit of information in an unlinkable manner.

They interact as illustrated below.

Let's dive into what's really happening with an example. The User wants to access an Origin, say store.example.com. This website has suffered attacks or abuse in the past, and the site is using Privacy Pass to help avoid these going forward. To that end, the Origin returns an authentication request to the Client: WWW-Authenticate: PrivateToken challenge="A==",token-key="B==". In this way, the Origin signals that it accepts tokens from the Issuer with public key “B==” to satisfy the challenge. That Issuer in turn trusts reputable Attesters to vouch for the Client not being an attacker by means of the presence of a cookie, CAPTCHA, Turnstile, or CAP challenge for example. For accessibility reasons for our example, let us say that the Client likely prefers the Turnstile method. The User’s browser prompts them to solve a Turnstile challenge. On success, it contacts the Issuer “B==” with that solution, and then replays the initial requests to store.example.com, this time sending along the token header Authorization: PrivateToken token="C==", which the Origin accepts and returns your desired content to the Client. And that’s it.

We’ve described the Privacy Pass authentication protocol. While Basic authentication (RFC 7671) asks you for a username and a password, the PrivateToken authentication scheme allows the browser to be more flexible on the type of check, while retaining privacy. The Origin store.example.com does not know your attestation method, they just know you are reputable according to the token issuer. In the same spirit, the Issuer "B==" does not see your IP, nor the website you are visiting. This separation between issuance and redemption, also referred to as unlinkability, is what makes Privacy Pass private.

Demo time

To put the above in practice, let’s see how the protocol works with Silk, a browser extension providing Privacy Pass support. First, download the relevant Chrome or Firefox extension.

Then, head to https://demo-pat.research.cloudflare.com/login. The page returns a 401 Privacy Pass Token not presented. In fact, the origin expects you to perform a PrivateToken authentication. If you don’t have the extension installed, the flow stops here. If you have the extension installed, the extension is going to orchestrate the flow required to get you a token requested by the Origin.

With the extension installed, you are directed to a new tab https://pp-attester-turnstile.research.cloudflare.com/challenge. This is a page provided by an Attester able to deliver you a token signed by the Issuer request by the Origin. In this case, the Attester checks you’re able to solve a Turnstile challenge.

You click, and that’s it. The Turnstile challenge solution is sent to the Attester, which upon validation, sends back a token from the requested Issuer. This page appears for a very short time, as once the extension has the token, the challenge page is no longer needed.

The extension, now having a token requested by the Origin, sends your initial request for a second time, with an Authorization header containing a valid Issuer PrivateToken. Upon validation, the Origin allows you in with a 200 Privacy Pass Token valid!

If you want to check behind the scenes, you can right-click on the extension logo and go to the preference/options page. It contains a list of attesters trusted by the extension, one per line. You can add your own attestation method (API described below). This allows the Client to decide on their preferred attestation methods.

Privacy Pass protocol — extended

The Privacy Pass protocol is new and not a standard yet, which implies that it’s not uniformly supported on all platforms. To improve flexibility beyond the existing standard proposal, we are introducing two mechanisms: an API for Attesters, and a replay API for web clients. The API for attesters allows developers to build new attestation methods, which only need to provide their URL to interface with the Silk browser extension. The replay API for web clients is a mechanism to enable websites to cooperate with the extension to make PrivateToken authentication work on browsers with Chrome user agents.

Because more than one Attester may be supported on your machine, your Client needs to understand which Attester to use depending on the requested Issuer. As mentioned before, you as the Client do not communicate directly with the Issuer because you don’t necessarily know their relation with the attester, so you cannot retrieve its public key. To this end, the Attester API exposes all Issuers reachable by the said Attester via an endpoint: /v1/private-token-issuer-directory. This way, your client selects an appropriate Attester - one in relation with an Issuer that the Origin trusts, before triggering a validation.

In addition, we propose a replay API. Its goal is to allow clients to fetch a resource a second time if the first response presented a Privacy pass challenge. Some platforms do this automatically, like Silk on Firefox, but some don’t. That’s the case with the Silk Chrome extension for instance, which in its support of manifest v3 cannot block requests and only supports Basic authentication in the onAuthRequired extension event. The Privacy Pass Authentication scheme proposes the request to be sent once to get a challenge, and then a second time to get the actual resource. Between these requests to the Origin, the platform orchestrates the issuance of a token. To keep clients informed about the state of this process, we introduce a private-token-client-replay: UUID header alongside WWW-Authenticate. Using a platform defined endpoint, this UUID informs web clients of the current state of authentication: pending, fulfilled, not-found.

To learn more about how you can use these today, and to deploy your own attestation method, read on.

How to use Privacy Pass today?

As seen in the section above, Privacy Pass is structured around four components: Origin, Client, Attester, Issuer. That’s why we created four repositories: cloudflare/pp-origin, cloudflare/pp-browser-extension, cloudflare/pp-attester, cloudflare/pp-issuer. In addition, the underlying cryptographic libraries are available cloudflare/privacypass-ts, cloudflare/blindrsa-ts, and cloudflare/voprf-ts. In this section, we dive into how to use each one of these depending on your use case.

Note: All examples below are designed in JavaScript and targeted at Cloudflare Workers. Privacy Pass is also implemented in other languages and can be deployed with a configuration that suits your needs.

As an Origin - website owners, service providers

You are an online service that people critically rely upon (health or messaging for instance). You want to provide private payment options to users to maintain your users’ privacy. You only have one subscription tier at $10 per month. You have heard people are making privacy preserving apps, and want to use the latest version of Privacy Pass.

To access your service, users are required to prove they've paid for the service through a payment provider of their choosing (that you deem acceptable). This payment provider acknowledges the payment and requests a token for the user to access the service. As a sequence diagram, it looks as follows:

To implement it in Workers, we rely on the @cloudflare/privacypass-ts library, which can be installed by running:

This section is going to focus on the Origin work. We assume you have an Issuer up and running, which is described in a later section.

The Origin defines two flows:

User redeeming token
User requesting a token issuance

From the user’s perspective, the overhead is minimal. Their client (possibly the Silk browser extension) receives a WWW-Authenticate header with the information required for a token issuance. Then, depending on their client configuration, they are taken to the payment provider of their choice to validate their access to the service.

With a successful response to the PrivateToken challenge a session is established, and the traditional web service flow continues.

As an Attester - CAPTCHA providers, authentication provider

You are the author of a new attestation method, such as CAP, a new CAPTCHA mechanism, or a new way to validate cookie consent. You know that website owners already use Privacy Pass to trigger such challenges on the user side, and an Issuer is willing to trust your method because it guarantees a high security level. In addition, because of the Privacy Pass protocol you never see which website your attestation is being used for.

So you decide to expose your attestation method as a Privacy Pass Attester. An Issuer with public key B== trusts you, and that's the Issuer you are going to request a token from. You can check that with the Yes/No Attester below, whose code is on Cloudflare Workers playground

The validation method above is simply checking if the user selected yes. Your method might be more complex, the wrapping stays the same.

Screenshot of the Yes/No Attester example

Because users might have multiple Attesters configured for a given Issuer, we recommend your Attester implements one additional endpoint exposing the keys of the issuers you are in contact with. You can try this code on Cloudflare Workers playground.

Et voilà. You have an Attester that can be used directly with the Silk browser extension (Firefox, Chrome). As you progress through your deployment, it can also be directly integrated into your applications.

If you would like to have a more advanced Attester and deployment pipeline, look at cloudflare/pp-attester template.

As an Issuer - foundation, consortium

We've mentioned the Issuer multiple times already. The role of an Issuer is to select a set of Attesters it wants to operate with, and communicate its public key to Origins. The whole cryptographic behavior of an Issuer is specified by the IETF draft. In contrast to the Client and Attesters which have discretionary behavior, the Issuer is fully standardized. Their opportunity is to choose a signal that is strong enough for the Origin, while preserving privacy of Clients.

Cloudflare Research is operating a public Issuer for experimental purposes to use on https://pp-issuer-public.research.cloudflare.com. It is the simplest solution to start experimenting with Privacy Pass today. Once it matures, you can consider joining a production Issuer, or deploying your own.

To deploy your own, you should:

Update wrangler.toml with your Cloudflare Workers account id and zone id. The open source Issuer API works as follows:

/.well-known/private-token-issuer-directory returns the issuer configuration. Note it does not expose non-standard token-key-legacy
/token-request returns a token. This endpoint should be gated (by Cloudflare Access for instance) to only allow trusted attesters to call it
/admin/rotate to generate a new public key. This should only be accessible by your team, and be called prior to the issuer being available.

Then, wrangler publish, and you're good to onboard Attesters.

Development of Silk extension

Just like the protocol, the browser technology on which Privacy Pass was proven viable has changed as well. For 5 years, the protocol got deployed along with a browser extension for Chrome and Firefox. In 2021, Chrome released a new version of extension configurations, usually referred to as Manifest version 3 (MV3). Chrome also started enforcing this new configuration for all newly released extensions.

Privacy Pass the extension is based on an agreed upon Privacy Pass authentication protocol. Briefly looking at Chrome’s API documentation, we should be able to use the onAuthRequired event. However, with PrivateToken authentication not yet being standard, there are no hooks provided by browsers for extensions to add logic to this event.

Image available under CC-BY-SA 4.0 provided by Google For Developers

The approach we decided to use is to define a client side replay API. When a response comes with 401 WWW-Authenticate PrivateToken, the browser lets it through, but triggers the private token redemption flow. The original page is notified when a token has been retrieved, and replays the request. For this second request, the browser is able to attach an authorization token, and the request succeeds. This is an active replay performed by the client, rather than a transparent replay done by the platform. A specification is available on GitHub.

We are looking forward to the standard progressing, and simplifying this part of the project. This should improve diversity in attestation methods. As we see in the next section, this is key to identifying new signals that can be leveraged by origins.

A standard for anonymous credentials

IP remains as a key identifier in the anti abuse system. At the same time, IP fingerprinting techniques have become a bigger concern and platforms have started to remove some of these ways of tracking users. To enable anti abuse systems to not rely on IP, while ensuring user privacy, Privacy Pass offers a reasonable alternative to deal with potentially abusive or suspicious traffic. The attestation methods vary and can be chosen as needed for a particular deployment. For example, Apple decided to back their attestation with hardware when using Privacy Pass as the authorization technology for iCloud Private Relay. Another example is Cloudflare Research which decided to deploy a Turnstile attester to signal a successful solve for Cloudflare’s challenge platform.

In all these deployments, Privacy Pass-like technology has allowed for specific bits of information to be shared. Instead of sharing your location, past traffic, and possibly your name and phone number simply by connecting to a website, your device is able to prove specific information to a third party in a privacy preserving manner. Which user information and attestation methods are sufficient to prevent abuse is an open question. We are looking to empower researchers with the release of this software to help in the quest for finding these answers. This could be via new experiments such as testing out new attestation methods, or fostering other privacy protocols by providing a framework for specific information sharing.

Future recommendations

Just as we expect this latest version of Privacy Pass to lead to new applications and ideas we also expect further evolution of the standard and the clients that use it. Future development of Privacy Pass promises to cover topics like batch token issuance and rate limiting. From our work building and deploying this version of Privacy Pass we have encountered limitations that we expect to be resolved in the future as well.

The division of labor between Attesters and Issuers and the clear directions of trust relationships between the Origin and Issuer, and the Issuer and Attester make reasoning about the implications of a breach of trust clear. Issuers can trust more than one Attester, but since many current deployments of Privacy Pass do not identify the Attester that lead to issuance, a breach of trust in one Attester would render all tokens issued by any Issuer that trusts the Attester untrusted. This is because it would not be possible to tell which Attester was involved in the issuance process. Time will tell if this promotes a 1:1 correspondence between Attesters and Issuers.

The process of developing a browser extension supported by both Firefox and Chrome-based browsers can at times require quite baroque (and brittle) code paths. Privacy Pass the protocol seems a good fit for an extension of the webRequest.onAuthRequired browser event. Just as Privacy Pass appears as an alternate authentication message in the WWW-Authenticate HTTP header, browsers could fire the onAuthRequired event for Private Token authentication too and include and allow request blocking support within the onAuthRequired event. This seems a natural evolution of the use of this event which currently is limited to the now rather long-in-the-tooth Basic authentication.

Conclusion

Privacy Pass provides a solution to one of the longstanding challenges of the web: anonymous authentication. By leveraging cryptography, the protocol allows websites to get the information they need from users, and solely this information. It's already used by millions to help distinguish human requests from automated bots in a manner that is privacy protective and often seamless. We are excited by the protocol’s broad and growing adoption, and by the novel use cases that are unlocked by this latest version.

Cloudflare’s Privacy Pass implementations are available on GitHub, and are compliant with the standard. We have open-sourced a set of templates that can be used to implement Privacy Pass Origins, Issuers, and Attesters, which leverage Cloudflare Workers to get up and running quickly.

For those looking to try Privacy Pass out for themselves right away, download the Silk - Privacy Pass Client browser extensions (Firefox, Chrome, GitHub) and start browsing a web with fewer bot checks today.

Serving Cloudflare Pages sites to the IPFS network

Thibault Meunier — Mon, 16 May 2022 12:57:44 GMT

Four years ago, Cloudflare went Interplanetary by offering a gateway to the IPFS network. This meant that if you hosted content on IPFS, we offered to make it available to every user of the Internet through HTTPS and with Cloudflare protection. IPFS allows you to choose a storage provider you are comfortable with, while providing a standard interface for Cloudflare to serve this data.

Since then, businesses have new tools to streamline web development. Cloudflare Workers, Pages, and R2 are enabling developers to bring services online in a matter of minutes, with built-in scaling, security, and analytics.

Today, we're announcing we're bridging the two. We will make it possible for our customers to serve their sites on the IPFS network.

In this post, we'll learn how you will be able to build your website with Cloudflare Pages, and leverage the IPFS integration to make your content accessible and available across multiple providers.

A primer on IPFS

The InterPlanetary FileSystem (IPFS) is a peer-to-peer network for storing content on a distributed file system. It is composed of a set of computers called nodes that store and relay content using a common addressing system. In short, a set of participants agree to maintain a shared index of content the network can provide, and where to find it.

Let's take two random participants in the network: Alice, a cat person, and Bob, a dog person.

As a participant in the network, Alice keeps connections with a subset of participants, referred to as her peers. When Alice is making her content 🐱 available on IPFS, it means she announces to her peers she has content 🐱, and these peers stored in their routing table 🐱 is provided by Alice's node.

Each node has a routing table, and a datastore. The routing table retains a mapping of content to peers, and the datastore retains the content a given node stores. In the above case, only Alice has content, a 🐱.

When Bob wants to retrieve 🐱, he tells his peers they want 🐱. These peers point him to Alice. Alice then provides 🐱 to Bob. Bob can verify 🐱 is the content they were looking for, because the content identifier he requested is derived from the 🐱 content itself, using a secure, cryptographic hash function.

Even better, if Bob becomes a cat person, they can announce to their peers they are also providing a cat. Bob's love for cats could be genuine, or because they have interest in providing the content, such as a contract with Alice. IPFS provides a common ground to share content, without being opinionated on how this content has to be stored and its guarantees.

How Pages websites are made available on IPFS

Content is made available as follows.

The components are:

Pages storage: Storage solution for Cloudflare Pages.
IPFS Index Proxy: Service maintaining a map between IPFS CID and location of the data. This is operated on Cloudflare Workers and using Workers KV to store the mapping.
IPFS node: Cloudflare-hosted IPFS node serving Pages content. It has an in-house datastore module, able to communicate with the IPFS Index Proxy.
IPFS network: The rest of the IPFS network.

When you opt in serving your Cloudflare Page on IPFS, a call is made to the IPFS index proxy. This call first fetches your Pages content, transforms it into a CID, then both populates IndexDB to associate the CID with the content and reaches out to Cloudflare IPFS node to tell them they are able to provide the CID.

For example, imagine your website structure is as follows:

/
- index.html
- static/
  - cats.txt
  - beautiful_cats.txt

To provide this website on IPFS, Cloudflare has to compute a CID for /. CIDs are built recursively. To compute the CID for a given folder /, one needs to have the CID of index.html and static/. index.html CID is derived from its binary representation, and static/ from cats.txt and beautiful_cats.txt.

This works similarly to a Merkle tree, except nodes can reference each other as long as they still form a Directed Acyclic Graph (DAG). This structure is therefore referred to as a MerkleDAG.

In our example, it's not unlikely for cats.txt and beautiful_cats.txt to have data in common. In this case, Cloudflare can be smart in the way it builds the MerkleDAG.

This reduces the storage and bandwidth requirement when accessing the website on IPFS.

If you want to learn more about how you can model a file system on IPFS, you can check the UnixFS specification.

Cloudflare stores every CID and the content it references in IndexDB. This allows Cloudflare IPFS nodes to serve Cloudflare Pages assets when requested.

Let's put this together

Let’s take an example: pages-on-ipfs.com is hosted on IPFS. It is built using Hugo, a static site generator, and Cloudflare Pages with the public documentation template. Its source is available on GitHub. If you have an IPFS compatible client, you can access it at ipns://pages-on-ipfs.com as well.

1. Read Cloudflare Pages deployment documentation

For the purpose of this blog, I want to create a simple Cloudflare page website. I have experience with Hugo, so I choose it as my framework for the project.

I type "cloudflare pages" in the search bar of my web browser, then head to the Read the docs > Framework Guide > Deploy a Hugo site.

2. Create a site

This is where I use Hugo, and your configuration might differ. In short, I use hugo new site pages-on-ipfs, create an index and two static resources, et voilà. The result is available on the source GitHub for this project.

3. Deploy using Cloudflare Pages

On the Cloudflare Dashboard, I go to Account Home > Pages > Create a project. I select the GitHub repository I created, and configure the build tool to build Hugo website. Basically, I follow what's written on Cloudflare Pages documentation.

Upon clicking Save and Deploy, my website is deployed on pages-on-ipfs.pages.dev. I also configure it to be available at pages-on-ipfs.com

4. Serve my content on IPFS

First, I opt in my zone on Cloudflare Pages integration with IPFS. This feature is not available yet for everyone to try out.

This allows Cloudflare to index the content of my website. Once indexed, I get the CID for my deployment baf...1. I can check that my content is available at this hash on IPFS using an IPFS gateway https://baf...1.ipfs.cf-ipfs.com/.

5. Make my IPFS website available at pages-on-ipfs.com

Having one domain name to access both Cloudflare Pages and IPFS version, depending on if the client supports IPFS or not is ideal. Fortunately, the IPFS ecosystem supports such a feature via DNSLink. DNSLink is a way to specify the IPFS content a domain is associated with.

For pages-on-ipfs.com, I create a TXT record on _dnslink.pages-on-ipfs.com with value dnslink=/ipfs/baf...1. Et voilà. I can now access ipns://pages-on-ipfs.com via an IPFS client.

6. (Optional) Replicate my website

The content of my website can now easily be replicated and pinned by other IPFS nodes. This can either be done at home via an IPFS client or using a pinning service such as Pinata.

What's next

We'll make this service available later this year as it is being refined. We are committed to make information move freely and help build a better Internet. Cloudflare Pages work of solving developer problems continues, as developers are now able to make their site accessible to more users.

Over the years, IPFS has been used by more and more people. While Cloudflare started by making IPFS content available to web users through an HTTP interface, we now think it's time to give back. Allowing Cloudflare assets to be served over a public distributed network extends developers and users capability on an open web.

Common questions

I am already hosting my website on IPFS. Can I pin it using Cloudflare?
- No. This project aims at serving Cloudflare hosted content via IPFS. We are still investigating how to best replicate and re-provide already-existing IPFS content via Cloudflare infrastructure.
Does this make IPFS more centralized?
- No. Cloudflare does not have the authority to decide who can be a node operator nor what content they provide.
Are there guarantees the content is going to be available?
- Yes. As long as you choose Cloudflare to host your website on IPFS, it will be available on IPFS. Should you move to another provider, it would be your responsibility to make sure the content remains available. IPFS allows for this transition to be smooth using a pinning service.
Is IPFS private?
- It depends. Generally, no. IPFS is a p2p protocol. The nodes you peer with and request content from would know the content you are looking for. The privacy depends on the trust you have in your peers to not snoop on the data you request.
Can users verify the integrity of my website?
- Yes. They need to access your website through an IPFS compatible client. Ideally, IPFS content integrity is turned into a web standard, similar to subresource integrity.

Gaining visibility in IPFS systems

Pop Chunhapanya — Mon, 16 May 2022 12:57:39 GMT

We have been operating an IPFS gateway for the last four years. It started as a research experiment in 2018, providing end-to-end integrity with IPFS. A year later, we made IPFS content faster to fetch. Last year, we announced we were committed to making IPFS gateway part of our product offering. Through this process, we needed to inform our design decisions to know how our setup performed.

To this end, we've developed the IPFS Gateway monitor, an observability tool that runs various IPFS scenarios on a given gateway endpoint. In this post, you'll learn how we use this tool and go over discoveries we made along the way.

Refresher on IPFS

IPFS is a distributed system for storing and accessing files, websites, applications, and data. It's different from a traditional centralized file system in that IPFS is completely distributed. Any participant can join and leave at any time without the loss of overall performance.

However, in order to access any file in IPFS, users cannot just use web browsers. They need to run an IPFS node to access the file from IPFS using its own protocol. IPFS Gateways play the role of enabling users to do this using only web browsers.

Cloudflare provides an IPFS gateway at https://cloudflare-ipfs.com, so anyone can just access IPFS files by using the gateway URL in their browsers.

As IPFS and the Cloudflare IPFS Gateway have become more and more popular, we need a way to know how performant it is: how much time it takes to retrieve IPFS-hosted content and how reliable it is. However, IPFS gateways are not like normal websites which only receive HTTP requests and return HTTP responses. The gateways need to run IPFS nodes internally and sometimes do content routing and peer routing to find the nodes which provide IPFS contents. They sometimes also need to resolve IPNS names to discover the contents. So, in order to measure the performance, we need to do measurements many times for many scenarios.

Enter the IPFS Gateway monitor

IPFS Gateway monitor is this tool. It allows anyone to check the performance of their gateway and export it to the Prometheus monitoring system.

This monitor is composed of three independent binaries:

ipfs-gw-measure is the tool that calls the gateway URL and does the measurement scenarios.
ipfs-gw-monitor is a way to call the measurement tool multiple times.
Prometheus Exporter exposes prometheus-readable metrics.

To interact with the IPFS network, the codebase also provides a way to start an IPFS node.

A scenario is a set of instructions a user performs given the state of our IPFS system. For instance, we want to know how fast newly uploaded content can be found by the gateway, or if popular content has a low query time. We'll discuss more of this in the next section.

Putting this together, the system is the following.

During this experience, we have operated both the IPFS Monitor, and a test IPFS node. The IPFS node allows the monitor to provide content to the IPFS network. This allows us to be sure that the content provided is fresh, and actually hosted. Peers have not been fixed, and leverage the IPFS default peer discovery mechanism.

Cloudflare IPFS Gateway is treated as an opaque system. The monitor performs end-to-end tests by contacting the gateway via its public API, either https://cloudflare-ipfs.com or via a custom domain registered with the gateway.

The following scenarios have been run on consumer hardware in March. They are not representative of all IPFS users. All values provided below have been sourced against Cloudflare’s IPFS gateway endpoint.

First scenarios: Gateway to serve immutable IPFS contents

IPFS contents are the most primitive contents being served by the IPFS network and IPFS gateways. By their nature, they are immutable and addressable only by the hash of the content. Users can access the IPFS contents by putting the Content IDentifier (CID) in the URL path of the gateway. For example, ipfs://bafybeig7hluox6xefqdgmwcntvsguxcziw2oeogg2fbvygex2aj6qcfo64. We measure three common scenarios that users will often encounter.

The first one is when users fetch popular content which has a high probability of being found in our cache already. During our experiment, we measured a response time for such content is around 116ms.

The second one is the case where the users create and upload the content to the IPFS network, such as via the integration between Cloudflare Pages and IPFS. This scenario is a lot slower than the first because the content is not in our cache yet, and it takes some time to discover the content. The content that we upload during this measurement is a random piece of 32-byte content.

The last measurement is when users try to download content that does not exist. This one is the slowest. Because of the nature of content routing of IPFS, there is no indication that tells us that content doesn't exist. So, setting the timeout is the only way to tell if the content exists or not. Currently, the Cloudflare IPFS gateway has a timeout of around five minutes.

Second scenarios: Gateway to serve mutable IPNS named contents

Since IPFS contents are immutable, when users want to change the content, the only way to do so is to create new content and distribute the new CID to other users. Sometimes distributing the new CID is hard, and is out of scope of IPFS. The InterPlanetary Naming System (IPNS) tries to solve this. IPNS is a naming system that — instead of locating the content using the CID — allows users to locate the content using the IPNS name instead. This name is a hash of a user's public key. Internally, IPNS maintains a section of the IPFS DHT which maps from a name to a CID. Therefore, when the users want to download the contents using names through the gateway, the gateway has to first resolve the name to get the CID, then use the CID to query the IPFS content as usual.

The example for fetching the IPNS named content is at ipns://k51qzi5uqu5diu2krtwp4jbt9u824cid3a4gbdybhgoekmcz4zhd5ivntan5ey

We measured many scenarios for IPNS as shown in the table below. Three scenarios are similar to the ones involving IPFS contents. There are two more scenarios added: the first one is measuring the response time when users query the gateway using an existing IPNS name, and the second one is measuring the response time when users query the gateway immediately after new content is published under the name.

Third scenarios: Gateway to serve DNSLink websites

Even though users can use IPNS to provide others a stable address to fetch the content, the address can still be hard to remember. This is what DNSLink is for. Users can address their content using DNSLink, which is just a normal domain name in the Domain Name System (DNS). As a domain owner, you only have to create a TXT record with the value dnslink=/ipfs/baf…1, and your domain can be fetched from a gateway.

There are two ways to access the DNSLink websites using the gateway. The first way is to access the website using the domain name as a URL hostname, for example, https://ipfs.example.com. The second way is to put the domain name as a URL path, for example, https://cloudflare-ipfs.com/ipns/ipfs.example.com.

What does this mean for regular IPFS users?

These measurements highlight that IPFS content is fetched best when cached. This is an order of magnitude faster than fetching new content. This result is expected as content publication on IPFS can take time for nodes to retrieve, as highlighted in previous studies. Then, when it comes to naming IPFS resources, leveraging DNSLink appears to be the faster strategy. This is likely due to the poor connection of the IPFS node used in this setup, preventing IPNS name from propagating rapidly. Overall, IPNS names would benefit from using a resolver to speed up resolution without losing the trust they provide.

As we mentioned in September, IPFS use has seen important growth. So has our tooling. The IPFS Gateway monitor can be found on GitHub, and we will keep looking at improving this first set of metrics.

At the time of writing, using IPFS via a gateway seems to provide lower retrieval times, while allowing for finer grain control over security settings in the browser context. This configuration preserves the content validity properties offered by IPFS, but reduces the number of nodes a user is peering with to one: the gateway. Ideally, we would like users to peer with Cloudflare because we're offering the best service, while still having the possibility to retrieve content from external sources if they want to. We'll be conducting more measurements to better understand how to best leverage Cloudflare presence in 270 cities to better serve the IPFS network.

Web3 — A vision for a decentralized web

Thibault Meunier — Fri, 01 Oct 2021 12:59:31 GMT

By reading this, you are a participant of the web. It's amazing that we can write this blog and have it appear to you without operating a server or writing a line of code. In general, the web of today empowers us to participate more than we could at any point in the past.

Last year, we mentioned the next phase of the Internet would be always on, always secure, always private. Today, we dig into a similar trend for the web, referred to as Web3. In this blog we'll start to explain Web3 in the context of the web's evolution, and how Cloudflare might help to support it.

Going from Web 1.0 to Web 2.0

When Sir Tim Berners-Lee wrote his seminal 1989 document “Information Management: A Proposal”, he outlined a vision of the “web” as a network of information systems interconnected via hypertext links. It is often assimilated to the Internet, which is the computer network it operates on. Key practical requirements for this web included being able to access the network in a decentralized manner through remote machines and allowing systems to be linked together without requiring any central control or coordination.

The original proposal for what we know as the web, fitting in one diagram - Source: w3

This vision materialized into an initial version of the web that was composed of interconnected static resources delivered via a distributed network of servers and accessed primarily on a read-only basis from the client side — “Web 1.0”. Usage of the web soared with the number of websites growing well over 1,000% in the ~2 years following the introduction of the Mosaic graphical browser in 1993, based on data from the World Wide Web Wanderer.

The early 2000s marked an inflection point in the growth of the web and a key period of its development, as technology companies that survived the dot-com crash evolved to deliver value to customers in new ways amidst heightened skepticism around the web:

Desktop browsers like Netscape became commoditized and paved the way for native web services for discovering content like search engines.
Network effects that were initially driven by hyperlinks in web directories like Yahoo! were hyperscaled by platforms that enabled user engagement and harnessed collective intelligence like review sites.
The massive volume of data generated by Internet activity and the growing realization of its competitive value forced companies to become experts at database management.

O’Reilly Media coined the concept of Web 2.0 in an attempt to capture such shifts in design principles, which were transformative to the usability and interactiveness of the web and continue to be core building blocks for Internet companies nearly two decades later.

However, in the midst of the web 2.0 transformation, the web fell out of touch with one of its initial core tenets — decentralization.

Decentralization: No permission is needed from a central authority to post anything on the web, there is no central controlling node, and so no single point of failure … and no “kill switch”!— History of the web by Web Foundation

A new paradigm for the Internet

This is where Web3 comes in. The last two decades have proven that building a scalable system that decentralizes content is a challenge. While the technology to build such systems exists, no content platform achieves decentralization at scale.

There is one notable exception: Bitcoin. Bitcoin was conceptualized in a 2008 whitepaper by Satoshi Nakamoto as a type of distributed ledger known as a blockchain designed so that a peer-to-peer (P2P) network could transact in a public, consistent, and tamper-proof manner.

That’s a lot said in one sentence. Let’s break it down by term:

A peer-to-peer network is a network architecture. It consists of a set of computers, called nodes, that store and relay information. Each node is equally privileged, preventing one node from becoming a single point of failure. In the Bitcoin case, nodes can send, receive, and process Bitcoin transactions.
A ledger is a collection of accounts in which transactions are recorded. For Bitcoin, the ledger records Bitcoin transactions.
A distributed ledger is a ledger that is shared and synchronized among multiple computers. This happens through a consensus, so each computer holds a similar replica of the ledger. With Bitcoin, the consensus process is performed over a P2P network, the Bitcoin network.
A blockchain is a type of distributed ledger that stores data in “blocks” that are cryptographically linked together into an immutable chain that preserves their chronological order. Bitcoin leverages blockchain technology to establish a shared, single source of truth of transactions and the sequence in which they occurred, thereby mitigating the double-spending problem.

Bitcoin — which currently has over 40,000 nodes in its network and processes over $30B in transactions each day — demonstrates that an application can be run in a distributed manner at scale, without compromising security. It inspired the development of other blockchain projects such as Ethereum which, in addition to transactions, allows participants to deploy code that can verifiably run on each of its nodes.

Today, these programmable blockchains are seen as ideal open and trustless platforms to serve as the infrastructure of a distributed Internet. They are home to a rich and growing ecosystem of nearly 7,000 decentralized applications (“Dapps”) that do not rely on any single entity to be available. This provides them with greater flexibility on how to best serve their users in all jurisdictions.

The web is for the end user

Distributed systems are inherently different from centralized systems. They should not be thought about in the same way. Distributed systems enable the data and its processing to not be held by a single party. This is useful for companies to provide resilience, but it’s also useful for P2P-based networks where data can stay in the hands of the participants.

For instance, if you were to host a blog the old-fashioned way, you would put up a server, expose it to the Internet (via Cloudflare :D), et voilà. Nowadays, your blog would be hosted on a platform like WordPress, Ghost, Notion, or even Twitter. If these companies were to have an outage, this affects a lot more people. In a distributed fashion, via IPFS for instance, your blog content can be hosted and served from multiple locations operated by different entities.

Web 1.0

Web 2.0

Web3

Each participant in the network can choose what they host/provide and can be home to different content. Similar to your home network, you are in control of what you share, and you don’t share everything.

This is a core tenet of decentralized identity. The same cryptographic principles underpinning cryptocurrencies like Bitcoin and Ethereum are being leveraged by applications to provide secure, cross-platform identity services. This is fundamentally different from other authentication systems such as OAuth 2.0, where a trusted party has to be reached to assess one's identity. This materializes in the form of “Login with ” buttons. These cloud providers are the only ones with enough data, resources, and technical expertise.

In a decentralised web, each participant holds a secret key. They can then use it to identify each other. You can learn about this cryptographic system in a previous blog. In a Web3 setting where web participants own their data, they can selectively share these data with applications they interact with. Participants can also leverage this system to prove interactions they had with one another. For example, if a college issues you a Decentralized Identifier (DID), you can later prove you have been registered at this college without reaching out to the college again. Decentralized Identities can also serve as a placeholder for a public profile, where participants agree to use a blockchain as a source of trust. This is what projects such as ENS or Unlock aim to provide: a way to verify your identity online based on your control over a public key.

This trend of proving ownership via a shared source of trust is key to the NFT craze. We have discussed NFTs before on this blog. Blockchain-based NFTs are a medium of conveying ownership. Blockchain enables this information to be publicly verified and updated. If the blockchain states a public key I control is the owner of an NFT, I can refer to it on other platforms to prove ownership of it. For instance, if my profile picture on social media is a cat, I can prove the said cat is associated with my public key. What this means depends on what I want to prove, especially with the proliferation of NFT contracts. If you want to understand how an NFT contract works, you can build your own.

How does Cloudflare fit in Web3?

Decentralization and privacy are challenges we are tackling at Cloudflare as part of our mission to help build a better Internet.

In a previous post, Nick Sullivan described Cloudflare’s contributions to enabling privacy on the web. We launched initiatives to fix information leaks in HTTPS through Encrypted Client Hello (ECH), make DNS even more private by supporting Oblivious DNS-over-HTTPS (ODoH), and develop OPAQUE which makes password breaches less likely to occur. We have also released our data localization suite to help businesses navigate the ever evolving regulatory landscape by giving them control over where their data is stored without compromising performance and security. We’ve even built a privacy-preserving attestation that is based on the same zero-knowledge proof techniques that are core to distributed systems such as ZCash and Filecoin.

It’s exciting to think that there are already ways we can change the web to improve the experience for its users. However, there are some limitations to build on top of the exciting infrastructure. This is why projects such as Ethereum and IPFS build on their own architecture. They are still relying on the Internet but do not operate with the web as we know it. To ease the transition, Cloudflare operates distributed web gateways. These gateways provide an HTTP interface to Web3 protocols: Ethereum and IPFS. Since HTTP is core to the web we know today, distributed content can be accessed securely and easily without requiring the user to operate experimental software.

Where do we go next?

The journey to a different web is long but exciting. The infrastructure built over the last two decades is truly stunning. The Internet and the web are now part of 4.6 billion people's lives. At the same time, the top 35 websites had more visits than all others (circa 2014). Users have less control over their data and are even more reliant on a few players.

The early Web was static. Then Web 2.0 came to provide interactiveness and service we use daily at the cost of centralisation. Web3 is a trend that tries to challenge this. With distributed networks built on open protocols, users of the web are empowered to participate.

At Cloudflare, we are embracing this distributed future. Applying the knowledge and experience we have gained from running one of the largest edge networks, we are making it easier for users and businesses to benefit from Web3. This includes operating a distributed web product suite, contributing to open standards, and moving privacy forward.

If you would like to help build a better web with us, we are hiring.

How Cloudflare provides tools to help keep IPFS users safe

Thibault Meunier — Wed, 29 Sep 2021 23:02:00 GMT

Cloudflare's journey with IPFS started in 2018 when we announced a public gateway for the distributed web. Since then, the number of infrastructure providers for the InterPlanetary FileSystem (IPFS) has grown and matured substantially. This is a huge benefit for users and application developers as they have the ability to choose their infrastructure providers.

Today, we’re excited to announce new secure filtering capabilities in IPFS. The Cloudflare IPFS module is a tool to protect users from threats like phishing and ransomware. We believe that other participants in the network should have the same ability. We are releasing that software as open source, for the benefit of the entire community.

Its code is available on github.com/cloudflare/go-ipfs. To understand how we built it and how to use it, read on.

A brief introduction on IPFS content retrieval

Before we get to understand how IPFS filtering works, we need to dive a little deeper into the operation of an IPFS node.

Nodes communicate with each other over the Internet using a Peer-to-Peer (P2P) architecture, preventing one node from becoming a single point of failure. This is even more true given that anyone can operate a node with limited resources. This can be light hardware such as a Raspberry Pi, a server at a cloud provider, or even your web browser.

This creates a challenge since not all nodes may support the same protocols, and networks may block some types of connections. For instance, your web browser does not expose a TCP API and your home router likely doesn’t allow inbound connections. This is where libp2p comes to help.

libp2p is a modular system of protocols, specifications, and libraries that enable the development of peer-to-peer network applications - libp2p documentation

That’s exactly what four IPFS nodes need to connect to the IPFS network. From a node point of view, the architecture is the following:

Any node that we maintain a connection with is a peer. A peer that does not have ? content can ask their peers, including you, they WANT?. If you do have it, you will provide the ? to them. If you don’t have it, you can give them information about the network to help them find someone who might have it. As each node chooses the resources they store, it means some might be stored on a limited number of nodes.

For instance, everyone likes ?, so many nodes will dedicate resources to store it. However, ? is less popular. Therefore, only a few nodes will provide it.

This assumption does not hold for public gateways like Cloudflare. A gateway is an HTTP interface to an IPFS node. On our gateway, we allow a user of the Internet to retrieve arbitrary content from IPFS. If a user asks for ?, we provide ?. If they ask for ?, we’ll find ? for them.

Cloudflare’s IPFS gateway is simply a cache in front of IPFS. Cloudflare does not have the ability to modify or remove content from the IPFS network. However, IPFS is a decentralized and open network, so there is the possibility of users sharing threats like phishing or malware. This is content we do not want to provide to the P2P network or to our HTTP users.

In the next section, we describe how an IPFS node can protect its users from such threats.

If you would like to learn more about the inner workings of libp2p, you can go to ProtoSchool which has a great tutorial about it.

How IPFS filtering works

As we described earlier, an IPFS node provides content in two ways: to its peers through the IPFS P2P network and to its users via an HTTP gateway.

Filtering content of the HTTP interface is no different from the current protection Cloudflare already has in place. If ? is considered malicious and is available at cloudflare-ipfs.com/ipfs/?, we can filter these requests, so the end user is kept safe.

The P2P layer is different. We cannot filter URLs because that’s not how the content is requested. IPFS is content-addressed. This means that instead of asking for a specific location such as cloudflare-ipfs.com/ipfs/?, peers request the content directly using its Content IDentifiers (CID), ?.

More precisely, ? is an abstraction of the content address. A CID looks like QmXnnyufdzAWL5CqZ2RnSNgPbvCc1ALT73s6epPrRnZ1Xy (QmXnnyufdzAWL5CqZ2RnSNgPbvCc1ALT73s6epPrRnZ1Xy happens to be the hash of a .txt file containing the string "I’m trying out IPFS''). CID is a convenient way to refer to content in a cryptographically verifiable manner.

This is great, because it means that when peers ask for malicious ? content, we can prevent our node from serving it. This includes both the P2P layer and the HTTP gateway.

In addition, the working of IPFS makes it, so content can easily be reused. On directories for instance, the address is a CID based on the CID of its files. This way, a file can be shared across multiple directories, and still be referred to by the same CID. It allows IPFS nodes to efficiently store content without duplicating it. This can be used to share docker container layers for example.

In the filtering use case, it means that if ? content is included in other IPFS content, our node can also prevent content linking to malicious ? content from being served. This results in ?, a mix of valid and malicious content.

This cryptographic method of linking content together is known as MerkleDAG. You can learn more about it on ProtoSchool, and Consensys did an article explaining the basic cryptographic construction with bananas ?.

How to use IPFS secure filtering

By now, you should have an understanding of how an IPFS node retrieves and provides content, as well as how we can protect peers and users from shared nodes accessing threats. Using this knowledge, Cloudflare went on to implement IPFS Safemode, a node protection layer on top of go-ipfs. It is up to every node operator to build their own list of threats to be blocked based on their policy.

To use it, we are going to follow the instructions available on cloudflare/go-ipfs repository.

First, you need to clone the git repository

Then, you have to check out the commit where IPFS safemode is implemented. This version is based on v0.9.1 of go-ipfs.

Now that you have the source code on your machine, we need to build the IPFS client from source.

Et voilà. You are ready to use your IPFS node, with safemode capabilities.

Going further

IPFS nodes are running in a diverse set of environments and operated by parties at various scales. The same software has to accommodate configuration in which it is accessed by a single-user, and others where it is shared by thousands of participants.

At Cloudflare, we believe that decentralization is going to be the next major step for content networks, but there is still work to be done to get these technologies in the hands of everyone. Content filtering is part of this story. If the community aims at embedding a P2P node in every computer, there needs to be ways to prevent nodes from serving harmful content. Users need to be able to give consent on the content they are willing to serve, and the one they aren’t.

By providing an IPFS safemode tool, we hope to make this protection more widely available.

Humanity wastes about 500 years per day on CAPTCHAs. It’s time to end this madness

Thibault Meunier — Thu, 13 May 2021 13:00:00 GMT

Select all the buses. Click on bikes. Does this photo have traffic lights? As ridiculous as these questions are, you’re almost guaranteed to have seen one recently. They are a way for online services to separate humans from bots, and they’re called CAPTCHAs. CAPTCHAs strengthen the security of online services. But while they do that, there’s a very real cost associated with them.

Based on our data, it takes a user on average 32 seconds to complete a CAPTCHA challenge. There are 4.6 billion global Internet users. We assume a typical Internet user sees approximately one CAPTCHA every 10 days.

This very simple back of the envelope math equates to somewhere in the order of 500 human years wasted every single day — just for us to prove our humanity.

Today, we are launching an experiment to end this madness. We want to get rid of CAPTCHAs completely. The idea is rather simple: a real human should be able to touch or look at their device to prove they are human, without revealing their identity. We want you to be able to prove that you are human without revealing which human you are! You may ask if this is even possible? And the answer is: Yes! We’re starting with trusted USB keys (like YubiKey) that have been around for a while, but increasingly phones and computers come equipped with this ability by default.

Today marks the beginning of the end for fire hydrants, cross walks, and traffic lights on the Internet.

Why CAPTCHAs?

In many instances, businesses need a way to tell whether an online user is human or not. Typically, those reasons relate to security, or abuse of an online service. Back at the turn of the century, CAPTCHAs were created to do just that. The first one was developed back in 1997, and the term ("Completely Automated Public Turing test to tell Computers and Humans Apart") was coined in 2003 by Luis von Ahn, Manuel Blum, Nicholas J. Hopper, and John Langford.

By their very nature, the challenge-response nature of CAPTCHAs have to be automated: so they can scale across both humans and the bots they need to catch.

Why get rid of CAPTCHAs?

Put simply: we all hate them.

The best we’ve been able to do to date has been to minimize them. For example, at Cloudflare, we’ve continuously improved our Bot management solution to get as smart as possible about when to serve a CAPTCHA to the user. However, over the years the web moved from simple CAPTCHAs based on text recognition against backgrounds to OCRing old books to identifying objects from pictures as AI has improved (see Google paper on Street Numbers). This creates some real problems for the human users of the Internet:

Productivity: Time is lost — as is focus on the task at hand — and often in exchange for some frustration.
Accessibility: Users are assumed to have the physical and cognitive capabilities required to solve the tests, which may not be the case. A visual disability, for example, may make it impossible to perform a CAPTCHA-solving task.
Cultural Knowledge: The people on the planet who have seen a US fire hydrant are in the minority, as are the number who speak English. Cabs are yellow in New York City, and black in London — heck, ‘cabs’ are only cabs in a few places, and ‘taxis’ everywhere else!
Interactions on Mobile Devices: Phones and mobile devices are the primary — and most often only — means of Internet access for a large part of the world. CAPTCHAs put a strain on their data plans and battery usage, in addition to being more difficult on small screens.

In fact, the World Wide Web Consortium (W3C) worked on multiple drafts — as early as 2003 — pointing out the inaccessibility of CAPTCHAs.

And this is just from the user side. Inflicting all these costs on users has very real costs for businesses, too. There’s a reason why businesses spend so much time optimizing the performance and layout of their websites and applications. That work stops users from bouncing when you want them to register. It stops shopping carts getting abandoned when you want them to end in the checkout. In general, you want to stop customers from getting frustrated and simply not come back.

CAPTCHAs are effectively businesses putting friction in front of their users, and as anyone who has managed a high performing online business will tell you, it’s not something you want to do unless you have no choice.

We started tackling these issues when we moved from Google reCAPTCHA to hCAPTCHA. Today, we are going further.

CAPTCHA without Picture: Cryptographic Attestation of Personhood

Hardware security keys are devices with an embedded secret that can connect to your computer or your phone

From a user perspective, a Cryptographic Attestation of Personhood works as follows:

The user accesses a website protected by Cryptographic Attestation of Personhood, such as cloudflarechallenge.com.
Cloudflare serves a challenge.
The user clicks I am human (beta) and gets prompted for a security device.
User decides to use a Hardware Security Key.
The user plugs the device into their computer or taps it to their phone for wireless signature (using NFC).
A cryptographic attestation is sent to Cloudflare, which allows the user in upon verification of the user presence test.

Completing this flow takes five seconds. More importantly, this challenge protects users' privacy since the attestation is not uniquely linked to the user device. All device manufacturers trusted by Cloudflare are part of the FIDO Alliance. As such, each hardware key shares its identifier with other keys manufactured in the same batch (see Universal 2nd Factor Overview, Section 8). From Cloudflare’s perspective, your key looks like all other keys in the batch.

There are at most three clicks required to complete a Cryptographic Attestation of Personhood. There is no looping, where a user is asked to click on buses 10 times in a row.

While there is a variety of hardware security keys, our initial rollout is limited to a set of USB and NFC keys that are both certified by the FIDO alliance and have no known security issues according to the FIDO metadata service (MDS). Our demo only includes support for YubiKeys, which we had the chance to use and test; HyperFIDO keys; and Thetis FIDO U2F keys.

“Driving open authentication standards like WebAuthn has long been at the heart of Yubico’s mission to deliver powerful security with a delightful user experience,” said Christopher Harrell, Chief Technology Officer at Yubico. “By offering a CAPTCHA alternative via a single touch backed by YubiKey hardware and public key cryptography, Cloudflare’s Cryptographic Attestation of Personhood experiment could help further reduce the cognitive load placed on users as they interact with sites under strain or attack. I hope this experiment will enable people to accomplish their goals with minimal friction and strong privacy, and that the results will show it is worthwhile for other sites to consider using hardware security for more than just authentication.”

How does it work?

The Cryptographic Attestation of Personhood relies on Web Authentication (WebAuthn) Attestation. This is an API that has been standardized at the W3C and is already implemented in most modern web browsers and operating systems. It aims to provide a standard interface to authenticate users on the web and use the cryptography capability of their devices.

As the need for stronger security with improved usability increases, we envision the deployment instances of WebAuthn to rise.

Assuming you are using a hardware device with a compatible configuration, you might be wondering what is happening behind the scenes.

The elevator pitch

The short version is that your device has an embedded secure module containing a unique secret sealed by your manufacturer. The security module is capable of proving it owns such a secret without revealing it. Cloudflare asks you for proof and checks that your manufacturer is legitimate.

The technical explanation

The longer version is that this verification involves public-key cryptography and digital certificates.

Public-key cryptography provides a way to produce unforgeable digital signatures. A user generates a signing key that can sign messages and a verification key that can be used by anyone to verify a message is authentic. This is akin to a signet ring, where the imprint of the ring is the signature and the ring itself the signing key.

Signature schemes are used widely to prove authenticity. Right now, your browser has verified that the server claiming to be “blog.cloudflare.com” is legitimate by verifying a signature made by someone who has a signing key associated with “blog.cloudflare.com”. To show the verification key is legitimate, the server provides a certificate that links the verification key to “blog.cloudflare.com”, itself signed by another verification key in another certificate. This chain goes all the way up to a root certificate from a Certificate Authority built into your browser.

Let's take another example. Alice owns a laptop with a secure module embedded. This module holds a signing key, sk_a. Alice says she sent a love letter to Bob yesterday. However, Bob is suspicious. Despite the letter stating "Hi Bob, it's Alice", Bob would like to be sure this letter comes from Alice. To do so, Bob asks Alice to provide her signature for the following message "musical-laboratory-ground". Since Bob chooses the message, if Alice can provide a signature associated with her verification key (pk_a), Bob would be convinced the love letter is from Alice. Alice does provide the signature, sk_a(“musical-laboratory-ground”). Bob confirms sk_a(“musical-laboratory-ground”) is associated with pk_a. He can now securely engage in their cryptographer relationship.

Thinking back to the Cryptographic Attestation of Personhood, you now know that your hardware key embeds a signing key. However, Cloudflare does not and cannot know the signing keys of all users of the Internet. To alleviate this problem, Cloudflare requests a different kind of proof. When asked if you are a human, we ask you to prove you are in control of a public key signed by a trusted manufacturer. When shipping devices with a secure module, manufacturers sign the associated attestation public key with a digital certificate.

Digital certificates usually contain a public key, information about the organization they are provisioned for, a validity period, the allowed usage, and a signature from a Certificate Authority making sure the certificate is legitimate. They allow metadata to be associated with a public key and therefore provide information about the issuer of a signature.

So when Cloudflare asks you to provide a signature, it verifies your public key has been signed by the public key of a manufacturer. Since manufacturers have multiple levels of certificates, your device provides a chain of certificates that Cloudflare is able to verify. Each link in the chain is signed by its predecessor and signs its successor. Cloudflare trusts the root certificate of manufacturers. Because their numbers are limited, we have the capacity to verify them manually.

Privacy first

Designing a challenge asking users to prove they are in control of a key from a certain manufacturer comes with a privacy and security challenge.

The privacy properties of the Cryptographic Attestation of Personhood are summarized in the following table.

* There must be 100,000 or more keys per batch (FIDO UAF Protocol Specification #4.1.2.1.1). However, self-signed keys and keys from certain manufacturers have been found to not meet this requirement.

**This would require that we set a separate and distinct cookie to track your key. This is antithetical to privacy on the Internet, and to the goals of this project. You can learn more about how we are removing cookies like __cfduid here.

Attestation without collecting biometrics

The aim of this project: we want to know that you’re human. But we’re not interested in which human you are.

Happily, the WebAuthn API goes a long way to take care of this for us. Not that we want it, but the WebAuthn API prevents the collection of biometrics, such as a fingerprint. When your device asks for a biometric authentication — such as via a fingerprint sensor — it all happens locally. The verification is meant to unlock the secure module of your device, which provides a signature associated with your platform.

For our challenge, we leverage the WebAuthn registration process. It has been designed to perform multiple authentications, which we do not have a use for. Therefore, we do assign the same constant value to the required username field. It protects users from deanonymization.

No hidden work

A common use of CAPTCHA is to label datasets that AI has difficulty identifying. This could be for books, street numbers, or fire hydrants. While this is useful for science, it has also been used as a way for companies to leverage human recognition ability for commercial gain without their users’ knowledge.

With the Cryptographic Attestation of Personhood, this does not happen. We have more flexibility designing the user flow, as we are not constrained by the CAPTCHA challenge model any more.

What Cloudflare is doing to push privacy even further

While the Cryptographic Attestation of Personhood has a lot of upsides in terms of privacy, it is not perfect. Cloudflare still needs to know your manufacturer to let you in. As WebAuthn works with any certificate, we need to make sure Cloudflare receives certificates from untampered hardware keys. We would prefer to not have that information, further preserving your privacy.

We have worked on privacy standards in the past, leading the efforts with Privacy Pass for instance. Privacy Pass allows you to solve a challenge once and provide a proof you passed it, meaning you don’t have to solve multiple CAPTCHAs. It greatly improved the user experience of VPN users, who face more challenges than other Internet users.

For the Cryptographic Attestation of Personhood, we dig into an emerging field in cryptography called Zero Knowledge proofs (ZK proof). It allows our users to prove their manufacturer is part of a set of manufacturers trusted by Cloudflare. Using a ZK proof, the devices from a single manufacturer become indistinguishable from each other, as well as from devices from other manufacturers. This new system requires more technical details and deserves a dedicated blog post. Stay tuned.

A never-ending quest

Designing a challenge aimed at protecting millions of Internet properties is no easy task. In the current setup, we believe Cryptographic Attestation of Personhood offers strong security and usability guarantees compared to traditional CAPTCHA challenges. During a preliminary user study, users indicated a strong preference for touching their hardware key over clicking on pictures. Nevertheless, we know that this is a new system with room for improvements.

This experiment will be available on a limited basis in English-speaking regions. This allows us to have diversity in the pool of users and test this process in various locations. However, we recognize this is insufficient coverage, and we intend to test further. If you have specific needs, feel free to reach out.

Another issue that we keep a close eye on is security. The security of this challenge depends on the underlying hardware provided by trusted manufacturers. We have confidence they are secured. If any breach were to occur, we would be able to quickly deauthorize manufacturers’ public keys at various levels of granularity.

We also have to consider the possibility of facing automated button-pressing systems. A drinking bird able to press the capacitive touch sensor could pass the Cryptographic Attestation of Personhood. At best, the bird solving rate matches the time it takes for the hardware to generate an attestation. With our current set of trusted manufacturers, this would be slower than the solving rate of professional CAPTCHA-solving services, while allowing legitimate users to pass through with certainty. In addition, existing Cloudflare mitigations would remain in place, efficiently protecting Internet properties.

Final words

For Cloudflare, it always comes back to: helping build a better Internet. The very idea that we’re all wasting 500 years per day on the Internet — that nobody had revisited the fundamental assumptions of CAPTCHAs since the turn of the century — seemed absurd to us.

We’re very proud of the work we’ve done here to release the Cryptographic Attestation of Personhood. This challenge has been built with a user-first approach while maintaining a high level of security for accessing Internet properties sitting behind Cloudflare’s global network. We’re now in the process of augmenting our existing humanity challenge with the Cryptographic Attestation of Personhood. You should expect to see it more frequently as time passes. You can try it out today at cloudflarechallenge.com.

We want to acknowledge the work of other teams at Cloudflare. While this work is led by the Research team, we have been extremely privileged to get support from all across the company. If you want to help us build a better Internet, we are hiring.

Finally: we’re excited to bring about the demise of the fire hydrant on the Internet. It’s no longer needed.

Feedback and Common errors

As this is currently an experimental project from the Cloudflare Research Team only USB or NFC security keys work today. We’re happy for the feedback and will look into adding other authenticators as soon as possible. If you use a non-supported device then you’re likely to get a somewhat difficult to understand error message from your browser. On Google Chrome you would see:

If you would like to give us feedback on the Cryptographic Attestation of Personhood please fill out our Google Form.

A Name Resolver for the Distributed Web

Thibault Meunier — Wed, 13 Jan 2021 12:00:00 GMT

The Domain Name System (DNS) matches names to resources. Instead of typing 104.18.26.46 to access the Cloudflare Blog, you type blog.cloudflare.com and, using DNS, the domain name resolves to 104.18.26.46, the Cloudflare Blog IP address.

Similarly, distributed systems such as Ethereum and IPFS rely on a naming system to be usable. DNS could be used, but its resolvers’ attributes run contrary to properties valued in distributed Web (dWeb) systems. Namely, dWeb resolvers ideally provide (i) locally verifiable data, (ii) built-in history, and (iii) have no single trust anchor.

At Cloudflare Research, we have been exploring alternative ways to resolve queries to responses that align with these attributes. We are proud to announce a new resolver for the Distributed Web, where IPFS content indexed by the Ethereum Name Service (ENS) can be accessed.

To discover how it has been built, and how you can use it today, read on.

Welcome to the Distributed Web

IPFS and its addressing system

This addressing system relies on the use of Content IDentifiers (CID). CIDs are self-describing identifiers, because the identifier is derived from the content itself. For example, QmXoypizjW3WknFiJnKLwHCnL72vedxjQkDDP1mXWo6uco is the CID version 0 (CIDv0) of the wikipedia-on ipfs homepage.

To understand why a CID is defined as self-describing, we can look at its binary representation. For QmXoypizjW3WknFiJnKLwHCnL72vedxjQkDDP1mXWo6uco, the CID looks like the following:

The first is the algorithm used to generate the CID (sha2-256 in this case); then comes the length of the encoded content (32 for a sha2-256 hash), and finally the content itself. When referring to the multicodec table, it is possible to understand how the content is encoded.

This encoding mechanism is useful, because it creates a unique and upgradable content-addressing system across multiple protocols.

If you want to learn more, have a look at ProtoSchool’s tutorial.

Ethereum and decentralised applications

Ethereum is an account-based blockchain with smart contract capabilities. Being account-based, each account is associated with addresses and these can be modified by operations grouped in blocks and sealed by Ethereum’s consensus algorithm, Proof-of-Work.

There are two categories of accounts: user accounts and contract accounts. User accounts are controlled by a private key, which is used to sign transactions from the account. Contract accounts hold bytecode, which is executed by the network when a transaction is sent to their account. A transaction can include both funds and data, allowing for rich interaction between accounts.

When a transaction is created, it gets verified by each node on the network. For a transaction between two user accounts, the verification consists of checking the origin account signature. When the transaction is between a user and a smart contract, every node runs the smart contract bytecode on the Ethereum Virtual Machine (EVM). Therefore, all nodes perform the same suite of operations and end up in the same state. If one actor is malicious, nodes will not add its contribution. Since nodes have diverse ownership, they have an incentive to not cheat.

How to access IPFS content

As you may have noticed, while a CID describes a piece of content, it doesn't describe where to find it. In fact, the CID describes the content, but not its location on the network. The location of the file would be retrieved by a query made to an IPFS node.

An IPFS URL (Unified Resource Locator) looks like this: ipfs://QmXoypizjW3WknFiJnKLwHCnL72vedxjQkDDP1mXWo6uco. Accessing this URL means retrieving QmXoypizjW3WknFiJnKLwHCnL72vedxjQkDDP1mXWo6uco using the IPFS protocol, denoted by ipfs://. However, typing such a URL is quite error-prone. Also, these URLs are not very human-friendly, because there is no good way to remember such long strings. To get around this issue, you can use DNSLink. DNSLink is a way of specifying IPFS CIDs within a DNS TXT record. For instance, wikipedia on ipfs has the following TXT record

$ dig +short TXT _dnslink.en.wikipedia-on-ipfs.org

_dnslink=/ipfs/QmXoypizjW3WknFiJnKLwHCnL72vedxjQkDDP1mXWo6uco

In addition, it's A record points to an IPFS gateway. This means that, when you access en.wikipedia-on-ipfs.org, your request is directed to an IPFS HTTP Gateway, which then looks out for the CID using your domain TXT record, and returns the content associated to this CID using the IPFS network.

This is trading ease-of-access against security. The web browser of the user doesn't verify the integrity of the content served. This could be because the browser does not implement IPFS or because it has no way of validating domain signature — DNSSEC. We wrote about this issue in our previous blog post on End-to-End Integrity.

Human-readable identifiers

DNS simplifies referring to IP addresses, in the same way that postal addresses are a way of referring to geolocation data, and contacts in your mobile phone abstract phone numbers. All these systems provide a human-readable format and reduce the error rate of an operation.

To verify these data, the trusted anchors, or “sources of truth”, are:

Root DNS Keys for DNS.
The government registry for postal addresses. In the UK, addresses are handled by cities, boroughs and local councils.
When it comes to your contacts, you are the trust anchor.

Ethereum Name Service, an index for the Distributed Web

An account is identified by its address. An address starts with "0x" and is followed by 20 bytes (ref 4.1 Ethereum yellow paper), for example: 0xf10326c1c6884b094e03d616cc8c7b920e3f73e0. This is not very readable, and can be pretty scary when transactions are not reversible and one can easily mistype a single character.

A first mitigation strategy was to introduce a new notation to capitalise some letters based on the hash of the address 0xF10326C1c6884b094E03d616Cc8c7b920E3F73E0. This can help detect mistype, but it is still not readable. If I have to send a transaction to a friend, I have no way of confirming she hasn't mistyped the address.

The Ethereum Name Service (ENS) was created to tackle this issue. It is a system capable of turning human-readable names, referred to as domains, to blockchain addresses. For instance, the domain privacy-pass.eth points to the Ethereum address 0xF10326C1c6884b094E03d616Cc8c7b920E3F73E0.

To achieve this, the system is organised in two components, registries and resolvers.

A registry is a smart contract that maintains a list of domains and some information about each domain: the domain owner and the domain resolver. The owner is the account allowed to manage the domain. They can create subdomains and change ownership of their domain, as well as modify the resolver associated with their domain.

Resolvers are responsible for keeping records. For instance, Public Resolver is a smart contract capable of associating not only a name to blockchain addresses, but also a name to an IPFS content identifier. The resolver address is stored in a registry. Users then contact the registry to retrieve the resolver associated with the name.

Consider a user, Alice, who has direct access to the Ethereum state. The flow goes as follows: Alice would like to get Privacy Pass’s Ethereum address, for which the domain is privacy-pass.eth. She looks for privacy-pass.eth in the ENS Registry and figures out the resolver for privacy-pass.eth is at 0x1234... . She now looks for the address of privacy-pass.eth at the resolver address, which turns out to be 0xf10326c....

Accessing the IPFS content identifier for privacy-pass.eth works similarly. The resolver is the same, only the accessed data is different — Alice calls a different method from the smart contract.

Cloudflare Distributed Web Resolver

The goal was to be able to use this new way of indexing IPFS content directly from your web browser. However, accessing the ENS registry requires access to the Ethereum state. To get access to IPFS, you would also need to access the IPFS network.

To tackle this, we are going to use Cloudflare’s Distributed Web Gateway. Cloudflare operates both an Ethereum Gateway and an IPFS Gateway, respectively available at cloudflare-eth.com and cloudflare-ipfs.com.

EthLink

The first version of EthLink was built by Jim McDonald and is operated by True Name LTD at eth.link. Starting from next week, eth.link will transition to use the Cloudflare Distributed Web Resolver. To that end, we have built EthLink on top of Cloudflare Workers. This is a proxy to IPFS. It proxies all ENS registered domains when .link is appended. For instance, privacy-pass.eth should render the Privacy Pass homepage. From your web browser, https://privacy-pass.eth.link does it.

The resolution is done at the Cloudflare edge using a Cloudflare Worker. Cloudflare Workers allows JavaScript code to be run on Cloudflare infrastructure, eliminating the need to maintain a server and increasing the reliability of the service. In addition, it follows Service Workers API, so results returned from the resolver can be checked by end users if needed.

To do this, we set up a wildcard DNS record for *.eth.link to be proxied through Cloudflare and handled by a Cloudflare Worker. When a user Alice accesses privacy-pass.eth.link, the worker first gets the CID of the CID to be retrieved from Ethereum. Then, it requests the content matching this CID to IPFS, and returns it to Alice.

All parts can be run locally. The worker can be run in a service Worker, and the Ethereum Gateway can point to both a local Ethereum node and the IPFS gateway provided by IPFS Companion. It means that while Cloudflare provides resolution-as-a-service, none of the components has to be trusted.

Final notes

So are we distributed yet? No, but we are getting closer, building bridges between emerging technologies and current web infrastructure. By providing a gateway dedicated to the distributed web, we hope to make these services more accessible to everyone.

We thank the ENS team for their support of a new resolver on expanding the distributed web. The ENS team has been running a similar service at https://eth.link. On January 18th, they will switch https://eth.link to using our new service.

These services benefit from the added speed and security of the Cloudflare Worker platform, while paving the way to run distributed protocols in browsers.