The Cloudflare Blog

Why we cannot wait for better post-quantum signature algorithms

Bas Westerbaan — Thu, 09 Jul 2026 14:00:00 GMT

RSA and ECC, cryptographic algorithms that we’ve all relied on for decades, are vulnerable to the attack of sufficiently advanced quantum computers. Such quantum computers do not exist yet, but they seem to be coming sooner than expected. Luckily, the solution is already available: migrate to ML-KEM encryption and ML-DSA signatures, which are designed to be resistant to quantum attack. They were standardized in 2024 by the U.S. National Institute of Standards and Technology (NIST) after an eight-year open international competition.

The migration to post-quantum cryptography is in full swing now. At the time of writing, the majority of traffic handled by Cloudflare is already using ML-KEM encryption, and is thus secured against the threat to data posed by harvest-now-decrypt-later attacks. But encryption is only one part of the equation: to be fully secure against quantum computers capable of breaking classical cryptography, we aim to deploy post-quantum signatures to protect authentication systems from unauthorized access. We are targeting 2029 for Cloudflare to be fully post-quantum secure.

ML-DSA, the best all-around post-quantum signature scheme standardized today, has its downsides: it’s much larger on the wire, and many tricks we were able to perform with RSA and ECC simply cannot be done with ML-DSA. There are better post-quantum signature schemes on the horizon: last month, NIST announced that it is advancing nine post-quantum signature schemes to the third round of the “signatures on-ramp”. And a draft standard for FN-DSA (née Falcon), which was picked from the previous competition, is expected imminently.

We have been very interested in advances in post-quantum signature algorithms, and wrote about the progress in 2021, 2022, 2024, and 2025. In this blog post we’ll treat you to the latest developments in great detail.

But first we have to deal with the elephant in the room: These new signature algorithms will not be ready in time for the PQ transition — not even close, as we will see later on. The problem is arriving too soon for us to wait. ML-DSA is available today, and it will have to do for the first migration. As Eric Rescorla wrote in 2024:

You go to war with the algorithms you have, not the ones you wish you had.

Nonetheless, the search for better post-quantum signature algorithms is crucial for several reasons, and we firmly believe it is still the best use of NIST’s limited resources.

Let’s have a look at the signature algorithms in detail. After that we’ll look at the timeline for their availability, and the reasons why we still need them.

The signature algorithms

In the table below, we compare the candidate signature algorithms that progressed to the third round (marked by 🤔), with classical algorithms vulnerable to quantum attack (marked by ❌), and the post-quantum algorithms that are already standardized ( ✅) or soon will be (📝). Each candidate proposes several variants. We list the most relevant variants to TLS, the protocol used to secure connections on the Internet. To explore all variants, check out Thom Wiggers' signatures zoo.

			Sizes (bytes)	CPU time (lower is better)
Family	Name _variant	A	Public key	Signature	Signing	Verification
Elliptic curves	Ed25519	❌	32	64	0.15	1.3
Factoring	RSA ₂₀₄₈	❌	272	256	80	0.4
Lattices	ML-DSA ₄₄	✅	1,312	2,420	1 (baseline)	1 (baseline)
Symmetric	SLH-DSA _128s	✅	32	7,856	14,000	40
SLH-DSA _128f	✅	32	17,088	720	110
SLH-DSA _128-24	📝	32	3,856	7,000,000 ⚠️	4
LMS _{M24_H20_W8}	✅	48	1,112	2.9 ⚠️	8.4
Lattices	FN-DSA ₅₁₂	📝	897	666	3 ⚠️	0.7
Lattices	HAWK ₅₁₂	🤔	1,024	555	0.25	1.2
Proof of knowledge	MQOM _{L1-gf16-fast-5r}	🤔	60	3,280	8	20
SDitH _{SDitH2-L1-gf2-fast}	🤔	70	4,484	15	40
FAEST _EM-128f	🤔	32	5,060	4.2	9
Isogeny	SQIsign _I	🤔	65	148	300 ⚠️	50
Multivariate	MAYO _one	🤔	1,420	454	2.1	0.4
MAYO _two	🤔	4,912	186	1.1	0.8
QR-UOV _{I-(127 156 54 3)}	🤔	24,225	200	9.3	20
SNOVA _(24,5,4)	🤔	1,016	248	1.2	1.7
SNOVA _(25,8,3)	🤔	2,320	165	1	1.5
SNOVA _(37,17,2)	🤔	9,842	124	0.8	1.3
UOV _Is-pkc	🤔	66,576	96	0.3	2.4
UOV _Ip-pkc	🤔	43,576	128	0.3	2

A few more remarks on this table: Most candidates have multiple variants in every security level. We show the most relevant variants for TLS at the 128-bit security level, the gold standard for security. CPU times are taken from the signatures zoo in June 2026, which collected them from the round two submission documents and later advances. Candidates are allowed to make changes for the third round, which will influence these numbers. Some will improve (both in compute and size), whereas others will regress to counter new attacks. Check out the zoo for the latest numbers. We marked FN-DSA and SQIsign signing with a ⚠️️, as both are hard to implement in a fast and timing side-channel secure manner. LMS signing has a ⚠️, as secure LMS signing requires keeping state across signatures, and the listed signing time assumes a 32MB cache. The 128-24 variant of SLH-DSA is marked with a ⚠️️ as it’s meant to create fewer than 2²⁴ signatures.

No "all-star" algorithm

One thing that stands out immediately is that the quantum-vulnerable elliptic curves signature algorithm Ed25519 is by far the best all-around choice (ignoring its quantum vulnerability): it has the best numbers in almost every single metric, including public key size, signature size, and signing time. It’s only beaten on verification time, but it’s more than fast enough for the vast majority of applications.

This is quite different than the roster of post-quantum algorithms. Instead of a single "all-star" algorithm, we have roughly two categories of schemes: the "specialists" that approach our trusty elliptic curve signatures on some metrics, but are problematic on others, which make them great in the right deployment scenario. Then there are the “generalists”, such as ML-DSA, which don’t perform as well as elliptic curves on all metrics, but so far as downsides go, are pretty balanced.

Specialists

Let’s start with the specialists.

SQIsign: small signatures / slow signing

If you just look at the bytes on the wire, then SQIsign looks like an almost perfect drop-in replacement for elliptic curve cryptography. With signatures of 148 bytes and public keys of 65 bytes, it beats RSA-2048. Unfortunately there is no free lunch: SQIsign has three weak points. First, it’s the most complex algorithm on the docket. Secondly, its signature creation and verification is quite slow. Finally, it’s difficult to implement signature creation in a timing side-channel secure way and doing so comes with a performance penalty to boot.

That doesn’t sound great so far, but it was much worse: when we had a look back in 2024, there was not yet any timing side-channel secure implementation and signature verification was 20x slower. Furthermore there has been welcome progress on simplifying the scheme.

Despite these dramatic improvements, it is unlikely (side-channel secure) signing will be fast enough in the foreseeable future to be used in typical online cases such as the TLS handshake. However, for offline cases, such as CA signatures or DNSSEC, where it’s the verification time that’s more important than the signing time, SQIsign might have an application.

But the topic we should really discuss is security. SQIsign is based on isogenies. Rather famously, SIKE, another algorithm based on isogenies, got broken badly in a late stage of the first NIST PQC competition that standardized ML-DSA. SIKE is often brought up as a cautionary example showing that post-quantum cryptography could break suddenly. This requires some nuance. First, there were already concerns about SIKE’s security, and in particular the torsion points that led to the break. Because of these concerns, SIKE was not selected for standardization, but deferred to an additional round of evaluation before it was broken. (Indeed, this is an example of the NIST process working well.) SQIsign doesn’t use torsion points, and there is no similar concern as there was for SIKE.

One other notable security property is that the best known attacks on SQIsign are generic brute force, just like with classical attacks on well-selected elliptic curves. This is quite different from RSA, lattices, and multivariate where the attack algorithms have been slowly improving, pushing the parameters towards bigger signatures. Nonetheless, the mathematics behind isogenies is very rich, and compared to the other algorithms, there is a lot of mathematical attack surface. Still, its security seems more sound than the structured multivariate algorithms we’ll discuss later.

SQIsign is an algorithm with tremendous potential. It’d be a shame to standardize it too early. To the authors, we’d like to share the following wishlist:

Ideally verification time is decreased even further, even if this trades off against signing time and signature size: SQIsign signatures are already small enough, and offline signing time has some slack anyway.
The timing side-channel secure implementation should be the default, especially if signing time is decreased further, which would tempt some online signing applications.
But above all, our wish is for SQIsign to be simplified.

UOV: tiny signatures / huge public keys

UOV (unbalanced oil and vinegar) is a classic multivariate signature algorithm originally proposed in 1999. It has tiny signatures: only 96 bytes. The trade-off? A huge public key: 66kB. That wouldn’t help for a TLS server certificate, whose public key is transmitted over the wire when setting up a connection, but it would be a help for cases where the public key is predistributed.

Let’s take the WebPKI as an example. A typical browser trusts about a hundred root certificates and 30 certificate transparency logs, whose public keys would add up to about 8MB when using UOV.

The public keys and signatures in a typical TLS connection.

Since the root certificate is transmitted to clients out of band, one idea is to use a UOV signature there. But this is not a slam dunk; because of its size, a UOV root certificate would be impractical to be cross-signed where the root is used as an intermediate. At the same time, cross signs and intermediates become less attractive anyway with any larger post-quantum signatures. This encourages more root certificates to be included directly with clients. This would again favor UOV, but to a point: if the number of root certificates grows above a thousand, we’d be dealing with more than 66MB of key material, which would make up a substantial portion of the browsers’ download size (e.g., 90MB for Firefox 151.)

Multivariate security

What about the security? Over the years, many variants of UOV have been proposed that use some extra mathematical structure to reduce the size of the public key. These structured multivariate schemes have had a spotty track record with schemes such as Rainbow and GeMMS being broken quite badly. It is important to distinguish these from UOV itself, whose security track record is much better, but not perfect.

As with many cryptographic schemes, there were growing pains in the early years, as basic attacks and parametrization pitfalls were discovered. In fact, the “U” in UOV is a remnant of that: it stands for unbalanced, which is a fix to a parameter-setting mistake in the 1997 oil-and-vinegar scheme on which UOV is based: the original scheme had an equal number of oil and vinegar variables in the quadratic system of equations used as the public key, which turns out to allow for an attack. In case you’re curious about the colorful name: the system of equations contains vinegar x vinegar and oil x vinegar, but no oil x oil terms. It’s like vinaigrette with small separate oil droplets. Back to the history: from 2005 to 2020 was a quiet period for multivariate signatures: understanding of UOV grew, but there weren’t any new attacks on typical parameters.

This changed in 2020 with the discovery of the intersection attack which built on the ideas of the original attack on balanced oil-and-vinegar. The intersection attack removes about 30 bits of security from a then-proposed 128-bit parameter set. A considerable blow, but not fatal: slightly adjusting parameters mitigates the attack completely, with minor increase in key and signature size.

A bigger shock was the 2025 publication of the idea to use wedges to attack multivariate schemes. The initial impact on UOV was minor: only a few bits (again at the 128-bit security level.) The worry was that this idea came out of left field, and it wasn’t clear how far the approach could be taken. That concern was partly justified: the wedges idea was very fruitful and several subsequent attacks have been built on it, reducing security by about 15 bits. However, it also became clear that the wedges attack and generalizations can be seen as a special case of an intersection attack over truncated rings — thus much more familiar than we thought. Again, these attacks can be mitigated with only minor increases in key and signature size.

What to make of all of this? Such a history of attacks is not uncommon: over the last 25 years lattices have seen larger reductions in security, although this has calmed down over the recent years. Notwithstanding, lattice-based cryptography deployed in production today uses conservative parameter sets well above 128-bits to hedge against future cryptanalysis. We’d want to do the same with UOV. Signature size only grows linearly with the security level, costing just 260 bytes even at the 256-bit security level. Unfortunately, the public key size is cubic in security level: 446kB for 256-bit. Conveniently, UOV (as most multivariate schemes) has a lot of flexibility in picking parameter sets at various intermediate security levels.

UOV is a foundational scheme with narrow but real use cases. Going forward, we’d like to see a parameter set with a bit of margin above 128 bits, say 160 bits, to hedge against future cryptanalytic improvements.

QR-UOV: small signatures / large public keys

Like SNOVA and MAYO which we’ll discuss later on, QR-UOV is a structured multivariate scheme: it’s a variant of UOV that adds more structure to the public key to reduce its size. The gains are modest: at best we’re looking at 12kB public keys, but signature verification is impractically slow for that particular parameter set. The more realistic parameter sets start at 24kB public keys.

With respect to security, QR-UOV is the only multivariate scheme that did not have to adjust its original (round one) parameters in response to new attacks. This is somewhat surprising as any attack on UOV can also be applied to QR-UOV. The explanation is that the attacks do apply, but the natural parameters for QR-UOV happen to make them ineffective. On the other hand, there were already several attacks known that use the specific extra structure that QR-UOV adds: indeed, for some of the parameter sets, the structure-specific attacks are the best attacks. This should be contrasted with MAYO, where there is no known attack against the extra structure MAYO adds. (We’ll get back to MAYO and SNOVA later in this post.)

Compared to last round, QR-UOV signing and verification time improved significantly, but it is still comparatively slow. All in all, QR-UOV is a hard sell: it adds exploitable structure to UOV without pushing key sizes down to general-purpose sizes.

Hash-based signatures

Stateful hash-based signatures

The very first standardized post-quantum signature algorithms are the stateful hash-based LMS, HSS and XMSS^(MT). They have very small public keys, and for many parameter sets the signatures are much smaller than those of ML-DSA-44. To boot, their security is based on that of hashes, which are well-understood and already a cornerstone of cryptography. That makes hash-based signature algorithms a very conservative choice, and there is no need to hedge with higher security levels.

So, what’s the catch?

There are two. The big one is keeping the eponymous state. These stateful hash-based signature schemes are built out of one-time-signature keys which are collected into Merkle trees. The signer has to keep track of which one-time-signature keys have been used, which can be as simple as just a counter. If the signer mucks it up, though, and accidentally uses the same one-time-signature key twice on a different message, then anyone can likely use those two signatures to create their own signature on any message. You have to keep a lot in mind to keep the state correctly. Some considerations: you want to make sure that updates are written to storage before handing out the signature; you don’t want the old state to be restored from a backup; and you can’t export/import a private key from one place to another without agreeing on how to split or keep the state. The state is, as Adam Langley pointed out several years ago, a huge foot-cannon.

Another downside is that the most competitive parameter sets can only create a modest number of signatures. The 1,112 byte signatures (as listed in the table above) can only be used to create about a million signatures. You can explore the trade-offs with this calculator.

Together this leaves a very small niche for stateful hash-based signatures: signers have to be able to keep state; signature size has to be a real concern; and signers have to be OK with a hard limit on the number of signatures.

SLH-DSA: conservative security / large and slow

SLH-DSA is a hash-based signature that doesn’t have the low signature limit and avoids the problem of keeping the state. The basic idea is to make the number of one-time-signature keys so large that you can pick one at random without having to worry about using the same one twice, since the chance of picking the same one twice is diminishingly small. SLH-DSA is a bit more efficient than that, by replacing the one-time-signature key as a building block with a few-time-signature key, where security degrades gracefully if keys are occasionally reused. It still comes at a cost. SLH-DSA has two variants, one that optimizes for small signature size, and one that optimizes for fast signing. The size-optimized one is not small at all at 8kB, and the signing-optimized one is even slower than SQIsign.

Fewer signature variants of SLH-DSA

NIST has proposed to standardize an additional parameter set for SLH-DSA with much smaller signatures, but that can only be used to create about 16 million signatures before security reduces. At 3.8kB the signatures are still larger than those of ML-DSA-44, but the combined public key and signature size is very close. The parameter set was chosen to make signature verification fast at the cost of signing time. The signing time is very bad indeed.

Use cases

So why ever use SLH-DSA? The selling point is the conservative security. For a long-term trusted key that is hard to replace, it could make sense if the application can stomach the large signature and slow verification of the standardized variants or the slow signing time of the newly proposed one. There are two more caveats to add. First, it’s better to set things up so that key algorithms are not burned-in and can be replaced after the fact. And secondly, in most cases systems (such as secure connections with TLS) do not just depend on signatures, but also on key agreement. There is no hash-based key agreement mechanism, so we end up needing to trust something less conservative, like lattices, anyway.

FN-DSA: small key and signatures / subtle signing

Comparing the numbers, FN-DSA-512 (née Falcon) looks much better than ML-DSA-44 on almost every metric: faster verification, smaller public key, and much smaller signatures at 666 bytes. Signing is three times slower, but it’s still 25x faster than RSA-2048. To boot it’s already picked to become FIPS 206. So why don’t we consider FN-DSA to be a general-purpose algorithm?

It’s because it’s difficult to implement FN-DSA signing securely. The most well-known sharp edge of FN-DSA is that it is most naturally and efficiently implemented using hardware-accelerated floating-point arithmetic. This is a first for a cryptographic standard. One big challenge with it is that we have little experience implementing fast floating-point arithmetic in a side-channel safe way. What we know so far is that it’s subtle and not very robust: a safe implementation of FN-DSA signing using the Floating-Point Unit (FPU) for one processor might not be safe for another. Instead of relying on the FPU, the floating point operations can be emulated. This is easier to get right, but about 20 times slower, making it about as slow as RSA-2048. There has been some welcome progress recently to implement FN-DSA signing safely using fixed-point arithmetic, which is much faster than the floating-point emulation. So just use that and FN-DSA is good to go? This presumes a level of awareness that might not be warranted. Anecdotally at conferences, every time we saw a presenter compare post-quantum signature algorithms including FN-DSA in benchmarks, they couldn’t answer whether floating-point emulation was used.

Another consequence of using floating points is that it’s difficult to make test vectors for signing. Just one example of this is that the outcome of a+(b+c) and (a+b)+c are only guaranteed to be close, but not the same. That means that to have useful test vectors, the FN-DSA specification would need to be very precise on the order of floating-point operations. Another example is a*b+c, which can be computed in two steps (multiply and then add), or at once using fused-multiply-add (FMA). The latter is faster, but again gives a slightly different answer as rounding happens only once. Not all processors support FMA, but for those that do, compilers typically automatically use FMA for the performance boost. There are also mathematical optimizations that cause trouble. For instance, the reference implementation computes a value (norm) in a faster roundabout way using Parseval’s theorem. Mathematically the answer is exactly the same, but as floating-points are only an approximation, the resulting value is ever so slightly different. Similarly, the safe fixed-point arithmetic implementation gives slightly different results.

Why is this a problem? It is because it is still the humble test vector that catches most implementation bugs in practice. Other more refined methods like formal verification will certainly catch more, but test vectors are hard to beat in simplicity.

Another surprising sharp edge from not having a fixed implementation is the following. From two deterministic signatures created by slightly different implementations from the same private key, one can derive parts of that private key. FN-DSA does not use deterministic signatures, instead adding a randomizer to thwart this. There is a tension with testing: you need a deterministic interface to test signing, but you don’t want that to be used to create actual signatures.

How to deal with the wiggle room in the FN-DSA specification will undoubtedly be a point of discussion. The discrepancy between the implementations might actually have a silver lining: NIST could decide to generate the test vectors (CAVP) from the fixed-point arithmetic implementation. That the more risky floating-point implementation wouldn’t pass the test vectors would be a feature, not a bug, as it would steer implementations towards the safer fixed-point version!

You can read about a few other interesting sharp edges in this blog post. Stepping back from the specifics, the main point is that FN-DSA is a complicated scheme. It’s not a surprise that it took NIST a couple of years (not counting the current limbo) just to write the draft standard. It’ll take longer than usual for the final standard to come out and for cryptographic libraries to add support. FN-DSA is farther away than it seems. We’ll compare timelines later in this blog post.

If the numbers are still very tempting, there is one last thing you should be aware of: FN-DSA-512 is parametrized for 128-bit security compared to ML-DSA-44’s generous 160 bits. If lattice cryptanalysis improves, there is no middle security level: the next step-up is all the way to FN-DSA-1024 at 256 bits. FN-DSA-1024 has double the key and signature sizes and signing and verifying times of FN-DSA-512. An FN-DSA-1024 signature is still half the size of that of ML-DSA-44, but the public key+signature only differs about 20%.

To close the discussion of FN-DSA, it is good to emphasize that all difficulties with FN-DSA are on the signing side: the verification of an FN-DSA signature is very straightforward.

General-purpose algorithms

Now let’s turn to the algorithms that are meant to be general-purpose replacements for ML-DSA.

HAWK

HAWK is a curious case. In many aspects it’s similar to FN-DSA: a structured lattice hash-then-sign scheme with similar sizes for signatures and public keys with a missing middle security level. The main benefit of HAWK over FN-DSA is that signing is very fast and doesn’t use floating-point arithmetic, although it’s not a simple algorithm either. This comes with a trade-off: HAWK is based on and introduces a new security assumption, the lattice isomorphism problem (LIP). In 2024, two years after the introduction of HAWK, it was discovered that this problem is easy to solve in the special case of totally real number fields, which aren’t used in HAWK or any other cryptography. In 2025, this attack was extended to a broader class of number fields. This hasn’t yet applied to HAWK, but it’s getting closer. A new paper published in June 2026 suggests there is a way to extend the attack to HAWK. An error has been found in the paper, although it’s yet unclear how fundamental it is to the approach. Regardless, the trajectory is concerning.

Even ignoring the potential attacks, HAWK faces some headwinds: its additional security assumption prevents it from displacing FN-DSA, but its practical benefits (especially considering the lack of middle security level) fall short of that of the structured multivariate candidates. It also doesn't increase diversity in security assumptions, an outcome that NIST is hoping for.

Proof-of-knowledge schemes

FAEST, MQOM, and SDitH all share a similar overall structure. Their public keys are instances of some hard problem and their secret keys are the solutions.

A FAEST public key is the AES-encryption of a known plaintext under a secret key.
MQOM gets its name from the Multivariate Quadratic problem, which is closely related to (but more conservative than) the cryptographic assumptions underlying the multivariate schemes. The public key is a system of quadratic equations, and the secret key is a solution to that system of equations.
SDitH is based on the hardness of the Syndrome Decoding problem for random linear codes. This problem is related to the code-based schemes submitted to the original NIST competition, but these were eliminated in the third round.

In all cases, a signature is a zero-knowledge proof that the signer knows the solution of that hard problem, while at the same time (almost as an afterthought) acknowledging the message-to-be-signed as part of the proof.

Many signature schemes are zero-knowledge proofs like this behind the scenes, notably ML-DSA, SQIsign, and Ed25519. Why don’t we group those with proof of knowledge schemes too?

The difference is generalizability: the zero-knowledge proof used for ML-DSA is only able to prove something about a specific LWE problem as used in ML-DSA: the proof uses mathematical structure in the key. There are ways to create zero-knowledge proofs using lattices for any general statement, but those proof systems are very different from ML-DSA, and would create rather larger signatures on the order of 50kB.

In contrast, the proof system used in FAEST, MQOM, and SDitH can be used to prove arbitrary statements. For instance, FAEST can be modified to use the hard problem of MQOM instead. This leads to a more efficient scheme called KuMQuat. (We’ll get to some performance numbers later on.) Conversely, MQOM can be adjusted to use AES as the hard problem.

This flexibility is great for two reasons. First, it doesn’t require any specific mathematical structure in the hard problem used, and thus we can pick a very conservative problem such as breaking AES. Some problems lead to a more efficient signature than others, as we see with MQ as used in MQOM. MQ is still quite a conservative assumption: it does not contain the hidden subspace used in UOV and thus the other multivariate signatures. Neither the intersection nor wedges attacks apply to it. In fact, the MQ-problem is NP-hard. To be secure, one still needs to pick the correct size of the problem, and although MQ has been studied for quite a while, it certainly has not seen the same scrutiny as deployed algorithms like AES.

The second and greater benefit is that we’re able to create much more than just a plain signature scheme from a general zero-knowledge proof system: we can create blind signatures and even full-fledged anonymous credentials.

Here it’s good to note a limitation: the size of the proofs for all three grows linearly with the statement proven. In technical terms: they’re not succinct like STARKs and LaBRADOR, which outperform them handily for large statements. It’s another example where sometimes it’s better to pick the approach that’s not optimal asymptotically.

Back to advantages: apart from the hard problem chosen, and the security of hash functions, these three schemes don’t require any further security assumption. This makes FAEST as conservative as SLH-DSA.

So what’s the difference except for the chosen hard problem? These schemes started off quite differently, but have been improving and converging since the first round. The proof system in MQOM is a bit simpler than FAEST, but it also does not perform as well: KuMQuat (FAEST+MQ) outperforms MQOM.

Talking about performance, let’s start with a comparison to SLH-DSA. All three schemes have variants that outperform any standardised SLH-DSA parameter set and often by a good margin. SLH-DSA does have one distinct advantage: the verification routine is simpler to implement.

Against ML-DSA-44 the comparison is more interesting. All schemes have a smooth trade-off between runtime and signature size. To illustrate, here are trade-offs reported for KuMQuat (FAEST+MQ.) Verification times are close to signing times.

KuMQuat can be parametrized to have somewhat smaller signatures than ML-DSA-44 at the cost of long signing (and verification) runtime. At the other end, it can have similar signing time as ML-DSA-44, at the cost of larger signatures, although the public key+signature size is still similar.

These schemes have improved quite a bit over the years, and we expect some improvements still. Although they won’t improve upon ML-DSA as dramatically as some of the other schemes considered, their conservative security and especially their potential for broader applications like anonymous credentials make them very appealing. To showcase the flexibility of the underlying zero-knowledge proof system, we’d like each scheme in this category to present numbers on how well they’d perform with a different underlying hard problem.

Structured multivariate: MAYO versus SNOVA

Like QR-UOV discussed earlier, MAYO and SNOVA are variants of UOV that add extra structure to the public key to reduce its size. MAYO and SNOVA take two different approaches: SNOVA makes aggressive bets to get the best performance, whereas MAYO treads carefully with a conservative design.

SNOVA does have impressive performance. Its main parameter set has 248 byte signatures (smaller than RSA-2048!) with only a 1kB public key. It beats every other post-quantum scheme on public key+signature size and has great runtime.

MAYO’s performance is nothing to scoff at either. MAYO_one has the best verification time and its 454 byte signatures are still smaller than those of FN-DSA-512, HAWK-512, and RSA-4096. Combined with its 1,420 byte public key, MAYO_one does slightly fall behind FN-DSA-512 and HAWK-512. However, MAYO takes the lead again if we ask for some security margin. FN-DSA and HAWK have a missing middle security level and thus need to bump all the way to the 256-bit security, whereas MAYO’s granularity can add extra security at the cost of slightly increasing public key and signature sizes.

	Security	Public key	Signature	PK + Sig
HAWK-1024	256	2,440	1,221	3,661
FN-DSA-1024	256	1,793	1,280	3,079
MAYO at 174 bit security	174	1,600	550	2,150

If that wasn’t good enough, both MAYO and SNOVA allow for a trade-off between signature and public key size. Thus, we can get even smaller signatures for public keys that are transmitted ahead of time. Pushed to the extreme, MAYO becomes UOV.

So far we have discussed performance. What about the security? MAYO adds a “whipping” structure on top of UOV: any attack on UOV will also work for MAYO, but there might be attacks specific to the whipping structure of MAYO. So far no attacks on the whipping structure, and thus on MAYO specifically, have been found. The worst that has happened is that some UOV attacks have affected some MAYO variants more than typical UOV parameter sets, due to the UOV parameter choices that are natural for MAYO.

This is in stark contrast to SNOVA. SNOVA has been hit quite hard on its specific structure several times. In response, the SNOVA team has not just tweaked parameters, but continuously changed the actual structure. Every time, they take the leap forward and propose a new SNOVA with even better performance. We noted this last year and the pattern has continued, whereas MAYO’s basic design is stable.

Furthermore, the structure SNOVA uses can be seen as a special form of the whipping map that MAYO uses. That means that any MAYO-specific attack would apply to SNOVA, but not the other way around.

All in all, we’ve seen a lot of progress on the understanding of multivariate security. NIST wrote that they expect an extra round before standardizing a multivariate scheme. That seems prudent. To us, it’s unclear whether SNOVA would be ready by then, but MAYO so far seems to have matured well.

Timelines

Now, let’s have a look ahead and sketch when these new signature algorithms might become usable.

Progress for ML-DSA so far

It’s illustrative to look at ML-DSA.

November 2017	Submitted to the competition
January 2019	Progressed to the second round
July 2020	Progressed to third round
July 2022	Selected for standardization
August 2023	Initial public draft
August 2024	Final NIST standard
October 2025	ML-DSA certificate standard (RFC 9881)
April 2025	OpenSSL 3.5.0 adds support for ML-DSA
August 2025	Debian Trixie released with OpenSSL 3.5.0
December 2025	TLS IANA codepoint for ML-DSA registered
March 2026	First CMVP certificates for ML-DSA module
July 2026 (expected)	Hybrid ML-DSA certificate standard
August 2026 (expected)	RFC for use of ML-DSA in TLS
Early 2027 (expected)	Availability first ML-DSA certificates in WebPKI

After NIST selected Dilithium to become ML-DSA, it took a year to draft a proposal for the standard, and another year for the algorithm standard to be published. The algorithm standard is not enough: protocols need to agree on how to integrate ML-DSA. For certificates that took another year. That’s not the end of it: software needs to add support for ML-DSA and its integration into protocols.

These steps are not purely sequential: work on software implementation of ML-DSA started before the final standard. Also, protocol integration standards are often “done” before they’re a final standard. For instance, the use of ML-DSA in TLS is done, but at the time of writing it’ll take a couple of months before the RFC for that is out. Notably OpenSSL jumped the gun and added support for ML-DSA before the IANA codepoints were assigned. Notably missing still is agreement on which hybrid signatures should be used in TLS (or at all), for which (at the time of writing) no IANA code points have been assigned.

When will these new signature algorithms be ready for use?

So where does that leave us for new signature algorithms? If the FN-DSA draft is released today, and it progresses at the same rate as ML-DSA, then we’d perhaps have some early software support in early 2029, but no significant deployment. Looking at the time it took to write the FN-DSA draft standard, it is likely that the final standard, protocol integrations, and software support will progress slowly as well. We would not expect FN-DSA to be widely available before 2033.

The progress in cryptanalysis of multivariate schemes gave NIST pause: they wrote that they expect multivariate to at least take another round of about two years. On the other hand, multivariate schemes are reasonably easy to implement. That means we might see a multivariate NIST standard in 2031, and wider product availability not earlier than 2034.

NIST is more confident in the security of SQIsign than that of multivariate. Not unlike FN-DSA, SQIsign is a difficult scheme to standardise and implement. At the same time, a lot of progress is made in simplifying SQIsign. It seems likely that SQIsign will make large changes for the third round, and will thus require a fourth round of evaluation. In either case, wide availability before 2035 seems unlikely.

As discussed above, HAWK occupies an awkward middle ground between FN-DSA and structured multivariate candidates. If it were standardized, which seems unlikely even before the recent progress in cryptanalysis, we wouldn’t expect product availability before 2034.

That leaves the proof of knowledge algorithms MQOM, SDitH, and FAEST. We’ve seen dramatic improvements to these schemes over the rounds. If that rate of change holds, it’ll require another round, but if it’s stable now, a proof of knowledge algorithm will be the first new NIST standard to see the light in 2030. If it’s out this early, it’ll likely not outperform ML-DSA dramatically. Nonetheless, it’ll still be very welcome to build anonymous credentials and other primitives beyond signatures.

So, should you wait on one of these signatures for your post-quantum migration? Given recent advances in quantum hardware and software, we don’t believe we can afford to wait. At Cloudflare, we’re aiming to be fully migrated by 2029. None of these signatures will be out in time. Deadlines of most regulators vary between 2030 and 2035. These did not account for recent progress, and we expect them to be adjusted. We saw just this with the June 2026 US executive order setting a 2031 deadline. Even if deadlines weren’t changed, we wouldn’t advise waiting.

Why? Deploying post-quantum signatures in 2034 to beat a 2035 deadline is not enough. In a system of any reasonable size, you can’t upgrade everything all at once. You’ll need a transition period where both post-quantum and traditional signatures are supported. And supporting both allows for a downgrade attack. The most straightforward way to prevent such downgrades is to disable classical cryptography. That will take time, and is frankly not even an option in many sufficiently distributed systems like the WebPKI. We will cover how to deal with downgrades in a future blog post. In the meantime here is some reading if you’re curious. In any case, dealing with downgrades will take time.

It seems clear these new post-quantum signature algorithms will not be ready to use in time for the first migration. So why bother?

Why we still need them

We’ve had 50 years to weave public key cryptography all through our digital society. We have a few short years left to make it all quantum secure. For most of these upgrades the procedure is clear: drop in post-quantum cryptography. Easier said than done: it’s a monumental task. But then there are cases that are fundamentally harder. There's no all-star signature in a post-quantum world, and there are cases where the size of ML-DSA is a problem. With enough resources and stakeholder agreement, systems can be re-engineered to work well with these larger signatures. Indeed, thanks to ongoing re-engineering, the post-quantum WebPKI is shaping up to perform better than the quantum-vulnerable one of today. It is unrealistic to expect that this will happen for every system before it’s too late. Some will have to accept a performance cost. Others will need to deal with the security gap in other ways, such as restricting access, tunneling, more monitoring, or a myriad of other measures that are costly on their own. Once smaller post-quantum signatures arrive, these compensating controls can be removed, and full efficiency and security restored.

An indirect, but no less important benefit of the ongoing NIST competition is its help in furthering post-quantum cryptography beyond basic primitives: it isn’t just key agreement and signatures that are quantum vulnerable. There is a long tail of fancy cryptographic primitives out there used in production, such as anonymous credentials, PAKEs, and threshold signatures to name a few. For most, post-quantum variants are not readily available or are understudied. For some, the same goal can be achieved without fancy cryptography, but with a regrettable regression in subtle privacy goals. NIST cannot run a competition to define a post-quantum standard for each of these specific primitives, but luckily the signatures competition has been a huge help here.

The most clear example is FAEST. Although designed as a signature scheme, its underlying machinery (VOLEitH) can be repurposed in combination with a multivariate scheme like MAYO, to create an efficient post-quantum anonymous credential. Without the signatures competition, VOLEitH wouldn’t be as developed and vetted as it is today.

Many of the candidate schemes briefly point out their usefulness apart from signatures. We hope to see more of the indirect applications of these schemes highlighted.

Despite great signatures and more advanced cryptography on the horizon, we should not forget the task at hand: staying secure in the immediate future.

Anonymous credentials: rate-limiting bots and agents without compromising privacy

Thibault Meunier — Thu, 30 Oct 2025 13:00:00 GMT

The way we interact with the Internet is changing. Not long ago, ordering a pizza meant visiting a website, clicking through menus, and entering your payment details. Soon, you might just ask your phone to order a pizza that matches your preferences. A program on your device or on a remote server, which we call an AI agent, would visit the website and orchestrate the necessary steps on your behalf.

Of course, agents can do much more than order pizza. Soon we might use them to buy concert tickets, plan vacations, or even write, review, and merge pull requests. While some of these tasks will eventually run locally, for now, most are powered by massive AI models running in the biggest datacenters in the world. As agentic AI increases in popularity, we expect to see a large increase in traffic from these AI platforms and a corresponding drop in traffic from more conventional sources (like your phone).

This shift in traffic patterns has prompted us to assess how to keep our customers online and secure in the AI era. On one hand, the nature of requests are changing: Websites optimized for human visitors will have to cope with faster, and potentially greedier, agents. On the other hand, AI platforms may soon become a significant source of attacks, originating from malicious users of the platforms themselves.

Unfortunately, existing tools for managing such (mis)behavior are likely too coarse-grained to manage this transition. For example, when Cloudflare detects that a request is part of a known attack pattern, the best course of action often is to block all subsequent requests from the same source. When the source is an AI agent platform, this could mean inadvertently blocking all users of the same platform, even honest ones who just want to order pizza. We started addressing this problem earlier this year. But as agentic AI grows in popularity, we think the Internet will need more fine-grained mechanisms of managing agents without impacting honest users.

At the same time, we firmly believe that any such security mechanism must be designed with user privacy at its core. In this post, we'll describe how to use anonymous credentials (AC) to build these tools. Anonymous credentials help website operators to enforce a wide range of security policies, like rate-limiting users or blocking a specific malicious user, without ever having to identify any user or track them across requests.

Anonymous credentials are under development at IETF in order to provide a standard that can work across websites, browsers, platforms. It's still in its early stages, but we believe this work will play a critical role in keeping the Internet secure and private in the AI era. We will be contributing to this process as we work towards real-world deployment. This is still early days. If you work in this space, we hope you will follow along and contribute as well.

Let’s build a small agent

To help us discuss how AI agents are affecting web servers, let’s build an agent ourselves. Our goal is to have an agent that can order a pizza from a nearby pizzeria. Without an agent, you would open your browser, figure out which pizzeria is nearby, view the menu and make selections, add any extras (double pepperoni), and proceed to checkout with your credit card. With an agent, it’s the same flow —except the agent is opening and orchestrating the browser on your behalf.

In the traditional flow, there’s a human all along the way, and each step has a clear intent: list all pizzerias within 3 Km of my current location; pick a pizza from the menu; enter my credit card; and so on. An agent, on the other hand, has to infer each of these actions from the prompt "order me a pizza."

In this section, we’ll build a simple program that takes a prompt and can make outgoing requests. Here’s an example of a simple Worker that takes a specific prompt and generates an answer accordingly. You can find the code on GitHub:

export default {
   async fetch(request: Request, env: Env, ctx: ExecutionContext): Promise {
       const out = await env.AI.run("@cf/meta/llama-3.1-8b-instruct-fp8", {
           prompt: `I'd like to order a pepperoni pizza with extra cheese.
                    Please deliver it to Cloudflare Austin office.
                    Price should not be more than $20.`,
       });


       return new Response(out.response);
   },
} satisfies ExportedHandler;

In this context, the LLM provides its best answer. It gives us a plan and instruction, but does not perform the action on our behalf. You and I are able to take a list of instructions and act upon it because we have agency and can affect the world. To allow our agent to interact with more of the world, we’re going to give it control over a web browser.

Cloudflare offers a Browser Rendering service that can bind directly into our Worker. Let’s do that. The following code uses Stagehand, an automation framework that makes it simple to control the browser. We pass it an instance of Cloudflare remote browser, as well as a client for Workers AI.

import { Stagehand } from "@browserbasehq/stagehand";
import { endpointURLString } from "@cloudflare/playwright";
import { WorkersAIClient } from "./workersAIClient"; // wrapper to convert cloudflare AI


export default {
   async fetch(request: Request, env: Env, ctx: ExecutionContext): Promise {
       const stagehand = new Stagehand({
           env: "LOCAL",
           localBrowserLaunchOptions: { cdpUrl: endpointURLString(env.BROWSER) },
           llmClient: new WorkersAIClient(env.AI),
           verbose: 1,
       });
       await stagehand.init();


       const page = stagehand.page;
       await page.goto("https://mini-ai-agent.cloudflareresearch.com/llm");


       const { extraction } = await page.extract("what are the pizza available on the menu?");
       return new Response(extraction);
   },
} satisfies ExportedHandler;

You can access that code for yourself on https://mini-ai-agent.cloudflareresearch.com/llm. Here’s the response we got on October 10, 2025:

Margherita Classic: $12.99
Pepperoni Supreme: $14.99
Veggie Garden: $13.99
Meat Lovers: $16.99
Hawaiian Paradise: $15.49

Using the screenshot API of browser rendering, we can also inspect what the agent is doing. Here's how the browser renders the page in the example above:

Stagehand allows us to identify components on the page, such as page.act(“Click on pepperoni pizza”) and page.act(“Click on Pay now”). This eases interaction between the developer and the browser.

To go further, and instruct the agent to perform the whole flow autonomously, we have to use the appropriately named agent mode of Stagehand. This feature is not yet supported by Cloudflare Workers, but is provided below for completeness.

import { Stagehand } from "@browserbasehq/stagehand";
import { endpointURLString } from "@cloudflare/playwright";
import { WorkersAIClient } from "./workersAIClient";


export default {
   async fetch(request: Request, env: Env, ctx: ExecutionContext): Promise {
       const stagehand = new Stagehand({
           env: "LOCAL",
           localBrowserLaunchOptions: { cdpUrl: endpointURLString(env.BROWSER) },
           llmClient: new WorkersAIClient(env.AI),
           verbose: 1,
       });
       await stagehand.init();
       
       const agent = stagehand.agent();
       const result = await agent.execute(`I'd like to order a pepperoni pizza with extra cheese.
                                           Please deliver it to Cloudflare Austin office.
                                           Price should not be more than $20.`);


       return new Response(result.message);
   },
} satisfies ExportedHandler;

We can see that instead of adding step-by-step instructions, the agent is provided control. To actually pay, it would need access to a payment method such as a virtual credit card.

The prompt had some subtlety in that we’ve scoped the location to Cloudflare’s Austin office. This is because while the agent responds to us, it needs to understand our context. In this case, the agent operates out of Cloudflare edge, a location remote to us. This implies we are unlikely to pick up a pizza from this data center if it was ever delivered.

The more capabilities we provide to the agent, the more it has the ability to create some disruption. Instead of someone having to make 5 clicks at a slow rate of 1 request per 10 seconds, they’d have a program running in a data center possibly making all 5 requests in a second.

This agent is simple, but now imagine many thousands of these — some benign, some not — running at datacenter speeds. This is the challenge origins will face.

Protecting origins

For humans to interact with the online world, they need a web browser and some peripherals with which to direct the behavior of that browser. Agents are another way of directing a browser, so it may be tempting to think that not much is actually changing from the origin's point of view. Indeed, the most obvious change from the origin's point of view is merely where traffic comes from:

The reason this change is significant has to do with the tools the server has to manage traffic. Websites generally try to be as permissive as possible, but they also need to manage finite resources (bandwidth, CPU, memory, storage, and so on). There are a few basic ways to do this:

Global security policy: A server may opt to slow down, CAPTCHA, or even temporarily block requests from all users. This policy may be applied to an entire site, a specific resource, or to requests classified as being part of a known or likely attack pattern. Such mechanisms may be deployed in reaction to an observed spike in traffic, as in a DDoS attack, or in anticipation of a spike in legitimate traffic, as in Waiting Room.
Incentives: Servers sometimes try to incentivize users to use the site when more resources are available. For instance, a server price may be lower depending on the location or request time. This could be implemented with a Cloudflare Snippet.

While both tools can be effective, they also sometimes cause significant collateral damage. For example, while rate limiting a website's login endpoint can help prevent credential stuffing attacks, it also degrades the user experience for non-attackers. Before resorting to such measures, servers will first try to apply the security policy (whether a rate limit, a CAPTCHA, or an outright block) to individual users or groups of users.

However, in order to apply a security policy to individuals, the server needs some way of identifying them. Historically, this has been done via some combination of IP addresses, User-Agent, an account tied to the user identity (if available), and other fingerprints. Like most cloud service providers, Cloudflare has a dedicated offering for per-user rate limits based on such heuristics.

Fingerprinting works for the most part. However, it's unequitably distributed. On mobile, users have an especially difficult time solving CAPTCHAs, when using a VPN they’re more likely to be blocked, and when using reading mode they can mess up their fingerprint, preventing rendering of the page.

Likewise, agentic AI only exacerbates the limitations of fingerprinting. Not only will more traffic be concentrated on a smaller source IP range, the agents themselves will run the same software and hardware platform, making it harder to distinguish honest from malicious users.

Something that could help is Web Bot Auth, which would allow agents to identify to the origin which platform they're operated by. However, we wouldn't want to extend this mechanism — intended for identifying the platform itself — to identifying individual users of the platforms, as this would create an unacceptable privacy risk for these users.

We need some way of implementing security controls for individual users without identifying them. But how? The Privacy Pass protocol provides a partial solution.

Privacy Pass and its limitations

Today, one of the most prominent use cases for Privacy Pass is to rate limit requests from a user to an origin, as we have discussed before. The protocol works roughly as follows. The client is issued a number of tokens. Each time it wants to make a request, it redeems one of its tokens to the origin; the origin allows the request through only if the token is fresh, i.e., has never been observed before by the origin.

In order to use Privacy Pass for per-user rate-limiting, it's necessary to limit the number of tokens issued to each user (e.g., 100 tokens per user per hour). To rate limit an AI agent, this role would be fulfilled by the AI platform. To obtain tokens, the user would log in with the platform, and said platform would allow the user to get tokens from the issuer. The AI platform fulfills the attester role in Privacy Pass parlance. The attester is the party guaranteeing the per-user property of the rate limit. The AI platform, as an attester, is incentivized to enforce this token distribution as it stakes its reputation: Should it allow for too many tokens to be issued, the issuer could distrust them.

The issuance and redemption protocols are designed to have two properties:

Tokens are unforgeable: only the issuer can issue valid tokens.
Tokens are unlinkable: no party, including the issuer, attester, or origin, can tell which user a token was issued to.

These properties can be achieved using a cryptographic primitive called a blind signature scheme. In a conventional signature scheme, the signer uses its private key to produce a signature for a message. Later on, a verifier can use the signer’s public key to verify the signature. Blind signature schemes work in the same way, except that the message to be signed is blinded such that the signer doesn't know the message it's signing. The client “blinds” the message to be signed and sends it to the server, which then computes a blinded signature over the blinded message. The client obtains the final signature by unblinding the signature.

This is exactly how the standardised Privacy Pass issuance protocols are defined by RFC 9578:

Issuance: The user generates a random message $k$ which we call the nullifier. Concretely, this is just a random, 32-byte string. It then blinds the nullifier and sends it to the issuer. The issuer replies with a blind signature. Finally, the user unblinds the signature to get $\sigma$, a signature for the nullifier $k$. The token is the pair $(k, \sigma)$.
Redemption: When the user presents $(k, \sigma)$, the origin checks that $\sigma$ is a valid signature for the nullifier $k$ and that $k$ is fresh. If both conditions hold, then it accepts and lets the request through.

Blind signatures are simple, cheap, and perfectly suited for many applications. However, they have some limitations that make them unsuitable for our use case.

First, the communication cost of the issuance protocol is too high. For each token issued, the user sends a 256-byte, blinded nullifier and the issuer replies with a 256-byte blind signature (assuming RSA-2048 is used). That's 0.5KB of additional communication per request, or 500KB for every 1,000 requests. This is manageable as we’ve seen in a previous experiment for Privacy Pass, but not ideal. Ideally, the bandwidth would be sublinear in the rate limit we want to enforce. An alternative to blind signatures with lower compute time are Oblivious Pseudorandom Functions (VOPRF), but the bandwidth is still asymptotically linear. We’ve discussed them in the past, as they served as the basis for early deployments of Privacy Pass.

Second, blind signatures can't be used to rate-limit on a per-origin basis. Ideally, when issuing $N$ tokens to the client, the client would be able to redeem at most $N$ tokens at any origin server that can verify the token's validity. However, the client can't safely redeem the same token at more than one server because it would be possible for the servers to link those redemptions to the same client. What's needed is some mechanism for what we'll call late origin-binding: transforming a token for redemption at a particular origin in a way that's unlinkable to other redemptions of the same token.

Third, once a token is issued, it can't be revoked: it remains valid as long as the issuer's public key is valid. This makes it impossible for an origin to block a specific user if it detects an attack, or if its tokens are compromised. The origin can block the offending request, but the user can continue to make requests using its remaining token budget.

Anonymous credentials and the future of Privacy Pass

As noted by Chaum in 1985, an anonymous credential system allows users to obtain a credential from an issuer, and later prove possession of this credential, in an unlinkable way, without revealing any additional information. Also, it is possible to demonstrate that some attributes are attached to the credential.

One way to think of an anonymous credential is as a kind of blind signature with some additional capabilities: late-binding (link a token to an origin after issuance), multi-show (generate multiple tokens from a single issuer response), and expiration distinct from key rotation (token validity decoupled of the issuer cryptographic key validity). In the redemption flow for Privacy Pass, the client presents the unblinded message and signature to the server. To accept the redemption, the server needs to verify the signature. In an AC system, the client only presents a part of the message. In order for the server to accept the request, the client needs to prove to the server that it knows a valid signature for the entire message without revealing the whole thing.

The flow we described above would therefore include this additional presentation step.

Note that the tokens generated through blind signatures or VOPRFs can only be used once, so they can be regarded as single-use tokens. However, there exists a type of anonymous credentials that allows tokens to be used multiple times. For this to work, the issuer grants a credential to the user, who can later derive at most N many single-use tokens for redemption. Therefore, the user can send multiple requests, at the expense of a single issuance session.

The table below describes how blind signatures and anonymous credentials provide features of interest to rate limiting.

Feature	Blind Signature	Anonymous Credential
Issuing Cost	Linear complexity: issuing 10 signatures is 10x as expensive as issuing one signature	Sublinear complexity: signing 10 attributes is cheaper than 10 individual signatures
Proof Capability	Only prove that a message has been signed	Allow efficient proving of partial statements (i.e., attributes)
State Management	Stateless	Stateful
Attributes	No attributes	Public (e.g. expiry time) and private state

Let's see how a simple anonymous credential scheme works. The client's message consists of the pair $(k, C)$, where $k$ is a nullifier and $C$ is a counter representing the remaining number of times the client can access a resource. The value of the counter is controlled by the server: when the client redeems its credential, it presents both the nullifier and the counter. In response, the server checks that signature of the message is valid and that the nullifier is fresh, as before. Additionally, the server also

checks that the counter is greater than zero; and
decrements the counter issuing a new credential for the updated counter and a fresh nullifier.

A blind signature could be used to meet this functionality. However, whereas the nullifier can be blinded as before, it would be necessary to handle the counter in plaintext so that the server can check that the counter is valid (Step 1) and update it (Step 2). This creates an obvious privacy risk since the server, which is in control of the counter, can use it to link multiple presentations by the same client. For example, when you reach out to buy a pepperoni pizza, the origin could assign you a special counter value, which eases fingerprinting when you present it a second time. Fortunately, there exist anonymous credentials designed to close this kind of privacy gap.

The scheme above is a simplified version of Anonymous Credit Tokens (ACT), one of the anonymous credential schemes being considered for adoption by the Privacy Pass working group at IETF. The key feature of ACT is its statefulness: upon successful redemption, the server re-issues a new credential with updated nullifier and counter values. This creates a feedback loop between the client and server that can be used to express a variety of security policies.

By design, it's not possible to present ACT credentials multiple times simultaneously: the first presentation must be completed so that the re-issued credential can be presented in the next request. Parallelism is the key feature of Anonymous Rate-limited Credential (ARC), another scheme under discussion at the Privacy Pass working group. ARCs can be presented across multiple requests in parallel up to the presentation limit determined during issuance.

Another important feature of ARC is its support for late origin-binding: when a client is issued an ARC with presentation limit $N$, it can safely use its credential to present up to $N$ times to any origin that can verify the credential.

These are just examples of relevant features of some anonymous credentials. Some applications may benefit from a subset of them; others may need additional features. Fortunately, both ACT and ARC can be constructed from a small set of cryptographic primitives that can be easily adapted for other purposes.

Building blocks for anonymous credentials

ARC and ACT share two primitives in common: algebraic MACs, which provide for limited computations on the blinded message; and zero-knowledge proofs (ZKP) for proving validity of the part of the message not revealed to the server. Let's take a closer look at each.

Algebraic MACs

A Message Authenticated Code (MAC) is a cryptographic tag used to verify a message's authenticity (that it comes from the claimed sender) and integrity (that it has not been altered). Algebraic MACs are built from mathematical structures like group actions. The algebraic structure gives them some additional functionality, one of them being a homomorphism that we can blind easily to conceal the actual value of the MAC. Adding a random value on an algebraic MAC blinds the value.

Unlike blind signatures, both ACT and ARC are only privately verifiable, meaning the issuer and the origin must both have the issuer's private key. Taking Cloudflare as an example, this means that a credential issued by Cloudflare can only be redeemed by an origin behind Cloudflare. Publicly verifiable variants of both are possible, but at an additional cost.

Zero-Knowledge Proofs for linear relations

Zero knowledge proofs (ZKP) allow us to prove a statement is true without revealing the exact value that makes the statement true. The ZKP is constructed by a prover in such a way that it can only be generated by someone who actually possesses the secret. The verifier can then run a quick mathematical check on this proof. If the check passes, the verifier is convinced that the prover's initial statement is valid. The crucial property is that the proof itself is just data that confirms the statement; it contains no other information that could be used to reconstruct the original secret.

For ARC and ACT, we want to prove linear relations of secrets. In ARC, a user needs to prove that different tokens are linked to the same original secret credential. For example, a user can generate a proof showing that a request token was derived from a valid issued credential. The system can verify this proof to confirm the tokens are legitimately connected, all without ever learning the underlying secret credential that ties them together. This allows the system to validate user actions while guaranteeing their privacy.

Proving simple linear relations can be extended to prove a number of powerful statements, for example that a number is in range. For example, this is useful to prove that you have a positive balance on your account. To prove your balance is positive, you prove that you can encode your balance in binary. Let’s say you can at most have 1024 credits in your account. To prove your balance is non-zero when it is, for example, 12, you prove two things simultaneously: first, that you have a set of binary bits, in this case 12=(1100)₂, and second, that a linear equation using these bits (8*1 + 4*1 + 2*0 + 1*0) correctly adds up to your total committed balance. This convinces the verifier that the number is validly constructed without them learning the exact value. This is how it works for powers of two, but it can easily be extended to arbitrary ranges.

The mathematical structure of algebraic MACs allows easy blinding and evaluation. The structure also allows for an easy proof that a MAC has been evaluated with the private key without revealing the MAC. In addition, ARC could use ZKPs to prove that a nonce has not been spent before. In contrast, ACT uses ZKPs to prove we have enough of a balance left on our token. The balance is subtracted homomorphically using more group structure.

How much does this all cost?

Anonymous credentials allow for more flexibility, and have the potential to reduce the communication cost, compared to blind signatures in certain applications. To identify such applications, we need to measure the concrete communication cost of these new protocols. In addition, we need to understand how their CPU usage compares to blind signatures and oblivious pseudorandom functions.

We measure the time that each participant spends at each stage of some AC schemes. We also report the size of messages transmitted across the network. For ARC, ACT, and VOPRF, we'll use ristretto255 as the prime group and SHAKE128 for hashing. For Blind RSA, we'll use a 2048-bit modulus and SHA-384 for hashing.

Each algorithm was implemented in Go, on top of the CIRCL library. We plan to open source the code once the specifications of ARC and ACT begin to stabilize.

Let’s take a look at the most widely used deployment in Privacy Pass: Blind RSA. Redemption time is low, and most of the cost lies with the server at issuance time. Communication cost is mostly constant and in the order of 256 bytes.

Blind RSA RFC9474(RSA-2048+SHA384)	1 Token
Time	Message Size
Issuance	Client (Blind)	63 µs	256 B
Server (Evaluate)	2.69 ms	256 B
Client (Finalize)	37 µs	256 B
Redemption	Client	–	300 B
Server	37 µs	–

When looking at VOPRF, verification time on the server is slightly higher than for Blind RSA, but communication cost and issuance are much faster. Evaluation time on the server is 10x faster for 1 token, and more than 25x faster when using amortized token issuance. Communication cost per token is also more appealing, with a message size at least 3x lower.

VOPRF RFC9497(Ristretto255+SHA512)	1 Token	1000 Amortized issuances
Time	Message Size	Time (per token)	Message Size (per token)
Issuance	Client (Blind)	54 µs	32 B	54 µs	32 B
Server (Evaluate)	260 µs	96 B	99 µs	32.064 B
Client (Finalize)	376 µs	64 B	173 µs	64 B
Redemption	Client	–	96 B	–
Server	57 µs	–

This makes VOPRF tokens appealing for applications requiring a lot of tokens that can accept a slightly higher redemption cost, and that don’t need public verifiability.

Now, let’s take a look at the figures for ARC and ACT anonymous credential schemes. For both schemes we measure the time to issue a credential that can be presented at most $N=1000$ times.

Issuance Credential Generation	ARC	ACT
Time	Message Size	Time	Message Size
Client (Request)	323 µs	224 B	64 µs	141 B
Server (Response)	1349 µs	448 B	251 µs	176 B
Client (Finalize)	1293 µs	128 B	204 µs	176 B

Redemption Credential Presentation	ARC	ACT
Time	Message Size	Time	Message Size
Client (Present)	735 µs	288 B	1740 µs	1867 B
Server (Verify/Refund)	740 µs	–	1785 µs	141 B
Client (Update)	–	–	508 µs	176 B

As we would hope, the communication cost and the server’s runtime is much lower than a batched issuance with either Blind RSA or VOPRF. For example, a VOPRF issuance of 1000 tokens takes 99 ms (99 µs per token) vs 1.35 ms for issuing one ARC credential that allows for 1000 presentations. This is about 70x faster. The trade-off is that presentation is more expensive, both for the client and server.

How about ACT? Like ARC, we would expect the communication cost of issuance grows much slower with respect to the credits issued. Our implementation bears this out. However, there are some interesting performance differences between ARC and ACT: issuance is much cheaper for ACT than it is for ARC, but redemption is the opposite.

What's going on? The answer has largely to do with what each party needs to prove with ZKPs at each step. For example, during ACT redemption, the client proves to the server (in zero-knowledge) that its counter $C$ is in the desired range, i.e., $0 \leq C \leq N$. The proof size is on the order of $\log_{2} N$, which accounts for the larger message size. In the current version, ARC redemption does not involve range proofs, but a range proof may be added in a future version. Meanwhile, the statements the client and server need to prove during ARC issuance are a bit more complicated than for ARC presentation, which accounts for the difference in runtime there.

The advantage of anonymous credentials, as discussed in the previous sections, is that issuance only has to be performed once. When a server evaluates its cost, it takes into account the cost of all issuances and the cost of all verifications. At present, only accounting for credentials costs, it’s cheaper for a server to issue and verify tokens than verify an anonymous credential presentation.

The advantage of multiple-use anonymous credentials is that instead of the issuer generating $N$ tokens, the bulk of computation is offloaded to the clients. This is more scoped. Late origin binding allows them to work for multiple origins/namespace, range proof to decorrelate expiration from key rotation, and refund to provide a dynamic rate limit. Their current applications are dictated by the limitation of single-use token based schemes, more than by the added efficiency they provide. This seems to be an exciting area to explore, and see if closing the gap is possible.

Managing agents with anonymous credentials

Managing agents will likely require features from both ARC and ACT.

ARC already has much of the functionality we need: it supports rate limiting, is communication-efficient, and it supports late origin-binding. Its main downside is that, once an ARC credential is issued, it can't be revoked. A malicious user can always make up to N requests to any origin it wants.

We can allow for a limited form of revocation by pairing ARC with blind signatures (or VOPRF). Each presentation of the ARC credential is accompanied by a Privacy Pass token: upon successful presentation, the client is issued another Privacy Pass token it can use during the next presentation. To revoke a credential, the server would simply not re-issue the token:

This scheme is already quite useful. However, it has some important limitations:

Parallel presentation across origins is not possible: the client must wait for the request to one origin to succeed before it can initiate a request to a second origin.
Revocation is global rather than per-origin, meaning the credential is not only revoked for the origin to whom it was presented, but for every origin it can be presented to. We suspect this will be undesirable in some cases. For example, an origin may want to revoke if a request violates its robots.txt policy; but the same request may have been accepted by other origins.

A more fundamental limitation of this design is that the decision to revoke can only be made on the basis of a single request — the one in which the credential was presented. It may be risky to decide to block a user on the basis of a single request; in practice, attack patterns may only emerge across many requests. ACT's statefulness enables at least a rudimentary form of this kind of defense. Consider the following scheme:

Issuance: The client is issued an ARC with presentation limit $N=1$.
Presentation:
- When the client presents its ARC credential to an origin for the first time, the server issues an ACT credential with a valid initial state.
- When the client presents an ACT with valid state (e.g., credit counter greater than 0), the origin either:
  - refuses to issue a new ACT, thereby revoking the credential. It would only do so if it had high confidence that the request was part of an attack; or
  - issues a new ACT with state updated to reduce the ACT credit by the amount of resources consumed while processing the request.

Benign requests wouldn't change the state by much (if at all), but suspicious requests might impact the state in a way that gets the user closer to their rate limit much faster.

Demo

To see how this idea works in practice, let's look at a working example that uses the Model Context Protocol. The demo below is built using MCP Tools. Tools are extensions the AI agent can call to extend its capabilities. They don't need to be integrated at release time within the MCP client. This provides a nice and easy prototyping avenue for anonymous credentials.

Tools are offered by the server via an MCP compatible interface. You can see details on how to build such MCP servers in a previous blog.

In our pizza context, this could look like a pizzeria that offers you a voucher. Each voucher gets you 3 pizza slices. Mocking a design, an integration within a chat application could look as follows:

The first panel presents all tools exposed by the MCP server. The second one showcases an interaction performed by the agent calling these tools.

To look into how such a flow would be implemented, let’s write the MCP tools, offer them in an MCP server, and manually orchestrate the calls with the MCP Inspector.

The MCP server should provide two tools:

act-issue which issues an ACT credential valid for 3 requests. The code used here is an earlier version of the IETF draft which has some limitations.
act-redeem makes a presentation of the local credential, and fetches our pizza menu.

First, we run act-issue. At this stage, we could ask the agent to run an OAuth flow, fetch an internal authentication endpoint, or to compute a proof of work.

This gives us 3 credits to spend against an origin. Then, we run act-redeem

Et voilà. If we run act-redeem once more, we see we have one fewer credit.

You can test it yourself, here are the source codes available. The MCP server is written in Rust to integrate with the ACT rust library. The browser-based client works similarly, check it out.

Moving further

In this post, we’ve presented a concrete approach to rate limit agent traffic. It is in full control of the client, and is built to protect the user's privacy. It uses emerging standards for anonymous credentials, integrates with MCP, and can be readily deployed on Cloudflare Workers.

We're on the right track, but there are still questions that remain. As we touched on before, a notable limitation of both ARC and ACT is that they are only privately verifiable. This means that the issuer and origin need to share a private key, for issuing and verifying the credential respectively. There are likely to be deployment scenarios for which this isn't possible. Fortunately, there may be a path forward for these cases using pairing-based cryptography, as in the BBS signature specification making its way through IETF. We’re also exploring post-quantum implications in a concurrent post.

If you are an agent platform, an agent developer, or a browser, all our code is available on GitHub for you to experiment. Cloudflare is actively working on vetting this approach for real-world use cases.

The specification and discussion are happening within the IETF and W3C. This ensures the protocols are built in the open, and receive participation from experts. Improvements are still to be made to clarify the right performance-to-privacy tradeoff, or even the story to deploy on the open web.

If you’d like to help us, we’re hiring 1,111 interns over the course of next year, and have open positions.

Policy, privacy and post-quantum: anonymous credentials for everyone

Lena Heimberger — Thu, 30 Oct 2025 13:00:00 GMT

The Internet is in the midst of one of the most complex transitions in its history: the migration to post-quantum (PQ) cryptography. Making a system safe against quantum attackers isn't just a matter of replacing elliptic curves and RSA with PQ alternatives, such as ML-KEM and ML-DSA. These algorithms have higher costs than their classical counterparts, making them unsuitable as drop-in replacements in many situations.

Nevertheless, we're making steady progress on the most important systems. As of this writing, about 50% of TLS connections to Cloudflare's edge are safe against store-now/harvest-later attacks. Quantum safe authentication is further out, as it will require more significant changes to how certificates work. Nevertheless, this year we've taken a major step towards making TLS deployable at scale with PQ certificates.

That said, TLS is only the lowest hanging fruit. There are many more ways we have come to rely on cryptography than key exchange and authentication and which aren’t as easy to migrate. In this blog post, we'll take a look at Anonymous Credentials (ACs).

ACs solve a common privacy dilemma: how to prove a specific fact (for example that one has had a valid driver’s license for more than three years) without over-sharing personal information (like the place of birth)? Such problems are fundamental to a number of use cases, and ACs may provide the foundation we need to make these applications as private as possible.

Just like for TLS, the central question for ACs is whether there are drop-in, PQ replacements for its classical primitives that will work at the scale required, or will it be necessary to re-engineer the application to mitigate the cost of PQ.

We'll take a stab at answering this question in this post. We'll focus primarily on an emerging use case for ACs described in a concurrent post: rate-limiting requests from agentic AI platforms and users. This demanding, high-scale use case is the perfect lens through which to evaluate the practical readiness of today's post-quantum research. We'll use it as our guiding problem to measure each cryptographic approach.

We'll first explore the current landscape of classical AC adoption across the tech industry and the public sector. Then, we’ll discuss what cryptographic researchers are currently looking into on the post-quantum side. Finally, we’ll take a look at what it'll take to bridge the gap between theory and real-world applications.

While anonymous credentials are only seeing their first real-world deployments in recent years, it is critical to start thinking about the post-quantum challenge concurrently. This isn’t a theoretical, too-soon problem given the store-now decrypt-later threat. If we wait for mass adoption before solving post-quantum anonymous credentials, ACs risk being dead on arrival. Fortunately, our survey of the state of the art shows the field is close to a practical solution. Let’s start by reviewing real-world use-cases of ACs.

Real world (classical) anonymous credentials

In 2026, the European Union is set to launch its digital identity wallet, a system that will allow EU citizens, residents and businesses to digitally attest to their personal attributes. This will enable them, for example, to display their driver’s license on their phone or perform age verification. Cloudflare's use cases for ACs are a bit different and revolve around keeping our customers secure by, for example, rate limiting bots and humans as we currently do with Privacy Pass. The EU wallet is a massive undertaking in identity provisioning, and our work operates at a massive scale of traffic processing. Both initiatives are working to solve a shared fundamental problem: allowing an entity to prove a specific attribute about themselves without compromising their privacy by revealing more than they have to.

The EU's goal is a fully mobile, secure, and user-friendly digital ID. The current technical plan is ambitious, as laid out in the Architecture Reference Framework (ARF). It defines the key privacy goals of unlinkability to guarantee that if a user presents attributes multiple times, the recipients cannot link these separate presentations to conclude that they concern the same user. However, currently proposed solutions fail to achieve this. The framework correctly identifies the core problem: attestations contain unique, fixed elements such as hash values, […], public keys, and signatures that colluding entities could store and compare to track individuals.

In its present form, the ARF's recommendation to mitigate cross-session linkability is limited-time attestations. The framework acknowledges in the text that this would only partially mitigate Relying Party linkability. An alternative proposal that would mitigate linkability risks are single-use credentials. They are not considered at the moment due to complexity and management overhead. The framework therefore leans on organisational and enforcement measures to deter collusion instead of providing a stronger guarantee backed by cryptography.

This reliance on trust assumptions could become problematic, especially in the sensitive context of digital identity. When asked for feedback, cryptographic researchers agree that the proper solution would be to adopt anonymous credentials. However, this solution presents a long-term challenge. Well-studied methods for anonymous credentials, such as those based on BBS signatures, are vulnerable to quantum computers. While some anonymous schemes are PQ-unlinkable, meaning that user privacy is preserved even when cryptographically relevant quantum computers exist, new credentials could be forged. This may be an attractive target for, say, a nation state actor.

New cryptography also faces deployment challenges: in the EU, only approved cryptographic primitives, as listed in the SOG-IS catalogue, can be used. At the time of writing, this catalogue is limited to established algorithms such as RSA or ECDSA. But when it comes to post-quantum cryptography, SOG-IS is leaving the problem wide open.

The wallet's first deployment will not be quantum-secure. However, with the transition to post-quantum algorithms being ahead of us, as soon as 2030 for high-risk use cases per the EU roadmap, research in a post-quantum compatible alternative for anonymous credentials is critical. This will encompass standardizing more cryptography.

Regarding existing large scale deployments, the US has allowed digital ID on smartphones since 2024. They can be used at TSA checkpoints for instance. The Department of Homeland Security lists funding for six privacy-preserving digital credential wallets and verifiers on their website. This early exploration and engagement is a positive sign, and highlights the need to plan for privacy-preserving presentations.

Finally, ongoing efforts at the Internet Engineering Task Force (IETF) aim to build a more private Internet by standardizing advanced cryptographic techniques. Active individual drafts (i.e., not yet adopted by a working group), such as Longfellow and Anonymous Credit Tokens (ACT), and adopted drafts like Anonymous Rate-limited Credentials (ARC), propose more flexible multi-show anonymous credentials that incorporate developments over the last several years. At IETF 117 in 2023, post-quantum anonymous credentials and deployable generic anonymous credentials were presented as a research opportunity. Check out our post on rate limiting agents for details.

Before we get into the state-of-the-art for PQ, allow us to try to crystalize a set of requirements for real world applications.

Requirements

Given the diversity of use cases, adoption of ACs will be made easier by the fact that they can be built from a handful of powerful primitives. (More on this in our concurrent post.) As we'll see in the next section, we don't yet have drop-in, PQ alternatives for these kinds of primitives. The "building blocks" of PQ ACs are likely to look quite different, and we're going to know something about what we're building towards.

For our purposes, we can think of an anonymous credential as a kind of fancy blind signature. What's that you ask? A blind signature scheme has two phases: issuance, in which the server signs a message chosen by the client; and presentation, in which the client reveals the message and the signature to the server. The scheme should be unlinkable in the sense that the server can't link any message and signature to the run of the issuance protocol in which it was produced. It should also be unforgeable in the sense that no client can produce a valid signature without interacting with the server.

The key difference between ACs and blind signatures is that, during presentation of an AC, the client only presents part of the message in plaintext; the rest of the message is kept secret. Typically, the message has three components:

Private state, such as a counter that, for example, keeps track of the number of times the credential was presented. The client would prove to the server that the state is "valid", for example, a counter with value $0 \leq C \leq N$, without revealing $C$. In many situations, it's desirable to allow the server to update this state upon successful presentation, for example, by decrementing the counter. In the context of rate limiting, this is the number of how many requests are left for a credential.
A random value called the nullifier that is revealed to the server during presentation. In rate-limiting, the nullifier prevents a user from spending a credential with a given state more than once.
Public attributes known to both the client and server that bind the AC to some application context. For example, this might represent the window of time in which the credential is valid (without revealing the exact time it was issued).

Such ACs are well-suited for rate limiting requests made by the client. Here the idea is to prevent the client from making more than some maximum number of requests during the credential's lifetime. For example, if the presentation limit is 1,000 and the validity window is one hour, then the clients can make up to 0.27 requests/second on average before it gets throttled.

It's usually desirable to enforce rate limits on a per-origin basis. This means that if the presentation limit is 1,000, then the client can make at most 1,000 requests to any website that can verify the credential. Moreover, it can do so safely, i.e., without breaking unlinkability across these sites.

The current generation of ACs being considered for standardization at IETF are only privately verifiable, meaning the server issuing the credential (the issuer) must share a private key with the server verifying the credential (the origin). This will be sufficient for some deployment scenarios, but many will require public verifiability, where the origin only needs the issuer's public key. This is possible with BBS-based credentials, for example.

Finally, let us say a few words about round complexity. An AC is round optimal if issuance and presentation both complete in a single HTTP request and response. In our survey of PQ ACs, we found a number of papers that discovered neat tricks that reduce bandwidth (the total number of bits transferred between the client and server) at the cost of additional rounds. However, for use cases like ours, round optimality is an absolute necessity, especially for presentation. Not only do multiple rounds have a high impact on latency, they also make the implementation far more complex.

Within these constraints, our goal is to develop PQ ACs that have as low communication cost (i.e., bandwidth consumption) and runtime as possible in the context of rate-limiting.

"Ideal world" (PQ) anonymous credentials

The academic community has produced a number of promising post-quantum ACs. In our survey of the state of the art, we evaluated several leading schemes, scoring them on their underlying primitives and performance to determine which are truly ready for the Internet. To understand the challenges, it is essential to first grasp the cryptographic building blocks used in ACs today. We’ll now discuss some of the core concepts that frequently appear in the field.

Relevant cryptographic paradigms

Zero-knowledge proofs

Zero-knowledge proofs (ZKPs) are a cryptographic protocol that allows a prover to convince a verifier that a statement is true without revealing the secret information, or witness. ZKPs play a central role in ACs: they allow proving statements of the secret part of the credential's state without revealing the state itself. This is achieved by transforming the statement into a mathematical representation, such as a set of polynomial equations over a finite field. The prover then generates a proof by performing complex operations on this representation, which can only be completed correctly if they possess the valid witness.

General-purpose ZKP systems, like Scalable Transparent Arguments of Knowledge (STARKs), can prove the integrity of any computation up to a certain size. In a STARK-based system, the computational trace is represented as a set of polynomials. The prover then constructs a proof by evaluating these polynomials and committing to them using cryptographic hash functions. The verifier can then perform a quick probabilistic check on this proof to confirm that the original computation was executed correctly. Since the proof itself is just a collection of hashes and sampled polynomial values, it is secure against quantum computers, providing a statistically sound guarantee that the claimed result is valid.

Cut-and-Choose

Cut-and-choose is a cryptographic technique designed to ensure a prover’s honest behaviour by having a verifier check a random subset of their work. The prover first commits to multiple instances of a computation, after which the verifier randomly chooses a portion to be cut open by revealing the underlying secrets for inspection. If this revealed subset is correct, the verifier gains high statistical confidence that the remaining, un-opened instances are also correct.

This technique is important because while it is a generic tool used to build protocols secure against malicious adversaries, it also serves as a crucial case study. Its security is not trivial; for example, practical attacks on cut-and-choose schemes built with (post-quantum) homomorphic encryption have succeeded by attacking the algebraic structure of the encoding, not the encryption itself. This highlights that even generic constructions must be carefully analyzed in their specific implementation to prevent subtle vulnerabilities and information leaks.

Sigma Protocols

Sigma protocols follow a more structured approach that does not require us to throw away any computations. The three-move protocol starts with a commitment phase where the prover generates some randomness, which is added to the input to generate the commitment, and sends the commitment to the verifier. Then, the verifier challenges the prover with an unpredictable challenge. To finish the proof, the prover provides a response in which they combine the initial randomness with the verifier’s challenge in a way that is only possible if the secret value, such as the solution to a discrete logarithm problem, is known.

^{Depiction of a Sigma protocol flow, where the prover commits to their witness $w$, the verifier challenges the prover to prove knowledge about $w$, and the prover responds with a mathematical statement that the verifier can either accept or reject.}

In practice, the prover and verifier don't run this interactive protocol. Instead, they make it non-interactive using a technique known as the Fiat-Shamir transformation. The idea is that the prover generates the challenge itself, by deriving it from its own commitment. It may sound a bit odd, but it works quite well. In fact, it's the basis of signatures like ECDSA and even PQ signatures like ML-DSA.

MPC in the head

Multi-party computation (MPC) is a cryptographic tool that allows multiple parties to jointly compute a function over their inputs without revealing their individual inputs to the other parties. MPC in the Head (MPCitH) is a technique to generate zero-knowledge proofs by simulating a multi-party protocol in the head of the prover.

The prover simulates the state and communication for each virtual party, commits to these simulations, and shows the commitments to the verifier. The verifier then challenges the prover to open a subset of these virtual parties. Since MPC protocols are secure even if a minority of parties are dishonest, revealing this subset doesn't leak the secret, yet it convinces the verifier that the overall computation was correct.

This paradigm is particularly useful to us because it's a flexible way to build post-quantum secure ZKPs. MPCitH constructions build their security from symmetric-key primitives (like hash functions). This approach is also transparent, requiring no trusted setup. While STARKs share these post-quantum and transparent properties, MPCitH often offers faster prover times for many computations. Its primary trade-off, however, is that its proofs scale linearly with the size of the circuit to prove, while STARKs are succinct, meaning their proof size grows much slower.

Rejection sampling

When a randomness source is biased or outputs numbers outside the desired range, rejection sampling can correct the distribution. For example, imagine you need a random number between 1 and 10, but your computer only gives you random numbers between 0 and 255. (Indeed, this is the case!) The rejection sampling algorithm calls the RNG until it outputs a number below 11 and above 0:

Calling the generator over and over again may seem a bit wasteful. An efficient implementation can be realized with an eXtendable Output Function (XOF). A XOF takes an input, for example a seed, and computes an arbitrarily-long output. An example is the SHAKE family (part of the SHA3 standard), and the recently proposed round-reduced version of SHAKE called TurboSHAKE.

Let’s imagine you want to have three numbers between 1 and 10. Instead of calling the XOF over and over, you can also ask the XOF for several bytes of output. Since each byte has a probability of 3.52% to be in range, asking the XOF for 174 bytes is enough to have a greater than 99% chance of finding at least three usable numbers. In fact, we can be even smarter than this: 10 fits in four bits, so we can split the output bytes into lower and higher nibbles. The probability of a nibble being in the desired range is now 56.4%:

^{Rejection sampling by batching queries.}

Rejection sampling is a part of many cryptographic primitives, including many we'll discuss in the schemes we look at below.

Building post-quantum ACs

Classical anonymous credentials (ACs), such as ARC and ACT, are built from algebraic groups- specifically, elliptic curves, which are very efficient. Their security relies on the assumption that certain mathematical problems over these groups are computationally hard. The premise of post-quantum cryptography, however, is that quantum computers can solve these supposedly hard problems. The most intuitive solution is to replace elliptic curves with a post-quantum alternative. In fact, cryptographers have been working on a replacement for a number of years: CSIDH.

This raises the key question: can we simply adapt a scheme like ARC by replacing its elliptic curves with CSIDH? The short answer is no, due to a critical roadblock in constructing the necessary zero-knowledge proofs. While we can, in theory, build the required Sigma protocols or MPC-in-the-Head (MPCitH) proofs from CSIDH, they have a prerequisite that makes them unusable in practice: they require a trusted setup to ensure the prover cannot cheat. This requirement is a non-starter, as no algorithm for performing a trusted setup in CSIDH exists. The trusted setup for sigma protocols can be replaced by a combination of generic techniques from multi-party computation and cut-and-choose protocols, but that adds significant computation cost to the already computationally expensive isogeny operations.

This specific difficulty highlights a more general principle. The high efficiency of classical credentials like ARC is deeply tied to the rich algebraic structure of elliptic curves. Swapping this component for a post-quantum alternative, or moving to generic constructions, fundamentally alters the design and its trade-offs. We must therefore accept that post-quantum anonymous credentials cannot be a simple "lift-and-shift" of today's schemes. They will require new designs built from different cryptographic primitives, such as lattices or hash functions.

Prefabricated schemes from generic approaches

At Cloudflare, we explored a post-quantum privacy pass construction in 2023 that closely resembles the functionality needed for anonymous credentials. The main result is a generic construction that composes separate, quantum-secure building blocks: a digital signature scheme and a general-purpose ZKP system:

The figure shows a cryptographic protocol divided into two main phases: (1.) Issuance: The user commits to a message (without revealing it) and sends the commitment to the server. The server signs the commitment and returns this signed commitment, which serves as a token. The user verifies the server's signature. (2.) Redemption: To use the token, the user presents it and constructs a proof. This proof demonstrates they have a valid signature on the commitment and opens the commitment to reveal the original message. If the server validates the proof, the user and server continue (e.g., to access a rate-limited origin).

The main appeal of this modular design is its flexibility. The experimental implementation uses a modified version of the signature ML-DSA signatures and STARKs, but the components can be easily swapped out. The design provides strong, composable security guarantees derived directly from the underlying parts. A significant speedup for the construction came from replacing the hash function SHA3 in ML-DSA with the zero-knowledge friendly Poseidon.

However, the modularity of our post-quantum Privacy Pass construction incurs a significant performance overhead demonstrated in a clear trade-off between proof generation time and size: a fast 300 ms proof generation requires a large 173 kB signature, while a 4.8s proof generation time cuts the size of the signature nearly in half. A balanced parameter set, which serves as a good benchmark for any dedicated solution to beat, took 660 ms to sign and resulted in a 112 kB signature. The implementation is currently a proof of concept, with perhaps some room for optimization. Alternatively, a different signature like FN-DSA could offer speed improvements: while its issuance is more complex, its verification is far more straightforward, boiling down to a simple hash-to-lattice computation and a norm check.

However, while this construction gives a functional baseline, these figures highlight the performance limitations for a real-time rate limiting system, where every millisecond counts. The 660 ms signing time strongly motivates the development of dedicated cryptographic constructions that trade some of the modularity for performance.

Solid structure: Lattices

Lattices are a natural starting point when discussing potential post-quantum AC candidates. NIST standardized ML-DSA and ML-KEM as signature and KEM algorithms, both of which are based on lattices. So, are lattices the answer to post-quantum anonymous credentials?

The answer is a bit nuanced. While explicit anonymous credential schemes from lattices exist, they have shortcomings that prevent real-world deployment: for example, a recent scheme sacrifices round-optimality for smaller communication size, which is unacceptable for a service like Privacy Pass where every second counts. Given that our RTT is 100ms or less for the majority of users, each extra communication round adds tangible latency especially for those on slower Internet connections. When the final credential size is still over 100 kB, the trade-offs are hard to justify. So, our search continues. We expand our horizon by looking into blind signatures and whether we can adapt them for anonymous credentials.

Two-step approach: Hash-and-sign

A prominent paradigm in lattice-based signatures is the hash-and-sign construction. Here, the message is first hashed to a point in the lattice. Then, the signer uses their secret key, a lattice trapdoor, to generate a vector that, when multiplied with the private key, evaluates to the hashed point in the lattice. This is the core mechanism behind signature schemes like FN-DSA.

Adapting hash-and-sign for blind signatures is tricky, since the signer may not learn the message. This introduces a significant security challenge: If the user can request signatures on arbitrary points, they can mount an attack to extract the trapdoor by repeatedly requesting signatures for carefully chosen arbitrary points. These points can be used to reconstruct a short basis, which is equivalent to a key recovery.

The standard defense against this attack is to require the user to prove in zero-knowledge that the point they are asking to be signed is the blinded output of the specified hash function. However, proving hash preimages leads to the same problem as in the generic post-quantum privacy pass paper: proving a conventional hash function (like SHA3) inside a ZKP is computationally expensive and has a large communication complexity.

This difficult trade-off is at the heart of recent academic work. The state-of-the-art paper presents two lattice-based blind signature schemes with small signature sizes of 22 KB for a signature and 48 bytes for a privately-verifiable protocol that may be more useful in a setting like anonymous credential. However, this focus on the final signature size comes at the cost of an impractical issuance. The user must provide ZKPs for the correct hash and lattice relations that, by the paper’s own analysis, can add to several hundred kilobytes and take 20 seconds to generate and 10 seconds to verify.

While these results are valuable for advancing the field, this trade-off is a significant barrier for any large-scale, practical system. For our use case, a protocol that increases the final signature size moderately in exchange for a more efficient and lightweight issuance process would be a more suitable and promising direction.

Best of two signatures: Hash-and-sign with aborts

A promising technique for blind signatures combines the hash-and-sign paradigm with Fiat-Shamir with aborts, a method that relies on rejection sampling signatures. In this approach, the signer repeatedly attempts to generate a signature and aborts any result that may leak information about the secret key. This process ensures the final signature is statistically independent of the key and is used in modern signatures like ML-DSA. The Phoenix signature scheme uses hash-and-sign with aborts, where a message is first hashed into the lattice and signed, with rejection sampling employed to break the dependency between the signature and the private key.

Building on this foundation is an anonymous credential scheme for hash-and-sign with aborts. The main improvement over hash-and-sign anonymous credentials is that, instead of proving the validity of a hash, the user commits to their attributes, which avoids costly zero-knowledge proofs.

The scheme is fully implemented and credentials with attribute proofs just under 80 KB and signatures under 7 kB. The scheme takes less than 400 ms for issuance and 500 ms for showing the credential. The protocol also has a lot of features necessary for anonymous credentials, allowing users to prove relations between attributes and request pseudonyms for different instances.

This research presents a compelling step towards real-world deployability by combining state-of-the-art techniques to achieve a much healthier balance between performance and security. While the underlying mathematics are a bit more complex, the scheme is fully implemented and with a proof of knowledge of a signature at 40 kB and a prover time under a second, the scheme stands out as a great contender. However, for practical deployment, these figures would likely need a significant speedup to be usable in real-time systems. An improvement seems plausible, given recent advances in lattice samplers. Though the exact scale we can achieve is unclear. Still, we think it would be worthwhile to nudge the underlying design paradigm a little closer to our use cases.

Do it yourself: MPC-in-the-head

While the lattice-based hash-and-sign with aborts scheme provides one path to post-quantum signatures, an alternative approach is emerging from the MPCitH variant VOLE-in-the-Head (VOLEitH).

This scheme builds on Vector Oblivious Linear Evaluation (VOLE), an interactive protocol where one party's input vector is processed with another's secret value delta, creating a correlation. This VOLE correlation is used as a cryptographic commitment to the prover’s input. The system provides a zero-knowledge proof because the prover is bound by this correlation and cannot forge a solution without knowing the secret delta. The verifier, in turn, just has to verify that the final equation holds when the commitment is opened. This system is linearly homomorphic, which means that two commitments can be combined. This property is ideal for the commit-and-prove paradigm, where the prover first commits to the witnesses and then proves the validity of the circuit gate by gate. The primary trade-off is that the proofs are linear in the size of the circuit, but they offer substantially better runtimes. We also use linear-sized proofs for ARC and ACT.

^{Example of evaluating a circuit gate by first committing to each wire and then proving the composition. This is easy for linear gates.}

This commit-and-prove approach allows VOLEitH to efficiently prove the evaluation of symmetric ciphers, which are quantum-resistant. The transformation to a non-interactive protocol follows the standard MPCitH method: the prover commits to all secret values, a challenge is used to select a subset to reveal, and the prover proves consistency.

Efficient implementations operate over two mathematical fields (binary and prime) simultaneously, allowing these ZK circuits to handle both arithmetic and bitwise functions (like XORs) efficiently. Based on this foundation, a recent talk teased the potential for blind signatures from the multivariate quadratic signature scheme MAYO with sizes of just 7.5 kB and signing/verification times under 50 ms.

The VOLEitH approach, as a general-purpose solution system, represents a promising new direction for performant constructions. There are a number of competing in-the-head schemes in the NIST competition for additional signature schemes, including one based on VOLEitH. The current VOLEitH literature focuses on high-performance digital signatures, and an explicit construction for a full anonymous credential system has not yet been proposed. This means that features standard to ACs, such as multi-show unlinkability or the ability to prove relations between attributes, are not yet part of the design, whereas they are explicitly supported by the lattice construction. However, the preliminary results show great potential for performance, and it will be interesting to see the continued cryptanalysis and feature development from this line of VOLEitH in the area of anonymous credentials, especially since the general-purpose construction allows adding features easily.

Approach	Pros	Cons	Practical Viability
Generic Composition	Flexible construction, strong security	Large signatures (112 kB), slow (660 ms)	Low: Performance is not great
Hash-and-sign	Potentially tiny signatures, lots of optimization potential	Current implementation large and slow	Low: Performance is not great
Hash-and-sign with aborts	Full AC system, good balance in communication	Slow runtimes (1s)	Medium: promising but performance would need to improve
VOLEitH	Excellent potential performance (<50ms, 7.5 kB)	not a full AC system, not peer-reviewed	Medium: promising research direction, no full solution available so far

Closing the gap

My (that is Lena's) internship focused on a critical question: what should we look at next to build ACs for the Internet? For us, "the right direction" means developing protocols that can be integrated with real world applications, and developed collaboratively at the IETF. To make these a reality, we need researchers to look beyond blind signatures; we need a complete privacy-preserving protocol that combines blind signatures with efficient zero-knowledge proofs and properties like multi-show credentials that have an internal state. The issuance should also be sublinear in communication size with the number of presentations.

So, with the transition to post-quantum cryptography on the horizon, what are our thoughts on the current IETF proposals? A 2022 NIST presentation on the current state of anonymous credentials states that efficient post-quantum secure solutions are basically non-existent. We argue that the last three years show nice developments in lattices and MPCitH anonymous credentials, but efficient post-quantum protocols still need work. Moving protocols into a post-quantum world isn't just a matter of swapping out old algorithms for new ones. A common approach on constructing post-quantum versions of classical protocols is swapping out the building blocks for their quantum-secure counterpart.

We believe this approach is essential, but not forward-looking. In addition to identifying how modern concerns can be accommodated on old cryptographic designs, we should be building new, post-quantum native protocols.

For ARC, the conceptual path to a post-quantum construction seems relatively straightforward. The underlying cryptography follows a similar structure as the lattice-based anonymous credentials, or, when accepting a protocol with fewer features, the generic post-quantum privacy-pass construction. However, we need to support per-origin rate-limiting, which allows us to transform a token at an origin without leaking us being able to link the redemption to redemptions at other origins, a feature that none of the post-quantum anonymous credential protocols or blind signatures support. Also, ARC is sublinear in communication size with respect to the number of tokens issued, which so far only the hash-and-sign with abort lattices achieve, although the notion of “limited shows” is not present in the current proposal. In addition, it would be great to gauge efficient implementations, especially for blind signatures, as well as looking into efficient zero-knowledge proofs.
For ACT, we need the protocols for ARC and an additional state. Even for the simplest counter, we need the ability to homomorphically subtract from that balance within the credential itself. This is a much more complex cryptographic requirement. It would also be interesting to see a post-quantum double-spend prevention that enforces the sequential nature of ACT.

Working on ACs and other privacy-preserving cryptography inevitably leads to a major bottleneck: efficient zero-knowledge proofs, or to be more exact, efficiently proving hash function evaluations. In a ZK circuit, multiplications are expensive. Each wire in the circuit that performs a multiplication requires a cryptographic commitment, which adds communication overhead. In contrast, other operations like XOR can be virtually "free." This makes a huge difference in performance. For example, SHAKE (the primitive used in ML-DSA) can be orders of magnitude slower than arithmetization-friendly hash functions inside a ZKP. This is why researchers and implementers are already using Poseidon or Poseidon2 to make their protocols faster.

Currently, Ethereum is seriously considering migrating Ethereum to the Poseidon hash and calls for cryptanalysis, but there is no indication of standardization. This is a problem: papers increasingly use different instantiations of Poseidon to fit their use-case, and there are more and more zero-knowledge friendly hash functions coming out, tailored to different use-cases. We would like to see at least one XOF and one hash each for a prime field and for a binary field, ideally with some security levels. And also, is Poseidon the best or just the most well-known ZK friendly cipher? Is it always secure against quantum computers (like we believe AES to be), and are there other attacks like the recent attacks on round-reduced versions?

Looking at algebra and zero-knowledge brings us to a fundamental debate in modern cryptography. Imagine a line representing the spectrum of research: On one end, you have protocols built on very well-analyzed standard assumptions like the SIS problem on lattices or the collision resistance of SHA3. On the other end, you have protocols that gain massive efficiency by using more algebraic structure, which in turn relies on newer, stronger cryptographic assumptions. Breaking novel hash functions is somewhere in the middle.

The answer for the Internet can’t just be to relent and stay at the left end of our graph to be safe. For the ecosystem to move forward, we need to have confidence in both. We need more research to validate the security of ZK-friendly primitives like Poseidon, and we need more scrutiny on the stronger assumptions that enable efficient algebraic methods.

Conclusion

As we’ve explored, the cryptographic properties that make classical ACs efficient, particularly the rich structure of elliptic curves, do not have direct post-quantum equivalents. Our survey of the state of the art from generic compositions using STARKs, to various lattice-based schemes, and promising new directions like MPC-in-the-head, reveals a field full of potential but with no clear winner. The trade-offs between communication cost, computational cost, and protocol rounds remain a significant barrier to practical, large-scale deployment, especially in comparison to elliptic curve constructions.

To bridge this gap, we must move beyond simply building post-quantum blind signatures. We challenge our colleagues in academia and industry to develop complete, post-quantum native protocols that address real-world needs. This includes supporting essential features like the per-origin rate-limiting required for ARC or the complex stateful credentials needed for ACT.

A critical bottleneck for all these approaches is the lack of efficient, standardized, and well-analyzed zero-knowledge-friendly hash functions. We need to research zero-knowledge friendly primitives and build industry-wide confidence to enable efficient post-quantum privacy.

If you’re working on these problems, or you have experience in the management and deployment of classical credentials, now is the time to engage. The world is rapidly adopting credentials for everything from digital identity to bot management, and it is our collective responsibility to ensure these systems are private and secure for a post-quantum future. We can tell for certain that there are more discussions to be had, and if you’re interested in helping to build this more secure and private digital world, we’re hiring 1,111 interns over the course of next year, and have open positions!

Keeping the Internet fast and secure: introducing Merkle Tree Certificates

Luke Valenta — Tue, 28 Oct 2025 13:00:00 GMT

The world is in a race to build its first quantum computer capable of solving practical problems not feasible on even the largest conventional supercomputers. While the quantum computing paradigm promises many benefits, it also threatens the security of the Internet by breaking much of the cryptography we have come to rely on.

To mitigate this threat, Cloudflare is helping to migrate the Internet to Post-Quantum (PQ) cryptography. Today, about 50% of traffic to Cloudflare's edge network is protected against the most urgent threat: an attacker who can intercept and store encrypted traffic today and then decrypt it in the future with the help of a quantum computer. This is referred to as the harvest now, decrypt later threat.

However, this is just one of the threats we need to address. A quantum computer can also be used to crack a server's TLS certificate, allowing an attacker to impersonate the server to unsuspecting clients. The good news is that we already have PQ algorithms we can use for quantum-safe authentication. The bad news is that adoption of these algorithms in TLS will require significant changes to one of the most complex and security-critical systems on the Internet: the Web Public-Key Infrastructure (WebPKI).

The central problem is the sheer size of these new algorithms: signatures for ML-DSA-44, one of the most performant PQ algorithms standardized by NIST, are 2,420 bytes long, compared to just 64 bytes for ECDSA-P256, the most popular non-PQ signature in use today; and its public keys are 1,312 bytes long, compared to just 64 bytes for ECDSA. That's a roughly 20-fold increase in size. Worse yet, the average TLS handshake includes a number of public keys and signatures, adding up to 10s of kilobytes of overhead per handshake. This is enough to have a noticeable impact on the performance of TLS.

That makes drop-in PQ certificates a tough sell to enable today: they don’t bring any security benefit before Q-day — the day a cryptographically relevant quantum computer arrives — but they do degrade performance. We could sit and wait until Q-day is a year away, but that’s playing with fire. Migrations always take longer than expected, and by waiting we risk the security and privacy of the Internet, which is dear to us.

It's clear that we must find a way to make post-quantum certificates cheap enough to deploy today by default for everyone — not just those that can afford it. In this post, we'll introduce you to the plan we’ve brought together with industry partners to the IETF to redesign the WebPKI in order to allow a smooth transition to PQ authentication with no performance impact (and perhaps a performance improvement!). We'll provide an overview of one concrete proposal, called Merkle Tree Certificates (MTCs), whose goal is to whittle down the number of public keys and signatures in the TLS handshake to the bare minimum required.

But talk is cheap. We know from experience that, as with any change to the Internet, it's crucial to test early and often. Today we're announcing our intent to deploy MTCs on an experimental basis in collaboration with Chrome Security. In this post, we'll describe the scope of this experiment, what we hope to learn from it, and how we'll make sure it's done safely.

The WebPKI today — an old system with many patches

Why does the TLS handshake have so many public keys and signatures?

Let's start with Cryptography 101. When your browser connects to a website, it asks the server to authenticate itself to make sure it's talking to the real server and not an impersonator. This is usually achieved with a cryptographic primitive known as a digital signature scheme (e.g., ECDSA or ML-DSA). In TLS, the server signs the messages exchanged between the client and server using its secret key, and the client verifies the signature using the server's public key. In this way, the server confirms to the client that they've had the same conversation, since only the server could have produced a valid signature.

If the client already knows the server's public key, then only 1 signature is required to authenticate the server. In practice, however, this is not really an option. The web today is made up of around a billion TLS servers, so it would be unrealistic to provision every client with the public key of every server. What's more, the set of public keys will change over time as new servers come online and existing ones rotate their keys, so we would need some way of pushing these changes to clients.

This scaling problem is at the heart of the design of all PKIs.

Trust is transitive

Instead of expecting the client to know the server's public key in advance, the server might just send its public key during the TLS handshake. But how does the client know that the public key actually belongs to the server? This is the job of a certificate.

A certificate binds a public key to the identity of the server — usually its DNS name, e.g., cloudflareresearch.com. The certificate is signed by a Certification Authority (CA) whose public key is known to the client. In addition to verifying the server's handshake signature, the client verifies the signature of this certificate. This establishes a chain of trust: by accepting the certificate, the client is trusting that the CA verified that the public key actually belongs to the server with that identity.

Clients are typically configured to trust many CAs and must be provisioned with a public key for each. Things are much easier however, since there are only 100s of CAs instead of billions. In addition, new certificates can be created without having to update clients.

These efficiencies come at a relatively low cost: for those counting at home, that's +1 signature and +1 public key, for a total of 2 signatures and 1 public key per TLS handshake.

That's not the end of the story, however. As the WebPKI has evolved, so have these chains of trust grown a bit longer. These days it's common for a chain to consist of two or more certificates rather than just one. This is because CAs sometimes need to rotate their keys, just as servers do. But before they can start using the new key, they must distribute the corresponding public key to clients. This takes time, since it requires billions of clients to update their trust stores. To bridge the gap, the CA will sometimes use the old key to issue a certificate for the new one and append this certificate to the end of the chain.

That's +1 signature and +1 public key, which brings us to 3 signatures and 2 public keys. And we still have a little ways to go.

Trust but verify

The main job of a CA is to verify that a server has control over the domain for which it’s requesting a certificate. This process has evolved over the years from a high-touch, CA-specific process to a standardized, mostly automated process used for issuing most certificates on the web. (Not all CAs fully support automation, however.) This evolution is marked by a number of security incidents in which a certificate was mis-issued to a party other than the server, allowing that party to impersonate the server to any client that trusts the CA.

Automation helps, but attacks are still possible, and mistakes are almost inevitable. Earlier this year, several certificates for Cloudflare's encrypted 1.1.1.1 resolver were issued without our involvement or authorization. This apparently occurred by accident, but it nonetheless put users of 1.1.1.1 at risk. (The mis-issued certificates have since been revoked.)

Ensuring mis-issuance is detectable is the job of the Certificate Transparency (CT) ecosystem. The basic idea is that each certificate issued by a CA gets added to a public log. Servers can audit these logs for certificates issued in their name. If ever a certificate is issued that they didn't request itself, the server operator can prove the issuance happened, and the PKI ecosystem can take action to prevent the certificate from being trusted by clients.

Major browsers, including Firefox and Chrome and its derivatives, require certificates to be logged before they can be trusted. For example, Chrome, Safari, and Firefox will only accept the server's certificate if it appears in at least two logs the browser is configured to trust. This policy is easy to state, but tricky to implement in practice:

Operating a CT log has historically been fairly expensive. Logs ingest billions of certificates over their lifetimes: when an incident happens, or even just under high load, it can take some time for a log to make a new entry available for auditors.
Clients can't really audit logs themselves, since this would expose their browsing history (i.e., the servers they wanted to connect to) to the log operators.

The solution to both problems is to include a signature from the CT log along with the certificate. The signature is produced immediately in response to a request to log a certificate, and attests to the log's intent to include the certificate in the log within 24 hours.

Per browser policy, certificate transparency adds +2 signatures to the TLS handshake, one for each log. This brings us to a total of 5 signatures and 2 public keys in a typical handshake on the public web.

The future WebPKI

The WebPKI is a living, breathing, and highly distributed system. We've had to patch it a number of times over the years to keep it going, but on balance it has served our needs quite well — until now.

Previously, whenever we needed to update something in the WebPKI, we would tack on another signature. This strategy has worked because conventional cryptography is so cheap. But 5 signatures and 2 public keys on average for each TLS handshake is simply too much to cope with for the larger PQ signatures that are coming.

The good news is that by moving what we already have around in clever ways, we can drastically reduce the number of signatures we need.

Crash course on Merkle Tree Certificates

Merkle Tree Certificates (MTCs) is a proposal for the next generation of the WebPKI that we are implementing and plan to deploy on an experimental basis. Its key features are as follows:

All the information a client needs to validate a Merkle Tree Certificate can be disseminated out-of-band. If the client is sufficiently up-to-date, then the TLS handshake needs just 1 signature, 1 public key, and 1 Merkle tree inclusion proof. This is quite small, even if we use post-quantum algorithms.
The MTC specification makes certificate transparency a first class feature of the PKI by having each CA run its own log of exactly the certificates they issue.

Let's poke our head under the hood a little. Below we have an MTC generated by one of our internal tests. This would be transmitted from the server to the client in the TLS handshake:

Looks like your average PEM encoded certificate. Let's decode it and look at the parameters:

$ openssl x509 -in merkle-tree-cert.pem -noout -text
Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number: 531 (0x213)
        Signature Algorithm: 1.3.6.1.4.1.44363.47.0
        Issuer: 1.3.6.1.4.1.44363.47.1=44363.48.3
        Validity
            Not Before: Oct 21 15:33:26 2025 GMT
            Not After : Oct 28 15:33:26 2025 GMT
        Subject: CN=cloudflareresearch.com
        Subject Public Key Info:
            Public Key Algorithm: id-ecPublicKey
                Public-Key: (256 bit)
                pub:
                    04:70:ed:e1:96:87:b4:22:ef:fb:dc:a9:cd:9c:5c:
                    ef:1e:9e:ab:1b:6d:d7:11:74:7b:76:c8:3c:a1:5f:
                    94:37:45:99:d8:80:e3:5c:24:4f:28:46:b5:bf:84:
                    60:d8:fc:eb:82:5a:c4:4e:33:90:c7:b3:36:51:0c:
                    92:6d:bf:88:27
                ASN1 OID: prime256v1
                NIST CURVE: P-256
        X509v3 extensions:
            X509v3 Key Usage: critical
                Digital Signature
            X509v3 Extended Key Usage:
                TLS Web Server Authentication
            X509v3 Subject Alternative Name:
                DNS:cloudflareresearch.com, DNS:static-ct.cloudflareresearch.com
    Signature Algorithm: 1.3.6.1.4.1.44363.47.0
    Signature Value:
        00:00:00:00:00:00:02:00:00:00:00:00:00:00:02:58:00:e0:
        44:be:03:a5:bd:6a:b7:f2:9e:39:77:4c:16:4c:f8:06:e5:e1:
        55:c0:93:21:c6:79:83:3c:dd:5b:e6:57:89:c0:75:b3:4c:ec:
        75:8a:0b:53:a0:ca:1c:07:0c:1a:92:dd:c7:7c:a2:23:5d:83:
        0e:e4:23:43:38:af:43:20:a8:66:44:34:95:87:ea:2b:f0:0f:
        16:52:bb:ea:67:67:1e:89:36:4f:90:d4:05:55:89:46:f1:b7:
        b6:68:84:d3:57:31:ae:2b:c3:79:31:86:85:9d:24:ed:cf:25:
        a4:5c:fd:8f:f6:76:14:55:dd:67:2e:df:d6:8c:25:0d:52:48:
        c8:e3:fe:f9:7c:e6:a5:30:52:a5:b5:c7:3a:89:a5:c1:f6:4b:
        5b:95:ef:70:b8:91:fc:61:0f:6d:16:de:39:e9:a0:59:49:2b:
        34:71:7c:2a:16:da:c7:af:de:f7:01:94:10:c4:62:d1:f5:00:
        87:bd:e8:a2:f4:df:3b:35:79:27:0e:fc:cc:43:e7:60:5a:df:
        df:06:e8:d3:7e:eb:b3:bf:7b:25:43:0f:34:9a:26:c0:d3:6d:
        5d:0c:28:bc:87:58:58:15:00:00

While some of the parameters probably look familiar, others will look unusual. On the familiar side, the subject and public key are exactly what we might expect: the DNS name is cloudflareresearch.com and the public key is for a familiar signature algorithm, ECDSA-P256. This algorithm is not PQ, of course — in the future we would put ML-DSA-44 there instead.

On the unusual side, OpenSSL appears to not recognize the signature algorithm of the issuer and just prints the raw OID and bytes of the signature. There's a good reason for this: the MTC does not have a signature in it at all! So what exactly are we looking at?

The trick to leave out signatures is that a Merkle Tree Certification Authority (MTCA) produces its signatureless certificates in batches rather than individually. In place of a signature, the certificate has an inclusion proof of the certificate in a batch of certificates signed by the MTCA.

To understand how inclusion proofs work, let's think about a slightly simplified version of the MTC specification. To issue a batch, the MTCA arranges the unsigned certificates into a data structure called a Merkle tree that looks like this:

Each leaf of the tree corresponds to a certificate, and each inner node is equal to the hash of its children. To sign the batch, the MTCA uses its secret key to sign the head of the tree. The structure of the tree guarantees that each certificate in the batch was signed by the MTCA: if we tried to tweak the bits of any one of the certificates, the treehead would end up having a different value, which would cause the signature to fail.

An inclusion proof for a certificate consists of the hash of each sibling node along the path from the certificate to the treehead:

Given a validated treehead, this sequence of hashes is sufficient to prove inclusion of the certificate in the tree. This means that, in order to validate an MTC, the client also needs to obtain the signed treehead from the MTCA.

This is the key to MTC's efficiency:

Signed treeheads can be disseminated to clients out-of-band and validated offline. Each validated treehead can then be used to validate any certificate in the corresponding batch, eliminating the need to obtain a signature for each server certificate.
During the TLS handshake, the client tells the server which treeheads it has. If the server has a signatureless certificate covered by one of those treeheads, then it can use that certificate to authenticate itself. That's 1 signature,1 public key and 1 inclusion proof per handshake, both for the server being authenticated.

Now, that's the simplified version. MTC proper has some more bells and whistles. To start, it doesn’t create a separate Merkle tree for each batch, but it grows a single large tree, which is used for better transparency. As this tree grows, periodically (sub)tree heads are selected to be shipped to browsers, which we call landmarks. In the common case browsers will be able to fetch the most recent landmarks, and servers can wait for batch issuance, but we need a fallback: MTC also supports certificates that can be issued immediately and don’t require landmarks to be validated, but these are not as small. A server would provision both types of Merkle tree certificates, so that the common case is fast, and the exceptional case is slow, but at least it’ll work.

Experimental deployment

Ever since early designs for MTCs emerged, we’ve been eager to experiment with the idea. In line with the IETF principle of “running code”, it often takes implementing a protocol to work out kinks in the design. At the same time, we cannot risk the security of users. In this section, we describe our approach to experimenting with aspects of the Merkle Tree Certificates design without changing any trust relationships.

Let’s start with what we hope to learn. We have lots of questions whose answers can help to either validate the approach, or uncover pitfalls that require reshaping the protocol — in fact, an implementation of an early MTC draft by Maximilian Pohl and Mia Celeste did exactly this. We’d like to know:

What breaks? Protocol ossification (the tendency of implementation bugs to make it harder to change a protocol) is an ever-present issue with deploying protocol changes. For TLS in particular, despite having built-in flexibility, time after time we’ve found that if that flexibility is not regularly used, there will be buggy implementations and middleboxes that break when they see things they don’t recognize. TLS 1.3 deployment took years longer than we hoped for this very reason. And more recently, the rollout of PQ key exchange in TLS caused the Client Hello to be split over multiple TCP packets, something that many middleboxes weren't ready for.

What is the performance impact? In fact, we expect MTCs to reduce the size of the handshake, even compared to today's non-PQ certificates. They will also reduce CPU cost: ML-DSA signature verification is about as fast as ECDSA, and there will be far fewer signatures to verify. We therefore expect to see a reduction in latency. We would like to see if there is a measurable performance improvement.

What fraction of clients will stay up to date? Getting the performance benefit of MTCs requires the clients and servers to be roughly in sync with one another. We expect MTCs to have fairly short lifetimes, a week or so. This means that if the client's latest landmark is older than a week, the server would have to fallback to a larger certificate. Knowing how often this fallback happens will help us tune the parameters of the protocol to make fallbacks less likely.

In order to answer these questions, we are implementing MTC support in our TLS stack and in our certificate issuance infrastructure. For their part, Chrome is implementing MTC support in their own TLS stack and will stand up infrastructure to disseminate landmarks to their users.

As we've done in past experiments, we plan to enable MTCs for a subset of our free customers with enough traffic that we will be able to get useful measurements. Chrome will control the experimental rollout: they can ramp up slowly, measuring as they go and rolling back if and when bugs are found.

Which leaves us with one last question: who will run the Merkle Tree CA?

Bootstrapping trust from the existing WebPKI

Standing up a proper CA is no small task: it takes years to be trusted by major browsers. That’s why Cloudflare isn’t going to become a “real” CA for this experiment, and Chrome isn’t going to trust us directly.

Instead, to make progress on a reasonable timeframe, without sacrificing due diligence, we plan to "mock" the role of the MTCA. We will run an MTCA (on Workers based on our StaticCT logs), but for each MTC we issue, we also publish an existing certificate from a trusted CA that agrees with it. We call this the bootstrap certificate. When Chrome’s infrastructure pulls updates from our MTCA log, they will also pull these bootstrap certificates, and check whether they agree. Only if they do, they’ll proceed to push the corresponding landmarks to Chrome clients. In other words, Cloudflare is effectively just “re-encoding” an existing certificate (with domain validation performed by a trusted CA) as an MTC, and Chrome is using certificate transparency to keep us honest.

Conclusion

With almost 50% of our traffic already protected by post-quantum encryption, we’re halfway to a fully post-quantum secure Internet. The second part of our journey, post-quantum certificates, is the hardest yet though. A simple drop-in upgrade has a noticeable performance impact and no security benefit before Q-day. This means it’s a hard sell to enable today by default. But here we are playing with fire: migrations always take longer than expected. If we want to keep an ubiquitously private and secure Internet, we need a post-quantum solution that’s performant enough to be enabled by default today.

Merkle Tree Certificates (MTCs) solves this problem by reducing the number of signatures and public keys to the bare minimum while maintaining the WebPKI's essential properties. We plan to roll out MTCs to a fraction of free accounts by early next year. This does not affect any visitors that are not part of the Chrome experiment. For those that are, thanks to the bootstrap certificates, there is no impact on security.

We’re excited to keep the Internet fast and secure, and will report back soon on the results of this experiment: watch this space! MTC is evolving as we speak, if you want to get involved, please join the IETF PLANTS mailing list.

Prepping for post-quantum: a beginner’s guide to lattice cryptography

Christopher Patton — Fri, 21 Mar 2025 13:00:00 GMT

The cryptography that secures the Internet is evolving, and it's time to catch up. This post is a tutorial on lattice cryptography, the paradigm at the heart of the post-quantum (PQ) transition.

Twelve years ago (in 2013), the revelation of mass surveillance in the US kicked off the widespread adoption of TLS for encryption and authentication on the web. This transition was buoyed by the standardization and implementation of new, more efficient public-key cryptography based on elliptic curves. Elliptic curve cryptography was both faster and required less communication than its predecessors, including RSA and Diffie-Hellman over finite fields.

Today's transition to PQ cryptography addresses a looming threat for TLS and beyond: once built, a sufficiently large quantum computer can be used to break all public-key cryptography in use today. And we continue to see advancements in quantum-computer engineering that bring us closer to this threat becoming a reality.

Fortunately, this transition is well underway. The research and standards communities have spent the last several years developing alternatives that resist quantum cryptanalysis. For its part, Cloudflare has contributed to this process and is an early adopter of newly developed schemes. In fact, PQ encryption has been available at our edge since 2022 and is used in over 35% of non-automated HTTPS traffic today (2025). And this year we're beginning a major push towards PQ authentication for the TLS ecosystem.

Lattice-based cryptography is the first paradigm that will replace elliptic curves. Apart from being PQ secure, lattices are often as fast, and sometimes faster, in terms of CPU time. However, this new paradigm for public key crypto has one major cost: lattices require much more communication than elliptic curves. For example, establishing an encryption key using lattices requires 2272 bytes of communication between the client and the server (ML-KEM-768), compared to just 64 bytes for a key exchange using a modern elliptic-curve-based scheme (X25519). Accommodating such costs requires a significant amount of engineering, from dealing with TCP packet fragmentation, to reworking TLS and its public key infrastructure. Thus, the PQ transition is going to require the participation of a large number of people with a variety of backgrounds, not just cryptographers.

The primary audience for this blog post is those who find themselves involved in the PQ transition and want to better understand what's going on under the hood. However, more fundamentally, we think it's important for everyone to understand lattice cryptography on some level, especially if we're going to trust it for our security and privacy.

We'll assume you have a software-engineering background and some familiarity with concepts like TLS, encryption, and authentication. We'll see that the math behind lattice cryptography is, at least at the highest level, not difficult to grasp. Readers with a crypto-engineering background who want to go deeper might want to start with the excellent tutorial by Vadim Lyubashevsky on which this blog post is based. We also recommend Sophie Schmieg's blog on this subject.

While the transition to lattice cryptography incurs costs, it also creates opportunities. Many things we can build with elliptic curves we can also build with lattices, though not always as efficiently; but there are also things we can do with lattices that we don't know how to do efficiently with anything else. We'll touch on some of these applications at the very end.

We're going to cover a lot of ground in this post. If you stick with it, we hope you'll come away feeling empowered, not only to tackle the engineering challenges the PQ transition entails, but to solve problems you didn't know how to solve before.

Strap in — let's have some fun!

Encryption

The most pressing problem for the PQ transition is to ensure that tomorrow's quantum computers don't break today's encryption. An attacker today can store the packets exchanged between your laptop and a website you visit, and then, some time in the future, decrypt those packets with the help of a quantum computer. This means that much of the sensitive information transiting the Internet today — everything from API tokens and passwords to database encryption keys — may one day be unlocked by a quantum computer.

In fact, today's encryption in TLS is mostly PQ secure: what's at risk is the process by which your browser and a server establish an encryption key. Today this is usually done with elliptic-curve-based schemes, which are not PQ secure; our goal for this section is to understand how to do key exchange with lattices-based schemes, which are.

We will work through and implement a simplified version of ML-KEM, a.k.a. Kyber, the most widely deployed PQ key exchange in use today. Our code will be less efficient and secure than a spec-compliant, production-quality implementation, but will be good enough to grasp the main ideas.

Our starting point is a protocol that looks an awful lot like Diffie-Hellman (DH) key exchange. For those readers unacquainted with DH, the goal is for Alice and Bob to establish a shared secret over an insecure network. To do so, each picks a random secret number, computes the corresponding "key share", and sends the key share to the other:

Alice's secret number is $s$ and her key share is $g^s$; Bob's secret number is $r$ and his key share is $g^r$. Then given their secret and their peer's key share, each can compute $g^{rs}$. The security of this protocol comes from how we choose $g$, $s$, and $r$ and how we do arithmetic. The most efficient instantiation of DH uses elliptic curves.

In ML-KEM we replace operations on elliptic curves with matrix operations. It's not quite a drop-in replacement, so we'll need a little linear algebra to make sense of it. But don't worry: we're going to work with Python so we have running code to play with, and we'll use NumPy to keep things high level.

All the math we'll need

A matrix is just a two-dimensional array of numbers. In NumPy, we can create a matrix as follows (importing numpy as np):

A = np.matrix([[1, 2, 3],
               [4, 5, 6],
               [7, 8, 9]])

This defines A to be the 3-by-3 matrix with entries A[0,0]==1, A[0,1]==2, A[0,2]==3, A[1,0]==4, and so on.

For the purposes of this post, the entries of our matrices will always be integers. Furthermore, whenever we add, subtract, or multiply two integers, we then reduce the result, just like we do with hours on a clock, so that we end up with a number in range(Q) for some positive number Q, called the modulus. The exact value doesn’t really matter now, but for ML-KEM it’s Q=3329, so let's go with that for now. (The modulus for a clock would be Q=12.)

In Python, we write multiplication of integers a and b modulo Q as c = a*b % Q. Here we compute a*b, divide the result by Q, then set c to the remainder. For example, 42*1337 % Q is equal to 2890 rather than 56154. Modular addition and subtraction are done analogously. For the rest of this blog, we will sometimes omit "% Q" when it's clear in context that we mean modular arithmetic.

Next, we'll need three operations on matrices.

The first is matrix transpose, written A.T in NumPy. This operation flips the matrix along its diagonal so that A.T[j,i] == A[i,j] for all rows i and columns j:

print(A.T)
# [[1 4 7]
#  [2 5 8]
#  [3 6 9]]

To visualize this, imagine writing down a matrix on a translucent piece of paper. Draw a line from the top left corner to the bottom right corner of that paper, then rotate the paper 180° around that line:

The second operation we'll need is matrix multiplication. Normally, we will multiply a matrix by a column vector, which is just a matrix with one column. For example, the following 3-by-1 matrix is a column vector:

s = np.matrix([[0],
               [1],
               [0]])

We can also write s more concisely as np.matrix([[0,1,0]]).T. To multiply a square matrix A by a column vector s, we compute the dot product of each row of A with s. That is, if t = A*s % Q, then t[i] == (A[i,0]*s[0,0] + A[i,1]*s[1,0] + A[i,2]*s[2,0]) % Q for each row i. The output will always be a column vector:

print(A*s % Q)
# [[2]
#  [5]
#  [8]]

The number of rows of this column vector is equal to the number of rows of the matrix on the left hand side. In particular, if we take our column vector s, transpose it into a 1-by-3 matrix, and multiply it by a 3-by-1 matrix r, then we end up with a 1-by-1 matrix:

r = np.matrix([[1,2,3]]).T
print(s.T*r % Q)
# [[2]]

The final matrix operation we'll need is matrix addition. If A and B are both N-by-M matrices, then C = (A+B) % Q is the N-by-M matrix for which C[i,j] == (A[i,j]+B[i,j]) % Q. Of course, this only works if the matrices we're adding have the same dimensions.

Warm up

Enough maths — let's get to exchanging some keys. We start with the DH diagram from before and swap out the computations with matrix operations. Note that this protocol is not secure, but will be the basis of a secure key exchange mechanism we'll develop in the next section:

Alice and Bob agree on a public, N-by-N matrix A. This is analogous to the number $g$ that Alice and Bob agree on in the DH diagram.
Alice chooses a random length-N vector s and sends t = A*s % Q to Bob.
Bob chooses a random length-N vector r and sends u = r.T*A % Q to Alice. You can also compute this as (A.T*r).T % Q.

The vectors t and u are analogous to DH key shares. After the exchange of these key shares, Alice and Bob can compute a shared secret. Alice computes the shared secret as u*s % Q and Bob computes the shared secret as r.T*t % Q. To see why they compute the same key, notice that u*s == (r.T*A)*s == r.T*(A*s) == r.T*t.

In fact, this key exchange is essentially what happens in ML-KEM. However, we don't use this directly, but rather as part of a public key encryption scheme. Public key encryption involves three algorithms:

key_gen(): The key generation algorithm that outputs a public encryption key pk and the corresponding secret decryption key sk.
encrypt(): The encryption algorithm that takes the public key and a plaintext and outputs a ciphertext.
decrypt(): The decryption algorithm that takes the secret key and a ciphertext and outputs the underlying plaintext. That is, decrypt(sk, encrypt(pk, ptxt)) == ptxt for any plaintext ptxt.

We'll say the scheme is secure if, given a ciphertext and the public key used to encrypt it, no attacker can discern any information about the underlying plaintext without knowledge of the secret key. Once we have this encryption scheme, we then transform it into a key-encapsulation mechanism (the "KEM" in "ML-KEM") in the last step. A KEM is very similar to encryption except that the plaintext is always a randomly generated key.

Our encryption scheme is as follows:

key_gen(): To generate a key pair, we choose a random, square matrix A and a random column vector s. We set our public key to (A,t=A*s % Q) and our secret key to s. Notice that t is Alice's key share from the key exchange protocol above.
encrypt(): Suppose our plaintext ptxt is an integer in range(Q). To encrypt ptxt, Bob generates his key share u. He then derives the shared secret and adds it to ptxt. The ciphertext has two components:

u = r.T*A % Q
v = (r.T*t + m) % Q

Here m is a 1-by-1 matrix containing the plaintext, i.e., m = np.matrix([[ptxt]]), and r is a random column vector.

decrypt(): To decrypt, Alice computes the shared secret and subtracts it from v:

m = (v - u*s) % Q

Some readers will notice that this looks an awful lot like El Gamal encryption. This isn't a coincidence. Good cryptographers roll their own crypto; great cryptographers steal from good cryptographers.

Let's now put this together into code. The last thing we'll need is a method of generating random matrices and column vectors. We call this function gen_mat() below. Take a crack at implementing this yourself. Our scheme has two parameters: the modulus Q; and the dimension of N of the matrix and column vectors. The choice of N matters for security, but for now feel free to pick whatever value you want.

def key_gen():
    # Here `gen_mat()` returns an N-by-N matrix with entries
    # randomly chosen from `range(0, Q)`.
    A = gen_mat(N, N, 0, Q)
    # Like above except the matrix is N-by-1.
    s = gen_mat(N, 1, 0, Q)
    t = A*s % Q
    return ((A, t), s)

def encrypt(pk, ptxt):
    (A, t) = pk
    m = np.matrix([[ptxt]])
    r = gen_mat(N, 1, 0, Q)
    u = r.T*A % Q
    v = (r.T*t + m) % Q
    return (u, v)

def decrypt(sk, ctxt):
    s = sk
    (u, v) = ctxt
    m = (v - u*s) % Q
    return m[0,0]

# Test
assert decrypt(sk, encrypt(pk, 1)) == 1

Making the scheme secure (or "What is a lattice?")

By now, you might be wondering what on Earth a lattice even is. We promise we'll define it, but before we do, it'll help to understand why our warm-up scheme is insecure and what it'll take to fix it.

Readers familiar with linear algebra may already see the problem: in order for this scheme to be secure, it should be impossible for the attacker to recover the secret key s; but given the public (A,t), we can immediately solve for s using Gaussian elimination.

In more detail, if A is invertible, we can write the secret key as A^-1*t == A^-1*(A*s) == (A^-1*A)*s == s, where A^-1 is the inverse of A. (When you multiply a matrix by its inverse, you get the identity matrix I, which simply takes a column vector to itself, i.e., I*s == s.) We can use Gaussian elimination to compute this matrix. Intuitively, all we're doing is solving a set of linear equations, where the entries of s are the unknown variables. (Note that this is possible even if A is not invertible.)

In order to make this encryption scheme secure, we need to make it a little... “messier”.

Let's get messy

For starters, we need to make it hard to recover the secret key from the public key. Let's try the following: generate another random vector e and add it into A*s. Our key generation algorithm becomes:

def key_gen():
    A = gen_mat(N, N, 0, Q)
    s = gen_mat(N, 1, 0, Q)
    e = gen_mat(N, 1, 0, Q)
    t = (A*s + e) % Q
    return ((A, t), s)

Our formula for the column vector component of the public key, t, now includes an additive term e, which we'll call the error. Like the secret key, the error is just a random vector.

Notice that the previous attack no longer works: since A^-1*t == A^-1*(A*s + e) == A^-1*(A*s) + A^-1*e == s + A^-1*e, we need to know e in order to compute s.

Great, but this patch creates another problem. Take a second to plug in this new key generation algorithm into your implementation and test it out. What happens?

You should see that decrypt() now outputs garbage. We can see why using a little algebra:

(v - u*s) == (r.T*t + m) - (r.T*A)*s
                == r.T*(A*s + e) + m - (r.T*A)*s
                == r.T*(A*s) + r.T*e + m - r.T*(A*s)
                == r.T*e + m

The entries of r and e are sampled randomly, so r.T*e is also uniformly random. It's as if we encrypted m with a one-time pad, then threw away the one-time pad!

Handling decryption errors

What can we do about this? First, it would help if r.T*e were small so that decryption yields something that's close to the plaintext. Imagine we could generate r and e in such a way that r.T*e were in range(-epsilon, epsilon+1) for some small epsilon. Then decrypt would output a number in range(ptxt-epsilon, ptxt+epsilon+1), which would be pretty close to the actual plaintext.

However, we need to do better than get close. Imagine your browser failing to load your favorite website one-third of the time because of a decryption error. Nobody has time for that.

ML-KEM reduces the probability of decryption errors by being clever about how we encode the plaintext. Suppose all we want to do is encrypt a single bit, i.e., ptxt is either 0 or 1. Consider the numbers in range(Q), and split the number line into four chunks of roughly equal length:

Here we've labeled the region around zero (-Q/4 to Q/4 modulo Q) with ptxt=0 and the region far away from zero with ptxt=1. To encode the bit, we set it to the integer corresponding to the middle of its range, i.e., m = np.matrix([[ptxt * Q//2]]). (Note the double "//" — this denotes integer division in Python.) To decode, we choose the ptxt corresponding to whatever range m[0,0] is in. That way if the decryption error is small, then we're highly likely to end up in the correct range.

Now all that's left is to ensure the decryption error, r.T*e, is small. We do this by sampling short vectors r and e. By "short" we mean the entries of these vectors are sampled from a range that is much smaller than range(Q). In particular, we'll pick some small positive integer beta and sample entries range(-beta,beta+1).

How do we choose beta? Well, it should be small enough that decryption succeeds with overwhelming probability, but not so small that r and e are easy to guess and our scheme is broken. Take a minute or two to play with this. The parameters we can vary are:

the modulus Q
the dimension of the column vectors N
the shortness parameter beta

For what ranges of these parameters is the decryption error low but the secret vectors are hard to guess? For what ranges is our scheme most efficient, in terms of runtime and communication cost (size of the public key plus the ciphertext)? We'll give a concrete answer at the end of this section, but in the meantime, we encourage you to play with this a bit.

Gauss strikes back

At this point, we have a working encryption scheme that mitigates at least one key-recovery attack. We've come pretty far, but we have at least one more problem.

Take another look at our formula for the ciphertext ctxt = (u,v). What would happen if we managed to recover the random vector r? That would be catastrophic, since v == r.T*t + m, and we already know t (part of the public key) and v (part of the ciphertext).

Just as we were able to compute the secret key from the public key in our initial scheme, we can recover the encryption randomness r from the ciphertext component u using Gaussian elimination. Again, this is just because r is the solution to a system of linear equations.

We can mitigate this plaintext-recovery attack just as before, by adding some noise. In particular, we'll generate a short vector according to gen_mat(N,1,-beta,beta+1) and add it into u. We also need to add noise to v in the same way, for reasons that we'll discuss in the next section.

Once again, adding noise increases the probability of a decryption error, but this time the magnitude of the error also depends on the secret key s. To see this, recall that during decryption, we multiply u by s (to compute the shared secret), and the error vector is an additive term. We'll therefore need s to be a short vector as well.

Let's now put together everything we've learned into an updated encryption scheme. Our scheme now has three parameters, Q, N, and beta, and can be used to encrypt a single bit:

def key_gen():
    A = gen_mat(N, N, 0, Q)
    s = gen_mat(N, 1, -beta, beta+1)
    e1 = gen_mat(N, 1, -beta, beta+1)
    t = (A*s + e1) % Q
    return ((A, t), s)

def encrypt(pk, ptxt):
    (A, t) = pk
    m = np.matrix([[ptxt*(Q//2) % Q]])
    r = gen_mat(N, 1, -beta, beta+1)
    e2 = gen_mat(N, 1, -beta, beta+1)
    e3 = gen_mat(1, 1, -beta, beta+1)
    u = (r.T*A + e2) % Q
    v = (r.T*t + e3 + m) % Q
    return (u, v)

def decrypt(sk, ctxt):
    s = sk
    (u, v) = ctxt
    m = (v - u*s) % Q
    if m[0,0] in range(Q//4, 3*Q//4):
        return 1
    return 0

# Test
assert decrypt(sk, encrypt(pk, 0)) == 0
assert decrypt(sk, encrypt(pk, 1)) == 1

Before moving on, try to find parameters for which the scheme works and for which the secret and error vectors seem hard to guess.

Learning with errors

So far we have a functioning encryption scheme for which we've mitigated two attacks, one a key-recovery attack and the other a plaintext-recovery attack. There seems to be no other obvious way of breaking our scheme, unless we choose parameters that are so weak that an attacker can easily guess the secret key s or ciphertext randomness r. Again, these vectors need to be short in order to prevent decryption errors, but not so short that they are easy to guess. (Likewise for the error terms.)

Still, there may be other attacks that require a little more sophistication to pull off. For instance, there might be some mathematical analysis we can do to recover, or at least make a good guess of, a portion of the ciphertext randomness. This raises a more fundamental question: in general, how do we establish that cryptosystems like this are actually secure?

As a first step, cryptographers like to try and reduce the attack surface. Modern cryptosystems are designed so that the problem of attacking the scheme reduces to solving some other problem that is easier to reason about.

Our public key encryption scheme is an excellent illustration of this idea. Think back to the key- and plaintext-recovery attacks from the previous section. What do these attacks have in common?

In both instances, the attacker knows some public vector that allowed it to recover a secret vector:

In the key-recovery attack, the attacker knew t for which A*s == t.
In the plaintext-recovery attack, the attacker knew u for which r.T*A == u (or, equivalently, A.T*r == u.T).

The fix in both cases was to construct the public vector in such a manner that it is hard to solve for the secret, namely, by adding an error term. However, ideally the public vector would reveal no information about the secret whatsoever. This ideal is formalized by the Learning With Errors (LWE) problem.

The LWE problem asks the attacker to distinguish between two distributions. Concretely, imagine we flip a coin, and if it comes up heads, we sample from the first distribution and give the sample to the attacker; and if the coin comes up tails, we sample from the second distribution and give the sample to the attacker. The distributions are as follows:

(A,t=A*s + e) where A is a random matrix generated with gen_mat(N,N,0,Q) and s and e are short vectors generated with gen_mat(N,1,-beta,beta+1).
(A,t) where A is a random matrix generated with gen_mat(N,N,0,Q) and t is a random vector generated with gen_mat(N,1,0,Q).

The first distribution corresponds to what we actually do in the encryption scheme; in the second, t is just a random vector, and no longer a secret vector at all. We say that the LWE problem is "hard" if no attacker is able to guess the coin flip with probability significantly better than one-half.

Our encryption is passively secure — meaning the ciphertext doesn't leak any information about the plaintext — if the LWE problem is hard for the parameters we chose. To see why, notice that both the public key and ciphertext look like LWE instances; if we can replace each instance with an instance of the random distribution, then the ciphertext would be completely independent of the plaintext and therefore leak no information about it at all. Note that, for this argument to go through, we also have to add the error term e3 to the ciphertext component v.

Choosing the parameters

We've established that if solving the LWE problem is hard for parameters N, Q, and beta, then so is breaking our public key encryption scheme. What's left for us to do is tune the parameters so that solving LWE is beyond the reach of any attacker we can think of. This is where lattices come in.

Lattices

A lattice is an infinite grid of points in high-dimensional space. A two-dimensional lattice might look something like this:

The points always follow a clear pattern that resembles "lattice work" you might see in a garden:

^{(Source: https://picryl.com/media/texture-wood-vintage-backgrounds-textures-8395bb)}

For cryptography, we care about a special class of lattices, those defined by a matrix P that "recognizes" points in the lattice. That is, the lattice recognized by P is the set of vectors v for which P*v == 0, where "0" denotes the all-zero vector. The all-zero vector is np.zeros((N,1), dtype=int) in NumPy.

Readers familiar with linear algebra may have a different definition of lattices in mind: in general, a lattice is the set of points obtained by taking linear combinations of some basis. Our lattices can also be formulated in this way, i.e., for a matrix P that recognizes a lattice, we can compute the basis vectors that generate the lattice. However, we don't much care about this representation here.

The LWE problem boils down to distinguishing a set of points that are "close to" the lattice from a set of points that are "far away from" the lattice. We construct these points from an LWE instance and a random (A,t) respectively. Here we have an LWE sample (left) and a sample from the random distribution (right):

What this shows is that the points of the LWE instance are much closer to the lattice than the random instance. This is indeed the case on average. However, while distinguishing LWE instances from random is easy in two dimensions, it gets harder in higher dimensions.

Let's take a look at how we construct these points. First, let's take an LWE instance (A,t=(A*s + e) % Q) and consider the lattice recognized by the matrix P we get by concatenating A with the identity matrix I. This might look something like this (N=3):

A = gen_mat(N, N, 0, Q)
P = np.concatenate((A, np.identity(N, dtype=int)), axis=1)
print(P)
# [[1570  634  161	1	0	0]
#  [1522 1215  861	0	1	0]
#  [ 344 2651 1889	0	0	1]]

Notice that we can compute t by multiplying P by the vector we get by concatenating s and e (beta=2):

s = gen_mat(N, 1, -beta, beta+1)
e = gen_mat(N, 1, -beta, beta+1)
t = (A*s + e) % Q
z = np.concatenate((s, e))
print(z)
# [[-2]
#  [ 0]
#  [-2]
#  [ 0]
#  [-1]
#  [ 2]]
assert np.array_equal(t, P*z % Q)

Let z denote this vector and consider the set of points v for which P*v == t. By definition, we say this set of points is "close to" the lattice because z is a short vector. (Remember: by "short" we mean its entries are bounded around 0 by beta.)

Now consider a random (A,t) and consider the set of points v for which P*v == t. We won't prove it, but it is a fact that this set of points is likely to be "far away from" the lattice in the sense that there is no short vector z for which P*z == t.

Intuitively, solving LWE gets harder as z gets longer. Indeed, increasing the average length of z (by making beta larger) increases the average distance to the lattice, making it look more like a random instance:

On the other hand, making z too long creates another problem.

Breaking lattice cryptography by finding short vectors

Given a random matrix A, the Short Integer Solution (SIS) problem is to find short vectors (i.e., whose entries are bounded by beta) z1 and z2 for which (A*z1 + z2) % Q is zero. Notice that this is equivalent to finding a short vector z in the lattice recognized by P:

z = np.concatenate((z1, z2))
assert np.array_equal((A*z1 + z2) % Q, P*z % Q)

If we had a (quantum) computer program for solving SIS, then we could use this program to solve LWE as well: if (A,t) is an LWE instance, then z1.T*t will be small; otherwise, if (A,t) is random, then z1.T*t will be uniformly random. (You can convince yourself of this using a little algebra.) Therefore, in order for our encryption scheme to be secure, it must be hard to find short vectors in the lattice defined by those parameters.

Intuitively, finding long vectors in the lattice is easier than finding short ones, which means that solving the SIS problem gets easier as beta gets closer to Q. On the other hand, as beta gets closer to 0, it gets easier to distinguish LWE instances from random!

This suggests a kind of Goldilocks zone for LWE-based encryption: if the secret and noise vectors are too short, then LWE is easy; but if the secret and noise vectors are too long, then SIS is easy. The optimal choice is somewhere in the middle.

Enough math, just give me my parameters!

To tune our encryption scheme, we want to choose parameters for which the most efficient known algorithms (quantum or classical) for solving LWE are out of reach for any attacker with as many resources as we can imagine (and then some, in case new algorithms are discovered). But how do we know which attacks to look out for?

Fortunately, the community of expert lattice cryptographers and cryptanalysts maintains a tool called lattice-estimator that estimates the complexity of the best known (quantum) algorithms for lattice problems relevant to cryptography. Here's what we get when we run this tool for ML-KEM (this requires Sage to run):

sage: from estimator import *
sage: res = LWE.estimate.rough(schemes.Kyber768)
usvp        :: rop: ≈2^182.2, red: ≈2^182.2, δ: 1.002902, β: 624, d: 1427, tag: usvp
dual_hybrid :: rop: ≈2^174.3, red: ≈2^174.3, guess: ≈2^162.5, β: 597, p: 4, ζ: 10, t: 60, β': 597, N: ≈2^122.7, m: 768

The number that we're most interested in is "rop", which estimates the amount of computation the attack would consume. Playing with this tool a bit, we eventually find some parameters for our scheme for which the "usvp" and "dual_hybrid" attacks have comparable complexity. However, lattice-estimator identifies an attack it calls "arora-gb" that applies to our scheme, but not to ML-KEM, that has much lower complexity. (N=600, Q=3329, and beta=4):

sage: res = LWE.estimate.rough(LWE.Parameters(n=600, q=3329, Xs=ND.Uniform(-4,4), Xe=ND.Uniform(-4,4)))
usvp        :: rop: ≈2^180.2, red: ≈2^180.2, δ: 1.002926, β: 617, d: 1246, tag: usvp
dual_hybrid :: rop: ≈2^226.2, red: ≈2^225.4, guess: ≈2^224.9, β: 599, p: 3, ζ: 10, t: 0, β': 599, N: ≈2^174.8, m: 600
arora-gb    :: rop: ≈2^129.4, dreg: 9, mem: ≈2^129.4, t: 4, m: ≈2^64.7

We'd have to bump the parameters even further to the scheme to a regime that has comparable security to ML-KEM.

Finally, a word of warning: when designing lattice cryptography, determining whether our scheme is secure requires a lot more than estimating the cost of generic attacks on our LWE parameters. In the absence of a mathematical proof of security in a realistic adversarial model, we can't rule out other ways of breaking our scheme. Tread lightly, fair traveler, and bring a friend along for the journey.

Making the scheme efficient

Now that we understand how to encrypt with LWE, let's take a quick look at how to make our scheme efficient.

The main problem with our scheme is that we can only encrypt a bit at a time. This is because we had to split the range(Q) into two chunks, one that encodes 1 and another that encodes 0. We could improve the bit rate by splitting the range into more chunks, but this would make decryption errors more likely.

Another problem with our scheme is that the runtime depends heavily on our security parameters. Encryption requires O(N²) multiplications (multiplication is the most expensive part of a secure implementation of modular arithmetic), and in order for our scheme to be secure, we need to make N quite large.

ML-KEM solves both of these problems by replacing modular arithmetic with arithmetic over a polynomial ring. This means the entries of our matrices will be polynomials rather than integers. We need to define what it means to add, subtract, and multiply polynomials, but once we've done that, everything else about the encryption scheme is the same.

In fact, you probably learned polynomial arithmetic in grade school. The only thing you might not be familiar with is polynomial modular reduction. To multiply two polynomials $f(X)$ and $g(X)$, we start by multiplying $f(X)\cdot g(X)$ as usual. Then we're going to divide $f(X)\cdot g(X)$ by some special polynomial — ML-KEM uses $X^{256}+1$ — and take the remainder. We won't try to explain this algorithm, but the takeaway is that the result is a polynomial with $256$ coefficients, each of which is an integer in range(Q).

The main advantage of using a polynomial ring for arithmetic is that we can pack more bits into the ciphertext. Our formula for the ciphertext is exactly the same (u=r.T*A + e2, v=r.T*t + e3 + m), but this time the plaintext m encodes a polynomial. Each coefficient of the polynomial encodes a bit, and we'll handle decryption errors just as we did before, by splitting range(Q) into two chunks, one that encodes 1 and another that encodes 0. This allows us to reliably encrypt 256 bits (32 bytes) per ciphertext.

Another advantage of using polynomials is that it significantly reduces the dimension of the matrix without impacting security. Concretely, the most widely used variant of ML-KEM, ML-KEM-768, uses a 3-by-3 matrix A, so just 9 polynomials in total. (Note that $256 \cdot 3 = 768$, hence the name "ML-KEM-768".) However, note that we have to be careful in how we choose the modulus: $X^{256}+1$ is special in that it does not exhibit any algebraic structure that is known to permit attacks.

The choices of Q=3329 for the coefficient modulus and $X^{256}+1$ for the polynomial modulus have one more benefit. They allow polynomial multiplication to be carried out using the NTT algorithm, which massively reduces the number of multiplications and additions we have to perform. In fact, this optimization is a major reason why ML-KEM is sometimes faster in terms of CPU time than key exchange with elliptic curves.

We won't get into how NTT works here, except to say that the algorithm will look familiar to you if you've ever implemented RSA. In both cases we use the Chinese Remainder Theorem to split multiplication up into multiple, cheaper multiplications with smaller moduli.

From public key encryption to ML-KEM

The last step to build ML-KEM is to make the scheme secure against chosen ciphertext attacks (CCA). Currently, it's only secure against chosen plaintext attacks (CPA), which basically means that the ciphertext leaks no information about the plaintext, regardless of the distribution of plaintexts. CCA security is stronger in that it gives the attacker access to decryptions of ciphertexts of its choosing. (Of course, it's not allowed to decrypt the target ciphertext itself.) The specific transform used in ML-KEM results in a CCA-secure KEM ("Key-Encapsulation Mechanism").

Chosen ciphertext attacks might seem a bit abstract, but in fact they formalize a realistic threat model for many applications of KEMs (and public key encryption for that matter). For example, suppose we use the scheme in a protocol in which the server authenticates itself to a client by proving it was able to decrypt a ciphertext generated by the client. In this kind of protocol, the server acts as a sort of "decryption oracle" in which its responses to clients depend on the secret key. Unless the scheme is CCA secure, this oracle can be abused by an attacker to leak information about the secret key over time, allowing it to eventually impersonate the server.

ML-KEM incorporates several more optimizations to make it as fast and as compact as possible. For example, instead of generating a random matrix A, we can derive it from a random, 32-byte string (called a "seed") using a hash-based primitive called a XOF ("eXtendable Output Function"), in the case of ML-KEM this XOF is SHAKE128. This significantly reduces the size of the public key.

Another interesting optimization is that the polynomial coefficients (integers in range(Q)) in the ciphertext are compressed by rounding off the least significant bits of each coefficient, thereby reducing the overall size of the ciphertext.

All told, for the most widely deployed parameters (ML-KEM-768), the public key is 1184 bytes and the ciphertext is 1088 bytes. There's no obvious way to reduce this, except by reducing the size of the encapsulated key or the size of the public matrix A. The former would make ML-KEM useful for fewer applications, and the latter would reduce the security margin.

Note that there are other lattice schemes that are smaller, but they are based on different hardness assumptions and are still undergoing analysis.

Authentication

In the previous section, we learned about ML-KEM, the algorithm already in use to make encryption PQ-secure. However, encryption is only one piece of the puzzle: establishing a secure connection also requires authenticating the server — and sometimes the client, depending on the application.

Authentication is usually provided by a digital signature scheme, which uses a secret key to sign a message and a public key to verify the signature. The signature schemes used today aren't PQ-secure: a quantum computer can be used to compute the secret key corresponding to a server's public key, then use this key to impersonate the server.

While this threat is less urgent than the threat to encryption, mitigating it is going to be more complicated. Over the years, we've bolted a number of signatures onto the TLS handshake in order to meet the evolving requirements of the web PKI. We have PQ alternatives for these signatures, one of which we'll study in this section, but so far these signatures and their public keys are too large (i.e., take up too many bytes) to make comfortable replacements for today's schemes. Barring some breakthrough in NIST's ongoing standardization effort, we will have to re-engineer TLS and the web PKI to use fewer signatures.

For now, let's dive into the PQ signature scheme we're likely to see deployed first: ML-DSA, a.k.a. Dillithium. The design of ML-DSA follows a similar template as ML-KEM. We start by building some intermediate primitive, then we transform that primitive into the primitive we want, in this case a signature scheme.

ML-DSA is quite a bit more involved than ML-KEM, so we're going to try to boil it down even further and just try to get across the main ideas.

Warm up

Whereas ML-KEM is basically El Gamal encryption with elliptic curves replaced with lattices, ML-DSA is basically the Schnorr identification protocol with elliptic curves replaced with lattices. Schnorr's protocol is used by a prover to convince a verifier that it knows the secret key associated with its public key without revealing the secret key itself. The protocol has three moves and is executed with four algorithms:

initialize(): The prover initializes the protocol and sends a commitment to the verifier
challenge(): The verifier receives the commitment and sends the prover a challenge
finish(): The prover receives the challenge and sends the verifier the proof
verify(): Finally, the verifier uses the proof to decide whether the prover knows the secret key

We get the high-level structure of ML-DSA by making this protocol non-interactive. In particular, the prover derives the challenge itself by hashing the commitment together with the message to be signed. The signature consists of the commitment and proof: to verify the signature, the verifier recomputes the challenge from the commitment and message and runs verify()as usual.

Let's jump right in to building Schnorr's identification protocol from lattices. If you've never seen this protocol before, then this will look a little like black magic at first. We'll go through it slowly enough to see how and why it works.

Just like for ML-KEM, our public key is an LWE instance (A,t=A*s1 + s2). However, this time our secret key is the pair of short vectors (s1,s2), i.e., it includes the error term. Otherwise, key generation is exactly the same:

def key_gen():
    A = gen_mat(N, N, 0, Q)
    s1 = gen_mat(N, 1, -beta, beta+1)
    s2 = gen_mat(N, 1, -beta, beta+1)
    t = (A*s1 + s2) % Q
    return ((A, t), (s1, s2))

To initialize the protocol, the prover generates another LWE instance (A,w=A*y1 + y2). You'll see why in just a moment. The prover sends the hash of w as its commitment:

def initialize(A):
    y1 = gen_mat(N, 1, -beta, beta+1)
    y2 = gen_mat(N, 1, -beta, beta+1)
    w = (A*y1 + y2) % Q
    return (H(w), (y1, y2))

Here H is some cryptographic hash function, like SHA-3. The prover stores the secret vectors (y1,y2) for use in its next move.

Now it's time for the verifier's challenge. The challenge is just an integer, but we need to be careful about how we choose it. For now let's just pick it at random:

def challenge():
    return random.randrange(0, Q)

Remember: when we turn this protocol into a digital signature, the challenge is derived from the commitment, H(w), and the message. The range of this hash function must be the same as the set of outputs of challenge().

Now comes the fun part. The proof is a pair of vectors (z1,z2) satisfying A*z1 + z2 == c*t + w. We can easily produce this proof if we know the secret key:

z1 = (c*s1 + y1) % Q

z2 = (c*s2 + y2) % Q

Then A*z1 + z2 == A*(c*s1 + y1) + (c*s2 + y2) == c*(A*s1 + s2) + (A*y1 + y2) == c*t + w. Our goal is to design the protocol such that it's hard to come up with (z1,z2) without knowing (s1,s2), even after observing many executions of the protocol.

Here are the finish() and verify() algorithms for completeness:

def finish(s1, s2, y1, y2, c):
    z1 = (c*s1 + y1) % Q
    z2 = (c*s2 + y2) % Q
    return (z1, z2)

def verify(A, t, hw, c, z1, z2):
	return H((A*z1 + z2 - c*t) % Q) == hw

# Test
((A, t), (s1, s2)) = key_gen()
(hw, (y1, y2)) = initialize(A)        # hw: prover -> verifier
c = challenge()                       # c: verifier -> prover
(z1, z2) = finish(s1, s2, y1, y2, c)  # (z1, z2): prover -> verifier
assert verify(A, t, hw, c, z1, z2)    # verifier

Notice that the verifier doesn't actually check A*z1 + z2 == c*t + w directly; we have to rearrange the equation so that we can set the commitment to H(w) rather than w. We'll explain the need for hashing in the next section.

Making this scheme secure

The question of whether this protocol is secure boils down to whether it's possible to impersonate the prover without knowledge of the secret key. Let's put our attacker hat on and poke around.

Perhaps there's a way to compute the secret key, either from the public key directly or by eavesdropping on executions of the protocol with the honest prover. If LWE is hard, then clearly there's no way we're going to extract the secret key from the public key t. Likewise, the commitment H(w)doesn't leak any information that would help us extract the secret key from the proof (z1,z2).

Let's take a closer look at the proof. Notice that the vectors (y1,y2) "mask" the secret key vectors, sort of how the shared secret masks the plaintext in ML-KEM. However, there's one big exception: we also scale the secret key vectors by the challenge c.

What's the effect of scaling these vectors? If we squint at a few proofs, we start to see a pattern emerge. Let's look at z1 first (N=3, Q=3329, beta=4):

((A, t), (s1, s2)) = key_gen()
print('s1={}'.format(s1.T % Q))
for _ in range(10):
    (w, (y1, y2)) = initialize(A)
    c = challenge()
    (z1, z2) = finish(s1, s2, y1, y2, c)
    print('c={}, z1={}'.format(c, z1.T))
# s1=[[   1	0 3326]]
# c=1123, z1=[[1121 3327 3287]]
# c=1064, z1=[[1060	4  137]]
# c=1885, z1=[[1884 3327  999]]
# c=269, z1=[[ 270 3325 2524]]
# c=1506, z1=[[1510 3325 2141]]
# c=3147, z1=[[3149	4  547]]
# c=703, z1=[[ 700	4 1219]]
# c=1518, z1=[[1518 3327 2104]]
# c=1726, z1=[[1726	0 1478]]
# c=2591, z1=[[2589	4 2217]]

Indeed, with enough proof samples, we should be able to make a pretty good guess of the value of s1. In fact, for these parameters, there is a simple statistical analysis we can do to compute s1 exactly. (Hint: Q is a prime number, which means c*pow(c,-1,Q)==1 whenever c>0.) We can also apply this analysis to s2, or compute it directly from t, s1, and A.

The main flaw in our protocol is that, although our secret vectors are short, scaling them makes them so long that they're not completely masked by (y1,y2). Since c spans the entire range(Q), so do the entries of c*s1. and c*s2, which means in order to mask these entries, we need the entries of (y1,y2) to span range(Q) as well. However, doing this would make solving LWE for (A,w) easy, by solving SIS. We somehow need to strike a balance between the length of the vectors of our LWE instances and the leakage induced by the challenge.

Here's where things get tricky. Let's refer to the set of possible outputs of challenge() as the challenge space. We need the challenge space to be fairly large, large enough that the probability of outputting the same challenge twice is negligible.

Why would such a collision be a problem? It's a little easier to see in the context of digital signatures. Let's say an attacker knows a valid signature for a message m. The signature includes the commitment H(m), so the attacker also knows the challenge is c == H(H(w),m). Suppose it manages to find a different message m^* for which c == H(H(w),m^*). Then the signature is also valid for m! And this attack is easy to pull off if the challenge space, that is, the set of possible outputs of H, is too small.

Unfortunately, we can't make the challenge space larger simply by increasing the size of the modulus Q: the larger the challenge might be, the more information we'd leak about the secret key. We need a new idea.

The best of both worlds

Remember that the hardness of LWE depends on the ratio between beta and Q. This means that y1 and y2 don't need to be short in absolute terms, but short relative to random vectors.

With that in mind, consider the following idea. Let's take a larger modulus, say Q=2**31 - 1, and we'll continue to sample from the same challenge space, range(2**16).

First, notice that z1 is now "relatively" short, since its entries are now in range(-gamma, gamma+1), where gamma = beta*(2**16-1), rather than uniform over range(Q). Let's also modify initialize() to sample the entries of (y1,y2) from the same range and see what happens:

def initialize(A):
	y1 = gen_mat(N, 1, -gamma, gamma+1)
	y2 = gen_mat(N, 1, -gamma, gamma+1)
	w = (A*y1 + y2) % Q
	return (H(w), (y1, y2))

((A, t), (s1, s2)) = key_gen()
print('s1={}'.format(s1.T % Q))
for _ in range(10):
    (w, (y1, y2)) = initialize(A)
    c = challenge()
    (z1, z2) = finish(s1, s2, y1, y2, c)
    print('c={}, z1={}'.format(c, z1.T))
# s1=[[3 0 1]]
# c=31476, z1=[[175933 141954  93186]]
# c=27360, z1=[[    136404 2147438807     283758]]
# c=33536, z1=[[2147430945 2147377022     190671]]
# c=23283, z1=[[186516  73400   4955]]
# c=24756, z1=[[    328377 2147438906 2147388768]]
# c=12428, z1=[[2147340715     188675      90282]]
# c=24266, z1=[[    175498 2147261581 2147301553]]
# c=45331, z1=[[357595 185269 177155]]
# c=45641, z1=[[     21592 2147249191 2147446200]]
# c=57893, z1=[[297750 113335 144894]]

This is definitely going in the right direction, since there are no obvious correlations between z1 and s1. (Likewise for z2 and s2.) However, we're not quite there.

One problem is that the challenge space is still quite small. With only 2**16 challenges to choose from, we're likely to see a collision even after only a handful of protocol executions. We need the challenge space to be much, much larger, say around 2**256. But then Q has to be an insanely large number in order for the beta to Q ratio to be secure.

ML-DSA is able to side step this problem due to its use of arithmetic over polynomial rings. It uses the same modulus polynomial as ML-KEM, so the challenge is a polynomial with 256 coefficients. The coefficients are chosen carefully so that the challenge space is large, but multiplication by the challenge scales the secret vector by a small amount. Note that we still end up using a slightly larger modulus (Q=8380417) for ML-DSA than for ML-KEM, but only by about twelve bits.

However, there is a more fundamental problem here, which is that we haven't completely ruled out that signatures may leak information about the secret key.

Cause and effect

Suppose we run the protocol a number of times, and in each run, we happen to choose a relatively small value for some entry of y1. After enough runs, this would eventually allow us to reconstruct the corresponding entry of s1. To rule this out as a possibility, we need to make y1 even longer. (Likewise for y2.) But how long?

Suppose we know that the entries of z1 and z2 are always in range(-beta_loose,beta_loose+1) for some beta_loose > beta. Then we can simulate an honest run of the protocol as follows:

def simulate(A, t):
    z1 = gen_mat(N, 1, -beta_loose, beta_loose+1)
    z2 = gen_mat(N, 1, -beta_loose, beta_loose+1)
    c = challenge()
    w = (A*z1 + z2 - c*t) % Q
    return (H(w), c, (z1, z2))

# Test
((A, t), (s1, s2)) = key_gen()
(hw, c, (z1, z2)) = simulate(A, t)
assert verify(A, t, hw, c, z1, z2)

This procedure perfectly simulates honest runs of the protocol, in the sense that the output of simulate() is indistinguishable from the transcript of a real run of the protocol with the honest prover. To see this, notice that the w, c, z1, and z2 all have the same mathematical relationship (the verification equation still holds) and have the same distribution.

And here's the punch line: since this procedure doesn't use the secret key, it follows that the attacker learns nothing from eavesdropping on the honest prover that it can't compute from the public key itself. Pretty neat!

What's left to do is arrange for z1 and z2 to fall in this range. First, we modify initialize() by increasing the range of y1 and y2 by beta_loose:

def initialize(A):
    y1 = gen_mat(N, 1, -gamma+beta_loose, gamma+beta_loose+1)
    y2 = gen_mat(N, 1, -gamma+beta_loose, gamma+beta_loose+1)
    w = (A*y1 + y2) % Q
    return (H(w), (y1, y2))

This ensures the proof vectors z1 and z2 are roughly uniform over range(-beta_loose, beta_loose+1). However, they may fall slightly outside of this range, so need to modify finalize() to abort if not. Correspondingly, verify() should reject proof vectors that are out of range:

def finish(s1, s2, y1, y2, c):
    z1 = (c*s1 + y1) % Q
    z2 = (c*s2 + y2) % Q
    if not in_range(z1, beta_loose) or not in_range(z2, beta_loose):
        return (None, None)
    return (z1, z2)

def verify(A, t, hw, c, z1, z2):
    if not in_range(z1, beta_loose) or not in_range(z2, beta_loose):
        return False
    return H((A*z1 + z2 - c*t) % Q) == hw

If finish() returns (None,None), then the prover and verifier are meant to abort the protocol and retry until the protocol succeeds:

((A, t), (s1, s2)) = key_gen()
while True:
    (hw, (y1, y2)) = initialize(A)        # hw: prover -> verifier
    c = challenge()                       # c: verifier -> prover
    (z1, z2) = finish(s1, s2, y1, y2, c)  # (z1, z2): prover -> verifier
    if z1 is not None and z2 is not None:
        break
assert verify(A, t, hw, c, z1, z2)

Interestingly, we should expect aborts to be quite common. The parameters of ML-DSA are tuned so that the protocol runs five times on average before it succeeds.

Another interesting point is that the security proof requires us to simulate not only successful protocol runs, but aborted protocol runs as well. More specifically, the protocol simulator must abort with the same probability as the real protocol, which implies that the rejection probability is independent of the secret key.

The simulator also needs to be able to produce realistic looking commitments for aborted transcripts. This is exactly why the prover commits to the hash of w rather than w itself: in the security proof, we can easily simulate hashes of random inputs.

Making this scheme efficient

ML-DSA benefits from many of the same optimizations as ML-KEM, including using polynomial rings, NTT for polynomial multiplication, and encoding polynomials with a fixed number of bits. However, ML-DSA has a few more tricks to make things smaller.

First, in ML-DSA, instead of the pair of short vectors z1 and z2, the proof consists of a single vector z=c*s1 + y, where y was committed to in the previous step. In turn, we only end up with a single proof vector z rather than two as before. Getting this to work requires a special encoding of the commitment so that we can't compute y from it. ML-DSA uses a related trick to reduce the size of the t vector of the public key, but the details are more complicated.

For the parameters we expect to deploy first (ML-DSA-44), the public key is 1312 bytes long and the signature is a whopping 2420 bytes. In contrast to ML-KEM, it is possible to shave off some more bytes. This does not come for free and requires complicating the scheme. An example is HAETAE, which changes the distributions used. Falcon takes it a step further with even smaller signatures, using a completely different approach, which although elegant is also more complex to implement.

Wrap up

Lattice cryptography underpins the first generation of PQ algorithms to get widely deployed on the Internet. ML-KEM is already widely used today to protect encryption from quantum computers, and in the coming years we expect to see ML-DSA deployed to get ahead of the threat of quantum computers to authentication.

Lattices are also the basis of a new frontier for cryptography: computing on encrypted data.

Suppose you wanted to aggregate some metrics submitted by clients without learning the metrics themselves. With LWE-based encryption, you can arrange for each client to encrypt their metrics before submission, aggregate the ciphertexts, then decrypt to get the aggregate.

Suppose instead that a server has a database that it wants to provide clients access to without revealing to the server which rows of the database the client wants to query. LWE-based encryption allows the database to be encoded in a manner that permits encrypted queries.

These applications are special cases of a paradigm known as FHE ("Fully Homomorphic Encryption"), which allows for arbitrary computations on encrypted data. FHE is an extremely powerful primitive, and the only way we know how to build it today is with lattices. However, for most applications, FHE is far less practical than a special-purpose protocol would be (lattice-based or not). Still, over the years we've seen FHE get better and better, and for many applications it is already a decent option. Perhaps we'll dig into this and other lattice schemes in a future blog post.

We hope you enjoyed this whirlwind tour of lattices. Thanks for reading!

An early look at cryptographic watermarks for AI-generated content

Teresa Brooks-Mejia — Wed, 19 Mar 2025 13:00:00 GMT

Generative AI is reshaping many aspects of our lives, from how we work and learn, to how we play and interact. Given that it's Security Week, it's a good time to think about some of the unintended consequences of this information revolution and the role that we play in bringing them about.

Today's web is full of AI-generated content: text, code, images, audio, and video can all be generated by machines, normally based on a prompt from a human. Some models have become so sophisticated that distinguishing their artifacts — that is, the text, audio, and video they generate — from everything else can be quite difficult, even for machines themselves. This difficulty creates a number of challenges. On the one hand, those who train and deploy generative AI need to be able to identify AI-created artifacts they scrape from websites in order to avoid polluting their training data. On the other hand, the origin of these artifacts may be intentionally misrepresented, creating myriad problems for society writ large.

Part of the solution to this problem might be watermarking. The basic idea of watermarking is to modify the training process, the inference process, or both so that an artifact of the model embeds some identifying information of the model from which it originates. This way a model operator, or potentially the consumer of the content themselves, can determine whether some artifact came from the model by checking for the presence of the watermark.

Watermarking shares many of the same goals as the C2PA initiative. C2PA seeks to add provenance of media from a variety of sources, not just AI. Think of it as a chain of digital signatures, where each link in the chain corresponds to some modification of the artifact. For example, if you're a Cloudflare customer using Images to serve C2PA-tagged content, you can opt in to preserve the provenance by extending this signature chain, even after the image is compressed on our network.

The challenge of this approach is that it requires participation by each entity in the chain of custody of the artifact. Watermarking has the potential to make C2PA more robust by preserving the origin of the artifact even after unattributed modification. Whereas the C2PA signature is encoded in an image’s metadata, a watermark is embedded in the pixels of the image itself.

In this post, we're going to take a look at an emerging paradigm for AI watermarking. Based on cryptography, these new watermarks aim to provide rigorous, mathematical guarantees of quality preservation and robustness to modification of the content. This field is, as of 2025, only a couple of years old, and we don't yet know if it will yield schemes that are practical to deploy. Nevertheless, we believe this is a promising area of research, and we hope this post inspires someone to come up with the next big idea.

The case for cryptography

It's often said that cryptography is necessary but not sufficient for security. In other words, cryptographers make certain assumptions about the state of the system, like that a key is kept secret from the attacker, or that some computational puzzle is hard to solve. When these assumptions hold, a good cryptosystem provides a mathematical proof of security against the class of attacks it is designed to prevent.

Artifact watermarking usually has three security goals:

Robustness: Users of the model should not be able to easily misrepresent the origin of its artifacts. The watermark should be verifiable even after the artifact is modified to some extent.
Undetectability: Watermarking should have negligible impact on the quality of the model output. In particular, watermarked artifacts should be indistinguishable from non-watermarked artifacts of the same model.
Unforgeability: It should be impossible for anyone but the model operator to produce watermarked artifacts. No one should be able to convince the model operator that an artifact was generated by the model when it wasn't.

Today's state-of-the-art watermarks, including Google's SynthID and Meta's Video Seal, are often based on deep learning. These schemes involve training a machine learning model, typically one with an encoder–decoder architecture where the encoder encodes a signature into an artifact and the decoder decodes the signature:

^{Figure 1: Illustration of the process of training an encoder-decoder watermarking model.}

The training process involves subjecting the watermarked artifact to a series of known attacks. The more attacks the model thwarts, the higher the model quality. For example, the trainer would alter the artifact in various ways and run the decoder on the outputs: the model scores high if the decoder manages to correctly output the signature most of the time.

This idea is quite beautiful. It's like a scaled up version of penetration testing, an essential practice of security engineering whereby the system is subjected to a suite of known attacks until all known vulnerabilities are patched. Of course, there will always be new attack variants or new attack strategies that the model was not trained on and that may evade the model.

And so ensues the proverbial game of cat-and-mouse that consumes so much time in security engineering. Coping with new attacks on robustness, undetectability, or unforgeability of deep-learning based watermarks requires continual intelligence gathering and re-training of deployed models to keep up with attackers.

The promise of cryptography is that it helps us break out of these kinds of cat-and-mouse games. Cryptography reduces the attack surface by focusing the attacker's attention on breaking some narrow aspect of the system that is easier to reason about. This might be gaining access to some secret key, or solving some (seemingly unrelated) computational puzzle that is believed to be impossible to solve.

Pseudorandom codes

To the best of our knowledge, the first cryptographic AI watermark was proposed by Scott Aaronson in the summer of 2022 while he was working at OpenAI. Tailored specifically for chatbots, Aaronson's simple scheme was both undetectable and unforgeable. However, it was susceptible to some simple attacks on robustness.

^{Figure 2: The "emoji attack": Ask the chatbot to embed a simple pattern in its response, then remove the pattern manually. This is sufficient to remove some cryptographic watermarks from the model output.}

In the year or so that followed, other cryptographic watermarks were proposed, all making different trade-offs between detectability and robustness. Two years later, in a paper that appeared at CRYPTO 2024, Miranda Christ and Sam Gunn articulated a new framework for watermarks that, if properly instantiated, would provide all three properties simultaneously.

Along with the prompt provided by the user, generative AI models typically take as input some randomness generated by the model operator. For many such models, it is often possible to run the model "in reverse" such that, given an artifact of the model, one can recover an approximation of the initial randomness used to generate it. We'll see why this is important in a moment.

The starting point for Christ-Gunn-2024 is a mathematical tool called an error correcting code. These codes are normally used to transmit messages over noisy channels and are found in just about every system we rely on, including fiber optics, satellites, the data bus on your motherboard, and even quantum computers.

To transmit a message m, one first encodes it into a codeword c=encode(m). Then the receiver attempts to decode the codeword as decode(c). Error correcting codes are designed to tolerate some fraction of the codeword bits being flipped: if too many bits are flipped, then decoding will fail.

Now, ignoring undetectability and unforgeability for a moment, we can use an error correcting code to make a robust watermark as follows:

Generate the initial randomness.
Embed a codeword c=encode(m) into the randomness in some manner, by overwriting bits of randomness with bits of c. The message m can be whatever we want, for example a short string identifying the version of our model.
Run the model with the modified randomness.

To verify the watermark, we:

Run the model "in reverse" on the artifact, obtaining an approximation of the initial randomness.
Extract the codeword c^* from the randomness.
If decode(c^*) succeeds, then the watermark is present.

Why does this work? Since c is a codeword, we can verify the watermark even if our approximation of the initial randomness isn't perfect. Some of the bits will be flipped, but the error correcting property of the code allows us to compensate for this. In fact, this is also what makes the watermark robust, since we can also tolerate bit flips caused by an attacker munging the artifact. Of course, the better our approximation of the initial randomness, the more robust our watermark will be, since we'll be able to correct for more bit flips.

To see why this watermark is detectable, notice that overwriting bits of the randomness with a fixed codeword (c=encode(m)) biases the randomness and thereby the output of the model. Thus, the distribution of watermarked artifacts will be slightly different from unwatermarked artifacts, perhaps even noticeably so. This watermark is also forgeable, since the encoding algorithm is public and can be run by anyone.

The challenge then is to design error-correcting codes for which codewords look random, and generating codewords requires knowledge of a secret key held by the model operator. Christ-Gunn-2024 names these pseudorandom error-correcting codes, or simply pseudorandom codes.

A pseudorandom code consists of three algorithms:

k = key_gen(): the key generation algorithm. Let's call k the watermarking key.
c = encode(k,m): the encoding algorithm takes in a message m and outputs a codeword c.
m = decode(k,c): the decoding algorithm takes in a codeword c and outputs the underlying message m, or an indication that decoding failed.

The term "pseudorandom" refers to the fact that codewords aren't technically random bit strings. Intuitively, an attacker can distinguish a codeword from a random string if it manages to guess the watermarking key. Thus, our goal is to choose parameters for the code such that distinguishing codewords from random — for example, by guessing the watermarking key — is hard for any computationally bounded attacker.

To use a pseudorandom code for watermarking, the operator first generates a watermarking key k. Then each time it gets a prompt from a user, it generates the initial randomness, embeds c=encode(k,m) into the initial randomness, and runs the model. To verify the watermark, the operator runs the model in reverse to get the inverted randomness, extracts the inverted codeword, c^*, and computes decode(k,c^*): if decoding succeeds, then the watermark is present.

In order for this watermark to be undetectable, we need to pick an embedding of the codeword that doesn't change the distribution of the initial randomness. The details of this embedding depend on the model. Let's take a look at Stable Diffusion as an example.

A watermark for Stable Diffusion

Stable Diffusion is a model used for image generation that takes as input a tensor of normally distributed floating point numbers called a latent. The model uses the user's prompt to "denoise" the latent tensor over a number of iterations, then converts the final version of the latent to an image.

Approximating the initial latent

Diffusion inversion is an iterative process that returns an exact or approximate initial latent by reversing the sampling process that generated an image. Inversion for text to image diffusion models is a relatively new area of research. A common application of diffusion inversion is editing images by text prompts.

Denoising Diffusion Implicit Models (DDIMs) are iterative, implicit probabilistic models that can generate high quality images using a faster sampling process than other approaches, as it only requires a relatively small number of timesteps to generate a sample. This makes DDIM Inversion a popular inversion technique because it is computationally fast, as it requires only a few timesteps to return an approximate initial latent of a generated image. Despite its popularity, it has some known limitations and can be problematic to use for tasks where exact image reproduction is required. These limitations have led researchers to explore techniques that produce exact initial latents. However, since watermarks based on pseudorandom codes can tolerate errors, it's worth investigating whether DDIM Inversion suffices for our purposes.

Before we can generate an approximate initial latent, we need a generated image. To do this we use a pretrained Stable Diffusion model that uses a DDIM scheduler. The scheduler performs the “denoising” process that generates an image from a random noise seed (initial latent). By default, the pipeline computes random latents; when embedding a watermark we will generate this latent ourselves as described in the next section. The Stable Diffusion pipelines in the code snippets below sets the number of inference steps to 50. This parameter controls the number of steps the denoising process takes. 50 steps provides a nice balance between speed and image quality.

from stable_diffusion.utils import build_stable_diffusion_pipeline
from stable_diffusion.schedulers import ddim_scheduler

# Instantiate Stable Diffusion pipeline
model_cache_path = './model_cache'
model = 'stabilityai/stable-diffusion-2-1-base'
scheduler, _ = ddim_scheduler()
pipe, device = build_stable_diffusion_pipeline(model, model_cache_path, scheduler)

# Generate image
prompt = 'grainy photo of a UFO at night'
image, _ = pipe(prompt, num_inference_steps=50, return_dict=False)

To compute the approximate initial latent for the image we generated, we run the sampling process backwards. We could include the prompt, but in the case of verifying a watermark, we will usually not know the initial prompt, so we instead just set it to the empty string:

from PIL import Image
from stable_diffusion.utils import build_stable_diffusion_pipeline,   
                                   convert_pil_to_latents
from stable_diffusion.schedulers import ddim_inverse_scheduler

# Load image
img = Image.open(image_path)

# Instantiate Stable Diffusion pipeline with DDIM Inverse scheduler
model_cache_path = './model_cache'
model = 'stabilityai/stable-diffusion-2-1-base'
scheduler, _= ddim_inverse_scheduler()
pipe, _ = build_stable_diffusion_pipeline(model, model_cache_path, scheduler)

# Convert the input image to latent space
image_latent = convert_pil_to_latents(pipe, img)

# Invert the sampling process that generated the image with an empty prompt
inverted_latent, _ = pipe('', output_type='latent', latents=image_latent,  
                          num_inference_steps=50, return_dict=False)

Embedding the code

The initial latent used for stable diffusion consists of a bunch of floating point numbers, each independently and normally distributed with a mean of zero.

The following observation is from a recent evaluation of the watermark of Christ and Gunn for stable diffusion. (There they used a more sophisticated but expensive inversion method than DDIM.) Observe that the probability that each number is negative is equal to the probability that the number is positive. Likewise, if the code is indeed pseudorandom, then each bit of the codeword is computationally indistinguishable from a bit that is one with probability ½ and zero with probability ½.

To embed the codeword in the latent, we just set the sign of each number according to the corresponding bit of the codeword:

from stable_diffusion.utils import build_stable_diffusion_pipeline
from stable_diffusion.schedulers import ddim_scheduler
import numpy as np
import torch

# Generate a normally distributed latent. For the default image
# size of 512x512 pixels, the latent shape is `[1, 4, 64, 64]`.
initial_latent = np.abs(np.random.randn(*LATENTS_SHAPE))

with np.nditer(initial_latent, op_flags=['readwrite']) as it:
    	codeword = encode(k, m)
    	for (i, x) in enumerate(it):
             # `codeword[i]` is a `bool` representing the `i`-th bit of
             # the codeword.
             x *= 1 if codeword[i] else -1

watermarked_latent = torch.from_numpy(initial_latent).to(dtype=torch.float32)

# Instantiate Stable Diffusion pipeline
model_cache_path = './model_cache'
model = 'stabilityai/stable-diffusion-2-1-base'
scheduler, _ = ddim_scheduler()
pipe, _ = build_stable_diffusion_pipeline(model, model_cache_path, scheduler)

# Generate watermarked image
prompt = 'grainy photo of a UFO at night'
watermarked_image, _ = pipe(prompt, num_inference_steps=50,
                                              latents=watermarked_latent, 
                                              return_dict=False)

To verify this watermark, we compute the inverted latent, extract the codeword, and attempt to decode:

from PIL import Image
from stable_diffusion.utils import build_stable_diffusion_pipeline,   
                                   convert_pil_to_latents
from stable_diffusion.schedulers import ddim_inverse_scheduler
import numpy as np

# Load image
img = Image.open(image_path)

# Instantiate Stable Diffusion pipeline with DDIM Inverse scheduler
model_cache_path = './model_cache'
model = 'stabilityai/stable-diffusion-2-1-base'
scheduler, _ = ddim_inverse_scheduler()
pipe, _ = build_stable_diffusion_pipeline(model, model_cache_path, scheduler)

# Convert the input image to latent space
image_latent = convert_pil_to_latents(pipe, img)

# Invert the sampling process that generated the image
inverted_latent, _ = pipe('', output_type='latent', latents=image_latent,     
                         num_inference_steps=50, return_dict=False)

watermark_verified = False
with np.nditer(inverted_latent.cpu().numpy()) as it:
    inverted_codeword = []
    for x in it:
         inverted_codeword.append(x > 0)

    if decode(k, inverted_codeword) == m:
         watermark_verified = True

This should work in theory given the error-correcting properties of the code. But does it work in practice?

Evaluation

A good approximate initial latent is one that is very similar to the original latent that generated an image. Given our embedding of a codeword into the latent, we define similarity as latents that have a high percentage of overlapping or matching signs.

To get a feel for this difference, we can visualize it by comparing a generated image to the same image, but with the inverted latent (these images are unwatermarked):

^{Figure 3: Image generated with prompt 'grainy photo of a UFO at night' (left) and the same image generated using the inverted latent (right).}

To evaluate how good the approximate latents are for preserving the robustness of watermarks, we randomly sampled 1,000 prompts from the PartiPrompts benchmark dataset. For each of these prompts we generated initial latent and inverted latent pairs. We then computed our similarity metric for each pair. We found that on average, 82% of the signs matched for all initial latent and inverted latent pairs, and at least 75% of signs matched for 90% of the pairs.

We were pleasantly surprised with how accurate the approximation was on average. If 75% of the signs are preserved, then this gives us a decent margin for correcting for errors introduced by an attacker attempting to remove the watermark. Of course a better approximation would give us a better robustness margin. More study is required to fully understand the strengths and limitations of using DDIM Inversion for watermark decoding.

Candidate pseudorandom codes

Now that we have a feel for how to apply pseudorandom codes, let's take a look at how we actually build them. Although the field is barely a year old, we already have a handful of candidates.

One obvious idea to try is to compose a plain error-correcting code with some cryptographic primitive to make the code pseudorandom. For instance, we might use some standard authenticated encryption scheme, like AES-GCM-SIV, to encrypt m, then apply an error correcting code to the ciphertext. (The watermarking key would be the encryption key.) This "encrypt-then-encode" composition seems natural because encryption schemes are already designed so that their ciphertexts are pseudorandom. Unfortunately, error correcting codes are generally highly structured, and this structure would be betrayed by the codeword, even when applied to a (pseudo)random input.

The dual composition, "encode-then-encrypt", also doesn't work. If the ciphertext is non-malleable, as in AES-GCM-SIV, then we wouldn't be able to tolerate any number of bit flips. On the other hand, if the ciphertext were malleable, as in AES-CTR, then the attacker would be able to forge codewords by manipulating a known codeword in a targeted manner.

The strategy of Christ-Gunn-2024 is to modify an existing error-correcting code to make it pseudorandom.

Pseudorandom LDPC codes

Their starting point is the widely used Low-Density Parity-Check (LDPC) code. This code is defined by a parity check matrix P with and a corresponding generator matrix G. The parity check matrix has bit entries and might look something like this:

import numpy as np
P = np.matrix([[1, 0, 1],
               [0, 0, 1],
               [0, 1, 1],
               [1, 1, 1],
               [1, 0, 0],
               [0, 1, 0]])

This matrix is used to check if a given bit string is a codeword. By definition, a codeword is any bit string c for which the weight of P*c (the number ones in the output of P*c) is small. (Note that arithmetic is modulo 2 here.) The generator matrix G is constructed from P so that it can be used as the encoder. In particular, for any bit string m, c=G*m is a codeword. The performance of this code depends in large part on the sparsity of the parity check matrix: roughly speaking, the more zero entries the matrix has, the more the bit flips the code can tolerate.

The main idea of Christ-Gunn-2024 is to tune the parameters of the LDPC code (the dimensions of the parity check matrix and its density) so that when the parity-check matrix P is chosen at random, the generator matrix G is pseudorandom. This means that, intuitively, when we encode a random input m as c=G*m, the codeword c is also pseudorandom. (There is a bit more that goes into constructing the input m, but this is roughly the idea.)

It's easy to see that a watermark based on this construction is robust, as it follows immediately from the capacity of LDPC to tolerate bit flips. Ensuring the watermark remains undetectable is more delicate, as it relies on relatively strong and understudied computational assumptions. As a result, it's not clear today for what parameter ranges this scheme is concretely secure. (There has been some progress here: a recent preprint by Surendra Ghentiyala and Venkatesan Guruswami showed that the pseudorandomness of Christ-Gunn-2024 can be proved with slightly weaker assumptions.)

To get a feel for how things might go wrong, consider what happens if the attacker manages to guess one of the rows of the parity check matrix P. When we take the dot product of this row with a codeword, then the output will be 0 with high probability. (By definition, c is a codeword if the sum of the dot products of c with each row of P is small.) But if we take the dot product of this row with a random bit string, then we should expect to see 0 with probability roughly ½. This gives us a way of distinguishing codewords from random bit strings.

Guessing a row of P is easy if the matrix is too sparse. In the extreme case, if each row has only one bit set, then there are only n possible values for that row, where n is the number of columns of P. On the other hand, making P too dense will degrade the code's ability to detect bit flips.

Similarly, it may be easy to guess a row of P if the length of the codeword itself (n) is too small. Thus, in order for this code to be pseudorandom, it is necessary (but not sufficient) for the number of possible parity check matrices to be so large that exhaustively searching for P is not feasible. This can be done by increasing the size of the codeword or tolerating fewer bit flips.

Pseudorandom codes from PRFs

Another approach to constructing pseudorandom codes comes from a 2024 preprint from Noah Golowich and Ankur Moitra. Their starting point is a common cryptographic primitive called a pseudorandom function (PRF). They require a PRF that takes as input a key k, a bit string x of length m and outputs a bit, denoted F(k,m).

Suppose our codewords are x₁, ⁣w₁=F(k,x₁), …, x_n, w_n=F(k,x_n) where x₁, …, x_n are random m-bit strings. (Notice that the codeword length is (m+1)*n.) To verify if a string is a codeword, we parse the codeword into x₁, w₁, …, x_n, w_n and check if w_i = F(k,x_i) for all i. If the number of checks that pass is sufficiently high, then the string is likely a codeword.

It's easy to see that this code is pseudorandom if the output of F is pseudorandom. However, it's not very robust: to make the i-th check fail, we just need to flip a single bit, w_i. The attacker just needs to flip a sufficient number of these bits to cause verification to fail. To defeat this attack, the encoder permutes the bits of the codeword with a secret, random permutation. That way the attacker has to guess the position of a sufficient number of w_is in the permuted bit string. (A bit more is required to make this scheme provably robust, but this is the idea.)

Note that the number of bit flips we can tolerate with this scheme depends significantly on the number of PRF checks. This in turn determines the length of the codeword, so we may only get a reasonable degree of robustness for longer codewords. Note that we can increase the number of PRF checks by decreasing the length m of the x_is, but making these strings too short is detrimental for pseudorandomness. (What happens if we happen to randomly pick x_i==x_j for i!=j?)

Are these schemes practical?

In our own experiments with Stable Diffusion, we were able to tune the LDPC code to tolerate up to 33% of the codeword bits being mangled, which is likely more than sufficient for robustness in practice. However, achieving this required making the parity check matrix so sparse that the code is not strongly pseudorandom. Thus, the resulting watermark cannot be considered cryptographically undetectable. Among the parameter sets for which the code is plausibly pseudorandom, we didn't find any for which the code tolerates more than 5% bit flips.

Our findings were similar for the PRF-based code: with plausibly pseudorandom parameters, we couldn't tune the code to tolerate more than 1% bit flips. Like the LDPC code, we can crank this higher by sacrificing pseudorandomness, but we weren't able to crack 5% with any parameters we tried.

There are a few ways to think about this.

First, for both codes, robustness improves as the codeword gets larger. In particular, if the latent space for Stable Diffusion were larger, then we'd expect to be able to tolerate more bit flips. In general, cryptographic watermarks of all kinds perform better when there is more randomness to work with. For example, short responses produced by chatbots are especially hard for any watermarking strategy, including pseudorandom codes.

Another takeaway is that we need a better approximation of the initial latent than provided by DDIM. Indeed, in their own evaluation of the LDPC-based code, Sam Gunn, Xuandong Zhao, and Dawn Song chose a much more sophisticated inversion method, which exhibited better results, albeit at a higher computational cost.

A third view is that, as a practical matter, cryptographic undetectability might not be all that important for some applications. For instance, we might decide the watermark is good enough on the basis of statistical tests to check for biases within, or correlations across, codewords. Of course, such tests can't rule out the possibility of perceptible differences between watermarked and unwatermarked artifacts.

^{Figure 4: Images with verified LDPC watermarks generated with prompt 'grainy photo of a UFO at night'.}

Conclusion

The sign of a good abstraction boundary is that it allows folks to collaborate across disciplines. With pseudorandom codes, it seems like we've landed on the right abstraction for AI watermarking: it's up to cryptography experts to figure out how to instantiate them; and it's up to AI/ML experts to figure out how to embed them in their applications. We believe this separation of concerns has the potential to make watermarking easier to deploy, especially for operators like Cloudflare's Workers AI, who don't train and maintain the models themselves.

After spending a few weeks playing around with this stuff, we're excited by the potential of pseudorandom codes to make strong watermarks for generative AI. However, we feel it will take some time for this field to yield practical schemes.

Existing candidates will require further study to determine the parameter ranges for which they provide good security. It is also worthwhile to investigate new approaches to building pseudorandom codes, perhaps starting with some other error correcting code besides LDPC. We should also examine what is even theoretically possible in this space: perhaps there is a fundamental tension between detectability and robustness that can't be resolved for some parameter regimes.

It's also going to be important for watermarks based on pseudorandom codes to be publicly verifiable, as some other cryptographic watermarks are. Concretely, the LDPC code is sort of analogous to public key encryption, where the ciphertext corresponds to a codeword. It might be possible to flip this paradigm around and make a digital signature where the signature is a codeword. Of course, this only works when the weights of the model are also publicly available.

On the AI/ML side, we need to look closer at methods of approximating the initial randomness for different types of models. This blog looked at what is perhaps the simplest possible method for Stable Diffusion, and while this seems to work pretty well, it's obvious that we can do a lot better. It's just a matter of keeping costs low. A good rule of thumb might be that verifying the watermark should not be more expensive than watermarked inference.

Pseudorandom codes may also have applications beyond watermarking. When we circulated this blog post internally, an idea that came up a lot was to somehow apply this technology to non-AI content, to embed in the content its provenance. Indeed, this is the idea behind the C2PA integration. Pseudorandom codes aren't immediately applicable, but may be in the future. Wherever you have a source of randomness in the process of generating some artifact, like digital photography, you can embed in that randomness a codeword.

Thanks for reading! We hope we've managed to pique your interest in this field. We certainly will be following along. If you'd like to play with the code we used to produce the numbers in this blog, or just make some cool watermarked AI content, you can find our demo on GitHub.

Post-quantum cryptography goes GA

Wesley Evans — Fri, 29 Sep 2023 13:05:00 GMT

Over the last twelve months, we have been talking about the new baseline of encryption on the Internet: post-quantum cryptography. During Birthday Week last year we announced that our beta of Kyber was available for testing, and that Cloudflare Tunnel could be enabled with post-quantum cryptography. Earlier this year, we made our stance clear that this foundational technology should be available to everyone for free, forever.

Today, we have hit a milestone after six years and 31 blog posts in the making: we’re starting to roll out General Availability¹ of post-quantum cryptography support to our customers, services, and internal systems as described more fully below. This includes products like Pingora for origin connectivity, 1.1.1.1, R2, Argo Smart Routing, Snippets, and so many more.

This is a milestone for the Internet. We don't yet know when quantum computers will have enough scale to break today's cryptography, but the benefits of upgrading to post-quantum cryptography now are clear. Fast connections and future-proofed security are all possible today because of the advances made by Cloudflare, Google, Mozilla, the National Institutes of Standards and Technology in the United States, the Internet Engineering Task Force, and numerous academic institutions

What does General Availability mean? In October 2022 we enabled X25519+Kyber as a beta for all websites and APIs served through Cloudflare. However, it takes two to tango: the connection is only secured if the browser also supports post-quantum cryptography. Starting August 2023, Chrome is slowly enabling X25519+Kyber by default.

The user’s request is routed through Cloudflare’s network (2). We have upgraded many of these internal connections to use post-quantum cryptography, and expect to be done upgrading all of our internal connections by the end of 2024. That leaves as the final link the connection (3) between us and the origin server.

We are happy to announce that we are rolling out support for X25519+Kyber for most inbound and outbound connections as Generally Available for use including origin servers and Cloudflare Workers fetch()es.

Plan	Support for post-quantum outbound connections
Free	Started roll-out. Aiming for 100% by the end of the October.
Pro and business	Aiming for 100% by the end of year.
Enterprise	Roll-out begins February 2024. 100% by March 2024.

For our Enterprise customers, we will be sending out additional information regularly over the course of the next six months to help prepare you for the roll-out. Pro, Business, and Enterprise customers can skip the roll-out and opt-in within your zone today, or opt-out ahead of time using an API described in our companion blog post on post-quantum cryptography. Before rolling out for Enterprise in February 2024, we will add a toggle on the dashboard to opt out.

If you're excited to get started now, check out our blog with the technical details and flip on post-quantum cryptography support via the API!

What’s included and what is next?

With an upgrade of this magnitude, we wanted to focus on our most used products first and then expand outward to cover our edge cases. This process has led us to include the following products and systems in this roll out:

1.1.1.1
AMP
API Gateway
Argo Smart Routing
Auto Minify
Automatic Platform Optimization
Automatic Signed Exchange
Cloudflare Egress
Cloudflare Images
Cloudflare Rulesets
Cloudflare Snippets
Cloudflare Tunnel
Custom Error Pages
Flow Based Monitoring
Health checks
Hermes
Host Head Checker
Magic Firewall
Magic Network Monitoring
Network Error Logging
Project Flame
Quicksilver
R2 Storage
Request Tracer
Rocket Loader
Speed on Cloudflare Dash
SSL/TLS
Traffic Manager
WAF, Managed Rules
Waiting Room
Web Analytics

If a product or service you use is not listed here, we have not started rolling out post-quantum cryptography to it yet. We are actively working on rolling out post-quantum cryptography to all products and services including our Zero Trust products. Until we have achieved post-quantum cryptography support in all of our systems, we will publish an update blog in every Innovation Week that covers which products we have rolled out post-quantum cryptography to, the products that will be getting it next, and what is still on the horizon.

Products we are working on bringing post-quantum cryptography support to soon:

Cloudflare Gateway
Cloudflare DNS
Cloudflare Load Balancer
Cloudflare Access
Always Online
Zaraz
Logging
D1
Cloudflare Workers
Cloudflare WARP
Bot Management

Why now?

As we announced earlier this year, post-quantum cryptography will be included for free in all Cloudflare products and services that can support it. The best encryption technology should be accessible to everyone - free of charge - to help support privacy and human rights globally.

As we mentioned in March:

“What was once an experimental frontier has turned into the underlying fabric of modern society. It runs in our most critical infrastructure like power systems, hospitals, airports, and banks. We trust it with our most precious memories. We trust it with our secrets. That’s why the Internet needs to be private by default. It needs to be secure by default.”

Our work on post-quantum cryptography is driven by the thesis that quantum computers that can break conventional cryptography create a similar problem to the Year 2000 bug. We know there is going to be a problem in the future that could have catastrophic consequences for users, businesses, and even nation states. The difference this time is we don’t know how the date and time that this break in the computational paradigm will occur. Worse, any traffic captured today could be decrypted in the future. We need to prepare today to be ready for this threat.

We are excited for everyone to adopt post-quantum cryptography into their systems. To follow the latest developments of our deployment of post-quantum cryptography and third-party client/server support, check out pq.cloudflareresearch.com and keep an eye on this blog.

***

¹We are using a preliminary version of Kyber, NIST’s pick for post-quantum key agreement. Kyber has not been finalized. We expect a final standard to be published in 2024 under the name ML-KEM, which we will then adopt promptly while deprecating support for X25519Kyber768Draft00.

Privacy-preserving measurement and machine learning

Christopher Patton — Fri, 29 Sep 2023 13:00:45 GMT

In 2023, data-driven approaches to making decisions are the norm. We use data for everything from analyzing x-rays to translating thousands of languages to directing autonomous cars. However, when it comes to building these systems, the conventional approach has been to collect as much data as possible, and worry about privacy as an afterthought.

The problem is, data can be sensitive and used to identify individuals – even when explicit identifiers are removed or noise is added.

Cloudflare Research has been interested in exploring different approaches to this question: is there a truly private way to perform data collection, especially for some of the most sensitive (but incredibly useful!) technology?

Some of the use cases we’re thinking about include: training federated machine learning models for predictive keyboards without collecting every user’s keystrokes; performing a census without storing data about individuals’ responses; providing healthcare authorities with data about COVID-19 exposures without tracking peoples’ locations en masse; figuring out the most common errors browsers are experiencing without reporting which websites are visiting.

It’s with those use cases in mind that we’ve been participating in the Privacy Preserving Measurement working group at the IETF whose goal is to develop systems for collecting and using this data while minimizing the amount of per-user information exposed to the data collector.

So far, the most promising standard in this space is DAP – Distributed Aggregation Protocol – a clever way to use multi-party computation to aggregate data without exposing individual measurements. Early versions of the algorithms used by DAP have been implemented by Google and Apple for exposure notifications.

In this blog post, we’ll do a deep dive into the fundamental concepts behind the DAP protocol and give an example of how we’ve implemented it into Daphne, our open source aggregator server. We hope this will inspire others to collaborate with us and get involved in this space!

The principles behind DAP, an open standard for privacy preserving measurement

At a high level, using the DAP protocol forces us to think in terms of data minimization: collect only the data that we use and nothing more. Abstractly, our goal is to devise a system with which a data collector can compute some function $ f(m_{1},...,m_{N}) $ of measurements $ m_{1},...,m_{N} $ uploaded by users without observing the measurements in the clear.

Alice wants to know some aggregate statistic – like the average salary of the people at the party – without knowing how much each individual person makes.

This may at first seem like an impossible task: to compute on data without knowing the data we're computing on. Nevertheless, —and, as is often the case in cryptography— once we've properly constrained the problem, solutions begin to emerge.

Strawperson solution: delegate the calculation to a trusted third party, Bob. The problem with this is that Bob can see the private inputs in the clear

In an ideal world (see above), there would be some server somewhere on the Internet that we could trust to consume measurements, aggregate them, and send the result to the data collector without ever disclosing anything else. However, in reality there's no reason for users to trust such a server more than the data collector; Indeed, both are subject to the usual assortment of attacks that can lead to a data breach.

_MPC solution: secret-share the inputs across multiple parties, a.k.a. Bob and Daphne. If at least one person is honest, Alice gets the aggregate result without anyone knowing individual inputs in the clear._‌ ‌

Instead, what we do in DAP is distribute the computation across the servers such that no single server has a complete measurement. The key idea that makes this possible is secret sharing.

Computing on secret shared data

To set things up, let's make the problem a little more concrete. Suppose each measurement $ m_{i} $ is a number and our goal is to compute the sum of the measurements. That is, $ f(m_{1},...,m_{N}) = m_{1} + \cdots + m_{N} $. Our goal is to use secret sharing to allow two servers, which we'll call aggregators, to jointly compute this sum.

To understand secret sharing, we're going to need a tiny bit of math—modular arithmetic. The expression $ X + Y (\textrm{mod}) \textit{q} $ means "add $ X $ and $ Y $, then divide the sum by $ q $ and return the remainder". For now the modulus $ q $ can be any large number, as long as it's larger than any sum we'd ever want to compute ($ 2 ^{64} $, say). In the remainder of this section, we'll omit $ q $ and simply write $ X + Y $ for addition modulo $ q $.

The goal of secret sharing is to shard a measurement (i.e., a "secret") into two "shares" such that (i) the measurement can be recovered by combining the shares together and (ii) neither share leaks any information about the measurement. To secret share each $ m_{i} $, we choose a random number $ R_{i} \in \lbrace 0,...,q - 1\rbrace $, set the first share to be $X_{i} = m_{i} - R_{i} $ and set the other share to be $ Y_{i} = R_{i} $. To recover the measurement, we simply add the shares together. This works because $ X_{i} + Y_{i} = (m_{i} - R_{i}) + R_{i} = m_{i} $. Moreover, each share is indistinguishable from a random number: For example, $ 1337 $ might be secret-shared into $ 11419752798245067454 $ and $ 7026991275464485499 $ (modulo $ q = 2^{64} $).

With this scheme we can devise a simple protocol for securely computing the sum:

Each client shards its measurement $ m_{i} $ into $ X_{i} $ and $ Y_{i} $ and sends one share to each server.
The first aggregator computes $ X = X_{1} + \cdots + X_{N} $ and reveals $ X $ to the data collector. The second aggregator computes $ Y = Y_{1} + \cdots + Y_{N} $ and reveals $ Y $ to the data collector.
The data collector unshards the result as $ r = X + Y $.

This works because the secret shares are additive, and the order in which we add things up is irrelevant to the function we're computing:

$ r = m_{1} + \cdots + m_{N} $ // by definition$ r = (m_{1} - R_{1}) + R_{1} + \cdots (m_{N} - R_{N}) + R_{N} $ // apply sharding$ r = (m_{1} - R_{1}) + \cdots + (m_{N} - R_{N}) + R_{1} + \cdots R_{N} $ // rearrange the sum$ r = X + Y $ // apply aggregation

Rich data types

This basic template for secure aggregation was described in a paper from Henry Corrigan-Gibbs and Dan Boneh called "Prio: Private, Robust, and Scalable Computation of Aggregate Statistics" (NSDI 2017). This paper is a critical milestone in DAP's history, as it showed that a wide variety of aggregation tasks (not just sums) can be solved within one, simple protocol framework, Prio. With DAP, our goal in large part is to bring this framework to life.

All Prio tasks are instances of the same template. Measurements are encoded in a form that allows the aggregation function to be expressed as the sum of (shares of) the encoded measurements. For example:

To get arithmetic mean, we just divide the sum by the number of measurements.
Variance and standard deviation can be expressed as a linear function of the sum and the sum of squares (i.e., $ m_{i}, m_{i}^{2} $ for each $ i $).
Quantiles (e.g., median) can be estimated reasonably well by mapping the measurements into buckets and aggregating the histogram.
Linear regression (i.e., finding a line of best fit through a set of data points) is a bit more complicated, but can also be expressed in the Prio framework.

This degree of flexibility is essential for wide-spread adoption because it allows us to get the most value we can out of a relatively small amount of software. However, there are a couple problems we still need to overcome, both of which entail the need for some form of interaction.

Input validation

The first problem is input validation. Software engineers, especially those of us who operate web services, know in our bones that validating inputs we get from clients is of paramount importance. (Never, ever stick a raw input you got from a client into an SQL query!) But if the inputs are secret shared, then there is no way for an aggregator to discern even a single bit of the measurement, let alone check that it has an expected value. (A secret share of a valid measurement and a number sampled randomly from $ \lbrace 0,...,q - 1 \rbrace $ look identical.) At least, not on its own.

The solution adopted by Prio (and the standard, with some improvements), is a special kind of zero-knowledge proof (ZKP) system designed to operate on secret shared data. The goal is for a prover to convince a verifier that a statement about some data it has committed to is true (e.g., the user has a valid hardware key), without revealing the data itself (e.g. which hardware key is in-use).

Our setting is exactly the same, except that we're working on secret-shared data rather than committed data. Along with the measurement shares, the client sends shares of a validity proof; then during aggregation, the aggregators interact with one another in order to check and verify the proof. (One round-trip over the network is required.)

A happy consequence of working with secret shared data is that proof generation and verification are much faster than for committed (or encrypted) data. This is mainly because we avoid the use of public-key cryptography (i.e., elliptic curves) and are less constrained in how we choose cryptographic parameters. (We require the modulus $ q $ to be a prime number with a particular structure, but such primes are not hard to find.)

Non-linear aggregation

There are a variety of aggregation tasks for which Prio is not well-suited, in particular those that are non-linear. One such task is to find the "heavy hitters" among the set of measurements. The heavy hitters are the subset of the measurements that occur most frequently, say at least $ t $ times for some threshold $ t $. For example, the measurements might be the URLs visited on a given day by users of a web browser; the heavy hitters would be the set of URLs that were visited by at least $ t $ users.

This computation can be expressed as a simple program:

def heavy_hitters(measurements: list[bytes], t: int) -> set[bytes]:
    hh = defaultdict(lambda: 0)
    for measurement in measurements:
        hh[measurement] += 1
    return set(map(lambda x: x[0], filter(lambda x: x[1] >= t, hh.items())))

However, it cannot be expressed as a linear function, at least not efficiently (with sub-exponential space). This would be required to perform this computation on secret-shared measurements.

In order to enable non-linear computation on secret shared data, it is necessary to introduce some form of interaction. There are a few possibilities. For the heavy hitters problem in particular, Henry Corrigan-Gibbs and others devised a protocol called Poplar (IEEE Security & Privacy 2021) in which several rounds of aggregation and unsharding are performed, where in each round, information provided by the collector is used to "query" the measurements to obtain a refined aggregate result.

Helping to build a world of multi-party computation

Protocols like Prio or Poplar that enable computation over secret shared data fit into a rich tradition in cryptography known as multi-party computation (MPC). MPC is at once an active research area in theoretical computer science and a class of protocols that are beginning to see real-world use—in our case, to minimize the amount of privacy-sensitive information we collect in order to keep the Internet moving.

The PPM working group at IETF represents a significant effort, by Cloudflare and others, to standardize MPC techniques for privacy preserving measurement. This work has three main prongs:

To identify the types of problems that need to be solved.
To provide cryptography researchers from academia, industry, and the public sector with "templates" for solutions that we know how to deploy. One such template is called a "Verifiable Distributed Aggregation Function (VDAF)", which specifies a kind of "API boundary" between protocols like Prio and Poplar and the systems that are built around them. Cloudflare Research is leading development of the standard, contributing to implementations, and providing security analysis.
To provide a deployment roadmap for emerging protocols. DAP is one such roadmap: it specifies execution of a generic VDAF over HTTPS and attends to the various operational considerations that arise as deployments progress. As well as contributing to the standard itself, Cloudflare has developed its own implementation designed for our own infrastructure (see below).

The IETF is working on its first set of drafts (DAP/VDAF). These drafts are mature enough to deploy, and a number of deployments are scaling up as we speak. Our hope is that we have initiated positive feedback between theorists and practitioners: as new cryptographic techniques emerge, more practitioners will begin to work with them, which will lead to identifying new problems to solve, leading to new techniques, and so on.

Daphne: Cloudflare’s implementation of a DAP Aggregation Server

Our emerging technology group has been working on Daphne, our Rust-based implementation of a DAP aggregator server. This is only half of a deployment – DAP architecture requires two aggregator servers to interoperate, both operated by different parties. Our current version only implements the DAP Helper role; the other role is the DAP Leader. Plans are in the works to implement the Leader as well, which will open us up to deploy Daphne for more use cases.

We made two big decisions in our implementation here: using Rust and using Workers. Rust has been skyrocketing in popularity in the past few years due to its performance and memory management – a favorite of cryptographers for similar reasons. Workers is Cloudflare’s serverless execution environment that allows developers to easily deploy applications globally across our network – making it a favorite tool to prototype with at Cloudflare. This allows for easy integration with our Workers-based storage solutions like: Durable Objects, which we’re using for storing various data artifacts as required by the DAP protocol; and KV, which we’re using for managing aggregation task configuration. We’ve learned a lot from our interop tests and deployment, which has helped improve our own Workers products and which we have also fed back into the PPM working group to help improve the DAP standard.

If you’re interested in learning more about Daphne or collaborating with us in this space, you can fill out this form. If you’d like to get involved in the DAP standard, you can check out the working group.

Experiment with post-quantum cryptography today

Bas Westerbaan — Thu, 04 Aug 2022 13:00:00 GMT

Practically all data sent over the Internet today is at risk in the future if a sufficiently large and stable quantum computer is created. Anyone who captures data now could decrypt it.

Luckily, there is a solution: we can switch to so-called post-quantum (PQ) cryptography, which is designed to be secure against attacks of quantum computers. After a six-year worldwide selection process, in July 2022, NIST announced they will standardize Kyber, a post-quantum key agreement scheme. The standard will be ready in 2024, but we want to help drive the adoption of post-quantum cryptography.

Today we have added support for the X25519Kyber512Draft00 and X25519Kyber768Draft00 hybrid post-quantum key agreements to a number of test domains, including pq.cloudflareresearch.com.

Do you want to experiment with post-quantum on your test website for free? Mail ask-research@cloudflare.com to enroll your test website, but read the fine-print below.

What does it mean to enable post-quantum on your website?

If you enroll your website to the post-quantum beta, we will add support for these two extra key agreements alongside the existing classical encryption schemes such as X25519. If your browser doesn’t support these post-quantum key agreements (and none at the time of writing do), then your browser will continue working with a classically secure, but not quantum-resistant, connection.

Then how to test it?

We have open-sourced a fork of BoringSSL and Go that has support for these post-quantum key agreements. With those and an enrolled test domain, you can check how your application performs with post-quantum key exchanges. We are working on support for more libraries and languages.

What to look for?

Kyber and classical key agreements such as X25519 have different performance characteristics: Kyber requires less computation, but has bigger keys and requires a bit more RAM to compute. It could very well make the connection faster if used on its own.

We are not using Kyber on its own though, but are using hybrids. That means we are doing both an X25519 and Kyber key agreement such that the connection is still classically secure if either is broken. That also means that connections will be a bit slower. In our experiments, the difference is very small, but it’s best to check for yourself.

The fine-print

Cloudflare’s post-quantum cryptography support is a beta service for experimental use only. Enabling post-quantum on your website will subject the website to Cloudflare’s Beta Services terms and will impact other Cloudflare services on the website as described below.

No stability or support guarantees

Over the coming months, both Kyber and the way it’s integrated into TLS will change for several reasons, including:

Kyber will see small, but backward-incompatible changes in the coming months.
We want to be compatible with other early adopters and will change our integration accordingly.
As, together with the cryptography community, we find issues, we will add workarounds in our integration.

We will update our forks accordingly, but cannot guarantee any long-term stability or continued support. PQ support may become unavailable at any moment. We will post updates on pq.cloudflareresearch.com.

Features in enrolled domains

For the moment, we are running enrolled zones on a slightly different infrastructure for which not all features, notably QUIC, are available.

With that out of the way, it’s…

Demo time!

BoringSSL

With the following commands build our fork of BoringSSL and create a TLS connection with pq.cloudflareresearch.com using the compiled bssl tool. Note that we do not enable the post-quantum key agreements by default, so you have to pass the -curves flag.

$ git clone https://github.com/cloudflare/boringssl-pq
[snip]
$ cd boringssl-pq && mkdir build && cd build && cmake .. -GNinja && ninja 
[snip]
$ ./tool/bssl client -connect pq.cloudflareresearch.com -server-name pq.cloudflareresearch.com -curves Xyber512D00
	Connecting to [2606:4700:7::a29f:8a55]:443
Connected.
  Version: TLSv1.3
  Resumed session: no
  Cipher: TLS_AES_128_GCM_SHA256
  ECDHE curve: X25519Kyber512Draft00
  Signature algorithm: ecdsa_secp256r1_sha256
  Secure renegotiation: yes
  Extended master secret: yes
  Next protocol negotiated: 
  ALPN protocol: 
  OCSP staple: no
  SCT list: no
  Early data: no
  Encrypted ClientHello: no
  Cert subject: CN = *.pq.cloudflareresearch.com
  Cert issuer: C = US, O = Let's Encrypt, CN = E1

Go

Our Go fork doesn’t enable the post-quantum key agreement by default. The following simple Go program enables PQ by default for the http package and GETs pq.cloudflareresearch.com.

package main

import (
    "context"
    "crypto/tls"
    "fmt"
    "net/http"
)

func main() {
    req, err := http.NewRequestWithContext(
        context.WithValue(
            context.Background(),
            tls.CFEventHandlerContextKey{},
            func(ev tls.CFEvent) {
                switch e := ev.(type) {
                case tls.CFEventTLS13HRR:
                    fmt.Printf("HelloRetryRequest\n")
                case tls.CFEventTLS13NegotiatedKEX:
                    switch e.KEX {
                    case tls.X25519Kyber512Draft00:
                        fmt.Printf("Used X25519Kyber512Draft00\n")
                    default:
                        fmt.Printf("Used %d\n", e.KEX)
                    }
                }
            },
        ),
        "GET",
        "https://pq.cloudflareresearch.com",
        nil,
    )
    if err != nil {
        panic(err)
    }

    http.DefaultTransport.(*http.Transport).TLSClientConfig = &tls.Config{
        CurvePreferences: []tls.CurveID{tls.X25519Kyber512Draft00, tls.X25519},
    }

    if _, err = (&http.Client{}).Do(req); err != nil {
        fmt.Println(err)
    }
}

To run we need to compile our Go fork:

$ git clone https://github.com/cloudflare/go
[snip]
$ cd go/src && ./all.bash
[snip]
$ ../bin/go run path/to/example.go
Used X25519Kyber512Draft00

On the wire

So what does this look like on the wire? With Wireshark we can capture the packet flow. First a non-post quantum HTTP/2 connection with X25519:

This is a normal TLS 1.3 handshake: the client sends a ClientHello with an X25519 keyshare, which fits in a single packet. In return, the server sends its own 32 byte X25519 keyshare. It also sends various other messages, such as the certificate chain, which requires two packets in total.

Let’s check out Kyber:

As you can see the ClientHello is a bit bigger, but still fits within a single packet. The response takes three packets now, instead of two, because of the larger server keyshare.

Under the hood

Want to add client support yourself? We are using a hybrid of X25519 and Kyber version 3.02. We are writing out the details of the latter in version 00 of this CRFG IETF draft, hence the name. We are using TLS group identifiers 0xfe30 and 0xfe31 for X25519Kyber512Draft00 and X25519Kyber768Draft00 respectively.

There are some differences between our Go and BoringSSL forks that are interesting to compare.

Our Go fork uses our fast AVX2 optimized implementation of Kyber from CIRCL. In contrast, our BoringSSL fork uses the simpler portable reference implementation. Without the AVX2 optimisations it’s easier to evaluate. The downside is that it’s slower. Don’t be mistaken: it is still very fast, but you can check yourself.
Our Go fork only sends one keyshare. If the server doesn’t support it, it will respond with a HelloRetryRequest message and the client will fallback to one the server does support. This adds a roundtrip.Our BoringSSL fork, on the other hand, will send two keyshares: the post-quantum hybrid and a classical one (if a classical key agreement is still enabled). If the server doesn’t recognize the first, it will be able to use the second. In this way we avoid a roundtrip if the server does not support the post-quantum key agreement.

Looking ahead

The quantum future is here. In the coming years the Internet will move to post-quantum cryptography. Today we are offering our customers the tools to get a headstart and test post-quantum key agreements. We love to hear your feedback: e-mail it to ask-research@cloudflare.com.

This is just a small, but important first step. We will continue our efforts to move towards a secure and private quantum-secure Internet. Much more to come — watch this space.

Handshake Encryption: Endgame (an ECH update)

Christopher Wood — Tue, 12 Oct 2021 12:59:22 GMT

Privacy and security are fundamental to Cloudflare, and we believe in and champion the use of cryptography to help provide these fundamentals for customers, end-users, and the Internet at large. In the past, we helped specify, implement, and ship TLS 1.3, the latest version of the transport security protocol underlying the web, to all of our users. TLS 1.3 vastly improved upon prior versions of the protocol with respect to security, privacy, and performance: simpler cryptographic algorithms, more handshake encryption, and fewer round trips are just a few of the many great features of this protocol.

TLS 1.3 was a tremendous improvement over TLS 1.2, but there is still room for improvement. Sensitive metadata relating to application or user intent is still visible in plaintext on the wire. In particular, all client parameters, including the name of the target server the client is connecting to, are visible in plaintext. For obvious reasons, this is problematic from a privacy perspective: Even if your application traffic to crypto.cloudflare.com is encrypted, the fact you’re visiting crypto.cloudflare.com can be quite revealing.

And so, in collaboration with other participants in the standardization community and members of industry, we embarked towards a solution for encrypting all sensitive TLS metadata in transit. The result: TLS Encrypted ClientHello (ECH), an extension to protect this sensitive metadata during connection establishment.

Last year, we described the current status of this standard and its relation to the TLS 1.3 standardization effort, as well as ECH's predecessor, Encrypted SNI (ESNI). The protocol has come a long way since then, but when will we know when it's ready? There are many ways by which one can measure a protocol. Is it implementable? Is it easy to enable? Does it seamlessly integrate with existing protocols or applications? In order to assess these questions and see if the Internet is ready for ECH, the community needs deployment experience. Hence, for the past year, we’ve been focused on making the protocol stable, interoperable, and, ultimately, deployable. And today, we’re pleased to announce that we’ve begun our initial deployment of TLS ECH.

What does ECH mean for connection security and privacy on the network? How does it relate to similar technologies and concepts such as domain fronting? In this post, we’ll dig into ECH details and describe what this protocol does to move the needle to help build a better Internet.

Connection privacy

For most Internet users, connections are made to perform some type of task, such as loading a web page, sending a message to a friend, purchasing some items online, or accessing bank account information. Each of these connections reveals some limited information about user behavior. For example, a connection to a messaging platform reveals that one might be trying to send or receive a message. Similarly, a connection to a bank or financial institution reveals when the user typically makes financial transactions. Individually, this metadata might seem harmless. But consider what happens when it accumulates: does the set of websites you visit on a regular basis uniquely identify you as a user? The safe answer is: yes.

This type of metadata is privacy-sensitive, and ultimately something that should only be known by two entities: the user who initiates the connection, and the service which accepts the connection. However, the reality today is that this metadata is known to more than those two entities.

Making this information private is no easy feat. The nature or intent of a connection, i.e., the name of the service such as crypto.cloudflare.com, is revealed in multiple places during the course of connection establishment: during DNS resolution, wherein clients map service names to IP addresses; and during connection establishment, wherein clients indicate the service name to the target server. (Note: there are other small leaks, though DNS and TLS are the primary problems on the Internet today.)

As is common in recent years, the solution to this problem is encryption. DNS-over-HTTPS (DoH) is a protocol for encrypting DNS queries and responses to hide this information from onpath observers. Encrypted Client Hello (ECH) is the complementary protocol for TLS.

The TLS handshake begins when the client sends a ClientHello message to the server over a TCP connection (or, in the context of QUIC, over UDP) with relevant parameters, including those that are sensitive. The server responds with a ServerHello, encrypted parameters, and all that’s needed to finish the handshake.

The goal of ECH is as simple as its name suggests: to encrypt the ClientHello so that privacy-sensitive parameters, such as the service name, are unintelligible to anyone listening on the network. The client encrypts this message using a public key it learns by making a DNS query for a special record known as the HTTPS resource record. This record advertises the server's various TLS and HTTPS capabilities, including ECH support. The server decrypts the encrypted ClientHello using the corresponding secret key.

Conceptually, DoH and ECH are somewhat similar. With DoH, clients establish an encrypted connection (HTTPS) to a DNS recursive resolver such as 1.1.1.1 and, within that connection, perform DNS transactions.

With ECH, clients establish an encrypted connection to a TLS-terminating server such as crypto.cloudflare.com, and within that connection, request resources for an authorized domain such as cloudflareresearch.com.

There is one very important difference between DoH and ECH that is worth highlighting. Whereas a DoH recursive resolver is specifically designed to allow queries for any domain, a TLS server is configured to allow connections for a select set of authorized domains. Typically, the set of authorized domains for a TLS server are those which appear on its certificate, as these constitute the set of names for which the server is authorized to terminate a connection.

Basically, this means the DNS resolver is open, whereas the ECH client-facing server is closed. And this closed set of authorized domains is informally referred to as the anonymity set. (This will become important later on in this post.) Moreover, the anonymity set is assumed to be public information. Anyone can query DNS to discover what domains map to the same client-facing server.

Why is this distinction important? It means that one cannot use ECH for the purposes of connecting to an authorized domain and then interacting with a different domain, a practice commonly referred to as domain fronting. When a client connects to a server using an authorized domain but then tries to interact with a different domain within that connection, e.g., by sending HTTP requests for an origin that does not match the domain of the connection, the request will fail.

From a high level, encrypting names in DNS and TLS may seem like a simple feat. However, as we’ll show, ECH demands a different look at security and an updated threat model.

A changing threat model and design confidence

The typical threat model for TLS is known as the Dolev-Yao model, in which an active network attacker can read, write, and delete packets from the network. This attacker’s goal is to derive the shared session key. There has been a tremendous amount of research analyzing the security of TLS to gain confidence that the protocol achieves this goal.

The threat model for ECH is somewhat stronger than considered in previous work. Not only should it be hard to derive the session key, it should also be hard for the attacker to determine the identity of the server from a known anonymity set. That is, ideally, it should have no more advantage in identifying the server than if it simply guessed from the set of servers in the anonymity set. And recall that the attacker is free to read, write, and modify any packet as part of the TLS connection. This means, for example, that an attacker can replay a ClientHello and observe the server’s response. It can also extract pieces of the ClientHello — including the ECH extension — and use them in its own modified ClientHello.

The design of ECH ensures that this sort of attack is virtually impossible by ensuring the server certificate can only be decrypted by either the client or client-facing server.

Something else an attacker might try is masquerade as the server and actively interfere with the client to observe its behavior. If the client reacted differently based on whether the server-provided certificate was correct, this would allow the attacker to test whether a given connection using ECH was for a particular name.

ECH also defends against this attack by ensuring that an attacker without access to the private ECH key material cannot actively inject anything into the connection.

The attacker can also be entirely passive and try to infer encrypted information from other visible metadata, such as packet sizes and timing. (Indeed, traffic analysis is an open problem for ECH and in general for TLS and related protocols.) Passive attackers simply sit and listen to TLS connections, and use what they see and, importantly, what they know to make determinations about the connection contents. For example, if a passive attacker knows that the name of the client-facing server is crypto.cloudflare.com, and it sees a ClientHello with ECH to crypto.cloudflare.com, it can conclude, with reasonable certainty, that the connection is to some domain in the anonymity set of crypto.cloudflare.com.

The number of potential attack vectors is astonishing, and something that the TLS working group has tripped over in prior iterations of the ECH design. Before any sort of real world deployment and experiment, we needed confidence in the design of this protocol. To that end, we are working closely with external researchers on a formal analysis of the ECH design which captures the following security goals:

Use of ECH does not weaken the security properties of TLS without ECH.
TLS connection establishment to a host in the client-facing server’s anonymity set is indistinguishable from a connection to any other host in that anonymity set.

We’ll write more about the model and analysis when they’re ready. Stay tuned!

There are plenty of other subtle security properties we desire for ECH, and some of these drill right into the most important question for a privacy-enhancing technology: Is this deployable?

Focusing on deployability

With confidence in the security and privacy properties of the protocol, we then turned our attention towards deployability. In the past, significant protocol changes to fundamental Internet protocols such as TCP or TLS have been complicated by some form of benign interference. Network software, like any software, is prone to bugs, and sometimes these bugs manifest in ways that we only detect when there’s a change elsewhere in the protocol. For example, TLS 1.3 unveiled middlebox ossification bugs that ultimately led to the middlebox compatibility mode for TLS 1.3.

While itself just an extension, the risk of ECH exposing (or introducing!) similar bugs is real. To combat this problem, ECH supports a variant of GREASE whose goal is to ensure that all ECH-capable clients produce syntactically equivalent ClientHello messages. In particular, if a client supports ECH but does not have the corresponding ECH configuration, it uses GREASE. Otherwise, it produces a ClientHello with real ECH support. In both cases, the syntax of the ClientHello messages is equivalent.

This hopefully avoids network bugs that would otherwise trigger upon real or fake ECH. Or, in other words, it helps ensure that all ECH-capable client connections are treated similarly in the presence of benign network bugs or otherwise passive attackers. Interestingly, active attackers can easily distinguish -- with some probability -- between real or fake ECH. Using GREASE, the ClientHello carries an ECH extension, though its contents are effectively randomized, whereas a real ClientHello using ECH has information that will match what is contained in DNS. This means an active attacker can simply compare the ClientHello against what’s in the DNS. Indeed, anyone can query DNS and use it to determine if a ClientHello is real or fake:

$ dig +short crypto.cloudflare.com TYPE65
\# 134 0001000001000302683200040008A29F874FA29F884F000500480046 FE0D0042D500200020E3541EC94A36DCBF823454BA591D815C240815 77FD00CAC9DC16C884DF80565F0004000100010013636C6F7564666C 6172652D65736E692E636F6D00000006002026064700000700000000 0000A29F874F260647000007000000000000A29F884F

Despite this obvious distinguisher, the end result isn’t that interesting. If a server is capable of ECH and a client is capable of ECH, then the connection most likely used ECH, and whether clients and servers are capable of ECH is assumed public information. Thus, GREASE is primarily intended to ease deployment against benign network bugs and otherwise passive attackers.

Note, importantly, that GREASE (or fake) ECH ClientHello messages are semantically different from real ECH ClientHello messages. This presents a real problem for networks such as enterprise settings or school environments that otherwise use plaintext TLS information for the purposes of implementing various features like filtering or parental controls. (Encrypted DNS protocols like DoH also encountered similar obstacles in their deployment.) Fundamentally, this problem reduces to the following: How can networks securely disable features like DoH and ECH? Fortunately, there are a number of approaches that might work, with the more promising one centered around DNS discovery. In particular, if clients could securely discover encrypted recursive resolvers that can perform filtering in lieu of it being done at the TLS layer, then TLS-layer filtering might be wholly unnecessary. (Other approaches, such as the use of canary domains to give networks an opportunity to signal that certain features are not permitted, may work, though it’s not clear if these could or would be abused to disable ECH.)

We are eager to collaborate with browser vendors, network operators, and other stakeholders to find a feasible deployment model that works well for users without ultimately stifling connection privacy for everyone else.

Next steps

ECH is rolling out for some FREE zones on our network in select geographic regions. We will continue to expand the set of zones and regions that support ECH slowly, monitoring for failures in the process. Ultimately, the goal is to work with the rest of the TLS working group and IETF towards updating the specification based on this experiment in hopes of making it safe, secure, usable, and, ultimately, deployable for the Internet.

ECH is one part of the connection privacy story. Like a leaky boat, it’s important to look for and plug all the gaps before taking on lots of passengers! Cloudflare Research is committed to these narrow technical problems and their long-term solutions. Stay tuned for more updates on this and related protocols.

Good-bye ESNI, hello ECH!

Christopher Patton — Tue, 08 Dec 2020 12:00:00 GMT

Most communication on the modern Internet is encrypted to ensure that its content is intelligible only to the endpoints, i.e., client and server. Encryption, however, requires a key and so the endpoints must agree on an encryption key without revealing the key to would-be attackers. The most widely used cryptographic protocol for this task, called key exchange, is the Transport Layer Security (TLS) handshake.

In this post we'll dive into Encrypted Client Hello (ECH), a new extension for TLS that promises to significantly enhance the privacy of this critical Internet protocol. Today, a number of privacy-sensitive parameters of the TLS connection are negotiated in the clear. This leaves a trove of metadata available to network observers, including the endpoints' identities, how they use the connection, and so on.

ECH encrypts the full handshake so that this metadata is kept secret. Crucially, this closes a long-standing privacy leak by protecting the Server Name Indication (SNI) from eavesdroppers on the network. Encrypting the SNI is important because it is the clearest signal of which server a given client is communicating with. However, and perhaps more significantly, ECH also lays the groundwork for adding future security features and performance enhancements to TLS while minimizing their impact on the privacy of end users.

ECH is the product of close collaboration, facilitated by the IETF, between academics and the tech industry leaders, including Cloudflare, our friends at Fastly and Mozilla (both of whom are the affiliations of co-authors of the standard), and many others. This feature represents a significant upgrade to the TLS protocol, one that builds on bleeding edge technologies, like DNS-over-HTTPS, that are only now coming into their own. As such, the protocol is not yet ready for Internet-scale deployment. This article is intended as a sign post on the road to full handshake encryption.

Background

The story of TLS is the story of the Internet. As our reliance on the Internet has grown, so the protocol has evolved to address ever-changing operational requirements, use cases, and threat models. The client and server don't just exchange a key. They negotiate a wide variety of features and parameters: the exact method of key exchange; the encryption algorithm; who is authenticated and how; which application layer protocol to use after the handshake; and much, much more. All of these parameters impact the security properties of the communication channel in one way or another.

SNI is a prime example of a parameter that impacts the channel's security. The SNI extension is used by the client to indicate to the server the website it wants to reach. This is essential for the modern Internet, as it's common nowadays for many origin servers to sit behind a single TLS operator. In this setting, the operator uses the SNI to determine who will authenticate the connection: without it, there would be no way of knowing which TLS certificate to present to the client. The problem is that SNI leaks to the network the identity of the origin server the client wants to connect to, potentially allowing eavesdroppers to infer a lot of information about their communication. (Of course, there are other ways for a network observer to identify the origin — the origin's IP address, for example. But co-locating with other origins on the same IP address makes it much harder to use this metric to determine the origin than it is to simply inspect the SNI.)

Although protecting SNI is the impetus for ECH, it is by no means the only privacy-sensitive handshake parameter that the client and server negotiate. Another is the ALPN extension, which is used to decide which application-layer protocol to use once the TLS connection is established. The client sends the list of applications it supports — whether it's HTTPS, email, instant messaging, or the myriad other applications that use TLS for transport security — and the server selects one from this list, and sends its selection to the client. By doing so, the client and server leak to the network a clear signal of their capabilities and what the connection might be used for.

Some features are so privacy-sensitive that their inclusion in the handshake is a non-starter. One idea that has been floated is to replace the key exchange at the heart of TLS with password-authenticated key-exchange (PAKE). This would allow password-based authentication to be used alongside (or in lieu of) certificate-based authentication, making TLS more robust and suitable for a wider range of applications. The privacy issue here is analogous to SNI: servers typically associate a unique identifier to each client (e.g., a username or email address) that is used to retrieve the client's credentials; and the client must, somehow, convey this identity to the server during the course of the handshake. If sent in the clear, then this personally identifiable information would be easily accessible to any network observer.

A necessary ingredient for addressing all of these privacy leaks is handshake encryption, i.e., the encryption of handshake messages in addition to application data. Sounds simple enough, but this solution presents another problem: how do the client and server pick an encryption key if, after all, the handshake is itself a means of exchanging a key? Some parameters must be sent in the clear, of course, so the goal of ECH is to encrypt all handshake parameters except those that are essential to completing the key exchange.

In order to understand ECH and the design decisions underpinning it, it helps to understand a little bit about the history of handshake encryption in TLS.

Handshake encryption in TLS

TLS had no handshake encryption at all prior to the latest version, TLS 1.3. In the wake of the Snowden revelations in 2013, the IETF community began to consider ways of countering the threat that mass surveillance posed to the open Internet. When the process of standardizing TLS 1.3 began in 2014, one of its design goals was to encrypt as much of the handshake as possible. Unfortunately, the final standard falls short of full handshake encryption, and several parameters, including SNI, are still sent in the clear. Let's take a closer look to see why.

The TLS 1.3 protocol flow is illustrated in Figure 1. Handshake encryption begins as soon as the client and server compute a fresh shared secret. To do this, the client sends a key share in its ClientHello message, and the server responds in its ServerHello with its own key share. Having exchanged these shares, the client and server can derive a shared secret. Each subsequent handshake message is encrypted using the handshake traffic key derived from the shared secret. Application data is encrypted using a different key, called the application traffic key, which is also derived from the shared secret. These derived keys have different security properties: to emphasize this, they are illustrated with different colors.

The first handshake message that is encrypted is the server's EncryptedExtensions. The purpose of this message is to protect the server's sensitive handshake parameters, including the server's ALPN extension, which contains the application selected from the client's ALPN list. Key-exchange parameters are sent unencrypted in the ClientHello and ServerHello.

Figure 1: The TLS 1.3 handshake.

All of the client's handshake parameters, sensitive or not, are sent in the ClientHello. Looking at Figure 1, you might be able to think of ways of reworking the handshake so that some of them can be encrypted, perhaps at the cost of additional latency (i.e., more round trips over the network). However, extensions like SNI create a kind of "chicken-and-egg" problem.

The client doesn't encrypt anything until it has verified the server's identity (this is the job of the Certificate and CertificateVerify messages) and the server has confirmed that it knows the shared secret (the job of the Finished message). These measures ensure the key exchange is authenticated, thereby preventing monster-in-the-middle (MITM) attacks in which the adversary impersonates the server to the client in a way that allows it to decrypt messages sent by the client. Because SNI is needed by the server to select the certificate, it needs to be transmitted before the key exchange is authenticated.

In general, ensuring confidentiality of handshake parameters used for authentication is only possible if the client and server already share an encryption key. But where might this key come from?

Full handshake encryption in the early days of TLS 1.3. Interestingly, full handshake encryption was once proposed as a core feature of TLS 1.3. In early versions of the protocol (draft-10, circa 2015), the server would offer the client a long-lived public key during the handshake, which the client would use for encryption in subsequent handshakes. (This design came from a protocol called OPTLS, which in turn was borrowed from the original QUIC proposal.) Called "0-RTT", the primary purpose of this mode was to allow the client to begin sending application data prior to completing a handshake. In addition, it would have allowed the client to encrypt its first flight of handshake messages following the ClientHello, including its own EncryptedExtensions, which might have been used to protect the client's sensitive handshake parameters.

Ultimately this feature was not included in the final standard (RFC 8446, published in 2018), mainly because its usefulness was outweighed by its added complexity. In particular, it does nothing to protect the initial handshake in which the client learns the server's public key. Parameters that are required for server authentication of the initial handshake, like SNI, would still be transmitted in the clear.

Nevertheless, this scheme is notable as the forerunner of other handshake encryption mechanisms, like ECH, that use public key encryption to protect sensitive ClientHello parameters. The main problem these mechanisms must solve is key distribution.

Before ECH there was (and is!) ESNI

The immediate predecessor of ECH was the Encrypted SNI (ESNI) extension. As its name implies, the goal of ESNI was to provide confidentiality of the SNI. To do so, the client would encrypt its SNI extension under the server's public key and send the ciphertext to the server. The server would attempt to decrypt the ciphertext using the secret key corresponding to its public key. If decryption were to succeed, then the server would proceed with the connection using the decrypted SNI. Otherwise, it would simply abort the handshake. The high-level flow of this simple protocol is illustrated in Figure 2.

Figure 2: The TLS 1.3 handshake with the ESNI extension. It is identical to the TLS 1.3 handshake, except the SNI extension has been replaced with ESNI.

For key distribution, ESNI relied on another critical protocol: Domain Name Service (DNS). In order to use ESNI to connect to a website, the client would piggy-back on its standard A/AAAA queries a request for a TXT record with the ESNI public key. For example, to get the key for crypto.dance, the client would request the TXT record of _esni.crypto.dance:

$ dig _esni.crypto.dance TXT +short
"/wGuNThxACQAHQAgXzyda0XSJRQWzDG7lk/r01r1ZQy+MdNxKg/mAqSnt0EAAhMBAQQAAAAAX67XsAAAAABftsCwAAA="

The base64-encoded blob contains an ESNI public key and related parameters such as the encryption algorithm.

But what's the point of encrypting SNI if we're just going to leak the server name to network observers via a plaintext DNS query? Deploying ESNI this way became feasible with the introduction of DNS-over-HTTPS (DoH), which enables encryption of DNS queries to resolvers that provide the DoH service (1.1.1.1 is an example of such a service.). Another crucial feature of DoH is that it provides an authenticated channel for transmitting the ESNI public key from the DoH server to the client. This prevents cache-poisoning attacks that originate from the client's local network: in the absence of DoH, a local attacker could prevent the client from offering the ESNI extension by returning an empty TXT record, or coerce the client into using ESNI with a key it controls.

While ESNI took a significant step forward, it falls short of our goal of achieving full handshake encryption. Apart from being incomplete — it only protects SNI — it is vulnerable to a handful of sophisticated attacks, which, while hard to pull off, point to theoretical weaknesses in the protocol's design that need to be addressed.

ESNI was deployed by Cloudflare and enabled by Firefox, on an opt-in basis, in 2018, an experience that laid bare some of the challenges with relying on DNS for key distribution. Cloudflare rotates its ESNI key every hour in order to minimize the collateral damage in case a key ever gets compromised. DNS artifacts are sometimes cached for much longer, the result of which is that there is a decent chance of a client having a stale public key. While Cloudflare's ESNI service tolerates this to a degree, every key must eventually expire. The question that the ESNI protocol left open is how the client should proceed if decryption fails and it can't access the current public key, via DNS or otherwise.

Another problem with relying on DNS for key distribution is that several endpoints might be authoritative for the same origin server, but have different capabilities. For example, a request for the A record of "example.com" might return one of two different IP addresses, each operated by a different CDN. The TXT record for "_esni.example.com" would contain the public key for one of these CDNs, but certainly not both. The DNS protocol does not provide a way of atomically tying together resource records that correspond to the same endpoint. In particular, it's possible for a client to inadvertently offer the ESNI extension to an endpoint that doesn't support it, causing the handshake to fail. Fixing this problem requires changes to the DNS protocol. (More on this below.)

The future of ESNI. In the next section, we'll describe the ECH specification and how it addresses the shortcomings of ESNI. Despite its limitations, however, the practical privacy benefit that ESNI provides is significant. Cloudflare intends to continue its support for ESNI until ECH is production-ready.

The ins and outs of ECH

The goal of ECH is to encrypt the entire ClientHello, thereby closing the gap left in TLS 1.3 and ESNI by protecting all privacy-sensitive handshake-parameters. Similar to ESNI, the protocol uses a public key, distributed via DNS and obtained using DoH, for encryption during the client's first flight. But ECH has improvements to key distribution that make the protocol more robust to DNS cache inconsistencies. Whereas the ESNI server aborts the connection if decryption fails, the ECH server attempts to complete the handshake and supply the client with a public key it can use to retry the connection.

But how can the server complete the handshake if it's unable to decrypt the ClientHello? As illustrated in Figure 3, the ECH protocol actually involves two ClientHello messages: the ClientHelloOuter, which is sent in the clear, as usual; and the ClientHelloInner, which is encrypted and sent as an extension of the ClientHelloOuter. The server completes the handshake with just one of these ClientHellos: if decryption succeeds, then it proceeds with the ClientHelloInner; otherwise, it proceeds with the ClientHelloOuter.

Figure 3: The TLS 1.3 handshake with the ECH extension.

The ClientHelloInner is composed of the handshake parameters the client wants to use for the connection. This includes sensitive values, like the SNI of the origin server it wants to reach (called the backend server in ECH parlance), the ALPN list, and so on. The ClientHelloOuter, while also a fully-fledged ClientHello message, is not used for the intended connection. Instead, the handshake is completed by the ECH service provider itself (called the client-facing server), signaling to the client that its intended destination couldn't be reached due to decryption failure. In this case, the service provider also sends along the correct ECH public key with which the client can retry handshake, thereby "correcting" the client's configuration. (This mechanism is similar to how the server distributed its public key for 0-RTT mode in the early days of TLS 1.3.)

At a minimum, both ClientHellos must contain the handshake parameters that are required for a server-authenticated key-exchange. In particular, while the ClientHelloInner contains the real SNI, the ClientHelloOuter also contains an SNI value, which the client expects to verify in case of ECH decryption failure (i.e., the client-facing server). If the connection is established using the ClientHelloOuter, then the client is expected to immediately abort the connection and retry the handshake with the public key provided by the server. It's not necessary that the client specify an ALPN list in the ClientHelloOuter, nor any other extension used to guide post-handshake behavior. All of these parameters are encapsulated by the encrypted ClientHelloInner.

This design resolves — quite elegantly, I think — most of the challenges for securely deploying handshake encryption encountered by earlier mechanisms. Importantly, the design of ECH was not conceived in a vacuum. The protocol reflects the diverse perspectives of the IETF community, and its development dovetails with other IETF standards that are crucial to the success of ECH.

The first is an important new DNS feature known as the HTTPS resource record type. At a high level, this record type is intended to allow multiple HTTPS endpoints that are authoritative for the same domain name to advertise different capabilities for TLS. This makes it possible to rely on DNS for key distribution, resolving one of the deployment challenges uncovered by the initial ESNI deployment. For a deep dive into this new record type and what it means for the Internet more broadly, check out Alessandro Ghedini's recent blog post on speeding up HTTPS with DNS.

The second is the CFRG's Hybrid Public Key Encryption (HPKE) standard, which specifies an extensible framework for building public key encryption schemes suitable for a wide variety of applications. In particular, ECH delegates all of the details of its handshake encryption mechanism to HPKE, resulting in a much simpler and easier-to-analyze specification. (Incidentally, HPKE is also one of the main ingredients of Oblivious DNS-over-HTTPS.)

The road ahead

The current ECH specification is the culmination of a multi-year collaboration. At this point, the overall design of the protocol is fairly stable. In fact, the next draft of the specification will be the first to be targeted for interop testing among implementations. Still, there remain a number of details that need to be sorted out. Let's end this post with a brief overview of the road ahead.

Resistance to traffic analysis

Ultimately, the goal of ECH is to ensure that TLS connections made to different origin servers behind the same ECH service provider are indistinguishable from one another. In other words, when you connect to an origin behind, say, Cloudflare, no one on the network between you and Cloudflare should be able to discern which origin you reached, or which privacy-sensitive handshake-parameters you and the origin negotiated. Apart from an immediate privacy boost, this property, if achieved, paves the way for the deployment of new features for TLS without compromising privacy.

Encrypting the ClientHello is an important step towards achieving this goal, but we need to do a bit more. An important attack vector we haven't discussed yet is traffic analysis. This refers to the collection and analysis of properties of the communication channel that betray part of the ciphertext's contents, but without cracking the underlying encryption scheme. For example, the length of the encrypted ClientHello might leak enough information about the SNI for the adversary to make an educated guess as to its value (this risk is especially high for domain names that are either particularly short or particularly long). It is therefore crucial that the length of each ciphertext is independent of the values of privacy-sensitive parameters. The current ECH specification provides some mitigations, but their coverage is incomplete. Thus, improving ECH's resistance to traffic analysis is an important direction for future work.

The spectre of ossification

An important open question for ECH is the impact it will have on network operations.

One of the lessons learned from the deployment of TLS 1.3 is that upgrading a core Internet protocol can trigger unexpected network behavior. Cloudflare was one of the first major TLS operators to deploy TLS 1.3 at scale; when browsers like Firefox and Chrome began to enable it on an experimental basis, they observed a significantly higher rate of connection failures compared to TLS 1.2. The root cause of these failures was network ossification, i.e., the tendency of middleboxes — network appliances between clients and servers that monitor and sometimes intercept traffic — to write software that expects traffic to look and behave a certain way. Changing the protocol before middleboxes had the chance to update their software led to middleboxes trying to parse packets they didn't recognize, triggering software bugs that, in some instances, caused connections to be dropped completely.

This problem was so widespread that, instead of waiting for network operators to update their software, the design of TLS 1.3 was altered in order to mitigate the impact of network ossification. The ingenious solution was to make TLS 1.3 "look like" another protocol that middleboxes are known to tolerate. Specifically, the wire format and even the contents of handshake messages were made to resemble TLS 1.2. These two protocols aren't identical, of course — a curious network observer can still distinguish between them — but they look and behave similar enough to ensure that the majority of existing middleboxes don't treat them differently. Empirically, it was found that this strategy reduced the connection failure rate enough to make deployment of TLS 1.3 viable.

Once again, ECH represents a significant upgrade for TLS for which the spectre of network ossification looms large. The ClientHello contains parameters, like SNI, that have existed in the handshake for a long time, and we don't yet know what the impact will be of encrypting them. In anticipation of the deployment issues ossification might cause, the ECH protocol has been designed to look as much like a standard TLS 1.3 handshake as possible. The most notable difference is the ECH extension itself: if middleboxes ignore it — as they should, if they are compliant with the TLS 1.3 standard — then the rest of the handshake will look and behave very much as usual.

It remains to be seen whether this strategy will be enough to ensure the wide-scale deployment of ECH. If so, it is notable that this new feature will help to mitigate the impact of future TLS upgrades on network operations. Encrypting the full handshake reduces the risk of ossification since it means that there are less visible protocol features for software to ossify on. We believe this will be good for the health of the Internet overall.

Conclusion

The old TLS handshake is (unintentionally) leaky. Operational requirements of both the client and server have led to privacy-sensitive parameters, like SNI, being negotiated completely in the clear and available to network observers. The ECH extension aims to close this gap by enabling encryption of the full handshake. This represents a significant upgrade to TLS, one that will help preserve end-user privacy as the protocol continues to evolve.

The ECH standard is a work-in-progress. As this work continues, Cloudflare is committed to doing its part to ensure this important upgrade for TLS reaches Internet-scale deployment.

Roughtime: Securing Time with Digital Signatures

Christopher Patton — Fri, 21 Sep 2018 12:00:00 GMT

When you visit a secure website, it offers you a TLS certificate that asserts its identity. Every certificate has an expiration date, and when it’s passed due, it is no longer valid. The idea is almost as old as the web itself: limiting the lifetime of certificates is meant to reduce the risk in case a TLS server’s secret key is compromised.

Certificates aren’t the only cryptographic artifacts that expire. When you visit a site protected by Cloudflare, we also tell you whether its certificate has been revoked (see our blog post on OCSP stapling) — for example, due to the secret key being compromised — and this value (a so-called OCSP staple) has an expiration date, too.

Thus, to determine if a certificate is valid and hasn’t been revoked, your system needs to know the current time. Indeed, time is crucial for the security of TLS and myriad other protocols. To help keep clocks in sync, we are announcing a free, high-availability, and low-latency authenticated time service called Roughtime, available at roughtime.cloudflare.com on port 2002.

Time is tricky

It may surprise you to learn that, in practice, clients’ clocks are heavily skewed. A recent study of Chrome users showed that a significant fraction of reported TLS-certificate errors are caused by client-clock skew. During the period in which error reports were collected, 6.7% of client-reported times were behind by more than 24 hours. (0.05% were ahead by more than 24 hours.) This skew was a causal factor for at least 33.5% of the sampled reports from Windows users, 8.71% from Mac OS, 8.46% from Android, and 1.72% from Chrome OS. These errors are usually presented to users as warnings that the user can click through to get to where they’re going. However, showing too many warnings makes users grow accustomed to clicking through them; this is risky, since these warnings are meant to keep users away from malicious websites.

Clock skew also holds us back from improving the security of certificates themselves. We’d like to issue certificates with shorter lifetimes because the less time the certificate is valid, the lower the risk of the secret key being exposed. (This is why Let’s Encrypt issues certificates valid for just 90 days by default.) But the long tail of skewed clocks limits the effective lifetime of certificates; shortening the lifetime too much would only lead to more warnings.

Endpoints on the Internet often synchronize their clocks using a protocol like the Network Time Protocol (NTP). NTP aims for precise synchronization, and even accounts for network latency. However, it is usually deployed without security features, as the added overhead on high-load servers degrades precision significantly. As a result, a man-in-the-middle attacker between the client and server can easily influence the client’s clock. By moving the client back in time, the attacker can force it to accept expired (and possibly compromised) certificates; by moving forward in time, it can force the client to accept a certificate that is not yet valid.

Fortunately, for settings in which both security and precision are paramount, workable solutions are on the horizon. But for many applications, precise network time isn’t essential; it suffices to be accurate, say, within 10 seconds of real time. This observation is the primary motivation of Google’s Roughtime protocol, a simple protocol by which clients can synchronize their clocks with one or more authenticated servers. Roughtime lacks the precision of NTP, but aims to be accurate enough for cryptographic applications, and since the responses are authenticated, man-in-the-middle attacks aren’t possible.

The protocol is designed to be simple and flexible. A client can get Roughtime from just one server it trusts, or it may contact many servers to make its calculation more robust. But its most distinctive feature is that it adds accountability to time servers. If a server misbehaves by providing the wrong time, then the protocol allows clients to produce publicly verifiable, cryptographic proof of this misbehavior. Making servers auditable in this manner makes them accountable to provide accurate time.

We are deploying a Roughtime service for two reasons.

First, the clock we use for this service is the same as the clock we use to determine whether our customers’ certificates are valid and haven’t been revoked; as a result, exposing this service makes us accountable for the validity of TLS artifacts we serve to clients on behalf of our customers.

Second, Roughtime is a great idea whose time has come. But it is only useful if several independent organizations participate; the more Roughtime servers there are, the more robust the ecosystem becomes. Our hope is that putting our weight behind it will help the Roughtime ecosystem grow.

The Roughtime protocol

At its most basic level, Roughtime is a one-round protocol in which the client requests the current time and the server sends a signed response. The response is comprised of a timestamp (the number of microseconds since the Unix epoch) and a radius (in microseconds) used to indicate the server’s certainty about the reported time. For example, a radius of 1,000,000μs means the server is reasonably sure that the true time is within one second of the reported time.

The server proves freshness of its response as follows. The request consists of a short, random string commonly called a nonce (pronounced /nän(t)s/, or sometimes /ˈen wən(t)s/). The server incorporates the nonce into its signed response so that it’s needed to verify the signature. If the nonce is sufficiently long (say, 16 bytes), then the number of possible nonces is so large that it’s extremely unlikely the server has encountered (or will ever encounter) a request with the same nonce. Thus, a valid signature serves as cryptographic proof that the response is fresh.

The client uses the server’s root public key to verify the signature. (The key is obtained out-of-band; you can get our key here.) When the server starts, it generates an online public/secret key pair; the root secret key is used to create a delegation for the online public key, and the online secret key is used to sign the response. The delegation serves the same function as a traditional X.509 certificate on the web: as illustrated in the figure below, the client first uses the root public key to verify the delegation, then uses the online public key to verify the response. This allows for operational separation of the delegator and the server and limits exposure of the root secret key.

Simplified Roughtime (without delegation)

Roughtime with delegation

Roughtime offers two features designed to make it scalable. First, when the volume of requests is high, the server may batch-sign a number of clients’ requests by constructing a Merkle tree from the nonces. The server signs the root of the tree and sends in its response the information needed to prove to the client that its request is in the tree. (The data structure is a binary tree, so the amount of information is proportional to the base-2 logarithm of the number of requests in the batch; see the figure below) Second, the protocol is executed over UDP. In order to prevent the Roughtime server from being an amplifier for DDoS attacks, the request is padded to 1KB; if the UDP packet is too short, then it’s dropped without further processing. Check out this blog post for a more in-depth discussion.

Roughtime with batching

Using Roughtime

The protocol is flexible enough to support a variety of use cases. A web browser could use a Roughtime server to proactively synchronize its clock when validating TLS certificates. It could also be used retroactively to avoid showing the user too many warnings: when a certificate validation error occurs — in particular, when the browser believes it’s expired or not yet valid — Roughtime could be used to determine if the clock skew was the root cause. Instead of telling the user the certificate is invalid, it could tell the user that their clock is incorrect.

Using just one server is sufficient if that server is trustworthy, but a security-conscious user could make requests to many servers; the delta might be computed by eliminating outliers and averaging the responses, or by some more sophisticated method. This makes the calculation robust to one or more of the servers misbehaving.

Making servers accountable

The real power of Roughtime is that it’s auditable. Consider the following mode of operation. The client has a list of servers it will query in a particular order. The client generates a random string — called a blind in the parlance of Roughtime — hashes it, and uses the output as the nonce for its request to the server. For subsequent requests, it computes the nonce as follows: generate a blind, compute the hash of this string and the response from the previous server (including the timestamp and signature), and use this hash as the nonce for the next request.

Chaining multiple Roughtime servers

Creating a chain of timestamps in this way binds each response to the response that precedes it. Thus, the sequence of blinds and signatures constitute a publicly verifiable, cryptographic proof that the timestamps were requested in order (a “clockchain” if you will ?). If the servers are roughly synchronized, then we expect that the sequence to monotonically increase, at least roughly. If one of the servers were consistently behind or ahead of the others, then this would be evident in the sequence. Suppose you get the following sequence of timestamps, each from different servers:

Server	Timestamp
ServerA-Roughtime	2018-08-29 14:51:50 -0700 PDT
ServerB-Roughtime	2018-08-29 14:51:51 -0700 PDT +0:00:01
Cloudflare-Roughtime	2018-08-29 12:51:52 -0700 PDT -1:59:59
ServerC-Roughtime	2018-08-29 14:51:53 -0700 PDT +2:00:01

Servers B and C corroborate the time given by server A, but — oh no! Cloudflare is two hours behind! Unless servers A, B, and C are in cahoots, it’s likely that the time offered by Cloudflare is incorrect. Moreover, you have verifiable, cryptographic proof. In this way, the Roughtime protocol makes our server (and all Roughtime servers) accountable to provide accurate time, or, at least, to be in sync with the others.

The Roughtime ecosystem

The infrastructure for monitoring and auditing the Roughtime ecosystem hasn’t been built yet. Right now there’s only a handful of servers: in addition to Cloudflare’s and Google’s, there’s also a really nice Rust implementation. The more diversity there is, the healthier the ecosystem becomes. We hope to see more organizations adopt this protocol.

Cloudflare’s Roughtime service

For the initial deployment of this service, our primary goals are to ensure high availability and minimal maintenance overhead. Each machine at each Cloudflare location executes an instance of the service and responds to queries using its system clock. The server signs each request individually rather than batch-signing them as described above; we rely on our load balancer to ensure no machine is overwhelmed. There are three ways in which we envision this service could be used:

TLS authentication. When a TLS application (a web browser for example) starts, it could make a request to roughtime.cloudflare.com and compute the difference between the reported time and its system time. Whenever it authenticates a TLS server, it would add this difference to the system time to get the current time.
Roughtime daemon. One could implement an OS daemon that periodically requests the current time. If the reported time differs from the system time by more than a second, it might issue an alert.
Server auditing. As the Roughtime ecosystem grows, it will be important to ensure that all of the servers are in sync. Individuals or organizations may take it upon themselves to monitor the ecosystem and ensure that the servers are in sync with one another.

The service is reachable wherever you are via our anycast network. This is important for a service like Roughtime, because minimizing network latency helps improve accuracy. For information about how to configure a client to use Cloudflare-Roughtime, check out the developer documentation. Note that our initial release is somewhat experimental. As such, our root public key may change in the future. See the developer docs for information on obtaining the current public key.

If you want to see what time our Roughtime server returns, click the button below!

function getTime() { document.getElementById("time-txt-box").innerHTML = "Loading..." fetch("/cdn-cgi/trace").then((res) => res.text()).then((txt) => { let ts = txt.match(/ts=([0-9\.]+)/)[1] let str = new Date(parseFloat(ts) * 1000) document.getElementById("time-txt-box").innerHTML = str }) .catch((err) => { console.log(err) document.getElementById("time-txt-box").innerHTML = "Request failed." }) }

Get Time!

Subscribe to the blog for daily updates on our announcements.