the first blogpost of the adventures in the Post-Quantum land series: a taxonomy of challenges.

At Cloudflare, we help to build a better Internet. In the face of quantum computers and their threat to cryptography, we want to provide protections for this future challenge. The only way that we can change the future is by analyzing and perusing the past. Only in the present, with the past in mind and the future in sight, can we categorize and unfold events. Predicting, understanding and anticipating quantum computers (with the opportunities and challenges they bring) is a daunting task. We can, though, create a taxonomy of these challenges, so the future can be better unrolled.

This is the first blog post in a post-quantum series, where we talk about our past, present and future “adventures in the Post-Quantum land”. We have written about previous post-quantum efforts at Cloudflare, but we think that here first we need to understand and categorize the problem by looking at what we have done and what lies ahead. So, welcome to our adventures!

A taxonomy of the challenges ahead that quantum computers and their threat to cryptography bring (for more information about it, read our other blog posts) could be a good way to approach this problem. This taxonomy should not only focus at the technical level, though. Quantum computers fundamentally change the way certain protocols, properties, storage and retrieval systems, and infrastructure need to work. Following the biological tradition, we can see this taxonomy as the idea of grouping together problems into spaces and the arrangement of them into a classification that helps to understand what and who the problems impact. Following the same tradition, we can use the idea of kingdoms to classify those challenges. The kingdoms are (in no particular order):

  1. Challenges at the Protocols level, Protocolla
  2. Challenges at the implementation level, Implementa
  3. Challenges at the standardization level, Regulae
  4. Challenges at the community level: Communitates
  5. Challenges at the research level, Investigationes

Let’s explore them one by one.

Image of the taxonomy tree of the post-quantum challenges.
The taxonomy tree of the post-quantum challenges.

Challenges at the Protocols level, Protocolla

One conceptual model of the Internet is that it is composed of layers that stack on top of each other, as defined by the Open Systems Interconnection (OSI) model. Communication protocols in each layer enable parties to interact with a corresponding one at the same layer. While quantum computers are a threat to the security and privacy of digital connections, they are not a direct threat to communication itself (we will see, though, that one of the consequences of the existence of quantum computers is how new algorithms that are safe to their attacks can impact the communication itself). But, if the protocol used in a layer aims to provide certain security or privacy properties, those properties can be endangered by a quantum computer. The properties that these protocols often aim to provide are confidentiality (no one can read the content of communications), authentication (communication parties are assured who they are talking to) and integrity (the content of the communication is assured to have not been changed in transit).

Well-known examples of protocols that aim to provide security or privacy properties to protect the different layers are:

We know how to provide those properties (and protect the protocols) against the threat of quantum computers: the solution is to use post-quantum cryptography, and, for this reason, the National Institute of Standards and Technology (NIST) has been running an ongoing process to choose the most suitable algorithms. At this protocol level, then, the problem doesn’t seem to be adding these new algorithms, as nothing impedes it from a theoretical perspective. But protocols and our connections often have other requirements or constraints. Requirements or constraints are, for example, that data to be exchanged fits into a specific packet or segment size. In the case of the Transport Control Protocol (TCP), which ensures that data packets are delivered and received in order, for example, the Maximum Segment SizeMSS — sets the largest segment size that a network-connected device can receive and, if a packet exceeds the MSS, it is dropped (certain middleboxes or software desire all data to fit into a single segment). IPv4 hosts are required to handle an MSS of 536 bytes and IPv6 hosts are required to handle an MSS of 1220 bytes. In the case of DNS, it has primarily answered queries over the User Datagram Protocol (UDP), which limits the answer size to 512 bytes unless the extension EDNS is in use. Larger packets are subject to fragmentation or to the request that TCP should be used: this results in added round-trips, which should be avoided.

Another important requirement is that the operations that algorithms execute (such as multiplication or addition) or the resources they consume (which can be time but also space/memory) are fast enough that connection times are not impacted. This is even more pressing when fast, reliable and cheap Internet access is not available, as access to this “type” of Internet is not globally available.

TLS is a protocol that needs to handle heavy HTTPS traffic load and one that gets impacted by the additional cost that cryptographic operations add. Since 2010, TLS has been deemed “not computationally expensive any more” due to the usage of fast cryptography implementations and abbreviated handshakes (by using resumption, for example). But what if we add post-quantum algorithms into the TLS handshake? Some post-quantum algorithms can be suitable, while others might not be.

Many of the schemes that are part of the third round of the NIST post-quantum process, that can be used for confidentiality, seem to have encryption and decryption performance time comparable (or faster) to fast elliptic-curve cryptography. This in turn means that, from this point of view, they can be practically used. But, what about storage or transmission of the public/private keys that they produce (an attack can be mounted, for example, to force a server into constantly storing big public keys, which can lead to a denial-of-service attack or variants of the SYN flood attack)? What about their impacts on bandwidth and latency? Does having larger keys that span multiple packets (or congestion windows) affect performance especially in scenarios with degraded networks or packet loss?

In 2019, we ran a wide-scale post-quantum experiment with Google’s Chrome Browser team to test these ideas. The goal of that experiment was to assess, in a controlled environment, the impact of adding post-quantum algorithms to the Key Exchange phase of TLS handshakes. This investigation gave us some good insight into what kind of post-quantum algorithms can be used in the Key Agreement part of a TLS handshake (mainly lattice-based schemes, or so it seems) and allowed us to test them in a real-world setting. It is worth noting that these algorithms were tested in a “hybrid mode”: a combination of classical cryptographic algorithms with post-quantum ones.

The key exchange of TLS ensures confidentiality: it is the most pressing task to update it as non-post-quantum traffic captured today can be decrypted by a quantum computer in the future. Luckily, many of the post-quantum Key Encapsulation Mechanisms (KEMs) that are under consideration by NIST seem to be well suited with minimal performance impact. Unfortunately, a problem that has arisen before when considering protocol changes is the presence and behavior of old servers and clients, and of “middleboxes” (devices that manipulate traffic —by inspecting— it for purposes other than continuing the communication process). For instance, some middleboxes assume that the first message of a handshake from a browser fits in a single network packet, which is not the case when adding the majority of post-quantum KEMs. Such false assumptions (“protocol ossification”) are not a problem unique to post-quantum cryptography: the TLS 1.3 standard is carefully crafted to work around quirks of older clients, servers and middleboxes.

While all the data seems to suggest that replacing classical cryptography by post-quantum cryptography in the key exchange phase of TLS handshakes is a straightforward exercise, the problem seems to be much harder for handshake authentication (or for any protocol that aims to give authentication, such as DNSSEC or IPsec). The majority of TLS handshakes achieve authentication by using digital signatures generated via advertised public keys in public certificates (what is called “certificate-based” authentication). Most of the post-quantum signature algorithms currently being considered for standardization in the NIST post-quantum process, have signatures or public keys that are much larger than their classical counterparts. Their operations’ computation time, in the majority of cases, is also much bigger. It is unclear how this will affect the TLS handshake latency and round-trip times, though we have a better insight now in respect to which sizes can be used. We still need to know how much slowdown will be acceptable for early adoption.

There seems to be several ways by which we can add post-quantum cryptography to the authentication phase of TLS. We can:

  • Change the standard to reduce the number of signatures needed.
  • Use different post-quantum signature schemes that fit.
  • Or achieve authentication in a novel way.

On the latter, a novel way to achieve certificate-based TLS authentication is to use KEMs, as their post-quantum versions have smaller sizes than post-quantum signatures. This mechanism is called KEMTLS and we ran a controlled experiment showing that it performs well, even when it adds an extra or full round trip to the handshake (KEMTLS adds half a round trip for server-only authentication and a full round-trip for mutual authentication). It is worth noting that we only experimented with replacing the authentication algorithm in the handshake itself and not all the authentication algorithms needed for the certificate chain. We used a mechanism called “delegated credentials” for this: since we can’t change the whole certificate chain to post-quantum cryptography (as it involves other actors beyond ourselves), we use this short-lived credential that advertises new algorithms. More details around this experiment can be found in our paper.

Lastly, on the TLS front, we wanted to test the notion that having bigger signatures (such as the post-quantum ones) noticeably impacts TLS handshake times. Since it is difficult to deploy post-quantum signatures to real-world connections, we found a way to emulate bigger signatures without having to modify clients. This emulation was done by using dummy data. The result of this experiment showed that even if large signatures fit in the TCP congestion window, there will still be a double-digit percentage slowdown due to the relatively low average Internet speed. This slowdown is a hard sell for browser vendors and for content servers to adopt. The ideal situation for early adoption seems to be that the six signatures and two public keys of the TLS handshake fit together within 9kB (the signatures are: two in the certificate chain, one handshake signature, one OCSP staple and two SCTs for certificate transparency).

After this TLS detour, we can now list the challenges at this kingdom, Protocolla, level. The challenges (in no order in particular) seem to be (divided into sections):

Storage of cryptographic parameters used during the protocol’s execution:

  • How are we going to properly store post-quantum cryptographic parameters, such as keys or certificates, that are generated for/during protocol execution (their sizes are bigger than what we are accustomed to)?
  • How is post-quantum cryptography going to work with stateless servers, ones that do not store session state and where every client request is treated as a new one, such as NFS, Sun’s Network File System (for an interesting discussion on the matter, see this paper)?

Long-term operations and ephemeral ones:

  • What are the impacts of using post-quantum cryptography for long-term operations or for ephemeral ones: will bigger parameters make ephemeral connections a problem?
  • Are security properties assumed in protocols preserved and could we relax others (such as IND-CCA or IND-CPA. For an interesting discussion on the matter, see this paper)?

Managing bigger keys and signatures:

  • What are the impacts on latency and bandwidth?
  • Does the usage of post-quantum increase the roundtrips at the Network layer, for example? And, if so, are these increases tolerable?
  • Will the increased sizes cause dropped or fragmented packets?
  • Devices can occasionally have settings for packets smaller than expected: a router, for example, along a network path can have a maximum transmission unit, MTU (the MSS plus the TCP and IP headers), value set lower than the typical 1,500 bytes. In these scenarios, will post-quantum cryptography make these settings more difficult (one can apply MSS clamping for some cases)?

Preservation of protocols as we know them:

  • Can we achieve the same security or privacy properties as we use them today?
  • Can protocols change: should we change, for example, the way DNSSEC or the PKI work? Can we consider this radical change?
  • Can we integrate and deploy novel ways to achieve authentication?

Hardware (or novel alternative to hardware) usage during protocol’s execution:

Novel attacks:

  • Will post-quantum cryptography increase the possibility of mounting denial of service attacks?

Challenges at the Implementation level, Implementa

The second kingdom that we are going to look at is the one that deals with the implementation of post-quantum algorithms. The ongoing NIST process is standardizing post-quantum algorithms on two fronts: those that help preserve confidentiality (KEMs) and those that provide authentication (signatures). There are other algorithms not currently part of the process that already can be used in a post-quantum world, such as hash-based signatures (for example, XMSS).

What must happen for algorithms to be widely deployed? What are the steps they need to take in order to be usable by protocols or for data at rest? The usual path that algorithms take is:

  1. Standardization: usually by a standardization body. We will talk about it further in the next section.
  2. Efficiency at an algorithmic level: by finding new ways to speed up operations in an algorithm. In the case of Elliptic Curve Cryptography, for example, it happened with the usage of endomorphisms for faster scalar multiplication.
  3. Efficient software implementations: by identifying the pitfalls that cause increase in time or space consumption (in the case of ECC, this paper can illustrate these efforts), and fixing them. An optimal implementation is always dependent on the target, though: where it will be used.
  4. Hardware efficiency: by using, for example, hardware acceleration.
  5. Avoidance of attacks: by looking at the usual pitfalls of algorithms which, in practice, are side-channel attacks.

Implementation of post-quantum cryptography will follow (and is following) the same path. Lattice-based cryptography for KEMs, for example, has taken many steps in order to be faster than ECC (but, from a protocol level perspective, it is inefficient than them, as their parameters are bigger than ECC ones and might cause extra round-trips). Isogeny-based cryptography, on the other hand, is still too slow (due to long isogenies evaluation), but it is an active area of research.

The challenges at this kingdom, Implementa, (in no particular order) level are:

  • Efficiency of algorithms: can we make them faster at the software, hardware (by using  acceleration or FPGA-based research) or at an algorithmic level (with new data structures or parallelization techniques) to meet the requirements of network protocols and ever-fastest connections?
  • Can we use new mechanisms to accelerate algorithms (such as, for example, the usage of floating point numbers as in the Falcon signature scheme)? Will this lead to portability issues as it might be dependent on the underlying architecture?
  • What is the asymptotic complexity of post-quantum algorithms (how they impact time and space)?
  • How will post-quantum algorithms work on embedded devices due to their limited capacity (see this paper for more explanations)?
  • How can we avoid attacks, failures in security proofs and misuse of APIs?
  • Can we provide correct testing of these algorithms?
  • Can we ensure constant-time needs for the algorithms?
  • What will happen in a disaster-recovery mode: what happens if an algorithm is found to be weaker than expected or is fully broken? How will we be able to remove or update this algorithm? How can we make sure there are transition paths to recover from a cryptographical weakening?

At Cloudflare, we have also worked on implementation of post-quantum algorithms. We published our own library (CIRCL) that contains high-speed assembly versions of several post-quantum algorithms (like Kyber, Dilithium, SIKE and CSIDH). We believe that providing these implementations for public use will help others with the transition to post-quantum cryptography by giving easy-to-use APIs that developers can integrate into their projects.

Challenges at the Standards level, Regulae

The third kingdom deals with the standardization process as done by different bodies of organizations (such NIST or the Internet Engineering Task Force — IETF). We have talked a little about the matter in the previous section as it involves the standardization of both protocols and algorithms. Standardization can be a long process due to the need for careful discussion, and this discussion will be needed for the standardization of post-quantum algorithms. Post-quantum cryptography is based on mathematical constructions that are not widely known by the engineering community, which can then lead to difficulty levels when standardizing.

The challenges in this kingdom, Regulae, (in no particular order) are:

  • The mathematical base of post-quantum cryptography is an active area of development and research, and there are some concerns in the security they give (are there new attacks in the confidentiality or authentication they give?). How will standardization bodies approach this problem?
  • Post-quantum cryptography introduces new models in which to analyze the security of algorithms (for example, the usage of the Quantum Random Oracle Model). Will this mean that new attacks or adversaries will not be noted at the standards level?
  • What will be the recommendation of migrating to post-quantum cryptography from the standards' perspective: will we use a hybrid approach?
  • How can we bridge the academic/research community into the standardization community, so analysis of protocols are executed and attacks are found on time (prior to being widely deployed)1? How can we make sure that standards bodies are informed enough to make the right practical/theoretical trade-offs?

At Cloudflare, we are closely collaborating with standardization bodies to prepare the path for post-quantum cryptography (see, for example, the AuthKEM draft at IETF).

Challenges at the Community level, Comunitates

The Internet is not an isolated system: it is a community of different actors coming together to make protocols and systems work. Migrating to post-quantum cryptography means sitting together as a community to update systems and understand the different needs. This is one of the reasons why, at Cloudflare, we are organizing a second installment of the PQNet workshop (expected to be colocated with Real World Crypto 2022 on April 2022) for experts on post-quantum cryptography to talk about the challenges of putting it into protocols, systems and architectures.

The challenges in this kingdom, Comunitates, are:

  • What are the needs of different systems? While we know what the needs of different protocols are, we don’t know exactly how all deployed systems and services work. Are there further restrictions?
  • On certain systems (for example, on the PKI), when will the migration happen, and how will it be coordinated?
  • How will the migration be communicated to the end-user?
  • How will we deprecate pre-quantum cryptography?
  • How will we integrate post-quantum cryptography into systems where algorithms are hardcoded (such as IoT devices)?
  • Who will maintain implementations of post-quantum algorithms and protocols? Is there incentive and funding for a diverse set of interoperable implementations?

Challenges at the Research level, Investigationes

Post-quantum cryptography is an active area of research. This research is not devoted only to how algorithms interact with protocols, systems and architectures (as we have seen), but it is heavily interested at the foundational level. The open challenges on this front are many. We will list four that are of the most interest to us:

  • Are there any efficient and secure post-quantum non-interactive key exchange (NIKE) algorithms?

NIKE is a cryptographic algorithm which enables two participants, who know each others’ public keys, to agree on a shared key, without requiring any interaction. An example of a NIKE is the Diffie-Hellman algorithm. There are no efficient and secure post-quantum NIKEs. A candidate seems to be CSIDH, which is rather slow and whose security is debated.

  • Are there post-quantum alternatives to (V)OPRFs based protocols, such as Privacy Pass or OPAQUE?
  • Are there post-quantum alternatives to other cryptographic schemes such as threshold signature schemes, credential based signatures and more?
  • How can post-quantum algorithms be formally verified with new notions such as the QROM?

Post-Quantum Future

Image of an atom.
The future of Cloudflare is post-quantum.

What is the post-quantum future at Cloudflare? There are many avenues that we explore in this blog series. While all of these experiments have given us some good and reliable information for the post-quantum migration, we further need tests in different network environments and with broader connections. We also need to test how post-quantum cryptography fits into different architectures and systems. We are preparing a bigger, wide-scale post-quantum effort that will give more insight into what can be done for real-world connections.

In this series of blog posts, we will be looking at:

  • What has been happening in the quantum research world in the last years and how it impacts post-quantum efforts.
  • What is a post-quantum algorithm: explanations of KEMs and signatures, and their security properties.
  • How one integrates post-quantum algorithms into protocols.
  • What is formal verification, analysis and implementation, and how it is needed for the post-quantum transition.
  • What does it mean to implement post-quantum cryptography: we will look at our efforts making Logfwdr, Cloudflare Tunnel and Gokeyless post-quantum.
  • What does the future hold for a post-quantum Cloudflare.

See you in the next blogpost and prepare for a better, safer, faster and quantum-protected Internet!

If you are a student enrolled in a PhD or equivalent research program and looking for an internship for 2022, see open opportunities.

If you’re interested in contributing to projects helping Cloudflare, our engineering teams are hiring.

You can reach us with questions, comments, and research ideas at [email protected].

.......

1A success scenario of this is the standardization of TLS 1.3 (in comparison to TLS 1.0-1.2) as it involved the formal verification community, which helped bridge the academic-standards communities to good effect. Read the analysis of this novel process.