The Cloudflare Blog

Integrating Turnstile with the Cloudflare WAF to challenge fetch requests

Adam Martinetti — Mon, 18 Dec 2023 14:00:17 GMT

Two months ago, we made Cloudflare Turnstile generally available — giving website owners everywhere an easy way to fend off bots, without ever issuing a CAPTCHA. Turnstile allows any website owner to embed a frustration-free Cloudflare challenge on their website with a simple code snippet, making it easy to help ensure that only human traffic makes it through. In addition to protecting a website’s frontend, Turnstile also empowers web administrators to harden browser-initiated (AJAX) API calls running under the hood. These APIs are commonly used by dynamic single-page web apps, like those created with React, Angular, Vue.js.

Today, we’re excited to announce that we have integrated Turnstile with the Cloudflare Web Application Firewall (WAF). This means that web admins can add the Turnstile code snippet to their websites, and then configure the Cloudflare WAF to manage these requests. This is completely customizable using WAF Rules; for instance, you can allow a user authenticated by Turnstile to interact with all of an application’s API endpoints without facing any further challenges, or you can configure certain sensitive endpoints, like Login, to always issue a challenge.

Challenging fetch requests in the Cloudflare WAF

Millions of websites protected by Cloudflare’s WAF leverage our JS Challenge, Managed Challenge, and Interactive Challenge to stop bots while letting humans through. For each of these challenges, Cloudflare intercepts the matching request and responds with an HTML page rendered by the browser, where the user completes a basic task to demonstrate that they’re human. When a user successfully completes a challenge, they receive a cf_clearance cookie, which tells Cloudflare that a user has successfully passed a challenge, the type of challenge, and when it was completed. A clearance cookie can’t be shared between users, and is only valid for the time set by the Cloudflare customer in their Security Settings dashboard.

This process works well, except when a browser receives a challenge on a fetch request and the browser has not previously passed a challenge. On a fetch request, or an XML HTTP Request (XHR), the browser expects to get back simple text (in JSON or XML formats) and cannot render the HTML necessary to run a challenge.

As an example, let’s imagine a pizzeria owner who built an online ordering form in React with a payment page that submits data to an API endpoint that processes payments. When a user views the web form to add their credit card details they can pass a Managed Challenge, but when the user submits their credit card details by making a fetch request, the browser won’t execute the code necessary for a challenge to run. The pizzeria owner’s only option for handling suspicious (but potentially legitimate) requests is to block them, which runs the risk of false positives that could cause the restaurant to lose a sale.

This is where Turnstile can help. Turnstile allows anyone on the Internet to embed a Cloudflare challenge anywhere on their website. Before today, the output of Turnstile was only a one-time use token. To enable customers to issue challenges for these fetch requests, Turnstile can now issue a clearance cookie for the domain that it's embedded on. Customers can issue their challenge within the HTML page before a fetch request, pre-clearing the visitor to interact with the Payment API.

Turnstile Pre-Clearance mode

Returning to our pizzeria example, the three big advantages of using Pre-Clearance to integrate Turnstile with the Cloudflare WAF are:

Improved user experience: Turnstile’s embedded challenge can run in the background while the visitor is entering their payment details.
Blocking more requests at the edge: Because Turnstile now issues a clearance cookie for the domain that it’s embedded on, our pizzeria owner can use a Custom Rule to issue a Managed Challenge for every request to the payment API. This ensures that automated attacks attempting to target the payment API directly are stopped by Cloudflare before they can reach the API.
(Optional) Securing the action and the user: No backend code changes are necessary to get the benefit of Pre-Clearance. However, further Turnstile integration will increase security for the integrated API. The pizzeria owner can adjust their payment form to validate the received Turnstile token, ensuring that every payment attempt is individually validated by Turnstile to protect their payment endpoint from session hijacking.

A Turnstile widget with Pre-Clearance enabled will still issue turnstile tokens, which gives customers the flexibility to decide if an endpoint is critical enough to require a security check on every request to it, or just once a session. Clearance cookies issued by a Turnstile widget are automatically applied to the Cloudflare zone the Turnstile widget is embedded on, with no configuration necessary. The clearance time the token is valid for is still controlled by the zone specific “Challenge Passage” time.

Implementing Turnstile with Pre-Clearance

Let’s make this concrete by walking through a basic implementation. Before we start, we’ve set up a simple demo application where we emulate a frontend talking to a backend on a /your-api endpoint.

To this end, we have the following code:




   Turnstile Pre-Clearance Demo 


  
    Pre-clearance Demo

We've created a button. Upon clicking, Cloudflare makes a fetch() request to the /your-api endpoint, showing the result in the response container.

Now let’s consider that we have a Cloudflare WAF rule set up that protects the /your-api endpoint with a Managed Challenge.

Due to this rule, the app that we just wrote is going to fail for the reason described earlier (the browser is expecting a JSON response, but instead receives the challenge page as HTML).

If we inspect the Network Tab, we can see that the request to /your-api has been given a 403 response.

Upon inspection, the Cf-Mitigated header shows that the response was challenged by Cloudflare’s firewall, as the visitor has not solved a challenge before.

To address this problem in our app, we set up a Turnstile Widget in Pre-Clearance mode for the Turnstile sitekey that we want to use.

In our application, we override the fetch() function to invoke Turnstile once a Cf-Mitigated response has been received.

There is a lot going on in the snippet above: First, we create a hidden overlay element and override the browser’s fetch() function. The fetch() function is changed to introspect the Cf-Mitigated header for ‘challenge’. If a challenge is issued, the initial result will be unsuccessful; instead, a Turnstile overlay (with Pre-Clearance enabled) will appear in our web application. Once the Turnstile challenge has been completed we will retry the previous request after Turnstile has obtained the cf_clearance cookie to get through the Cloudflare WAF.

Upon solving the Turnstile widget, the overlay disappears, and the requested API result is shown successfully:

Pre-Clearance is available to all Cloudflare customers

Every Cloudflare user with a free plan or above can use Turnstile in managed mode free for an unlimited number of requests. If you’re a Cloudflare user looking to improve your security and user experience for your critical API endpoints, head over to our dashboard and create a Turnstile widget with Pre-Clearance today.

Announcing Turnstile, a user-friendly, privacy-preserving alternative to CAPTCHA

Reid Tatoris — Wed, 28 Sep 2022 13:01:00 GMT

Today, we’re announcing the open beta of Turnstile, an invisible alternative to CAPTCHA. Anyone, anywhere on the Internet, who wants to replace CAPTCHA on their site will be able to call a simple API, without having to be a Cloudflare customer or sending traffic through the Cloudflare global network. Sign up here for free.

There is no point in rehashing the fact that CAPTCHA provides a terrible user experience. It's been discussed in detail before on this blog, and countless times elsewhere. The creator of the CAPTCHA has even publicly lamented that he “unwittingly created a system that was frittering away, in ten-second increments, millions of hours of a most precious resource: human brain cycles.” We hate it, you hate it, everyone hates it. Today we’re giving everyone a better option.

Turnstile is our smart CAPTCHA alternative. It automatically chooses from a rotating suite of non-intrusive browser challenges based on telemetry and client behavior exhibited during a session. We talked in an earlier post about how we’ve used our Managed Challenge system to reduce our use of CAPTCHA by 91%. Now anyone can take advantage of this same technology to stop using CAPTCHA on their own site.

UX isn’t the only big problem with CAPTCHA — so is privacy

While having to solve a CAPTCHA is a frustrating user experience, there is also a potential hidden tradeoff a website must make when using CAPTCHA. If you are a small site using CAPTCHA today, you essentially have one option: an 800 pound gorilla with 98% of the CAPTCHA market share. This tool is free to use, but in fact it has a privacy cost: you have to give your data to an ad sales company.

According to security researchers, one of the signals that Google uses to decide if you are malicious is whether you have a Google cookie in your browser, and if you have this cookie, Google will give you a higher score. Google says they don’t use this information for ad targeting, but at the end of the day, Google is an ad sales company. Meanwhile, at Cloudflare, we make money when customers choose us to protect their websites and make their services run better. It's a simple, direct relationship that perfectly aligns our incentives.

Less data collection, more privacy, same security

In June, we announced an effort with Apple to use Private Access Tokens. Visitors using operating systems that support these tokens, including the upcoming versions of macOS or iOS, can now prove they’re human without completing a CAPTCHA or giving up personal data.

By collaborating with third parties like device manufacturers, who already have the data that would help us validate a device, we are able to abstract portions of the validation process, and confirm data without actually collecting, touching, or storing that data ourselves. Rather than interrogating a device directly, we ask the device vendor to do it for us.

Private Access Tokens are built directly into Turnstile. While Turnstile has to look at some session data (like headers, user agent, and browser characteristics) to validate users without challenging them, Private Access Tokens allow us to minimize data collection by asking Apple to validate the device for us. In addition, Turnstile never looks for cookies (like a login cookie), or uses cookies to collect or store information of any kind. Cloudflare has a long track record of investing in user privacy, which we will continue with Turnstile.

We are opening our CAPTCHA replacement to everyone

To improve the Internet for everyone, we decided to open up the technology that powers our Managed Challenge to everyone in beta as a standalone product called Turnstile.

Rather than try to unilaterally deprecate and replace CAPTCHA with a single alternative, we built a platform to test many alternatives and rotate new challenges in and out as they become more or less effective. With Turnstile, we adapt the actual challenge outcome to the individual visitor/browser. First we run a series of small non-interactive JavaScript challenges gathering more signals about the visitor/browser environment. Those challenges include proof-of-work, proof-of-space, probing for web APIs, and various other challenges for detecting browser-quirks and human behavior. As a result, we can fine-tune the difficulty of the challenge to the specific request.

Turnstile also includes machine learning models that detect common features of end visitors who were able to pass a challenge before. The computational hardness of those initial challenges may vary by visitor, but is targeted to run fast.

Swap out your existing CAPTCHA in a few minutes

You can take advantage of Turnstile and stop bothering your visitors with a CAPTCHA even without being on the Cloudflare network. While we make it as easy as possible to use our network, we don't want this to be a barrier to improving privacy and user experience.

To switch from a CAPTCHA service, all you need to do is:

Create a Cloudflare account, navigate to the `Turnstile` tab on the navigation bar, and get a sitekey and secret key.
Copy our JavaScript from the dashboard and paste over your old CAPTCHA JavaScript.
Update the server-side integration by replacing the old siteverify URL with ours.

There is more detail on the process below, including options you can configure, but that’s really it. We’re excited about the simplicity of making a change.

Deployment options and analytics

To use Turnstile, first create an account and get your site and secret keys.

Then, copy and paste our HTML snippet:

Once the script is embedded, you can use implicit rendering. Here, the HTML is scanned for elements that have a cf-turnstile class:

Once a challenge has been solved, a token is injected in your form, with the name cf-turnstile-response. This token can be used with our siteverify endpoint to validate a challenge response. A token can only be validated once, and a token cannot be redeemed twice. The validation can be done on the server side or even in the cloud, for example using a simple Workers fetch (see a demo here):

async function handleRequest() {
    // ... Receive token
    let formData = new FormData();
    formData.append('secret', turnstileISecretKey);
    formData.append('response', receivedToken);
 
    await fetch('https://challenges.cloudflare.com/turnstile/v0/siteverify',
        {
            body: formData,
            method: 'POST'
        });
    // ...
}

For more complex use cases, the challenge can be invoked explicitly via JavaScript:

You can also create what we call 'Actions'. Custom labels that allow you to distinguish between different pages where you're using Turnstile, like a login, checkout, or account creation page.

Once you’ve deployed Turnstile, you can go back to the dashboard and see analytics on where you have widgets deployed, how users are solving them, and view any defined actions.

Why are we giving this away for free?

While this is sometimes hard for people outside to believe, helping build a better Internet truly is our mission. This isn’t the first time we’ve built free tools that we think will make the Internet better, and it won’t be the last. It's really important to us.

So whether or not you’re a Cloudflare customer today, if you’re using a CAPTCHA, try Turnstile for free, instead. You’ll make your users happier, and minimize the data you send to third parties.

Visit this page to sign up for the best invisible, privacy-first, CAPTCHA replacement and to retrieve your Turnstile beta sitekey.

If you want to read more, refer to our documentation.

Using HPKE to Encrypt Request Payloads

Miguel de Moura — Fri, 19 Feb 2021 12:00:00 GMT

The Managed Rules team was recently given the task of allowing Enterprise users to debug Firewall Rules by viewing the part of a request that matched the rule. This makes it easier to determine what specific attacks a rule is stopping or why a request was a false positive, and what possible refinements of a rule could improve it.

The fundamental problem, though, was how to securely store this debugging data as it may contain sensitive data such as personally identifiable information from submissions, cookies, and other parts of the request. We needed to store this data in such a way that only the user who is allowed to access it can do so. Even Cloudflare shouldn't be able to see the data, following our philosophy that any personally identifiable information that passes through our network is a toxic asset.

This means we needed to encrypt the data in such a way that we can allow the user to decrypt it, but not Cloudflare. This means public key encryption.

Now we needed to decide on which encryption algorithm to use. We came up with some questions to help us evaluate which one to use:

What requirements do we have for the algorithm?
What language do we implement it in?
How do we make this as secure as possible for users?

Here's how we made those decisions.

Algorithm Requirements

While we knew we needed to use public key encryption, we also needed to keep an eye on performance. This led us to select Hybrid Public Key Encryption (HPKE) early on as it has a best-of-both-worlds approach to using symmetric as well as public-key cryptography to increase performance. While these best-of-both-worlds schemes aren’t new [1][2][3], HPKE aims to provide a single, future-proof, robust, interoperable combination of a general key encapsulation mechanism and a symmetric encryption algorithm.

HPKE is an emerging standard developed by the Crypto Forum Research Group (CFRG), the research body that supports the development of Internet standards at the IETF. The CFRG produces specifications called RFCs (such as RFC 7748 for elliptic curves) that are then used in higher level protocols including two we talked about previously: ODoH and ECH. Cloudflare has long been a supporter of Internet standards, so HPKE was a natural choice to use for this feature. Additionally, HPKE was co-authored by one of our colleagues at Cloudflare.

How HPKE Works

HPKE combines an asymmetric algorithm such as elliptic curve Diffie-Hellman and a symmetric cipher such as AES. One of the upsides of HPKE is that the algorithms aren't dictated to the implementer, but making a combination that’s provably secure and meets the developer’s intuitive notions of security is important. All too often developers reach for a scheme without carefully understanding what it does, resulting in security vulnerabilities.

HPKE solves these problems by providing a high level of security in a generic manner and providing necessary hooks to tie messages to the context in which they are generated. This is the application of decades of research into the correct security notions and schemes.

HPKE is built in stages. First it turns a Diffie-Hellman key agreement into a Key Encapsulation Mechanism. A key encapsulation mechanism has two algorithms: Encap and Decap. The Encap algorithm creates a symmetric secret and wraps it in a public key, so that only the holder of the private key can unwrap it. An attacker with the encapsulation cannot recover the random key. Decap takes the encapsulation and the private key associated to the public key, and computes the same random key. This translation gives HPKE the flexibility to work almost unchanged with any kind of public key encryption or key agreement algorithm.

HPKE mixes this key with an optional info argument, as well as information relating to the cryptographic parameters used by each side. This ensures that attackers cannot modify messages’ meaning by taking them out of context. A postcard marked “So happy to see you again soon” is ominous from the dentist and endearing from one’s grandmother.

The specification for HPKE is open and available on the IETF website. It is on its way to becoming an RFC after passing multiple rounds of review and analysis by cryptography experts at the CFRG. HPKE is already gaining adoption in IETF protocols like ODoH, ECH, and the new Messaging Layer Security (MLS) protocol. HPKE is also designed with the post-quantum future since it is built to work with any KEM, including all the NIST finalists for post-quantum public-key encryption.

Implementation Language

Once we had an encryption scheme selected, we needed to settle on an implementation. HPKE is still fairly new, so the libraries aren't quite mature yet. There is a reference implementation, and we’re in the process of developing an implementation in Go as part of CIRCL. However, in the absence of a clear "go to" that is widely known to be the best, we decided to go with an implementation leveraging the same language already powering much of the Firewall code running at the Cloudflare edge - Rust.

Aside from this, the language benefits from features like native primitives, and crucially the ability to easily compile to WebAssembly (WASM).

As we mentioned in a previous blog post, customers are able to generate a key pair and decrypt payloads either from the dashboard UI or from a CLI. Instead of writing and maintaining two different codebases for these, we opted to reuse the same implementation across the edge component that encrypts the payloads and the UI and CLI that decrypt them. To achieve this we compile our library to target WASM so it can be used in the dashboard UI code that runs in the browser. While this approach may yield a slightly larger JavaScript bundle size and relatively small computational overhead, we found it preferable to spending a significant amount of time securely re-implementing HPKE using JavaScript WebCrypto primitives.

The HPKE implementation we decided on comes with the caveat of not yet being formally audited, so we performed our own internal security review. We analyzed the cryptography primitives being used and the corresponding libraries. Between the composition of said primitives and secure programming practices like correctly zeroing memory and safe usage of random number generators, we found no security issues.

Making It Secure For Users

To encrypt on behalf of users, we need them to provide us with a public key. To make this as easy as possible, we built a CLI tool along with the ability to do it right in the browser. Either option allows the user to generate a public/private key pair without needing to talk to Cloudflare servers at all.

In our API, we specifically do not accept the private key of the key pair — we don't want it! We don't need and don't want to be able to decrypt the data we're storing.

For the dashboard, once the user provides the private key for decryption, the key is held in a temporary JavaScript variable and used for the in-browser decryption. This allows the user to not constantly have to provide the key while browsing the Firewall event logs. The private key is also not persisted in any way in the browser, so any action that refreshes the page such as refreshing or navigating away will require the user to provide the key again. We believe this is an acceptable usability compromise for better security.

How Payload Extraction Works

After deciding how to encrypt the data, we just had to figure out the rest of the feature: what data to encrypt, how to store and transmit it, and how to allow users to decrypt it.

When an HTTP request reaches the L7 Firewall, it is evaluated against a set of rulesets. Each of these rulesets contain several rules written in the wirefilter syntax.

An example of one such rule would be:

http.request.version eq "HTTP/1.1"
and
(
    http.request.uri.path matches "\n+."
    or
    http.request.uri.query matches "\x00+."
)

This expression evaluates to a boolean “true” for HTTP/1.1 requests that either contain one or more newlines followed by a character in the request path or one or more NULL bytes followed by a character in the query string.

Say we had the following request that would match the rule above:

GET /cms/%0Aadmin?action=%00post HTTP/1.1
Host: example.com

If matched data logging is enabled, the rules that match would be executed again in a special context that tags all fields that are accessed during execution. We do this second execution because this tagging adds a noticeable computational overhead, and since the vast majority of requests don't trigger a rule at all we would be unnecessarily adding overhead to each request. Requests that do match any rules will only match a few rules as well, so we don't need to re-execute a large portion of the ruleset.

You may notice that although http.request.uri.query matches "\x00+." evaluates to true for this request, it won’t be executed, because the expression short-circuits with the first or condition that also matches. This results in only http.request.version and http.request.uri.path being tagged as accessed:

http.request.version -> HTTP/1.1
http.request.uri.path -> /cms/%0Aadmin

Having gathered the fields that were accessed, the Firewall engine does some post-processing; removing fields that are a subset of others (e.g., the query string and the full URI), or truncating fields that are beyond a certain character length.

Finally, these get serialized as JSON, encrypted with the customer's public key, serialized again as a set of bytes, and prefixed with a version number should we need to change/update it in the future. To simplify consumption of these blobs, our APIs display a base64 encoded version of the bytes:

Now that we have encrypted the data at the edge and persisted it in ClickHouse, we need to allow users to decrypt it. As part of the setup of turning this feature on, users generated a key-pair: the public key which was used to encrypt the payloads and a private key which is used to decrypt them. Decryption is done completely offline via either the command line using cloudflare/matched-data-cli:

$ MATCHED_DATA=AkjQDktMX4FudxeQhwa0UPNezhkgLAUbkglNQ8XVCHYqPgAAAAAAAACox6cEwqWQpFVE2gCFyOFsSdm2hCoE0/oWKXZJGa5UPd5mWSRxNctuXNtU32hcYNR/azLjsGO668Jwk+qCdFvmKjEqEMJgI+fvhwLQmm4=
$ matched-data-cli decrypt -d $MATCHED_DATA -k $PRIVATE_KEY
{"http.request.version": "HTTP/1.1", "http.request.uri.path": "/cms/%0Aadmin"}

Or the dashboard UI:

Since our CLI tool is open-source and HPKE is interoperable, it can also be used in other tooling as part of a user's logging pipeline, for example in security information and event management (SIEM) software.

Conclusion

This was a team effort with help from our Research and Security teams throughout the process. We relied on them for recommendations on how best to evaluate the algorithms as well as vetting the libraries we wanted to use.

We're very pleased with how HPKE has worked out for us from an ease-of-implementation and performance standpoint. It was also an easy choice for us to make due to its impending standardization and best-of-both-worlds approach to security.

Watch it on Cloudflare TV

Making the WAF 40% faster

Miguel de Moura — Wed, 01 Jul 2020 11:00:00 GMT

Cloudflare’s Web Application Firewall (WAF) protects against malicious attacks aiming to exploit vulnerabilities in web applications. It is continuously updated to provide comprehensive coverage against the most recent threats while ensuring a low false positive rate.

As with all Cloudflare security products, the WAF is designed to not sacrifice performance for security, but there is always room for improvement.

This blog post provides a brief overview of the latest performance improvements that were rolled out to our customers. Visit our learning center to learn more about how a Web Application Firewall (WAF) works.

Transitioning from PCRE to RE2

Back in July 2019, the WAF transitioned from using a regular expression engine based on PCRE to one inspired by RE2, which is based around using a deterministic finite automaton (DFA) instead of backtracking algorithms. This change came as a result of an outage where an update added a regular expression which backtracked enormously on certain HTTP requests, resulting in exponential execution time.

After the migration was finished, we saw no measurable difference in CPU consumption at the edge, but noticed execution time outliers in the 95th and 99th percentiles decreased, something we expected given RE2's guarantees of a linear time execution with the size of the input.

As the WAF engine uses a thread pool, we also had to implement and tune a regex cache shared between the threads to avoid excessive memory consumption (the first implementation turned out to use a staggering amount of memory).

These changes, along with others outlined in the postmortem blog post, helped us improve reliability and safety at the edge and have the confidence to explore further performance improvements.

But while we’ve highlighted regular expressions, they are only one of the many capabilities of the WAF engine.

Matching Stages

When an HTTP request reaches the WAF, it gets organized into several logical sections to be analyzed: method, path, headers, and body. These sections are all stored in Lua variables. If you are interested in more detail on the implementation of the WAF itself you can watch this old presentation.

Before matching these variables against specific malicious request signatures, some transformations are applied. These transformations are functions that range from simple modifications like lowercasing strings to complex tokenizers and parsers looking to fingerprint certain malicious payloads.

As the WAF currently uses a variant of the ModSecurity syntax, this is what a rule might look like:

SecRule REQUEST_BODY "@rx /\x00+evil" "drop, t:urlDecode, t:lowercase"

It takes the request body stored in the REQUEST_BODY variable, applies the urlDecode() and lowercase() functions to it and then compares the result with the regular expression signature \x00+evil.

In pseudo-code, we can represent it as:

rx( "/\x00+evil", lowercase( urlDecode( REQUEST_BODY ) ) )

Which in turn would match a request whose body contained percent encoded NULL bytes followed by the word "evil", e.g.:

POST /cms/admin?action=post HTTP/1.1
Host: example.com
Content-Type: text/plain; charset=utf-8
Content-Length: 16

thiSis%2F%00eVil

The WAF contains thousands of these rules and its objective is to execute them as quickly as possible to minimize any added latency to a request. And to make things harder, it needs to run most of the rules on nearly every request. That’s because almost all HTTP requests are non-malicious and no rules are going to match. So we have to optimize for the worst case: execute everything!

To help mitigate this problem, one of the first matching steps executed for many rules is pre-filtering. By checking if a request contains certain bytes or sets of strings we are able to potentially skip a considerable number of expressions.

In the previous example, doing a quick check for the NULL byte (represented by \x00 in the regular expression) allows us to completely skip the rule if it isn’t found:

contains( "\x00", REQUEST_BODY )
and
rx( "/\x00+evil", lowercase( urlDecode( REQUEST_BODY ) ) )

Since most requests don’t match any rule and these checks are quick to execute, overall we aren’t doing more operations by adding them.

Other steps can also be used to scan through and combine several regular expressions and avoid execution of rule expressions. As usual, doing less work is often the simplest way to make a system faster.

Memoization

Which brings us to memoization - caching the output of a function call to reuse it in future calls.

Let’s say we have the following expressions:

1. rx( "\x00+evil", lowercase( url_decode( body ) ) )
2. rx( "\x00+EVIL", remove_spaces( url_decode( body ) ) )
3. rx( "\x00+evil", lowercase( url_decode( headers ) ) )
4. streq( "\x00evil", lowercase( url_decode( body ) ) )

In this case, we can reuse the result of the nested function calls (1) as they’re the same in (4). By saving intermediate results we are also able to take advantage of the result of url_decode( body ) from (1) and use it in (2) and (4). Sometimes it is also possible to swap the order functions are applied to improve caching, though in this case we would get different results.

A naive implementation of this system can simply be a hash table with each entry having the function(s) name(s) and arguments as the key and its output as the value.

Some of these functions are expensive and caching the result does lead to significant savings. To give a sense of magnitude, one of the rules we modified to ensure memoization took place saw its execution time reduced by about 95%:

Execution time per rule

The WAF engine implements memoization and the rules take advantage of it, but there’s always room to increase cache hits.

Rewriting Rules and Results

Cloudflare has a very regular cadence of releasing updates and new rules to the Managed Rulesets. However, as more rules are added and new functions implemented, the memoization cache hit rate tends to decrease.

To improve this, we first looked into the rules taking the most wall-clock time to execute using some of our performance metrics:

Execution time per rule

Having these, we cross-referenced them with the ones having cache misses (output is truncated with [...]):

$ ./parse.py --profile
Hit Ratio:
-------------
0.5608

Hot entries:
-------------
[urlDecode, replaceComments, REQUEST_URI, REQUEST_HEADERS, ARGS_POST]
[urlDecode, REQUEST_URI]
[urlDecode, htmlEntityDecode, jsDecode, replaceNulls, removeWhitespace, REQUEST_URI, REQUEST_HEADERS]
[urlDecode, lowercase, REQUEST_FILENAME]
[urlDecode, REQUEST_FILENAME]
[urlDecode, lowercase, replaceComments, compressWhitespace, ARGS, REQUEST_FILENAME]
[urlDecode, replaceNulls, removeWhitespace, REQUEST_URI, REQUEST_HEADERS, ARGS_POST]
[...]

Candidates:
-------------
100152A - replace t:removeWhitespace with t:compressWhitespace,t:removeWhitespace
100214 - replace t:lowercase with (?i)
100215 - replace t:lowercase with (?i)
100300 - consider REQUEST_URI over REQUEST_FILENAME
100137D - invert order of t:replaceNulls,t:lowercase
[...]

After identifying more than 40 rules, we rewrote them to take full advantage of memoization and added pre-filter checks where possible. Many of these changes were not immediately obvious, which is why we’re also creating tools to aid analysts in creating even more efficient rules. This also helps ensure they run in accordance with the latency budgets the team has set.

This change resulted in an increase of the cache hit percentage from 56% to 74%, which crucially included the most expensive transformations.

Most importantly, we also observed a sharp decrease of 40% in the average time the WAF takes to process and analyze an HTTP request at the Cloudflare edge.

WAF Request Processing - Time Average

A comparable decrease was also observed for the 95th and 99th percentiles.

Finally, we saw a drop of CPU consumption at the edge of around 4.3%.

Next Steps

While the Lua WAF has served us well throughout all these years, we are currently porting it to use the same engine powering Firewall Rules. It is based on our open-sourced wirefilter execution engine, which uses a filter syntax inspired by Wireshark®. In addition to allowing more flexible filter expressions, it provides better performance and safety.

The rule optimizations we've described in this blog post are not lost when moving to the new engine, however, as the changes were deliberately not specific to the current Lua engine’s implementation. And while we're routinely profiling, benchmarking and making complex optimizations to the Firewall stack, sometimes just relatively simple changes can have a surprisingly huge effect.