Subscribe to receive notifications of new posts:

Optimising Caching on Pwned Passwords (with Workers)

2018-08-09

5 min read

In February, Troy Hunt unveiled Pwned Passwords v2. Containing over half a billion real world leaked passwords, this database provides a vital tool for correcting the course of how the industry combats modern threats against password security.

In supporting this project; I built a k-Anonymity model to add a layer of security to performed queries. This model allows for enhanced caching by mapping multiple leaked password hashes to a single hash prefix and additionally being performed in a deterministic HTTP-friendly way (which allows caching whereas other implementations of Private Set Intersection require a degree of randomness).

Since launch, PwnedPasswords, using this anonymity model and delivered by Cloudflare, has been implemented in a widespread way across a wide variety of platforms - from site like EVE Online and Kogan to tools like 1Password and Okta's PassProtect. The anonymity model is also used by Firefox Monitor when checking if an email is in a data breach.

Since it has been adopted, Troy has tweeted out about the high cache hit ratio; and people have been asking me about my "secret ways" of gaining such a high cache hit ratio. Over time I touched various pieces of Cloudflare's caching systems; in late 2016 I worked to bring Bypass Cache on Cookie functionality to our self-service Business plan users and wrestled with cache implications of CSRF tokens - however Pwned Passwords was far more fun to help show the power of Cloudflare's cache functionality from the perspective of a user.

Looks like Pwned Passwords traffic has started to double over the norm, trending around 8M requests a day now. @IcyApril made a cache change to improve stability but reduce hit ratio around the 10th, but that's improving again now with higher volumes (94% for the last week). pic.twitter.com/HwMDLlmBEY

— Troy Hunt (@troyhunt) June 25, 2018

Will @IcyApril secret ways ever be released?!

— Neal (@tun35) May 7, 2018

It is worth noting that PwnedPasswords is not like a typical website in terms of caching - it contains 16^5 possible API queries (any possible form of five hexadecimal charecters, in total over a million possible queries) in order to guarantee k-Anonymity in the API. Whilst the API guarantees k-Anonymity, it does not guarantee l-Diversity, meaning individual queries can occur more than others.

For ordinary websites, with fewer assets, the cache hit ratio can be far greater. An example of this is another site Troy set-up using our barebones free plan; by simply configuring a Page Rule with the Cache Everything option (and setting an Edge Cache TTL option, should the Cache-Control headers from your origin not do so), you are able to cache static HTML easily.

When I've written about really high cache-hit ratios on @haveibeenpwned courtesy of @Cloudflare, some people have suggested it's due to higher-level plans. Here's https://t.co/Y4GlsInvu2 running on the *free* plan: 99.0% cache hit ratio on requests and 99.5% on bandwidth. Free! pic.twitter.com/pP0wo7qKF3

— Troy Hunt (@troyhunt) July 31, 2018

Origin Headers

Indeed, the fact the queries are usually API queries makes a substantial difference. When optimising caching, the most important thing to look for is instances where the same cache asset is stored multiple times for different cache keys; for some assets this may involve selectively ignoring query strings for cache purposes, but for APIs the devil is more in the detail.

When a HTTP request is made from a JavaScript asset (as is done when PwnedPasswords is directly implemented in login forms) - the site will also send an Origin header to indicate where a fetch originates from.

When you make a search on haveibeenpwned.com/Passwords, there's a bit of JavaScript which takes the password and applies the k-Anonymity model by SHA-1 hashing the password and truncating the hash to the first five charecters and sending that request off to https://api.pwnedpasswords.com/range/A94A8 (then performing a check to see if any of the contained suffixes are in the response).

In the headers of this request to PwnedPasswords.com, you can see the request contains an Origin header of the querying site.

PwnedPasswords Headers

This header is often useful for mitigating Cross-Site Request Forgery (CSRF) vulnerabilities by only allowing certain Origins to make HTTP requests using Cross-Origin Resource Sharing (CORS).

In the context of an API, this does not nessecarily make sense where there is no state (i.e. cookies). However, Cloudflare's default Cache Key contains this header for those who wish to use it. This means, Cloudflare will store a new cached copy of the asset whenever a different Origin header is present. Whilst this is ordinarily not a problem (most sites have one Origin header, or just a handful when using CORS), PwnedPasswords has Origin headers coming from websites all over the internet.

As Pwned Passwords will always respond with the same for a given request, regardless of the Origin header - we are able to remove this header from the Cache Key using our Custom Cache Key functionality.

Incidently, JavaScript CDNs will frequently be requested to fetch assets as sub-resources from another JavaScript asset - removing the Origin header from their Cache Key can have similar benefits:

Just applied some @Cloudflare cache magic I experimented with to get @troyhunt's Pwned Passwords API cache hit ratio to ~91%, to a large JS CDN (@unpkg) during a slow traffic period. Traffic 30mins post deploy shows a growing ~94% Cache Hit Ratio (with a planned cache purge!). pic.twitter.com/ZQmfzEi4Y2

— Junade Ali (@IcyApril) May 6, 2018

Case Insensitivity

One thing I realised after speaking to Stefán Jökull Sigurðarson from EVE Online was that different users were querying assets using different casing; for example, instead of range/A94A8 - a request to range/a94a8 would result in the same asset. As the Cache Key accounted for case sensitivity, the asset would be cached twice.

Unfortuantely, the API was already public with both forms of casing being acceptable once I started these optimisations.

Enter Cloudflare Workers

Instead of adjusting the cache key to solve this problem, I decided to use Cloudflare Workers - allowing me to adjust cache behaviour using JavaScript.

Troy initially had a simple worker on the site to enable CORS:

addEventListener('fetch', event => {
    event.respondWith(checkAndDispatchReports(event.request))
})

async function checkAndDispatchReports(req) {
    if(req.method === 'OPTIONS') {
        let responseHeaders = setCorsHeaders(new Headers())
        return new Response('', {headers:responseHeaders})
    } else {
        return await fetch(req)
    }
}

function setCorsHeaders(headers) {
    headers.set('Access-Control-Allow-Origin', '*')
    headers.set('Access-Control-Allow-Methods', 'GET')
    headers.set('Access-Control-Allow-Headers', 'access-control-allow-headers')
    headers.set('Access-Control-Max-Age', 1728000)
    return headers
}

I added to this worker to ensure that when a request left Workers, the hash prefix would always be upper case, additionally I used the cacheKey flag to allow the Cache Key to be set directly in Workers when making the request (instead of using our internal Custom Cache Key configuration):

addEventListener('fetch', event => {
  event.respondWith(handleRequest(event.request));
})

/**
 * Fetch request after making casing of hash prefix uniform
 * @param {Request} request
 */
async function handleRequest(request) {
      
  if(request.method === 'OPTIONS') {
    let responseHeaders = setCorsHeaders(new Headers())
    return new Response('', {headers:responseHeaders})
  }

  const url = new URL(request.url);

  if (!url.pathname.startsWith("/range/")) {
    const response = await fetch(request)
    return response;
  }

  const prefix = url.pathname.substr(7);
  const newRequest = "https://api.pwnedpasswords.com/range/" + prefix.toUpperCase()

  if (prefix === prefix.toUpperCase()) {
    const response = await fetch(request, { cf: { cacheKey: newRequest } })
    return response;
  }

  const init = {
      method: request.method,
      headers: request.headers
  }
  
  const modifiedRequest = new Request(newRequest, init)
  const response = await fetch(modifiedRequest, { cf: { cacheKey: newRequest } })
  return response
}

function setCorsHeaders(headers) {
    headers.set('Access-Control-Allow-Origin', '*')
    headers.set('Access-Control-Allow-Methods', 'GET')
    headers.set('Access-Control-Allow-Headers', 'access-control-allow-headers')
    headers.set('Access-Control-Max-Age', 1728000)
    return headers
}

Incidentially, our Workers team are working on some really cool stuff around controlling our cache APIs at a fine grained level, you'll be able to see some of that stuff in due course by following this blog.

Argo

Finally, Argo plays an important part in improving Cache Hit ratio. Once toggled on, it is known for optimising speed at which traffic travels around the internet - but it also means that when traffic is routed from one Cloudflare data center to another, if an asset is cached closer to the origin web server, the asset will be served from that data center. In essence, it offers Tiered Cache functionality; by making sure when traffic comes from a less used Cloudflare data center, it can still utilise the cache from a data center recieving greater traffic (and more likely to have an asset in cache). This prevents an asset from having to travel all the way around the world whilst still being served from cache (even if not optimally close to the user).

Argo Infographic

Conclusion

By using Cloudflare's caching functionality, we are able to reduce the amount of times a single asset is in cache by accidental variations in the request parameters. Workers offers a mechanism to control the cache of assets on Cloudflare, with more fine-grained controls under active development.

By implementing this on Pwned Passwords; we are able to provide developers a simple and fast interface to reduce password reuse amonst their users, thereby limiting the effects of Credential Stuffing attacks on their system. If only Irene Adler had used a password manager:

Interested in helping debug performance, cache and security issues for websites of all sizes? We're hiring for Support Engineers to join us in London, and additionally those speaking Japanese, Korean or Mandarin in our Singapore office.

Cloudflare's connectivity cloud protects entire corporate networks, helps customers build Internet-scale applications efficiently, accelerates any website or Internet application, wards off DDoS attacks, keeps hackers at bay, and can help you on your journey to Zero Trust.

Visit 1.1.1.1 from any device to get started with our free app that makes your Internet faster and safer.

To learn more about our mission to help build a better Internet, start here. If you're looking for a new career direction, check out our open positions.
Cloudflare WorkersCacheSpeed & ReliabilityServerlessDeveloper PlatformDevelopers

Follow on X

Junade Ali|@IcyApril
Cloudflare|@cloudflare

Related posts

October 31, 2024 1:00 PM

Moving Baselime from AWS to Cloudflare: simpler architecture, improved performance, over 80% lower cloud costs

Post-acquisition, we migrated Baselime from AWS to the Cloudflare Developer Platform and in the process, we improved query times, simplified data ingestion, and now handle far more events, all while cutting costs. Here’s how we built a modern, high-performing observability platform on Cloudflare’s network....

October 24, 2024 1:05 PM

Build durable applications on Cloudflare Workers: you write the Workflows, we take care of the rest

Cloudflare Workflows is now in open beta! Workflows allows you to build reliable, repeatable, long-lived multi-step applications that can automatically retry, persist state, and scale out. Read on to learn how Workflows works, how we built it on top of Durable Objects, and how you can deploy your first Workflows application....