Blog What we do Support Community
Login Sign up

There’s Always Cache in the Banana Stand

by Anthony Davanzo.



We’re happy to announce that we now support all HTTP Cache-Control response directives. This puts powerful control in the hands of you, the people running origin servers around the world. We believe we have the strongest support for Internet standard cache-control directives of any large scale cache on the Internet.

Documentation on Cache-Control is available here.

Cloudflare runs a Content Distribution Network (CDN) across our globally distributed network edge. Our CDN works by caching our customers’ web content at over 119 data centers around the world and serving that content to the visitors nearest to each of our network locations. In turn, our customers’ websites and applications are much faster, more
available, and more secure for their end users.

A CDN’s fundamental working principle is simple: storing stuff closer to where it’s needed means it will get to its ultimate destination faster. And, serving something from more places means it’s more reliably available.

Caching

To use a simple banana analogy: say you want a banana. You go to your local fruit stand to pick up a bunch to feed your inner monkey. You expect the store to have bananas in stock, which would satisfy your request instantly. But, what if they’re out of stock? Or what if all of the bananas are old and stale? Then, the store might need to place an order with the banana warehouse. That order might take some time to fill, time you would spend waiting in the store for the banana delivery to arrive. But you don’t want bananas that badly; you’ll probably just walk out and figure out some other way to get your tropical fix.

Now, what if we think about the same scenario in the context of an Internet request? Instead of bananas, you are interested in the latest banana meme. You go to bananameme.com, which sits behind Cloudflare’s edge network, and you get served your meme faster!

Of course, there’s a catch. A CDN in-between your server (the “origin” of your content) and your visitor (the “eyeball” in network engineer slang) might cache content that is out-of-date or incorrect. There are two ways to manage this:

1) the origin should give the best instructions it can on when to treat content as stale.

2) the origin can tell the edge when it has made a change to content that makes content stale.

Cache-Control headers allow servers and administrators to give explicit instructions to the edge on how to handle content.

Challenges of Storing Ephemeral Content (or: No Stale Bananas)

When using an edge cache like Cloudflare in-between your origin and visitors, the origin server no longer has direct control over the cached assets being served. Internet standards allow for the origin to emit Cache-Control headers with each response it serves. These headers give intermediate and browser caches fine-grained instruction over how content should be cached.

The current RFC covering these directives (and HTTP caching in general) is RFC 7234. It’s worth a skim if you’re into this kind of stuff. The relevant section on Response Cache-Control is laid out in section 5.2.2 of that document. In addition, some interesting extensions to the core directives were defined in RFC 5861, covering how caches should behave when origins are unreachable or in the process of being revalidated against.

To put this in terms of bananas:

George Michael sells bananas at a small stand. He receives a shipment of bananas for resale from Anthony’s Banana Company (ABC) on Monday. Anthony’s Banana Company serves as the origin for bananas for stores spread across the country. ABC is keenly interested in protecting their brand; they want people to associate them with only the freshest, perfectly ripe bananas with no stale or spoiled fruit to their name.

To ensure freshness, ABC provides explicit instructions to its vendors and eaters of its bananas. Bananas can’t be held longer than 3 days before sale to prevent overripening/staleness. Past 3 days, if a customer tries to buy a banana, George Michael must call ABC to revalidate that the bananas are fresh. If ABC can’t be reached, the bananas must not be sold.

To put this in terms of banana meme SVGs:

Kari uses Cloudflare to cache banana meme SVGs at edge locations around the world to reduce visitor latency. Banana memes should only be cached for up to 3 days to prevent the memes from going stale. Past 3 days, if a visitor requests https://bananameme.com/, Cloudflare must make a revalidation request to the bananameme.com origin. If the request to origin fails, Cloudflare must serve the visitor an error page instead of their zesty meme.

If only ABC and Kari had strong support for Cache-Control response headers!

If they did, they could serve their banana related assets with the following header:

Cache-Control: public, max-age=259200, proxy-revalidate

Public means this banana is allowed to be served from an edge cache. Max-age=259200 means it can stay in cache for up to 3 days (3 days * 24 hours * 60 minutes * 60 seconds = 259200). Proxy-revalidate means the edge cache must revalidate the content with the origin when that expiration time is up, no exceptions.

For a full list of supported directives and a lot more examples (but no more bananas), check out the documentation in our Help Center.

comments powered by Disqus