Today, I’m delighted to announce Cache Analytics: a new tool that gives deeper exploration capabilities into what Cloudflare’s caching and content delivery services are doing for your web presence.
Caching is the most effective way to improve the performance and economics of serving your website to the world. Unsurprisingly, customers consistently ask us how they can optimize their cache performance to get the most out of Cloudflare.
With Cache Analytics, it’s easier than ever to learn how to speed up your website, and reduce traffic sent to your origin. Some of my favorite capabilities include:
See what resources are missing from cache, expired, or never eligible for cache in the first place
Slice and dice your data as you see fit: filter by hostnames, or see a list of top URLs that miss cache
Switch between views of requests and data Transfer to understand both performance and cost
An overview of Cache Analytics
Cache Analytics is available today for all customers on our Pro, Business, and Enterprise plans.
In this blog post, I’ll explain why we built Cache Analytics and how you can get the most out of it.
Why do we need analytics focused on caching?
If you want to scale the delivery of a fast, high-performance website, then caching is critical. Caching has two main goals:
First, caching improves performance. Cloudflare data centers are within 100ms of 90% of the planet; putting your content in Cloudflare’s cache gets it physically closer to your customers and visitors, meaning that visitors will see your website faster when they request it! (Plus, reading assets on our edge SSDs is really fast, rather than waiting for origins to generate a response.)
Second, caching helps reduce bandwidth costs associated with operating a presence on the Internet**.** Origin data transfer is one of the biggest expenses of running a web service, so serving content out of Cloudflare’s cache can significantly reduce costs incurred by origin infrastructure.
Because it’s not safe to cache all content (we wouldn’t want to cache your bank balance by default), Cloudflare relies on customers to tell us what’s safe to cache with HTTP Cache-Control headers and page rules. But even with page rules, it can be hard to understand what’s actually getting cached — or more importantly, what’s not getting cached, and why. Is a resource expired? Or was it even eligible for cache in the first place?
Faster or cheaper? Why not both!
Cache Analytics was designed to help users understand how Cloudflare’s cache is performing, but it can also be used as a general-purpose analytics tool. Here I’ll give a quick walkthrough of the interface.
First, at the top-left, you should decide if you want to focus on requests or data transfer.
Cache Analytics enables you to toggle between views of requests and data transfer.
As a general rule, requests (the default view) is more useful for understanding performance, because every request that misses cache results in a performance hit. Data transfer is useful for understanding cost, because most hosts charge for every byte that leaves their network — every gigabyte served by Cloudflare translates into money saved at the origin.
You can always toggle between these two views while keeping filters enabled.
A filter for every occasion
Let’s say you’re focused on improving the performance of a specific subdomain on your zone. Cache Analytics allows flexible filtering of the data that’s important to you:
Cache Analytics enables flexible filtering of data.
Filtering is essential for zooming in on the chunk of traffic that you’re most interested in. You can filter by cache status, hostname, path, content type, and more. This is helpful, for example, if you’re trying to reduce data transfer for a specific subdomain, or are trying to tune the performance of your HTML pages.
Seeing the big picture
When analyzing traffic patterns, it’s essential to understand how things change over time. Perhaps you just applied a configuration change and want to see the impact, or just launched a big sale on your e-commerce site.
“Served by Cloudflare” indicates traffic that we were able to serve from our edge without reaching your origin server. “Served by Origin” indicates traffic that was proxied back to origin servers. (It can be really satisfying to add a page rule and see the amount of traffic “Served by Cloudflare” go up!)
Note that this graph will change significantly when you switch between “Requests” and “Data Transfer.” Revalidated requests are particularly interesting; because Cloudflare checks with the origin before returning a result from cache, these count as “Served by Cloudflare” for the purposes of data transfer, but “Served by Origin” for the purposes of “requests.”
Slicing the pie
After the high-level summary, we show an overview of cache status, which explains why traffic might be served from Cloudflare or from origin. We also show a breakdown of cache status by Content-Type to give an overview on how different components of your website perform:
Cache statuses are also essential for understanding what you need to do to optimize cache ratios. For example:
Dynamic indicates that a request was never eligible for cache, and went straight to origin. This is the default for many file types, including HTML. Learn more about making more content eligible for cache using page rules. Fixing this is one of the fastest ways to reduce origin data transfer cost.
Revalidated indicates content that was expired, but after Cloudflare checked the origin, it was still fresh! If you see a lot of revalidated content, it’s a good sign you should increase your Edge Cache TTLs through a page rule or max-age origin directive. Updating TTLs is one of the easiest ways to make your site faster.
Expired resources are ones that were in our cache, but were expired. Consider if you can extend TTLs on these, or at least support revalidation at your origin.
A miss indicates that Cloudflare has not seen that resource recently. These can be tricky to optimize, but there are a few potential remedies: Enable Argo Tiered Caching to check another datacenter’s cache before going to origin, or use a Custom Cache Key to make multiple URLs match the same cached resource (for example, by ignoring query string)
For a full explanation of each cache status, see our help center.
To the Nth dimension
Finally, Cache Analytics shows a number of what we call “Top Ns” — various ways to slice and dice the above data on useful dimensions.
It’s often helpful to apply filters (for example, to a specific cache status) before looking at these lists. For example, when trying to tune performance, I often filter to just “expired” or “revalidated,” then see if there are a few URLs that dominate these stats.
But wait, there’s more
Cache Analytics is available now for customers on our Pro, Business, and Enterprise plans. Pro customers have access to up to 3 days of analytics history. Business and Enterprise customers have access to up to 21 days, with more coming soon.
This is just the first step for Cache Analytics. We’re planning to add more dimensions to drill into the data. And we’re planning to add even more essential statistics — for example, about how cache keys are being used.
Finally, I’m really excited about Cache Analytics because it shows what we have in store for Cloudflare Analytics more broadly. We know that you’ve asked for many features— like per-hostname analytics, or the ability to see top URLs — for a long time, and we’re hard at work on bringing these to Zone Analytics. Stay tuned!