Subscribe to receive notifications of new posts:

Update: More Page View Counting Refinement

2012-05-28

2 min read
Update: More Page View Counting Refinement

We've written about the challenge of classifying what is a "page view" a couple times before (see Understanding Analytics: When Is a Page View Not a Page View? and A Quick Update on Page Views). CloudFlare sees all web traffic so we're able to more accurately report on numbers like Unique IPs and Hits than other analytics systems like Google Analytics that use beacon-based tracking.

Beacon-Based vs. Hits Tracking

Google Analytics will only see the visitors that trigger their Javascript beacon, so they can't report on crawlers and bots that don't fire the request. Google Analytics also only sees the views of the pages where the beacon is fired, which means if you're trying to get an operations number like the number of requests per second your server is handling you have to estimate it rather than knowing it precisely.

CloudFlare has the opposite challenge. We see every request from every visitor to your site. As a result, our hits number is not an estimation but an exact count. Similarly, our Uniques number is precise, deduped count of the number of unique IP addresses that visited your site. With CloudFlare, you don't need to trigger Javascript to be counted, so we end up counting a lot of traffic beacon-based analytics systems miss.

That's not to say Google Analytics and other beacon-based tracking is bad. There's a place for both. If you're trying to see how many ads you serve, which is Google's primary interest, then it's good to use a tracking method that is the same as how ad tags are triggered. Therefore, beacon-based tracking makes sense. If you're trying to understand the total load on your server and other operations issues, which is CloudFlare's primary interest, then it make sense to count total requests.

What Hits Are Page Views?

Our challenge is that we then have to look at all the "hits" we see and classify which one of those actually counts as a page view. We're constantly making refinements to the algorithm that make it more accurate. We just pushed out a change to this algorithm late last week. It fixes some cases where objects were reporting their content type as text/html when they were actually images or non-HTML that shouldn't be counted as page views. It also fixes instances where we were counting some 300 redirects as page views, which effectively caused double counting of page views in some cases.

The net impact on overall page view stats for most sites is very small (less than 1%). However, for some sites they'll see a more significant drop (as much as 20%). The change will impact analytics data going forward beginning last Thursday. Unfortunately, we don't store the raw historical data so it's not possible for us to update past analytics reports.

The cases where there is a larger drop align with sites that previously reported a high deviation between our page view numbers and those numbers reported from Google Analytics. Now the two page view stats should be closer in line with one another, although CloudFlare still should report a higher number because we're picking up page views from crawlers and other visitors that don't trigger Javascript. As you'd expect, there's no change to the uniques or hits numbers since those are much more straight forward for us to count and report. We'll continue to refine all our analytics systems to report data about your as accurately as possible.

Cloudflare's connectivity cloud protects entire corporate networks, helps customers build Internet-scale applications efficiently, accelerates any website or Internet application, wards off DDoS attacks, keeps hackers at bay, and can help you on your journey to Zero Trust.

Visit 1.1.1.1 from any device to get started with our free app that makes your Internet faster and safer.

To learn more about our mission to help build a better Internet, start here. If you're looking for a new career direction, check out our open positions.
AnalyticsData

Follow on X

Matthew Prince|@eastdakota
Cloudflare|@cloudflare

Related posts

November 26, 2024 4:00 PM

Cloudflare incident on November 14, 2024, resulting in lost logs

On November 14, 2024, Cloudflare experienced a Cloudflare Logs outage, impacting the majority of customers using these products. During the ~3.5 hours that these services were impacted, about 55% of the logs we normally send to customers were not sent and were lost. The details of what went wrong and why are interesting both for customers and practitioners....

March 08, 2024 2:05 PM

Log Explorer: monitor security events without third-party storage

With the combined power of Security Analytics + Log Explorer, security teams can analyze, investigate, and monitor for security attacks natively within Cloudflare, reducing time to resolution and overall cost of ownership for customers by eliminating the need to forward logs to third-party SIEMs...