This post is also available in Deutsch, Français.

Need to Keep Analytics Data in the EU? Cloudflare Zaraz Can Offer a Solution

A recent decision from the Austrian Data Protection Authority (the Datenschutzbehörde) has network engineers scratching their heads and EU companies that use Google Analytics scrambling. The Datenschutzbehörde found that an Austrian website’s use of Google Analytics violates the EU General Data Protection Regulation (GDPR) as interpreted by the “Schrems II” case because Google Analytics can involve sending full or truncated IP addresses to the United States.

While disabling such trackers might be one (extreme) solution, doing so would leave website operators blind to how users are engaging with their site. A better approach: find a way to use tools like Google Analytics, but do so with an approach that protects the privacy of personal information and keeps it in the EU, avoiding a data transfer altogether. Enter Cloudflare Zaraz.

But before we get into just how Cloudflare Zaraz can help, we need to explain a bit of the background for the Datenschutzbehörde’s ruling, and why it’s a big deal.

What are the privacy and data localization issues?

The GDPR is a comprehensive data privacy law that applies to EU residents’ personal data, regardless of where it is processed. The GDPR itself does not insist that personal data must be processed only in Europe. Instead, it provides a number of legal mechanisms to ensure that GDPR-level protections are available for EU personal data if it is transferred outside the EU to a third country like the United States. Data transfers from the EU to the US were, until the 2020 “Schrems II” decision, permitted under an agreement called the EU-US Privacy Shield Framework.

The Schrems II decision refers to the July 2020 decision by the Court of Justice of the European Union that invalidated the EU-US Privacy Shield. The Court found that the Privacy Shield was not an effective means to protect EU data from US government surveillance authorities once data was transferred to the US, and therefore that under the Privacy Shield, EU personal data would not receive the level of protection guaranteed by the GDPR. However, the court upheld other valid transfer mechanisms designed to allow EU personal data to be transferred to the US in a way that is consistent with the GDPR that ensure EU personal data won’t be accessed by US government authorities in a way that violates the GDPR. One of those was the use of Standard Contractual Clauses, which are legal agreements approved by the EU Commission that enable data transfers – but they can only be used if supplementary measures are also in place.

Following the Schrems II case, the “NOYB” advocacy group founded by Max Schrems (the lawyer and activist who brought the legal action against Facebook that ultimately ended with the Schrems II ruling) filed 101 complaints against European websites that used Google Analytics and Facebook Connect trackers on the grounds that use of these trackers violates the Schrems II ruling because they send EU personal data to the United States without putting in place sufficient supplementary measures.

That issue of supplementary measures figured prominently in the Austrian data regulator’s decision. In its decision, the Datenschutzbehörde said that a European company could not use Google Analytics on its Austrian website because Google Analytics was sending the IP addresses of visitors to that website to Google’s servers in the United States. The Datenschutzbehörde reiterated earlier case law out of the EU that IP addresses can be sufficiently linked to individuals and therefore constitute personal data, so the GDPR applies. The regulator also found that IP addresses are not pseudonymous, and that Google doesn’t have sufficient supplementary measures in place to prevent US government authorities from accessing the data. As a result, the regulator found the use of Google Analytics and the transmission of IP addresses to the United States in this case violated the GDPR as interpreted by the Schrems II case. Since the Datenschutzbehörde announced its decision, Norway’s data protection authority announced it is joining the Austrian decision.

Google Analytics decision sets worrisome precedent

It’s important to remember that the Austrian ruling relates to one website’s use and implementation of Google Analytics. It is not a ban on Google Analytics throughout Europe. But is it a harbinger of more sweeping actions from data regulators? Any website might use dozens of third-party tools. If any of the third-party tools are transferring personal data to the US, they could attract the attention of an EU data regulator. Even if those tools are not collecting personal data or sensitive information intentionally, there remains a concern with the use of third-party tools, which evolves from how the Internet is built and operates.

Every time a user loads a website, those tools load and establish a connection between the end user’s browser and the third-party server. This connection is used for multiple purposes, such as requesting a script, reporting analytics data, or downloading an image pixel. In every such communication, the IP address of the visitor is exposed. This is how communication between a browser and a server has worked over the Internet since the Internet’s infancy.

The implications of the decision are therefore profound. If other European regulators adopt the Austrian ruling, and its conclusion that even the transfer of truncated IP addresses to the United States could constitute transfers of personal data that violate GDPR, the industry will likely need to fundamentally rethink current Internet architecture and the way IP addresses are used. Cloudflare increasingly believes that we’ll eventually solve these challenges by completely disassociating IP addresses from identity. We’ve partnered with others in the industry to pioneer new protocols like Oblivious DNS over HTTPS that divorce IP addresses from content being queried online to help begin to make this future a reality.

While we can envision this future, our customers need immediate ways to address regulators’ concerns. The median website in 2021 used 21 third-party solutions on mobile and 23 on desktop. At the 90th percentile, these numbers climbed to 89 third-party solutions on mobile, and 91 on desktop. Taking into account the Austrian DPA ruling, according to which the EU company itself is responsible for making sure no personal data is transmitted to the United States without proper handling, we can conclude that companies may soon become responsible for every one of their third-party solutions implemented on their website. And since this is a staggering amount of tools, it demands a scalable solution. Luckily, that is exactly what we have built.

Zaraz’s solution leverages Cloudflare’s global network and Workers platform

Zaraz is a third-party manager, built for speed, privacy and security. With Zaraz, customers can load analytics tools, advertising pixels, interactive widgets, and many other types of third-party tools without making any changes to their code.

Zaraz loads third party tools on the cloud, using Cloudflare Workers. There are multiple reasons why we chose to build on Workers, and you can read more about it in this blog post. By using Workers to offload third-party tools to the cloud and away from the browser, Zaraz creates an extra layer of security and control over Personal Identifiable Information (PII), Protected Health Information (PHI), or other sensitive pieces of information that are often unintentionally passed to third-party vendors.

In the traditional way of loading third-party tools, either via a Tag Management Software (TMS), a Customer Data Platform (CDP) or by including JavaScript snippets directly in the HTML, the browser always sends requests to the third-party domain. This is problematic for a bunch of reasons, but mainly because even if you wanted to, you can’t hide the user’s IP address. It is revealed with every HTTP request. It is also problematic because those tools execute remote JavaScript resources, and you have almost no visibility over the actions they take in the browser or the data they transmit.

We can use the Google Analytics example to illustrate the difference. When a website is loading Google Analytics either via Google Tag Manager or directly from the HTML, the browser downloads the analytics.js file that loads Google Analytics. It then sends an HTTP POST request from the browser to Google’s endpoint: https://www.google-analytics.com/collect. Both of these requests reveal the end-user’s IP address and might append to the URL some personal data, such as the Google Client ID, as query parameters for example.

In comparison, when you use Zaraz to load Google Analytics, there’s simply no communication at all between the browser and Google’s endpoint. Instead, Zaraz works as an intermediary, and the entire communication is between Zaraz (which runs on Workers servers, isolated from the browser) and the third party. You can think of Zaraz as an extra protection layer between the browser and the third-party endpoint, and this extra layer allows us to include some powerful privacy features.

For example, Zaraz allows customers to decide whether to transfer an end user's IP address to Google Analytics or not. As simple as that. When configuring a new third-party tool like Google Analytics, you can choose in the tools settings page to hide IP addresses.

You can use this feature currently with Google Analytics and the Facebook Pixel/Conversion API. But with more and more tools opening up their API and allowing server-to-server integrations, we expect the number of tools you can apply this on to grow rapidly.

A somewhat similar feature Zaraz offers is the Zaraz Data Loss Prevention (DLP) feature, currently used by several of our Enterprise customers. The DLP feature scans every request going to a third-party endpoint to make sure it doesn’t include sensitive information such as names, email addresses, social  security number, credit card numbers, IP addresses, and more. Using this feature, customers can either mask the data or simply be alerted when a tool is collecting such personal data. It gives full visibility and control over the information shared with third parties.

How Zaraz Can Help with Data Localization

Right now, you might be asking yourself, “wait, but how is Cloudflare different from Google, and won’t end users' logs go to Cloudflare’s US servers as well?” This is a great question, and where the combination of Zaraz with the Cloudflare global network makes us shine. We offer Enterprise customers Zaraz in combination with two powerful features of Cloudflare’s Data Localisation Suite: Regional Services, and the Customer Metadata Boundary.

Cloudflare Regional Services allows you to choose where you want the Cloudflare services to run, including the Zaraz service. To meet your compliance obligations, you may need control over where your data is inspected. Cloudflare Regional Services helps you decide where your data should be handled, without losing the performance benefits our network provides.

Let’s say you run a website for a European bank. Let’s also assume you enabled the Data Localisation Suite for the EU. When a person in the EU visits your website, an HTTP request is sent to activate Zaraz. Since Zaraz is running in a first-party context, meaning under your own domain, all the Data Localisation settings will apply on it as well. So the network will direct the traffic to the EU, without inspecting its content, and run Zaraz there.

The EU Customer Metadata Boundary expands the Data Localisation Suite to ensure that a customer’s end-user traffic metadata stays in the EU. “Metadata” can be a scary term, but it’s a simple concept — it just means “data about data.” In other words, it’s a description of activity that happened on our network. Using the EU Customer Metadata Boundary means that this type of metadata would be saved only in the EU.

And what about the end user’s personal data handled by Zaraz? By default, Zaraz doesn’t log or save any piece of information about the end user, with one exception in the case of error logging. To make our service better, we are saving logs of errors, so we can fix any issues. For customers that are using the Data Localisation Suite, this is something we can toggle off, which means that no log data whatsoever will be saved by Zaraz.

What Does the Future Hold for Privacy Features?

Since the Zaraz acquisition, we have been talking to hundreds of Cloudflare enterprise customers, and thousands of users using the beta for the free version of Zaraz. And we have gathered a shortlist of features that we plan to develop in 2022.

  • The Zaraz Consent Manager. Zaraz is fundamentally changing the way third-party tools are implemented on the web. So, in order to provide our customers with full control over user consent management, we realized we should build our own tool to allow customers to do so easily. The Zaraz consent manager will be fully integrated with Zaraz and will allow customers to take actions according to the user choices in a few clicks.
  • Geolocation Triggers. We are planning to add the option to create trigger rules based on an end user’s current location. This means you could configure tools to only load if the user is visiting your site from a specific region. You’d be able to even send specific events or properties according to the end-user’s location. This feature should help global companies to set granular configurations that meet the requirements of their global operations.
  • DLP pattern templates. At the moment, our DLP feature can scan requests going to third-party tools according to the patterns that enterprise customers create themselves. In the near future, we will introduce templates to help customers scan for common PII with more ease.

This is just a taste of what’s coming. If you have any ideas for privacy features you’d like to see, reach out to [email protected] – we would love to hear from you!

If you would like to explore the free beta version, please click here. Provided you are an Enterprise customer and want to learn more about Zaraz’s privacy features, please click here to join the waitlist. To join our Discord channel, click here.