The most famous data breaches–the ones that keep security practitioners up at night–involved the leak of millions of user records. Companies have lost names, addresses, email addresses, Social Security numbers, passwords, and a wealth of other sensitive information. Protecting this data is the highest priority of most security teams, yet many teams still struggle to actually detect these leaks.
Cloudflare’s Data Loss Prevention suite already includes the ability to identify sensitive data like credit card numbers, but with the volume of data being transferred every day, it can be challenging to understand which of the transactions that include sensitive data are actually problematic. We hear customers tell us, “I don’t care when one of my employees uses a personal credit card to buy something online. Tell me when one of my customers’ credit cards are leaked.”
In response, we looked for a method to distinguish between any credit card and one belonging to a specific customer. We are excited to announce the launch of our newest Data Loss Prevention feature, Exact Data Match. With Exact Data Match (EDM), customers securely tell us what data they want to protect, and then we identify, log, and block the presence or movement of that data. For example, if you provide us with a set of credit card numbers, we will DLP scan your traffic or repositories for only those cards. This allows you to create targeted DLP detections for your organization.
What is Exact Data Match?
Many Data Loss Prevention (DLP) detections begin with a generic identification of a pattern, often using a regular expression, and then are validated by additional criteria. Validation can leverage a wide range of techniques from checksums to machine learning models. However, this validates that the pattern is a credit card, not that it is your credit card.
With Exact Data Match, you tell us exactly the data you want to protect, but we never see it in cleartext. You provide a list of data of your choosing, such as a list of names, addresses, or credit card numbers, and that data is hashed before ever reaching Cloudflare. We store the hashes and scan your traffic or content for matches of the hashes. When we find a match, we log or block it according to your policy.
By using a finite list of data, we drastically reduce false positives compared to generic pattern matching. Meanwhile, hashing the data maintains your data privacy. Our goal is to meet your data protection and privacy needs.
How do I use it?
We now offer you the ability to upload DLP datasets. These allow you to provide batches of data to be used for your DLP detections.
When creating a dataset, provide a name, description, and a file containing the data to match.
When you upload the file, Cloudflare one-way hashes the data right in your browser. The hashed data is then transferred via API to Cloudflare, while the cleartext data never leaves the browser.
You can see the status of the upload in the datasets table.
The dataset can now be added to a DLP profile for detection. You can also add other predefined and custom entries to the same DLP profile.
DLP Profiles can be used for inline scanning and protection with Cloudflare Gateway or scanning your data at rest with Cloudflare CASB.
Can I join the beta?
Exact data match is now available for every DLP customer. If you are not a DLP customer but would like to learn more about Cloudflare One and DLP, reach out for a consultation.
What’s next?
Customers have many different formats to store data, and many different ways in which they want to monitor it. Our goal is to offer as much flexibility as your organization needs to meet your data protection goals.