The Cloudflare Blog

Make Your Website Conversational for People and Agents with NLWeb and AutoRAG

Catarina Pires Mota — Thu, 28 Aug 2025 14:00:00 GMT

Publishers and content creators have historically relied on traditional keyword-based search to help users navigate their website’s content. However, traditional search is built on outdated assumptions: users type in keywords to indicate intent, and the site returns a list of links for the most relevant results. It’s up to the visitor to click around, skim pages, and piece together the answer they’re looking for.

AI has reset expectations and that paradigm is breaking: how we search for information has fundamentally changed.

Your New Type of Visitors

Users no longer want to search websites the old way. They’re used to interacting with AI systems like Copilot, Claude, and ChatGPT, where they can simply ask a question and get an answer. We’ve moved from search engines to answer engines.

At the same time, websites now have a new class of visitors, AI agents. Agents face the same pain with keyword search: they have to issue keyword queries, click through links, and scrape pages to piece together answers. But they also need more: a structured way to ask questions and get reliable answers across websites. This means that websites need a way to give the agents they trust controlled access, so that information is retrieved accurately.

Website owners need a way to participate in this shift.

A New Search Model for the Agentic Web

If AI has reset expectations, what comes next? To meet both people and agents where they are, websites need more than incremental upgrades to keyword search. They need a model that makes conversational access to content a first-class part of the web itself.

That’s what we want to deliver: combining an open standard (NLWeb) with the infrastructure (AutoRAG) to make it simple for any website to become AI-ready.

NLWeb is an open project developed by Microsoft that defines a standard protocol for natural-language queries on websites. Each NLWeb instance also operates as a Model Context Protocol (MCP) server. Cloudflare is building to this spec and actively working with Microsoft to extend the standard with the goal to let every site function like an AI app, so users and agents alike can query its contents naturally.

AutoRAG, Cloudflare’s managed retrieval engine, can automatically crawl your website, store the content in R2, and embed it into a managed vector database. AutoRAG keeps the index fresh with continuous re-crawling and re-indexing. Model inference and embedding can be served through Workers AI. Each AutoRAG is paired with an AI Gateway that can provide observability and insights into your AI model usage. This gives you a complete, managed pipeline for conversational search without the burden of managing custom infrastructure.

“Together, NLWeb and AutoRAG let publishers go beyond search boxes, making conversational interfaces for websites simple to create and deploy. This integration will enable every website to easily become AI-ready for both people and trusted agents.” – R.V. Guha, creator of NLWeb, CVP and Technical Fellow at Microsoft.

We are optimistic this will open up new monetization models for publishers:

"The challenges publishers have faced are well known, as are the risks of AI accelerating the collapse of already challenged business models. However, with NLWeb and AutoRAG, there is an opportunity to reset the nature of relationships with audiences for the better. More direct engagement on Publisher Owned and Operated (O&O) environments, where audiences value the brand and voice of the Publisher, means new potential for monetization. This would be the reset the entire industry needs." – Joe Marchese, General & Build Partner at Human Ventures.

One-Click to Make Your Site Conversational

By combining NLWeb's standard with Cloudflare’s AutoRAG infrastructure, we’re making it possible to easily bring conversational search to any website.

Simply select your domain in AutoRAG, and it will crawl and index your site for semantic querying. It then deploys a Cloudflare Worker, which acts as the access layer. This Worker implements the NLWeb standard and UI defined by the NLWeb project and exposes your indexed content to both people and AI agents. The Worker includes:

`/ask` endpoint: The defined standard for how conversational web searches should be served. Powers the conversational UI at the root `/` as well as the embeddable preview at `/snippet.html`. It supports chat history so queries can build on one another within the same session, and includes automatic query decontextualization to improve retrieval quality.
`/mcp` endpoint: Implements an MCP server that trusted AI agents can connect to for structured access.

With this setup, your site content is immediately available in two ways for you to experiment: through a conversational UI that you can serve to your visitors, and through a structured MCP interface that lets trusted agents query your site reliably on your terms.

Additionally, if you prefer to deploy and host your own version of the NLWeb project, there’s also the option to use AutoRAG as the retrieval engine powering the NLWeb instance.

How Your Site Becomes Conversational

From your perspective, making your site conversational is just a single click. Behind the scenes, AutoRAG spins up a full retrieval pipeline to make that possible:

Crawling and ingestion: AutoRAG explores your site like a search engine, following `sitemap.xml` and `robots.txt` files to understand what pages are available and allowed for crawling. From there, it follows your sitemap to discover pages within your domain (up to 100k pages). Browser Rendering is used to load each page so that it can capture dynamic, JavaScript content. Crawled pages are downloaded into an R2 bucket in your account before being ingested.
Continuous Indexing: Once ingested, the content is parsed and embedded into Vectorize, making it queryable beyond keyword matching through semantic search. AutoRAG automatically re-crawls and re-indexes to keep your knowledge base aligned with your latest content.
Access & Observability: A Cloudflare Worker is deployed in your account to serve as the access layer that implements the NLWeb protocol (you can also find the deployable Worker in the Workers templates repository). Workers AI is used to seamlessly power the summarization and decontextualized query capabilities to improve responses. Soon, with the AI Gateway and Secret Store BYO keys, you’ll be able to connect models from any provider and select them directly in the AutoRAG dashboard.

Road to Making Websites a First-Class Data Source

Until now, AutoRAG only supported R2 as a data source. That worked well for structured files, but we needed to make a website itself a first-class data source to be indexed and searchable. Making that possible meant building website crawling into AutoRAG and strengthening the system to handle large, dynamic sources like websites.

Before implementing our web crawler, we needed to improve the reliability of data syncs. Prior users of AutoRAG lacked visibility into when indexing syncs ran and whether they were successful. To fix this, we introduced a Job module to track all syncs, store history, and provide logs. This required two new Durable Objects to be added into AutoRAG’s architecture:

JobManager runs a complete sync, and its duties include queuing files, embedding content, and keeping the Vectorize database up to date. To ensure data consistency, only one JobManager can run per RAG at a time, enforced by the RagManager (a Durable Object in our existing architecture), which cancels any running jobs before starting new ones which can be triggered either manually or by a scheduled sync.
FileManager solved scalability issues we hit when Workers ran out of memory during parallel processing. Originally, a single Durable Object was responsible for handling multiple files, but with a 128MB memory limit it quickly became a bottleneck. The solution was to break the work apart: JobManager now distributes files across many FileManagers, each responsible for a single file. By processing 20 files in parallel through 20 different FileManagers, we expanded effective memory capacity from 128MB to roughly 2.5GB per batch.

With these improvements, we were ready to build the website parser. By reusing our existing R2-based queuing logic, we added crawling with minimal disruption:

A JobManager designated for a website crawl begins by reading the sitemaps associated with the RAG configuration.
Instead of listing objects from an R2 bucket, it queues each website link into our existing R2-based queue, using the full URL as the R2 object key.
From here, the process is nearly identical to our file-based sync. A FileManager picks up the job and checks if the RAG is configured for website parsing.
If it is, the FileManager crawls the link and places the page's HTML contents into the user's R2 bucket, again using the URL as the object key.

After these steps, we index the data and serve it at query time. This approach maximized code reuse, and any improvements to our HTML-to-Markdown conversion now benefit both file and website-based RAGs automatically.

Get Started Today

Getting your website ready for conversational search through NLWeb and AutoRAG is simple. Here’s how:

In the Cloudflare Dashboard, navigate to Compute & AI > AutoRAG.
Select Create in AutoRAG, then choose the NLWeb Website quick deploy option.
Select the domain from your Cloudflare account that you want indexed.
Click Start indexing.

That’s it! You can now try out your NLWeb search experience via the provided link, and test out how it will look on your site by using the embeddable snippet.

We’d love to hear your feedback as you experiment with this new capability and share your thoughts with us at nlweb@cloudflare.com.

Robotcop: enforcing your robots.txt policies and stopping bots before they reach your website

Celso Martinho — Tue, 10 Dec 2024 14:00:00 GMT

Cloudflare’s AI Crawl Control (formerly AI Audit) dashboard allows you to easily understand how AI companies and services access your content. AI Crawl Control gives a summary of request counts broken out by bot, detailed path summaries for more granular insights, and the ability to filter by categories like AI Search or AI Crawler.

Today, we're going one step further. You can now quickly see which AI services are honoring your robots.txt policies, which aren’t, and then programmatically enforce these policies.

What is robots.txt?

Robots.txt is a plain text file hosted on your domain that implements the Robots Exclusion Protocol, a standard that has been around since 1994. This file tells crawlers like Google, Bing, and many others which parts of your site, if any, they are allowed to access.

There are many reasons why site owners would want to define which portions of their websites crawlers are allowed to access: they might not want certain content available on search engines or social networks, they might trust one platform more than another, or they might simply want to reduce automated traffic to their servers.

With the advent of generative AI, AI services have started crawling the Internet to collect training data for their models. These models are often proprietary and commercial and are used to generate new content. Many content creators and publishers that want to exercise control over how their content is used have started using robots.txt to declare policies that cover these AI bots, in addition to the traditional search engines.

Here’s an abbreviated real-world example of the robots.txt policy from a top online news site:

User-agent: GPTBot
Disallow: /

User-agent: ChatGPT-User
Disallow: /

User-agent: anthropic-ai
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: Bytespider
Disallow: /

This policy declares that the news site doesn't want ChatGPT, Anthropic AI, Google Gemini, or ByteDance’s Bytespider to crawl any of their content.

From voluntary compliance to enforcement

Compliance with the Robots Exclusion Protocol has historically been voluntary.

That’s where our new feature comes in. We’ve extended AI Crawl Control to give our customers both the visibility into how AI services providers honor their robots.txt policies and the ability to enforce those policies at the network level in your WAF.

Your robots.txt file declares your policy, but now we can help you enforce it. You might even call it … your Robotcop.

How it works

AI Crawl Control takes the robots.txt files from your web properties, parses them, and then matches their rules against the AI bot traffic we see for the selected property. The summary table gives you an aggregated view of the number of requests and violations we see for every Bot across all paths. If you hover your mouse over the Robots.txt column, we will show you the defined policies for each Bot in the tooltip. You can also filter by violations from the top of the page.

In the “Most popular paths” section, whenever a path in your site gets traffic that has violated your policy, we flag it for visibility. Ideally, you wouldn't see violations in the Robots.txt column — if you do see them, someone's not complying.

But that's not all… More importantly, AI Crawl Control allows you to enforce your robots.txt policy at the network level. By pressing the "Enforce robots.txt rules" button on the top of the summary table, we automatically translate the rules defined for AI Bots in your robots.txt into an advanced firewall rule, redirect you to the WAF configuration screen, and allow you to deploy the rule in our network.

This is how the robots.txt policy mentioned above looks after translation:

Once you deploy a WAF rule built from your robots.txt policies, you are no longer simply requesting that AI services respect your policy, you're enforcing it.

Conclusion

With AI Crawl Control, we are giving our customers even more visibility into how AI services access their content, helping them define their policies and then enforcing them at the network level.

This feature is live today for all Cloudflare customers. Simply log into the dashboard and navigate to your domain to begin auditing the bot traffic from AI services and enforcing your robots.txt directives.

Billions and billions (of logs): scaling AI Gateway with the Cloudflare Developer Platform

Catarina Pires Mota — Thu, 24 Oct 2024 13:00:00 GMT

With the rapid advancements occurring in the AI space, developers face significant challenges in keeping up with the ever-changing landscape. New models and providers are continuously emerging, and understandably, developers want to experiment and test these options to find the best fit for their use cases. This creates the need for a streamlined approach to managing multiple models and providers, as well as a centralized platform to efficiently monitor usage, implement controls, and gather data for optimization.

AI Gateway is specifically designed to address these pain points. Since its launch in September 2023, AI Gateway has empowered developers and organizations by successfully proxying over 2 billion requests in just one year, as we highlighted during September’s Birthday Week. With AI Gateway, developers can easily store, analyze, and optimize their AI inference requests and responses in real time.

With our initial architecture, AI Gateway faced a significant challenge: the logs, those critical trails of data interactions between applications and AI models, could only be retained for 30 minutes. This limitation was not just a minor inconvenience; it posed a substantial barrier for developers and businesses needing to analyze long-term patterns, ensure compliance, or simply debug over more extended periods.

In this post, we'll explore the technical challenges and strategic decisions behind extending our log storage capabilities from 30 minutes to being able to store billions of logs indefinitely. We'll discuss the challenges of scale, the intricacies of data management, and how we've engineered a system that not only meets the demands of today, but is also scalable for the future of AI development.

Background

AI Gateway is built on Cloudflare Workers, a serverless platform that runs on the Cloudflare network, allowing developers to write small JavaScript functions that can execute at the point of need, near the user, on Cloudflare's vast network of data centers, without worrying about platform scalability.

Our customers use multiple providers and models and are always looking to optimize the way they do inference. And, of course, in order to evaluate their prompts, performance, cost, and to troubleshoot what’s going on, AI Gateway’s customers need to store requests and responses. New requests show up within 15 seconds and customers can check a request’s cost, duration, number of tokens, and provide their feedback (thumbs up or down).

This scales in a way where an account can have multiple gateways and each gateway has its own settings. In our first implementation, a backend worker was responsible for storing Real Time Logs and other background tasks. However, in the rapidly evolving domain of artificial intelligence, where real-time data is as precious as the insights it provides, managing log data efficiently becomes paramount. We recognized that to truly empower our users, we needed to offer a solution where logs weren't just transient records but could be stored permanently. Permanent log storage means developers can now track the performance, security, and operational insights of their AI applications over time, enabling not only immediate troubleshooting but also longitudinal studies of AI behavior, usage trends, and system health.

The diagram above describes our old architecture, which could only store 30 minutes of data.

Tracing the path of a request through the AI Gateway, as depicted in the sequence above:

A developer sends a new inference request, which is first received by our Gateway Worker.
The Gateway Worker then performs several checks: it looks for cached results, enforces rate limits, and verifies any other configurations set by the user for their gateway. Provided all conditions are met, it forwards the request to the selected inference provider (in this diagram, OpenAI).
The inference provider processes the request and sends back the response.
Simultaneously, as the response is relayed back to the developer, the request and response details are also dispatched to our Backend Worker. This worker's role is to manage and store the log of this transaction.

The challenge: Store two billion logs

First step: real-time logs

Initially, the AI Gateway project stored both request metadata and the actual request bodies in a D1 database. This approach facilitated rapid development in the project's infancy. However, as customer engagement grew, the D1 database began to fill at an accelerating rate, eventually retaining logs for only 30 minutes at a time.

To mitigate this, we first optimized the database schema, which extended the log retention to one hour. However, we soon encountered diminishing returns due to the sheer volume of byte data from the request bodies. Post-launch, it became clear that a more scalable solution was necessary. We decided to migrate the request bodies to R2 storage, significantly alleviating the data load on D1. This adjustment allowed us to incrementally extend log retention to 24 hours.

Consequently, D1 functioned primarily as a log index, enabling users to search and filter logs efficiently. When users needed to view details or download a log, these actions were seamlessly proxied through to R2.

This dual-system approach provided us with the breathing room to contemplate and develop more sophisticated storage solutions for the future.

Second step: persistent logs and Durable Object transactional storage

As our traffic surged, we encountered a growing number of requests from customers wanting to access and compare older logs.

Upon learning that the Durable Objects team was seeking beta testers for their new Durable Objects with SQLite, we eagerly signed up.

Originally, we considered Durable Objects as the ideal solution for expanding our log storage capacity, which required us to shard the logs by a unique string. Initially, this string was the account ID, but during a mid-development load test, we hit a cap at 10 million logs per Durable Object. This limitation meant that each account could only support up to this number of logs.

Given our commitment to the DO migration, we saw an opportunity rather than a constraint. To overcome the 10 million log limit per account, we refined our approach to shard by both account ID and gateway name. This adjustment effectively raised the storage ceiling from 10 million logs per account to 10 million per gateway. With the default setting allowing each account up to 10 gateways, the potential storage for each account skyrocketed to 100 million logs.

This strategic pivot not only enabled us to store a significantly larger number of logs. But also enhanced our flexibility in gateway management. Now, when a gateway is deleted, we can simply remove the corresponding Durable Object.

Additionally, this sharding method isolates high-volume request scenarios. If one customer's heavy usage slows down log insertion, it only impacts their specific Durable Object, thereby preserving performance for other customers.

Taking a glance at the revised architecture diagram, we replaced the Backend Worker with our newly integrated Durable Object. The rest of the request flow remains unchanged, including the concurrent response to the user and the interaction with the Durable Object, which occurs in the fourth step.

Leveraging Cloudflare’s network, our Gateway Worker operates near the user's location, which in turn positions the user's Durable Object close by. This proximity significantly enhances the speed of log insertion and query operations.

Third step: managing thousands of Durable Objects

As the number of users and requests on AI Gateway grows, managing each unique Durable Object (DO) becomes increasingly complex. New customers join continuously, and we needed an efficient method to track each DO, ensure users stay within their 10 gateway limit, and manage the storage capacity for free users.

To address these challenges, we introduced another layer of control with a new Durable Object we've named the Account Manager. The primary function of the Account Manager is straightforward yet crucial: it keeps user activities in check.

Here's how it works: before any Gateway commits a new log to permanent storage, it consults the Account Manager. This check determines whether the gateway is allowed to insert the log based on the user's current usage and entitlements. The Account Manager uses its own SQLite database to verify the total number of rows a user has and their service level. If all checks pass, it signals the Gateway that the log can be inserted. It was paramount to guarantee that this entire validation process occurred in the background, ensuring that the user experience remains seamless and uninterrupted.

The Account Manager stays updated by periodically receiving data from each Gateway’s Durable Object. Specifically, after every 1000 inference requests, the Gateway sends an update on its total rows to the Account Manager, which then updates its local records. This system ensures that the Account Manager has the most current data when making its decisions.

Additionally, the Account Manager is responsible for monitoring customer entitlements. It tracks whether an account is on a free or paid plan, how many gateways a user is permitted to create, and the log storage capacity allocated to each gateway.

Through these mechanisms, the Account Manager not only helps in maintaining system integrity but also ensures fair usage across all users of AI Gateway.

AI evaluations and Durable Objects sharding

As we continue to develop evaluations to fully automatic and, in the future, use Large Language Models (LLMs), we are now taking the first step towards this goal and launching the open beta phase of comprehensive AI evaluations, centered on Human-in-the-Loop feedback.

This feature empowers users to create bespoke datasets from their application logs, thereby enabling them to score and evaluate the performance, speed, and cost-effectiveness of their models, with a primary focus on LLMs and automated scoring, analyzing the performance of LLMs, providing developers with objective, data-driven insights to refine their models.

To do this, developers require a reliable logging mechanism that persists logs from multiple gateways, storing up to 100 million logs in total (10 million logs per gateway, across 10 gateways). This represents a significant volume of data, as each request made through the AI Gateway generates a log entry, with some log entries potentially exceeding 50 MB in size.

This necessity leads us to work on the expansion of log storage capabilities. Since log storage is limited to 10 million logs per gateway, in future iterations, we aim to scale this capacity by implementing sharded Durable Objects (DO), allowing multiple Durable Objects per gateway to handle and store logs. This scaling strategy will enable us to store significantly larger volumes of logs, providing richer data for evaluations (using LLMs as a judge or from user input), all through AI Gateway.

Coming Soon

We are working on improving our existing Universal Endpoint, the next step on an enhanced solution that builds on existing fallback mechanisms to offer greater resilience, flexibility, and intelligence in request management.

Currently, when a provider encounters an error or is unavailable, our system falls back to an alternative provider to ensure continuity. The improved Universal Endpoint takes this a step further by introducing automatic retry capabilities, allowing failed requests to be reattempted before fallback is triggered. This significantly improves reliability by handling transient errors and increasing the likelihood of successful request fulfillment. It will look something like this:

curl --location 'https://aig.example.com/' \
--header 'CF-AIG-TOKEN: Bearer XXXX' \
--header 'Content-Type: application/json' \
--data-raw '[
    {
        "id": "0001",
        "provider": "openai",
        "endpoint": "chat/completions",
        "headers": {
            "Authorization": "Bearer XXXX",
            "Content-Type": "application/json"
        },
        "query": {
            "model": "gpt-3.5-turbo",
            "messages": [
                {
                    "role": "user",
                    "content": "generate a prompt to create cloudflare random images"
                }
            ]
        },
        "option": {
            "retry": 2,
            "delay": 200,
            "onComplete": {
                "provider": "workers-ai",
                "endpoint": "@cf/stabilityai/stable-diffusion-xl-base-1.0",
                "headers": {
                    "Authorization": "Bearer XXXXXX",
                    "Content-Type": "application/json"
                },
                "query": {
                    "messages": [
                        {
                            "role": "user",
                            "content": ""
                        }
                    ]
                }
            }
        }
    },
    {
        "provider": "workers-ai",
        "endpoint": "@cf/stabilityai/stable-diffusion-xl-base-1.0",
        "headers": {
            "Authorization": "Bearer XXXXXX",
            "Content-Type": "application/json"
        },
        "query": {
            "messages": [
                {
                    "role": "user",
                    "content": "create a image of a missing cat"
                }
            ]
        }
    }
]'

The request to the improved Universal Endpoint system demonstrates how it handles multiple providers with integrated retry mechanisms and fallback logic. In this example, the first request is sent to a provider like OpenAI, asking it to generate a text-to-image prompt. The “retry” option ensures that transient issues don’t result in immediate failure.

The system’s ability to seamlessly switch between providers while applying retry strategies ensures higher reliability and robustness in managing requests. By leveraging fallback logic, the Improved Universal Endpoint can dynamically adapt to provider failures, ensuring that tasks are completed successfully even in complex, multi-step workflows.

In addition to retry logic, we will have the ability to inspect requests and responses and make dynamic decisions based on the content of the result. This enables developers to create conditional workflows where the system can adapt its behavior depending on the nature of the response, creating a highly flexible and intelligent decision-making process.

If you haven’t yet used AI Gateway, check out our developer documentation on how to get started. If you have any questions, reach out on our Discord channel.

Email Routing subdomain support, new APIs and security protocols

Celso Martinho — Thu, 26 Oct 2023 13:10:06 GMT

It's been two years since we announced Email Routing, our solution to create custom email addresses for your domains and route incoming emails to your preferred mailbox. Since then, the team has worked hard to evolve the product and add more powerful features to meet our users' expectations. Examples include Route to Workers, which allows you to process your Emails programmatically using Workers scripts, Public APIs, Audit Logs, or DMARC Management.

We also made significant progress in supporting more email security extensions and protocols, protecting our customers from unwanted traffic, and keeping our IP space reputation for email egress impeccable to maximize our deliverability rates to whatever inbox upstream provider you chose.

Since leaving beta, Email Routing has grown into one of our most popular products; it’s used by more than one million different customer zones globally, and we forward around 20 million messages daily to every major email platform out there. Our product is mature, robust enough for general usage, and suitable for any production environment. And it keeps evolving: today, we announce three new features that will help make Email Routing more secure, flexible, and powerful than ever.

New security protocols

The SMTP email protocol has been around since the early 80s. Naturally, it wasn't designed with the best security practices and requirements in mind, at least not the ones that the Internet expects today. For that reason, several protocol revisions and extensions have been standardized and adopted by the community over the years. Cloudflare is known for being an early adopter of promising emerging technologies; Email Routing already supports things like SPF, DKIM signatures, DMARC policy enforcement, TLS transport, STARTTLS, and IPv6 egress, to name a few. Today, we are introducing support for two new standards to help increase email security and improve deliverability to third-party upstream email providers.

ARC

Authenticated Received Chain (ARC) is an email authentication system designed to allow an intermediate email server (such as Email Routing) to preserve email authentication results. In other words, with ARC, we can securely preserve the results of validating sender authentication mechanisms like SPF and DKIM, which we support when the email is received, and transport that information to the upstream provider when we forward the message. ARC establishes a chain of trust with all the hops the message has passed through. So, if it was tampered with or changed in one of the hops, it is possible to see where by following that chain.

We began rolling out ARC support to Email Routing a few weeks ago. Here’s how it works:

As you can see, joe@example.com sends an Email to henry@domain.example, an Email Routing address, which in turn is forwarded to the final address, example@gmail.com.

Email Routing will use @example.com’s DMARC policy to check the SPF and DKIM alignments (SPF, DKIM, and DMARC help authenticate email senders by verifying that the emails came from the domain that they claim to be from.) It then stores this authentication result by adding a Arc-Authentication-Results header in the message:

ARC-Authentication-Results: i=1; mx.cloudflare.net; dkim=pass header.d=cloudflare.com header.s=example09082023 header.b=IRdayjbb; dmarc=pass header.from=example.com policy.dmarc=reject; spf=none (mx.cloudflare.net: no SPF records found for postmaster@example.com) smtp.helo=smtp.example.com; spf=pass (mx.cloudflare.net: domain of joe@example.com designates 2a00:1440:4824:20::32e as permitted sender) smtp.mailfrom=joe@example.com; arc=none smtp.remote-ip=2a00:1440:4824:20::32e

Then we take a snapshot of all the headers and the body of the original message, and we generate an Arc-Message-Signature header with a DKIM-like cryptographic signature (in fact ARC uses the same DKIM keys):

ARC-Message-Signature: i=1; a=rsa-sha256; s=2022; d=email.cloudflare.net; c=relaxed/relaxed; h=To:Date:Subject:From:reply-to:cc:resent-date:resent-from:resent-to :resent-cc:in-reply-to:references:list-id:list-help:list-unsubscribe :list-subscribe:list-post:list-owner:list-archive; t=1697709687; bh=sN/+...aNbf==;

Finally, before forwarding the message to example@gmail.com, Email Routing generates the Arc-Seal header, another DKIM-like signature, composed out of the Arc-Authentication-Results and Arc-Message-Signature, and cryptographically “seals” the message:

ARC-Seal: i=1; a=rsa-sha256; s=2022; d=email.cloudflare.net; cv=none; b=Lx35lY6..t4g==;

When Gmail receives the message from Email Routing, it not only normally authenticates the last hop domain.example domain (Email Routing uses SRS), but it also checks the ARC seal header, which provides the authentication results of the original sender.

ARC increases the traceability of the message path through email intermediaries, allowing for more informed delivery decisions by those who receive emails as well as higher deliverability rates for those who transport them, like Email Routing. It has been adopted by all the major email providers like Gmail and Microsoft. You can read more about the ARC protocol in the RFC8617.

MTA-STS

As we said earlier, SMTP is an old protocol. Initially Email communications were done in the clear, in plain-text and unencrypted. At some point in time in the late 90s, the email providers community standardized STARTTLS, also known as Opportunistic TLS. The STARTTLS extension allowed a client in a SMTP session to upgrade to TLS encrypted communications.

While at the time this seemed like a step forward in the right direction, we later found out that because STARTTLS can start with an unencrypted plain-text connection, and that can be hijacked, the protocol is susceptible to man-in-the-middle attacks.

A few years ago MTA Strict Transport Security (MTA-STS) was introduced by email service providers including Microsoft, Google and Yahoo as a solution to protect against downgrade and man-in-the-middle attacks in SMTP sessions, as well as solving the lack of security-first communication standards in email.

Suppose that example.com uses Email Routing. Here’s how you can enable MTA-STS for it.

First, log in to the Cloudflare dashboard and select your account and zone. Then go to DNS > Records and create a new CNAME record with the name “_mta-sts” that points to Cloudflare’s record “_mta-sts.mx.cloudflare.net”. Make sure to disable the proxy mode.

Confirm that the record was created:

$ dig txt _mta-sts.example.com
_mta-sts.example.com.	300	IN	CNAME	_mta-sts.mx.cloudflare.net.
_mta-sts.mx.cloudflare.net. 300	IN	TXT	"v=STSv1; id=20230615T153000;"

This tells the other end client that is trying to connect to us that we support MTA-STS.

Next you need an HTTPS endpoint at mta-sts.example.com to serve your policy file. This file defines the mail servers in the domain that use MTA-STS. The reason why HTTPS is used here instead of DNS is because not everyone uses DNSSEC yet, so we want to avoid another MITM attack vector.

To do this you need to deploy a very simple Worker that allows Email clients to pull Cloudflare’s Email Routing policy file using the “well-known” URI convention. Go to your Account > Workers & Pages and press Create Application. Pick the “MTA-STS” template from the list.

This Worker simply proxies https://mta-sts.mx.cloudflare.net/.well-known/mta-sts.txt to your own domain. After deploying it, go to the Worker configuration, then Triggers > Custom Domains and Add Custom Domain.

You can then confirm that your policy file is working:

$ curl https://mta-sts.example.com/.well-known/mta-sts.txt
version: STSv1
mode: enforce
mx: *.mx.cloudflare.net
max_age: 86400

This says that we enforce MTA-STS. Capable email clients will only deliver email to this domain over a secure connection to the specified MX servers. If no secure connection can be established the email will not be delivered.

Email Routing also supports MTA-STS upstream, which greatly improves security when forwarding your Emails to service providers like Gmail or Microsoft, and others.

While enabling MTA-STS involves a few steps today, we plan to simplify things for you and automatically configure MTA-STS for your domains from the Email Routing dashboard as a future improvement.

Sending emails and replies from Workers

Last year we announced Email Workers, allowing anyone using Email Routing to associate a Worker script to an Email address rule, and programmatically process their incoming emails in any way they want. Workers is our serverless compute platform, it provides hundreds of features and APIs, like databases and storage. Email Workers opened doors to a flood of use-cases and applications that weren’t possible before like implementing allow/block lists, advanced rules, notifications to messaging applications, honeypot aggregators and more.

Still, you could only act on the incoming email event. You could read and process the email message, you could even manipulate and create some headers, but you couldn’t rewrite the body of the message or create new emails from scratch.

Today we’re announcing two new powerful Email Workers APIs that will further enhance what you can do with Email Routing and Workers.

Send emails from Workers

Now you can send an email from any Worker, from scratch, whenever you want, not just when you receive incoming messages, to any email address verified on Email Routing under your account. Here are a few practical examples where sending email from Workers to your verified addresses can be helpful:

Daily digests with the news from your favorite publications.
Alert messages whenever the weather conditions are adverse.
Automatic notifications when systems complete tasks.
Receive a message composed of the inputs of a form online on a contact page.

Let's see a simple example of a Worker sending an email. First you need to create “send_email” bindings in your wrangler.toml configuration:

send_email = [
    {type = "send_email", name = "EMAIL_OUT"}
 ]

And then creating a new message and sending it in a Workers is as simple as:

import { EmailMessage } from "cloudflare:email";
import { createMimeMessage } from "mimetext";

export default {
 async fetch(request, env) {
   const msg = createMimeMessage();
   msg.setSender({ name: "Workers AI story", addr: "joe@example.com" });
   msg.setRecipient("mary@domain.example");
   msg.setSubject("An email generated in a worker");
   msg.addMessage({
       contentType: 'text/plain',
       data: `Congratulations, you just sent an email from a worker.`
   });

   var message = new EmailMessage(
     "joe@example.com",
     "mary@domain.example",
     msg.asRaw()
   );
   try {
     await env.EMAIL_OUT.send(message);
   } catch (e) {
     return new Response(e.message);
   }

   return new Response("email sent!");
 },
};

This example makes use of mimetext, an open-source raw email message generator.

Again, for security reasons, you can only send emails to the addresses for which you confirmed ownership in Email Routing under your Cloudflare account. If you’re looking for sending email campaigns or newsletters to destination addresses that you do not control or larger subscription groups, you should consider other options like our MailChannels integration.

Since sending Emails from Workers is not tied to the EmailEvent, you can send them from any type of Worker, including Cron Triggers and Durable Objects, whenever you want, you control all the logic.

Reply to emails

One of our most-requested features has been to provide a way to programmatically respond to incoming emails. It has been possible to do this with Email Workers in a very limited capacity by returning a permanent SMTP error message — but this may or may not be visible to the end user depending on the client implementation.

export default {
  async email(message, env, ctx) {
      message.setReject("Address not allowed");
  }
}

As of today, you can now truly reply to incoming emails with another new message and implement smart auto-responders programmatically, adding any content and context in the main body of the message. Think of a customer support email automatically generating a ticket and returning the link to the sender, an out-of-office reply with instructions when you're on vacation, or a detailed explanation of why you rejected an email. Here’s a code example:

To mitigate security risks and abuse, replying to incoming emails has a few requirements:

The incoming email has to have valid DMARC.
The email can only be replied to once.
The In-Reply-To header of the reply message must match the Message-ID of the incoming message.
The recipient of the reply must match the incoming sender.
The outgoing sender domain must match the same domain that received the email.

If these and other internal conditions are not met, then reply() will fail with an exception, otherwise you can freely compose your reply message and send it back to the original sender.

For more information the documentation to these APIs is available in our Developer Docs.

Subdomains support

This is a big one.

Email Routing is a zone-level feature. A zone has a top-level domain (the same as the zone name) and it can have subdomains (managed under the DNS feature.) As an example, I can have the example.com zone, and then the mail.example.com and corp.example.com subdomains under it. However, we can only use Email Routing with the top-level domain of the zone, example.com in this example. While this is fine for the vast majority of use cases, some customers — particularly bigger organizations with complex email requirements — have asked for more flexibility.

This changes today. Now you can use Email Routing with any subdomain of any zone in your account. To make this possible we redesigned the dashboard UI experience to make it easier to get you started and manage all your Email Routing domains and subdomains, rules and destination addresses in one single place. Let’s see how it works.

To add Email Routing features to a new subdomain, log in to the Cloudflare dashboard and select your account and zone. Then go to Email > Email Routing > Settings and click “Add subdomain”.

Once the subdomain is added and the DNS records are configured, you can see it in the Settings list under the Subdomains section:

Now you can go to Email > Email Routing > Routing rules and create new custom addresses that will show you the option of using either the top domain of the zone or any other configured subdomain.

After the new custom address for the subdomain is created you can see it in the list with all the other addresses, and manage it from there.

It’s this easy.

Final words

We hope you enjoy the new features that we are announcing today. Still, we want to be clear: there are no changes in pricing, and Email Routing is still free for Cloudflare customers.

Ever since Email Routing was launched, we’ve been listening to customers’ feedback and trying to adjust our roadmap to both our requirements and their own ideas and requests. Email shouldn't be difficult; our goal is to listen, learn and keep improving the email security service with better, more powerful features.

You can find detailed information about the new features and more in our Email Routing Developer Docs.

If you have any questions or feedback about Email Routing, please come see us in the Cloudflare Community and the Cloudflare Discord.

How we built DMARC Management using Cloudflare Workers

André Cruz — Fri, 17 Mar 2023 13:00:00 GMT

What are DMARC reports

DMARC stands for Domain-based Message Authentication, Reporting, and Conformance. It's an email authentication protocol that helps protect against email phishing and spoofing.

When an email is sent, DMARC allows the domain owner to set up a DNS record that specifies which authentication methods, such as SPF (Sender Policy Framework) and DKIM (DomainKeys Identified Mail), are used to verify the email's authenticity. When the email fails these authentication checks DMARC instructs the recipient's email provider on how to handle the message, either by quarantining it or rejecting it outright.

DMARC has become increasingly important in today's Internet, where email phishing and spoofing attacks are becoming more sophisticated and prevalent. By implementing DMARC, domain owners can protect their brand and their customers from the negative impacts of these attacks, including loss of trust, reputation damage, and financial loss.

In addition to protecting against phishing and spoofing attacks, DMARC also provides reporting capabilities. Domain owners can receive reports on email authentication activity, including which messages passed and failed DMARC checks, as well as where these messages originated from.

DMARC management involves the configuration and maintenance of DMARC policies for a domain. Effective DMARC management requires ongoing monitoring and analysis of email authentication activity, as well as the ability to make adjustments and updates to DMARC policies as needed.

Some key components of effective DMARC management include:

Setting up DMARC policies: This involves configuring the domain's DMARC record to specify the appropriate authentication methods and policies for handling messages that fail authentication checks. Here’s what a DMARC DNS record looks like:

v=DMARC1; p=reject; rua=mailto:dmarc@example.com

This specifies that we are going to use DMARC version 1, our policy is to reject emails if they fail the DMARC checks, and the email address to which providers should send DMARC reports.

Monitoring email authentication activity: DMARC reports are an important tool for domain owners to ensure email security and deliverability, as well as compliance with industry standards and regulations. By regularly monitoring and analyzing DMARC reports, domain owners can identify email threats, optimize email campaigns, and improve overall email authentication.
Making adjustments as needed: Based on analysis of DMARC reports, domain owners may need to make adjustments to DMARC policies or authentication methods to ensure that email messages are properly authenticated and protected from phishing and spoofing attacks.
Working with email providers and third-party vendors: Effective DMARC management may require collaboration with email providers and third-party vendors to ensure that DMARC policies are being properly implemented and enforced.

Today we launched DMARC management. This is how we built it.

How we built it

As a leading provider of cloud-based security and performance solutions, we at Cloudflare take a specific approach to test our products. We "dogfood" our own tools and services, which means we use them to run our business. This helps us identify any issues or bugs before they affect our customers.

We use our own products internally, such as Cloudflare Workers, a serverless platform that allows developers to run their code on our global network. Since its launch in 2017, the Workers ecosystem has grown significantly. Today, there are thousands of developers building and deploying applications on the platform. The power of the Workers ecosystem lies in its ability to enable developers to build sophisticated applications that were previously impossible or impractical to run so close to clients. Workers can be used to build APIs, generate dynamic content, optimize images, perform real-time processing, and much more. The possibilities are virtually endless. We used Workers to power services like Radar 2.0, or software packages like Wildebeest.

Recently our Email Routing product joined forces with Workers, enabling processing incoming emails via Workers scripts. As the documentation states: “With Email Workers you can leverage the power of Cloudflare Workers to implement any logic you need to process your emails and create complex rules. These rules determine what happens when you receive an email.” Rules and verified addresses can all be configured via our API.

Here’s how a simple Email Worker looks like:

export default {
  async email(message, env, ctx) {
    const allowList = ["friend@example.com", "coworker@example.com"];
    if (allowList.indexOf(message.headers.get("from")) == -1) {
      message.setReject("Address not allowed");
    } else {
      await message.forward("inbox@corp");
    }
  }
}

Pretty straightforward, right?

With the ability to programmatically process incoming emails in place, it seemed like the perfect way to handle incoming DMARC report emails in a scalable and efficient manner, letting Email Routing and Workers do the heavy lifting of receiving an unbound number of emails from across the globe. A high level description of what we needed is:

Receive email and extract report
Publish relevant details to analytics platform
Store the raw report

Email Workers enable us to do #1 easily. We just need to create a worker with an email() handler. This handler will receive the SMTP envelope elements, a pre-parsed version of the email headers, and a stream to read the entire raw email.

For #2 we can also look into the Workers platform, and we will find the Workers Analytics Engine. We just need to define an appropriate schema, which depends both on what’s present in the reports and the queries we plan to do later. Afterwards we can query the data using either the GraphQL or SQL API.

For #3 we don’t need to look further than our R2 object storage. It is trivial to access R2 from a Worker. After extracting the reports from the email we will store them in R2 for posterity.

We built this as a managed service that you can enable on your zone, and added a dashboard interface for convenience, but in reality all the tools are available for you to deploy your own DMARC reports processor on top of Cloudflare Workers, in your own account, without having to worry about servers, scalability or performance.

Architecture

Email Workers is a feature of our Email Routing product. The Email Routing component runs in all our nodes, so any one of them is able to process incoming mail, which is important because we announce the Email ingress BGP prefix from all our datacenters. Sending emails to an Email Worker is as easy as setting a rule in the Email Routing dashboard.

When the Email Routing component receives an email that matches a rule to be delivered to a Worker, it will contact our internal version of the recently open-sourced workerd runtime, which also runs on all nodes. The RPC schema that governs this interaction is defined in a Capnproto schema, and allows the body of the email to be streamed to Edgeworker as it’s read. If the worker script decides to forward this email, Edgeworker will contact Email Routing using a capability sent in the original request.

jsg::Promise ForwardableEmailMessage::forward(kj::String rcptTo, jsg::Optional> maybeHeaders) {
  auto req = emailFwdr->forwardEmailRequest();
  req.setRcptTo(rcptTo);

  auto sendP = req.send().then(
      [](capnp::Response res) mutable {
    auto result = res.getResponse().getResult();
    JSG_REQUIRE(result.isOk(), Error, result.getError());
  });
  auto& context = IoContext::current();
  return context.awaitIo(kj::mv(sendP));
}

In the context of DMARC reports this is how we handle the incoming emails:

Fetch the recipient of the email being processed, this is the RUA that was used. RUA is a DMARC configuration parameter that indicates where aggregate DMARC processing feedback should be reported pertaining to a certain domain. This recipient can be found in the “to” attribute of the message.

const ruaID = message.to

Since we handle DMARC reports for an unbounded number of domains, we use Workers KV to store some information about each one and key this information on the RUA. This also lets us know if we should be receiving these reports.

const accountInfoRaw = await env.KV_DMARC_REPORTS.get(dmarc:${ruaID})

At this point, we want to read the entire email into an arrayBuffer in order to parse it. Depending on the size of the report we may run into the limits of the free Workers plan. If this happens, we recommend that you switch to the Workers Unbound resource model which does not have this issue.

const rawEmail = new Response(message.raw)
const arrayBuffer = await rawEmail.arrayBuffer()

Parsing the raw email involves, among other things, parsing its MIME parts. There are multiple libraries available that allow one to do this. For example, you could use postal-mime:

const parser = new PostalMime.default()
const email = await parser.parse(arrayBuffer)

Having parsed the email we now have access to its attachments. These attachments are the DMARC reports themselves and they can be compressed. The first thing we want to do is store them in their compressed form in R2 for long-term storage. They can be useful later on for re-processing or investigating interesting reports. Doing this is as simple as calling put() on the R2 binding. In order to facilitate retrieval later we recommend that you spread the report files across directories based on the current time.

await env.R2_DMARC_REPORTS.put(
    `${date.getUTCFullYear()}/${date.getUTCMonth() + 1}/${attachment.filename}`,
    attachment.content
  )

We now need to look into the attachment mime type. The raw form of DMARC reports is XML, but they can be compressed. In this case we need to decompress them first. DMARC reporter files can use multiple compression algorithms. We use the MIME type to know which one to use. For Zlib compressed reports pako can be used while for ZIP compressed reports unzipit is a good choice.
Having obtained the raw XML form of the report, fast-xml-parser has worked well for us in parsing them. Here’s how the DMARC report XML looks:


  
    example.com
    
   http://example.com/dmarc/support
    9391651994964116463
    
      1335521200
      1335652599
    
  
  
    business.example
    r
    r
    none
    none
    100
  
  
    
      192.0.2.1
      2
      
        none
        fail
        pass
      
    
    
      business.example
    
    
      
        business.example
        fail
        
      
      
        business.example
        pass

We now have all the data in the report at our fingertips. What we do from here on depends a lot on how we want to present the data. For us, the goal was to display meaningful data extracted from them in our Dashboard. Therefore we needed an Analytics platform where we could push the enriched data. Enter, Workers Analytics Engine. The Analytics engine is perfect for this task since it allows us to send data to it from a worker, and exposes a GraphQL API to interact with the data afterwards. This is how we obtain the data to show in our dashboard.

In the future, we are also considering integrating Queues in the workflow to asynchronously process the report and avoid waiting for the client to complete it.

We managed to implement this project end-to-end relying only on the Workers infrastructure, proving that it’s possible, and advantageous, to build non-trivial apps without having to worry about scalability, performance, storage and security issues.

Open sourcing

As we mentioned before, we built a managed service that you can enable and use, and we will manage it for you. But, everything we did can also be deployed by you, in your account, so that you can manage your own DMARC reports. It’s easy, and free. To help you with that, we are releasing an open-source version of a Worker that processes DMARC reports in the way described above: https://github.com/cloudflare/dmarc-email-worker

If you don’t have a dashboard where to show the data, you can also query the Analytics Engine from a Worker. Or, if you want to store them in a relational database, then there’s D1 to the rescue. The possibilities are endless and we are excited to find out what you’ll build with these tools.

Please contribute, make your own, we’ll be listening.

Final words

We hope that this post has furthered your understanding of the Workers platform. Today Cloudflare takes advantage of this platform to build most of our services, and we think you should too.

Feel free to contribute to our open-source version and show us what you can do with it.

The Email Routing is also working on expanding the Email Workers API more functionally, but that deserves another blog soon.

Email Routing leaves Beta

Celso Martinho — Tue, 25 Oct 2022 13:00:00 GMT

Email Routing was announced during Birthday Week in 2021 and has been available for free to every Cloudflare customer since early this year. When we launched in beta, we set out to make a difference and provide the most uncomplicated, more powerful email forwarding service on the Internet for all our customers, for free.

We feel we've met and surpassed our goals for the first year. Cloudflare Email Routing is now one of our most popular features and a top leading email provider. We are processing email traffic for more than 550,000 inboxes and forwarding an average of two million messages daily, and still growing month to month.

In February, we also announced that we were acquiring Area1. Merging their team, products, and know-how with Cloudflare was a significant step in strengthening our Email Security capabilities.

All this is good, but what about more features, you ask?

The team has been working hard to enhance Email Routing over the last few months. Today Email Routing leaves beta.

Also, we feel that this could be a good time to give you an update on all the new things we've been adding to the service, including behind-the-scenes and not-so-visible improvements.

Let’s get started.

Public API and Terraform

Cloudflare has a strong API-first philosophy. All of our services expose their primitives in our vast API catalog and gateway, which we then “dogfood” extensively. For instance, our customer's configuration dashboard is built entirely on top of these APIs.

The Email Routing APIs didn't quite make it to this catalog on day one and were kept private and undocumented for a while. This summer we made those APIs available on the public Cloudflare API catalog. You can programmatically use them to manage your destination emails, rules, and other Email Routing settings. The methods' definitions and parameters are documented, and we provide curl examples if you want to get your hands dirty quickly.

Even better, if you're an infrastructure as code type of user and use Terraform to configure your systems automatically, we have you covered too. The latest releases of Cloudflare's Terraform provider now incorporate the Email Routing API resources, which you can use with HCL.

IPv6 egress

IPv6 adoption is on a sustained growth path. Our latest IPv6 adoption report shows that we're nearing the 30% penetration figure globally, with some countries, where mobile usage is prevalent, exceeding the 50% mark. Cloudflare has offered full IPv6 support since 2011 as it aligns entirely with our mission to help build a better Internet.

We are IPv6-ready across the board in our network and our products, and Email Routing has had IPv6 ingress support since day one.

➜  ~ dig celso.io MX +noall +answer
celso.io.		300	IN	MX	91 isaac.mx.cloudflare.net.
celso.io.		300	IN	MX	2 linda.mx.cloudflare.net.
celso.io.		300	IN	MX	2 amir.mx.cloudflare.net.
➜  ~ dig linda.mx.cloudflare.net AAAA +noall +answer
linda.mx.cloudflare.net. 300	IN	AAAA	2606:4700:f5::b
linda.mx.cloudflare.net. 300	IN	AAAA	2606:4700:f5::c
linda.mx.cloudflare.net. 300	IN	AAAA	2606:4700:f5::d

More recently, we closed the loop and added egress IPv6 as well. Now we also use IPv6 when sending emails to upstream servers. If the MX server to which an email is being forwarded supports IPv6, then we will try to use it. Gmail is one good example of a high traffic destination that has IPv6 MX records.

➜  ~ dig gmail.com MX +noall +answer
gmail.com.		3362	IN	MX	30 alt3.gmail-smtp-in.l.google.com.
gmail.com.		3362	IN	MX	5 gmail-smtp-in.l.google.com.
gmail.com.		3362	IN	MX	10 alt1.gmail-smtp-in.l.google.com.
gmail.com.		3362	IN	MX	20 alt2.gmail-smtp-in.l.google.com.
gmail.com.		3362	IN	MX	40 alt4.gmail-smtp-in.l.google.com.
➜  ~ dig gmail-smtp-in.l.google.com AAAA +noall +answer
gmail-smtp-in.l.google.com. 116	IN	AAAA	2a00:1450:400c:c03::1a

We’re happy to report that we’re now delivering most of our email to upstreams using IPv6.

Observability

Email Routing is effectively another system that sits in the middle of the life of an email message. No one likes to navigate blindly, especially when using and depending on critical services like email, so it's our responsibility to provide as much observability as possible about what's going on when messages are transiting through our network.

End to end email deliverability is a complex topic and often challenging to troubleshoot due to the nature of the protocol and the number of systems and hops involved. We added two widgets, Analytics and Detailed Logs, which will hopefully provide the needed insights and help increase visibility.

The Analytics section of Email Routing shows general statistics about the number of emails received during the selected timeframe, how they got handled to the upstream destination addresses, and a convenient time-series chart.

On the Activity Log, you can get detailed information about what happened to each individual message that was received and then delivered to the destination. That information includes the sender and the custom address used, the timestamp, and the delivery attempt result. It also has the details of our SPF, DMARC, and DKIM validations. We also provide filters to help you find what you're looking for in case your message volume is higher.

More recently, the Activity Log now also shows bounces. A bounce message happens when the upstream SMTP server accepts the delivery, but then, for any reason (exceeded quota, virus checks, forged messages, or other issues), the recipient inbox decides to reject it and return a new message back with an error to the latest MTA in the chain, read from the Return-Path headers, which is us.

Audit Logs

Audit Logs are available on all plan types and summarize the history of events, like login and logout actions, or zone configuration changes, made within your Cloudflare account. Accounts with multiple members or companies that must comply with regulatory obligations rely on Audit logs for tracking and evidence reasons.

Email Routing now integrates with Audit Logs and records all configuration changes, like adding a new address, changing a rule, or editing the catch-all address. You can find the Audit Logs on the dashboard under "Manage Account" or use our API to download the list.

Anti-spam

Unsolicited and malicious messages plague the world of email and are a big problem for end users. They affect the user experience and efficiency of email, and often carry security risks that can lead to scams, identity theft, and manipulation.

Since day one, we have supported and validated SPF (Sender Policy Framework) records, DKIM (DomainKeys Identified Mail) signatures, and DMARC (Domain-based Message Authentication) policies in incoming messages. These steps are important and mitigate some risks associated with authenticating the origin of an email from a specific legitimate domain, but they don't solve the problem completely. You can still have bad actors generating spam or phishing Attacks from other domains who ignore SPF or DKIM completely.

Anti-spam techniques today are often based on blocking emails whose origin (the IP address of the client trying to deliver the message) confidence score isn't great. This is commonly known in the industry as IP reputation. Other companies specialize in maintaining reputation lists for IPs and email domains, also known as RBL lists, which are then shared across providers and used widely.

Simply put, an IP or a domain gets a bad reputation when it starts sending unsolicited or malicious emails. If your IP or domain has a bad reputation, you'll have a hard time delivering Emails from them to any major email provider. A bad reputation goes away when the IP or domain stops acting bad.

Cloudflare is a security company that knows a few things about IP threat scores and reputation. Working with the Area1 team and learning from them, we added support to flag and block emails received from what we consider bad IPs at the SMTP level. Our approach uses a combination of heuristics and reputation databases, including some RBL lists, which we constantly update.

This measure benefits not only those customers that receive a lot of spam, who will now get another layer of protection and filtering, but also everyone else using Email Routing. The reputation of our own IP space and forwarding domain, which we use to deliver messages to other email providers, will improve, and with it, so will our deliverability success rate.

IDN support

Internationalized domain names, or IDNs for short, are domains that contain at least one non-ASCII character. To accommodate backward compatibility with older Internet protocols and applications, the IETF approved the IDNA protocol (Internationalized Domain Names in Applications), which was then adopted by many browsers, top-level domain registrars and other service providers.

Cloudflare was one of the first platforms to adopt IDNs back in 2012. Supporting internationalized domain names on email, though, is challenging. Email uses DNS, SMTP, and other standards (like TLS and DKIM signatures) stacked on top of each other. IDNA conversions need to work end to end, or something will break.

Email Routing didn’t support IDNs until now. Starting today, Email Routing can be used with IDNs and everything will work end to end as expected.

8-bit MIME transport

The SMTP protocol supports extensions since the RFC 2821 revision. When an email client connects to an SMTP server, it announces its capabilities on the EHLO command.

➜  ~ telnet linda.mx.cloudflare.net 25
Trying 162.159.205.24...
Connected to linda.mx.cloudflare.net.
Escape character is '^]'.
220 mx.cloudflare.net Cloudflare Email ESMTP Service ready
EHLO celso.io
250-mx.cloudflare.net greets celso.io
250-STARTTLS
250-8BITMIME
250 ENHANCEDSTATUSCODES

This tells our client that we support the Secure SMTP over TLS, Enhanced Error Codes, and the 8-bit MIME Transport, our latest addition.

Most modern clients and servers support the 8BITMIME extension, making transmitting binary files easier and more efficient without additional conversions to and from 7-bit.

Email Routing now supports transmitting 8BITMIME SMTP messages end to end and handles DKIM signatures accordingly.

Other fixes

We’ve been making other smaller improvements to Email Routing too:

We ported our SMTP server to use BoringSSL, Cloudflare’s SSL/TLS implementation of choice, and now support more ciphers when clients connect to us using STARTTLS and when we connect to upstream servers.
We made a number of improvements when we added our own DKIM signatures in the messages. We keep our Rust ?DKIM implementation open source on GitHub, and we also contribute to Lettre, a Rust mailer library that we use.
When a destination address domain has multiple MX records, we now try them all in their preference value order, as described in the RFC, until we get a good delivery, or we fail.

Route to Workers update

We announced Route to Workers in May this year. Route to Workers enables everyone to programmatically process their emails and use them as triggers for any other action. In other words, you can choose to process any incoming email with a Cloudflare Worker script and then implement any logic you wish before you deliver it to a destination address or drop it. Think about it as programmable email.

The good news, though, is that we're near completing the project. The APIs, the dashboard configuration screens, the SMTP service, and the necessary Cap'n Proto interface to Workers are mostly complete, and "all" we have left now is adding the Email Workers primitives to the runtime and testing the hell out of everything before we ship.

Thousands of users are waiting for Email Workers to start creating advanced email processing workflows, and we're excited about the possibilities this will open. We promise we're working hard to open the public beta as soon as possible.

What’s next?

We keep looking at ways to improve email and will add more features and support to emerging protocols and extensions. Two examples are ARC (Authenticated Received Chain), a new signature-based authentication system designed with email forwarding services in mind, and EAI (Email Address Internationalization), which we will be supporting soon.

In the meantime, you can start using Email Routing with your own domain if you haven't yet, it only takes a few minutes to set up, and it's free. Our Developers Documentation page has details on how to get started, troubleshooting, and technical information.

Ping us on our Discord server, community forum, or Twitter if you have suggestions or questions, the team is listening.