订阅以接收新文章的通知:

Dynamic URL Rewriting at the edge with Cloudflare

2021-04-08

5 分钟阅读时间
这篇博文也有 English 版本。

URLs are ugly. They are hard to read, difficult to memorise and often auto-generated for the benefit of the origin server - not the user.

Today we are announcing the immediate availability of Transform Rules for all Cloudflare plans. Transform Rules provide Cloudflare administrators with the ability to create URL rewrite rules. These rules transform HTTP requests as they flow through Cloudflare providing an interpretation layer between the human friendly and the computer friendly.

Ease of understanding

Imagine you are going on a much needed around-the-world trip and want to buy a copy of John Graham-Cumming’s book The Geek Atlas: 128 Places Where Science and Technology Come Alive to use as inspiration. Would the link https://www.travelbooks247.com/dp/0596523203/ make sense to you? Chances are the answer is no. It's hard for humans to understand these complex, contextless URLs.

This is why companies instead provide user friendly alternatives such as: https://www.travelbooks247.com/Geek-Atlas-Places-Science-Technology/dp/0596523203/ and use web servers as the interpreter. This interpretation is known as URL rewriting.

Large ecommerce retailers take HTTP requests to these human-friendly URLs and rewrite them using a simple pattern that strips the content Geek-Atlas-Places-Science-Technology/ before sending the HTTP request to the backend. The human readable hyperlink is transformed into a simple format the back-end services can understand. This is an example of a URL rewrite.

This is common practice amongst online retailers such as large online auction platforms who follow similar practices, transforming HTTP requests to user-friendly URI Paths such as /itm/The-Geek-Atlas-by-John-Graham-Cumming/333892143938 into /itm/333892143938. This is again done by stripping out the vanity-text ahead of sending the HTTP request to the origin. Literally any text can be entered in place of ..Geek-Atlas... in these HTTP requests. It all gets stripped.

Maintaining control of your traffic

URL rewriting occurs when the request is received by the web server. This web server understands the friendly URL and knows its computer-generated counterpart. The web server retrieves the correct data and then sends it to the browser with no change to the URL in the browser’s address bar.  Common server-side implementations include the well-known mod_rewrite and ngx_http_rewrite_module modules.

Historically these web servers were located physically within a company's data center. Administrators then had full control over the URLs received, and could create the interpretation rules as and when needed.

As the world rapidly migrates on-premise applications and solutions to the cloud, administrators can find themselves in a situation where they can no longer do what they previously could. Not being responsible for the origin has a number of benefits, but it also comes with drawbacks such as lack of control. Previously, an administrator could quickly add a few config lines to the web server in front of their ecommerce platform. Moving to an online hosted platform makes this much more difficult to do. With the introduction of Cloudflare’s Transform Rules we are giving traffic control back to administrators, allowing them to reroute or modify HTTP requests before they're passed to servers they do not administer.

Announcing Transform Rules

Transform Rules allow the creation of traffic modification rules using URL rewrites, with plans to support additional rule types in the near future (such as HTTP request header modification).

Dynamic and static rewrites

The first available Transform Rule action is rewrite. It allows users to match on HTTP requests and modify the URI Path and URI Query using either static or dynamic rewrites.

A static rewrite changes a specified URI Path/Query to another. For example, users may want to transform all traffic addressed at the URI Path /index.php to /landing.php.

With a dynamic rewrite you can use expressions within the filter to transform traffic based on the specified pattern. For example, you might want to modify HTTP requests addressed from www.example.com/assets/* to www.example.com/internal/files/assets/* using a single dynamic rewrite rule. In this case, you would need to modify the first component of the path using the regex_replace() function. This function allows replacing parts of the value, based on an RE-2 compatible regular expression:

Another function is concat(). For example, if you wanted to change all requests with a URI Path of /news/2012/* to /archive/news/2012/*, you could use the concat() function in the dynamic rewrite expression. In our example, this would become:

You can use rewrite rules, both static and dynamic, to modify both the URI Path and URI Query, either in conjunction or independently. For example, you could use a URI rewrite to strip the URI Query value from matching HTTP requests by setting up a static rewrite and leaving the field blank:

This kind of rewrite can be used for SEO purposes and to prevent cache poisoning.

When do we rewrite requests?

One question that arose during the development of this feature was the following: “Where should Transform Rules happen in the Cloudflare traffic flow?”.

Originally, the “rewrite” action was added to the Firewall Rules section as a bolt-on. This allowed us to quickly develop the functionality and iterate, given it shares the same underlying engine. Once we began testing, we learned that Transform Rules must happen practically before anything else (at the application layer or layer 7), otherwise it may get confusing.

For example, if a user were to have a simple rewrite rule to take /soccer/* and transform it to /football/*, what would happen to Page Rules, Firewall Rules, and Worker Routes, for example, that filter on football? The answer is they wouldn't trigger, since the URL they would get would have been /soccer. Therefore, to simplify the experience, we made the decision to execute URL rewrite rules on traffic immediately as it enters the Cloudflare edge. This way, we can guarantee the URL that is passed to subsequent Cloudflare products. Predictability is absolutely key.

Redirect vs. rewrite

There are two common methods to change where an HTTP request is sent.

Firstly, there is a URL redirect, also known as ‘forwarding’. This is a server-side response that tells the client to go to another URL. This means that the URL displayed in the browser’s address bar gets updated to the new URL:

Secondly, there is a URL rewrite. This is a server-side modification of the URL before it is fully processed by the web server. This will not change what is seen in the user’s browser.

One of the most common uses of URL rewriting is creating human-friendly, memorable links. Rather than http://example.com/abcsd232sxa112, which may be easily generated and parsed as a database ID, it's easier for humans to read a URL like http://example.com/some-descriptive-product-name.

Normalization

We have also added new security functionality which closes a potential attack vector. This feature prevents malicious actors from potentially bypassing security rules within Cloudflare products using URL-encoding in HTTP requests. We have made this functionality available to all plans and, unless we’ve reached out to you directly via email, it was enabled by default in your zones before this blog was posted.

A number of Cloudflare products have historically used the URI of an incoming HTTP request in a literal sense when comparing it against user defined filters. For example, to block a URL like “https://example.com/%6ogin”, a user would have to create a Firewall Rule explicitly matching the URI Path /%ogin, rather than simply entering /login and expecting Cloudflare to figure out all the possible URL-encoded matches.

URL Normalization is now available for all Cloudflare users, with Edge Normalization enabled by default. This enhanced protection ensures that URL encoding cannot be used to bypass security features. In addition, it also simplifies the user experience by normalizing all inbound traffic into a standard format before it reaches other Cloudflare products such as Firewall Rules, Page Rules, and Workers.

For more information please refer to the KB article here.

A new home for page rules

You may have noticed that Page Rules has been renamed to “Rules” in the top-level navigation in the UI. You can now find Page Rules under ‘Rules’, alongside Transform Rules:

This move allows us to add new rule categories such as Transform Rules. All API endpoints remain unchanged.

Try it now

URL Rewriting can be used to improve SEO, secure your zone further, and to improve the experience of your users and customers. Try out the new Transform Rules yourself today.

我们保护整个企业网络,帮助客户高效构建互联网规模的应用程序,加速任何网站或互联网应用程序抵御 DDoS 攻击,防止黑客入侵,并能协助您实现 Zero Trust 的过程

从任何设备访问 1.1.1.1,以开始使用我们的免费应用程序,帮助您更快、更安全地访问互联网。要进一步了解我们帮助构建更美好互联网的使命,请从这里开始。如果您正在寻找新的职业方向,请查看我们的空缺职位
Transform Rules产品新闻安全性

在 X 上关注

Cloudflare|@cloudflare

相关帖子

2024年10月24日 13:00

Durable Objects aren't just durable, they're fast: a 10x speedup for Cloudflare Queues

Learn how we built Cloudflare Queues using our own Developer Platform and how it evolved to a geographically-distributed, horizontally-scalable architecture built on Durable Objects. Our new architecture supports over 10x more throughput and over 3x lower latency compared to the previous version....