
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/">
    <channel>
        <title><![CDATA[ The Cloudflare Blog ]]></title>
        <description><![CDATA[ Get the latest news on how products at Cloudflare are built, technologies used, and join the teams helping to build a better Internet. ]]></description>
        <link>https://blog.cloudflare.com</link>
        <atom:link href="https://blog.cloudflare.com/" rel="self" type="application/rss+xml"/>
        <language>en-us</language>
        <image>
            <url>https://blog.cloudflare.com/favicon.png</url>
            <title>The Cloudflare Blog</title>
            <link>https://blog.cloudflare.com</link>
        </image>
        <lastBuildDate>Fri, 03 Apr 2026 20:22:57 GMT</lastBuildDate>
        <item>
            <title><![CDATA[Partnering to make full-stack fast: deploy PlanetScale databases directly from Workers]]></title>
            <link>https://blog.cloudflare.com/planetscale-postgres-workers/</link>
            <pubDate>Thu, 25 Sep 2025 14:00:00 GMT</pubDate>
            <description><![CDATA[ We’ve teamed up with PlanetScale to make shipping full-stack applications on Cloudflare Workers even easier.  ]]></description>
            <content:encoded><![CDATA[ <p>We’re not burying the lede on this one: you can now connect <a href="https://www.cloudflare.com/developer-platform/products/workers/"><u>Cloudflare Workers</u></a> to your PlanetScale databases directly and ship full-stack applications backed by Postgres or MySQL. </p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3tcLGobPxPIHoDYEiGcY0X/d970a4a6b8a9e6ebc7d06ab57b168007/Frame_1321317798__1_.png" />
          </figure><p>We’ve teamed up with <a href="https://planetscale.com/"><u>PlanetScale</u></a> because we wanted to partner with a database provider that we could confidently recommend to our users: one that shares our obsession with performance, reliability and developer experience. These are all critical factors for any development team building a serious application. </p><p>Now, when connecting to PlanetScale databases, your connections are automatically configured for optimal performance with <a href="https://www.cloudflare.com/developer-platform/products/hyperdrive/"><u>Hyperdrive</u></a>, ensuring that you have the fastest access from your Workers to your databases, regardless of where your Workers are running.</p>
    <div>
      <h3>Building full-stack</h3>
      <a href="#building-full-stack">
        
      </a>
    </div>
    <p>As Workers has matured into a full-stack platform, we’ve introduced more options to facilitate your connectivity to data. With <a href="https://developers.cloudflare.com/kv/"><u>Workers KV</u></a>, we made it easy to store configuration and cache unstructured data on the edge. With <a href="https://www.cloudflare.com/developer-platform/products/d1/"><u>D1</u></a> and <a href="https://www.cloudflare.com/developer-platform/products/durable-objects/"><u>Durable Objects</u></a>, we made it possible to build multi-tenant apps with simple, isolated SQL databases. And with Hyperdrive, we made connecting to external databases fast and scalable from Workers.</p><p>Today, we’re introducing a new choice for building on Cloudflare: Postgres and MySQL PlanetScale databases, directly accessible from within the Cloudflare dashboard. Link your Cloudflare and PlanetScale accounts, stop manually copying API keys back-and-forth, and connect Workers to any of your PlanetScale databases (production or otherwise!).</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/71rXsGZgXWem4yvkhdtHsP/55f9433b5447c09703ef39a547881497/image3.png" />
          </figure><p><sup>Connect to a PlanetScale database — no figuring things out on your own</sup></p><p>Postgres and MySQL are the most popular options for building applications, and with good reason. Many large companies have built and scaled on these databases, providing for a robust ecosystem (like Cloudflare!). And you may want to have access to the power, familiarity, and functionality that these databases provide. </p><p>Importantly, all of this builds on <a href="https://blog.cloudflare.com/it-it/how-hyperdrive-speeds-up-database-access/"><u>Hyperdrive</u></a>, our distributed connection pooler and query caching infrastructure. Hyperdrive keeps connections to your databases warm to avoid incurring latency penalties for every new request, reduces the CPU load on your database by managing a connection pool, and can cache the results of your most frequent queries, removing load from your database altogether. Given that about 80% of queries for a typical transactional database are read-only, this can be substantial — we’ve observed this in reality!</p>
    <div>
      <h3>No more copying credentials around</h3>
      <a href="#no-more-copying-credentials-around">
        
      </a>
    </div>
    <p>Starting today, you can <a href="https://dash.cloudflare.com/?to=/:account/workers/hyperdrive?step=1&amp;modal=1"><u>connect to your PlanetScale databases from the Cloudflare dashboard</u></a> in just a few clicks. Connecting is now secure by default with a one-click password rotation option, without needing to copy and manage credentials back and forth. A Hyperdrive configuration will be created for your PlanetScale database, providing you with the optimal setup to start building on Workers.</p><p>And the experience spans both Cloudflare and PlanetScale dashboards: you can also create and view attached Hyperdrive configurations for your databases from the PlanetScale dashboard.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3I7WyAGXCLY8xhugPlIhl5/0ec38f0248140a628d805df7bb62dcc3/image2.png" />
          </figure><p>By automatically integrating with Hyperdrive, your PlanetScale databases are optimally configured for access from Workers. When you connect your database via Hyperdrive, Hyperdrive’s Placement system automatically determines the location of the database and places its pool of database connections in Cloudflare data centers with the lowest possible latency. </p><p>When one of your Workers connects to your Hyperdrive configuration for your PlanetScale database, Hyperdrive will ensure the fastest access to your database by eliminating the unnecessary roundtrips included in a typical database connection setup. Hyperdrive will resolve connection setup within the Hyperdrive client and use existing connections from the pool to quickly serve your queries. Better yet, Hyperdrive allows you to cache your query results in case you need to scale for high-read workloads. </p><p>This is a peek under the hood of how Hyperdrive makes access to PlanetScale as fast as possible. We’ve previously blogged about <a href="https://blog.cloudflare.com/it-it/how-hyperdrive-speeds-up-database-access/"><u>Hyperdrive’s technical underpinnings</u></a> — it’s worth a read. And with this integration with Hyperdrive, you can easily connect to your databases across different Workers applications or environments, without having to reconfigure your credentials. All in all, a perfect match.</p>
    <div>
      <h3>Get started with PlanetScale and Workers</h3>
      <a href="#get-started-with-planetscale-and-workers">
        
      </a>
    </div>
    <p>With this partnership, we’re making it trivially easy to build on Workers with PlanetScale. Want to build a new application on Workers that connects to your existing PlanetScale cluster? With just a few clicks, you can create a globally deployed app that can query your database, cache your hottest queries, and keep your database connections warmed for fast access from Workers.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3eTtJKz4sxeNvClVQMWIFg/9c91fb02b1cd4eca7ad5ef013e7ab0f0/image4.png" />
          </figure><p><sup><i>Connect directly to your PlanetScale MySQL or Postgres databases from the Cloudflare dashboard, for optimal configuration with Hyperdrive.</i></sup></p><p>To get started, you can:</p><ul><li><p>Head to the <a href="https://dash.cloudflare.com/?to=/:account/workers/hyperdrive?step=1&amp;modal=1"><u>Cloudflare dashboard</u></a> and connect your PlanetScale account</p></li><li><p>… or head to <a href="https://app.planetscale.com/"><u>PlanetScale</u></a> and connect your Cloudflare account</p></li><li><p>… and then deploy a Worker</p></li></ul><p>Review the <a href="https://developers.cloudflare.com/hyperdrive/"><u>Hyperdrive docs</u></a> and/or the <a href="https://planetscale.com/docs"><u>PlanetScale docs</u></a> to learn more about how to connect Workers to PlanetScale and start shipping.</p> ]]></content:encoded>
            <category><![CDATA[Hyperdrive]]></category>
            <category><![CDATA[Birthday Week]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[Partnership]]></category>
            <category><![CDATA[Database]]></category>
            <guid isPermaLink="false">7ibt13YouHX6Ew1wLZn5pi</guid>
            <dc:creator>Matt Silverlock</dc:creator>
            <dc:creator>Thomas Gauvin</dc:creator>
            <dc:creator>Adrian Gracia  </dc:creator>
        </item>
        <item>
            <title><![CDATA[Migrating billions of records: moving our active DNS database while it’s in use]]></title>
            <link>https://blog.cloudflare.com/migrating-billions-of-records-moving-our-active-dns-database-while-in-use/</link>
            <pubDate>Tue, 29 Oct 2024 14:00:00 GMT</pubDate>
            <description><![CDATA[ DNS records have moved to a new database, bringing improved performance and reliability to all customers. ]]></description>
            <content:encoded><![CDATA[ <p>According to a survey done by <a href="https://w3techs.com/technologies/overview/dns_server"><u>W3Techs</u></a>, as of October 2024, Cloudflare is used as an <a href="https://www.cloudflare.com/en-gb/learning/dns/dns-server-types/"><u>authoritative DNS</u></a> provider by 14.5% of all websites. As an authoritative DNS provider, we are responsible for managing and serving all the DNS records for our clients’ domains. This means we have an enormous responsibility to provide the best service possible, starting at the data plane. As such, we are constantly investing in our infrastructure to ensure the reliability and performance of our systems.</p><p><a href="https://www.cloudflare.com/learning/dns/what-is-dns/"><u>DNS</u></a> is often referred to as the phone book of the Internet, and is a key component of the Internet. If you have ever used a phone book, you know that they can become extremely large depending on the size of the physical area it covers. A <a href="https://www.cloudflare.com/en-gb/learning/dns/glossary/dns-zone/#:~:text=What%20is%20a%20DNS%20zone%20file%3F"><u>zone file</u></a> in DNS is no different from a phone book. It has a list of records that provide details about a domain, usually including critical information like what IP address(es) each hostname is associated with. For example:</p>
            <pre><code>example.com      59 IN A 198.51.100.0
blog.example.com 59 IN A 198.51.100.1
ask.example.com  59 IN A 198.51.100.2</code></pre>
            <p>It is not unusual for these zone files to reach millions of records in size, just for a single domain. The biggest single zone on Cloudflare holds roughly 4 million DNS records, but the vast majority of zones hold fewer than 100 DNS records. Given our scale according to W3Techs, you can imagine how much DNS data alone Cloudflare is responsible for. Given this volume of data, and all the complexities that come at that scale, there needs to be a very good reason to move it from one database cluster to another. </p>
    <div>
      <h2>Why migrate </h2>
      <a href="#why-migrate">
        
      </a>
    </div>
    <p>When initially measured in 2022, DNS data took up approximately 40% of the storage capacity in Cloudflare’s main database cluster (<b>cfdb</b>). This database cluster, consisting of a primary system and multiple replicas, is responsible for storing DNS zones, propagated to our <a href="https://www.cloudflare.com/network/"><u>data centers in over 330 cities</u></a> via our distributed KV store <a href="https://blog.cloudflare.com/introducing-quicksilver-configuration-distribution-at-internet-scale/"><u>Quicksilver</u></a>. <b>cfdb</b> is accessed by most of Cloudflare's APIs, including the <a href="https://developers.cloudflare.com/dns/manage-dns-records/how-to/create-dns-records/"><u>DNS Records API</u></a>. Today, the DNS Records API is the API most used by our customers, with each request resulting in a query to the database. As such, it’s always been important to optimize the DNS Records API and its surrounding infrastructure to ensure we can successfully serve every request that comes in.</p><p>As Cloudflare scaled, <b>cfdb</b> was becoming increasingly strained under the pressures of several services, many unrelated to DNS. During spikes of requests to our DNS systems, other Cloudflare services experienced degradation in the database performance. It was understood that in order to properly scale, we needed to optimize our database access and improve the systems that interact with it. However, it was evident that system level improvements could only be just so useful, and the growing pains were becoming unbearable. In late 2022, the DNS team decided, along with the help of 25 other teams, to detach itself from <b>cfdb</b> and move our DNS records data to another database cluster.</p>
    <div>
      <h2>Pre-migration</h2>
      <a href="#pre-migration">
        
      </a>
    </div>
    <p>From a DNS perspective, this migration to an improved database cluster was in the works for several years. Cloudflare initially relied on a single <a href="https://www.postgresql.org/"><u>Postgres</u></a> database cluster, <b>cfdb</b>. At Cloudflare's inception, <b>cfdb</b> was responsible for storing information about zones and accounts and the majority of services on the Cloudflare control plane depended on it. Since around 2017, as Cloudflare grew, many services moved their data out of <b>cfdb</b> to be served by a <a href="https://en.wikipedia.org/wiki/Microservices"><u>microservice</u></a>. Unfortunately, the difficulty of these migrations are directly proportional to the amount of services that depend on the data being migrated, and in this case, most services require knowledge of both zones and DNS records.</p><p>Although the term “zone” was born from the DNS point of view, it has since evolved into something more. Today, zones on Cloudflare store many different types of non-DNS related settings and help link several non-DNS related products to customers' websites. Therefore, it didn’t make sense to move both zone data and DNS record data together. This separation of two historically tightly coupled DNS concepts proved to be an incredibly challenging problem, involving many engineers and systems. In addition, it was clear that if we were going to dedicate the resources to solving this problem, we should also remove some of the legacy issues that came along with the original solution. </p><p>One of the main issues with the legacy database was that the DNS team had little control over which systems accessed exactly what data and at what rate. Moving to a new database gave us the opportunity to create a more tightly controlled interface to the DNS data. This was manifested as an internal DNS Records <a href="https://blog.cloudflare.com/moving-k8s-communication-to-grpc/"><u>gRPC API</u></a> which allows us to make sweeping changes to our data while only requiring a single change to the API, rather than coordinating with other systems.  For example, the DNS team can alter access logic and auditing procedures under the hood. In addition, it allows us to appropriately rate-limit and cache data depending on our needs. The move to this new API itself was no small feat, and with the help of several teams, we managed to migrate over 20 services, using 5 different programming languages, from direct database access to using our managed gRPC API. Many of these services touch very important areas such as <a href="https://developers.cloudflare.com/dns/dnssec/"><u>DNSSEC</u></a>, <a href="https://developers.cloudflare.com/ssl/"><u>TLS</u></a>, <a href="https://developers.cloudflare.com/email-routing/"><u>Email</u></a>, <a href="https://developers.cloudflare.com/cloudflare-one/connections/connect-networks/"><u>Tunnels</u></a>, <a href="https://developers.cloudflare.com/workers/"><u>Workers</u></a>, <a href="https://developers.cloudflare.com/spectrum/"><u>Spectrum</u></a>, and <a href="https://developers.cloudflare.com/r2/"><u>R2 storage</u></a>. Therefore, it was important to get it right. </p><p>One of the last issues to tackle was the logical decoupling of common DNS database functions from zone data. Many of these functions expect to be able to access both DNS record data and DNS zone data at the same time. For example, at record creation time, our API needs to check that the zone is not over its maximum record allowance. Originally this check occurred at the SQL level by verifying that the record count was lower than the record limit for the zone. However, once you remove access to the zone itself, you are no longer able to confirm this. Our DNS Records API also made use of SQL functions to audit record changes, which requires access to both DNS record and zone data. Luckily, over the past several years, we have migrated this functionality out of our monolithic API and into separate microservices. This allowed us to move the auditing and zone setting logic to the application level rather than the database level. Ultimately, we are still taking advantage of SQL functions in the new database cluster, but they are fully independent of any other legacy systems, and are able to take advantage of the latest Postgres version.</p><p>Now that Cloudflare DNS was mostly decoupled from the zones database, it was time to proceed with the data migration. For this, we built what would become our <b>Change Data Capture and Transfer Service (CDCTS).</b></p>
    <div>
      <h2>Requirements for the Change Data Capture and Transfer Service</h2>
      <a href="#requirements-for-the-change-data-capture-and-transfer-service">
        
      </a>
    </div>
    <p>The Database team is responsible for all Postgres clusters within Cloudflare, and were tasked with executing the data migration of two tables that store DNS data: <i>cf_rec</i> and <i>cf_archived_rec</i>, from the original <b>cfdb </b>cluster to a new cluster we called <b>dnsdb</b>.  We had several key requirements that drove our design:</p><ul><li><p><b>Don’t lose data. </b>This is the number one priority when handling any sort of data. Losing data means losing trust, and it is incredibly difficult to regain that trust once it’s lost.  Important in this is the ability to prove no data had been lost.  The migration process would, ideally, be easily auditable.</p></li><li><p><b>Minimize downtime</b>.  We wanted a solution with less than a minute of downtime during the migration, and ideally with just a few seconds of delay.</p></li></ul><p>These two requirements meant that we had to be able to migrate data changes in near real-time, meaning we either needed to implement logical replication, or some custom method to capture changes, migrate them, and apply them in a table in a separate Postgres cluster.</p><p>We first looked at using Postgres logical replication using <a href="https://github.com/2ndQuadrant/pglogical"><u>pgLogical</u></a>, but had concerns about its performance and our ability to audit its correctness.  Then some additional requirements emerged that made a pgLogical implementation of logical replication impossible:</p><ul><li><p><b>The ability to move data must be bidirectional.</b> We had to have the ability to switch back to <b>cfdb</b> without significant downtime in case of unforeseen problems with the new implementation. </p></li><li><p><b>Partition the </b><b><i>cf_rec</i></b><b> table in the new database.</b> This was a long-desired improvement and since most access to <i>cf_rec</i> is by zone_id, it was decided that <b>mod(zone_id, num_partitions)</b> would be the partition key.</p></li><li><p><b>Transferred data accessible from original database.  </b>In case we had functionality that still needed access to data, a foreign table pointing to <b>dnsdb</b> would be available in <b>cfdb</b>. This could be used as emergency access to avoid needing to roll back the entire migration for a single missed process.</p></li><li><p><b>Only allow writes in one database. </b> Applications should know where the primary database is, and should be blocked from writing to both databases at the same time.</p></li></ul>
    <div>
      <h2>Details about the tables being migrated</h2>
      <a href="#details-about-the-tables-being-migrated">
        
      </a>
    </div>
    <p>The primary table, <i>cf_rec</i>, stores DNS record information, and its rows are regularly inserted, updated, and deleted. At the time of the migration, this table had 1.7 billion records, and with several indexes took up 1.5 TB of disk. Typical daily usage would observe 3-5 million inserts, 1 million updates, and 3-5 million deletes.</p><p>The second table, <i>cf_archived_rec</i>, stores copies of <i>cf_rec</i> that are obsolete — this table generally only has records inserted and is never updated or deleted.  As such, it would see roughly 3-5 million inserts per day, corresponding to the records deleted from <i>cf_rec</i>. At the time of the migration, this table had roughly 4.3 billion records.</p><p>Fortunately, neither table made use of database triggers or foreign keys, which meant that we could insert/update/delete records in this table without triggering changes or worrying about dependencies on other tables.</p><p>Ultimately, both of these tables are highly active and are the source of truth for many highly critical systems at Cloudflare.</p>
    <div>
      <h2>Designing the Change Data Capture and Transfer Service</h2>
      <a href="#designing-the-change-data-capture-and-transfer-service">
        
      </a>
    </div>
    <p>There were two main parts to this database migration:</p><ol><li><p><b>Initial copy:</b> Take all the data from <b>cfdb </b>and put it in <b>dnsdb.</b></p></li><li><p><b>Change copy:</b> Take all the changes in <b>cfdb </b>since the initial copy and update <b>dnsdb</b> to reflect them. This is the more involved part of the process.</p></li></ol><p>Normally, logical replication replays every insert, update, and delete on a copy of the data in the same transaction order, making a single-threaded pipeline.  We considered using a queue-based system but again, speed and auditability were both concerns as any queue would typically replay one change at a time.  We wanted to be able to apply large sets of changes, so that after an initial dump and restore, we could quickly catch up with the changed data. For the rest of the blog, we will only speak about <i>cf_rec</i> for simplicity, but the process for <i>cf_archived_rec</i> is the same.</p><p>What we decided on was a simple change capture table. Rows from this capture table would be loaded in real-time by a database trigger, with a transfer service that could migrate and apply thousands of changed records to <b>dnsdb</b> in each batch. Lastly, we added some auditing logic on top to ensure that we could easily verify that all data was safely transferred without downtime.</p>
    <div>
      <h3>Basic model of change data capture </h3>
      <a href="#basic-model-of-change-data-capture">
        
      </a>
    </div>
    <p>For <i>cf_rec</i> to be migrated, we would create a change logging table, along with a trigger function and a  table trigger to capture the new state of the record after any insert/update/delete.  </p><p>The change logging table named <i>log_cf_rec</i> had the same columns as <i>cf_rec</i>, as well as four new columns:</p><ul><li><p><b>change_id</b>:  a sequence generated unique identifier of the record</p></li><li><p><b>action</b>: a single character indicating whether this record represents an [i]nsert, [u]pdate, or [d]elete</p></li><li><p><b>change_timestamp</b>: the date/time when the change record was created</p></li><li><p><b>change_user:</b> the database user that made the change.  </p></li></ul><p>A trigger was placed on the <i>cf_rec</i> table so that each insert/update would copy the new values of the record into the change table, and for deletes, create a 'D' record with the primary key value. </p><p>Here is an example of the change logging where we delete, re-insert, update, and finally select from the <i>log_cf_rec</i><b> </b>table. Note that the actual <i>cf_rec</i> and <i>log_cf_rec</i> tables have many more columns, but have been edited for simplicity.</p>
            <pre><code>dns_records=# DELETE FROM  cf_rec WHERE rec_id = 13;

dns_records=# SELECT * from log_cf_rec;
Change_id | action | rec_id | zone_id | name
----------------------------------------------
1         | D      | 13     |         |   

dns_records=# INSERT INTO cf_rec VALUES(13,299,'cloudflare.example.com');  

dns_records=# UPDATE cf_rec SET name = 'test.example.com' WHERE rec_id = 13;

dns_records=# SELECT * from log_cf_rec;
Change_id | action | rec_id | zone_id | name
----------------------------------------------
1         | D      | 13     |         |  
2         | I      | 13     | 299     | cloudflare.example.com
3         | U      | 13     | 299     | test.example.com </code></pre>
            <p>In addition to <i>log_cf_rec</i>, we also introduced 2 more tables in <b>cfdb </b>and 3 more tables in <b>dnsdb:</b></p><p><b>cfdb</b></p><ol><li><p><i>transferred_log_cf_rec</i>: Responsible for auditing the batches transferred to <b>dnsdb</b>.</p></li><li><p><i>log_change_action</i>:<i> </i>Responsible for summarizing the transfer size in order to compare with the <i>log_change_action </i>in <b>dnsdb.</b></p></li></ol><p><b>dnsdb</b></p><ol><li><p><i>migrate_log_cf_rec</i>:<i> </i>Responsible for collecting batch changes in <b>dnsdb</b>, which would later be applied to <i>cf_rec </i>in <b>dnsdb</b><i>.</i></p></li><li><p><i>applied_migrate_log_cf_rec</i>:<i> </i>Responsible for auditing the batches that had been successfully applied to cf_rec in <b>dnsdb.</b></p></li><li><p><i>log_change_action</i>:<i> </i>Responsible for summarizing the transfer size in order to compare with the <i>log_change_action </i>in <b>cfdb.</b></p></li></ol>
    <div>
      <h3>Initial copy</h3>
      <a href="#initial-copy">
        
      </a>
    </div>
    <p>With change logging in place, we were now ready to do the initial copy of the tables from <b>cfdb</b> to <b>dnsdb</b>. Because we were changing the structure of the tables in the destination database and because of network timeouts, we wanted to bring the data over in small pieces and validate that it was brought over accurately, rather than doing a single multi-hour copy or <a href="https://www.postgresql.org/docs/current/app-pgdump.html"><u>pg_dump</u></a>.  We also wanted to ensure a long-running read could not impact production and that the process could be paused and resumed at any time.  The basic model to transfer data was done with a simple psql copy statement piped into another psql copy statement.  No intermediate files were used.</p><p><code>psql_cfdb -c "COPY (SELECT * FROM cf_rec WHERE id BETWEEN n and n+1000000 TO STDOUT)" | </code></p><p><code>psql_dnsdb -c "COPY cf_rec FROM STDIN"</code></p><p>Prior to a batch being moved, the count of records to be moved was recorded in <b>cfdb</b>, and after each batch was moved, a count was recorded in <b>dnsdb</b> and compared to the count in <b>cfdb</b> to ensure that a network interruption or other unforeseen error did not cause data to be lost. The bash script to copy data looked like this, where we included files that could be touched to pause or end the copy (if they cause load on production or there was an incident).  Once again, this code below has been heavily simplified.</p>
            <pre><code>#!/bin/bash
for i in "$@"; do
   # Allow user to control whether this is paused or not via pause_copy file
   while [ -f pause_copy ]; do
      sleep 1
   done
   # Allow user to end migration by creating end_copy file
   if [ ! -f end_copy ]; then
      # Copy a batch of records from cfdb to dnsdb
      # Get count of records from cfdb 
	# Get count of records from dnsdb
 	# Compare cfdb count with dnsdb count and alert if different 
   fi
done
</code></pre>
            <p><sup><i>Bash copy script</i></sup></p>
    <div>
      <h3>Change copy</h3>
      <a href="#change-copy">
        
      </a>
    </div>
    <p>Once the initial copy was completed, we needed to update <b>dnsdb</b> with any changes that had occurred in <b>cfdb</b> since the start of the initial copy. To implement this change copy, we created a function <i>fn_log_change_transfer_log_cf_rec </i>that could be passed a <i>batch_id</i> and <i>batch_size</i>, and did 5 things, all of which were executed in a single database <a href="https://www.postgresql.org/docs/current/tutorial-transactions.html"><u>transaction</u></a>:</p><ol><li><p>Select a <i>batch_size</i> of records from <i>log_cf_rec</i> in <b>cfdb</b>.</p></li><li><p>Copy the batch to <i>transferred_log_cf_rec</i> in <b>cfdb </b>to mark it as transferred.</p></li><li><p>Delete the batch from <i>log_cf_rec</i>.</p></li><li><p>Write a summary of the action to <i>log_change_action</i> table. This will later be used to compare transferred records with <b>cfdb</b>.</p></li><li><p>Return the batch of records.</p></li></ol><p>We then took the returned batch of records and copied them to <i>migrate_log_cf_rec </i>in <b>dnsdb</b>. We used the same bash script as above, except this time, the copy command looked like this:</p><p><code>psql_cfdb -c "COPY (SELECT * FROM </code><code><i>fn_log_change_transfer_log_cf_rec(&lt;batch_id&gt;,&lt;batch_size&gt;</i></code><code>) TO STDOUT" | </code></p><p><code>psql_dnsdb -c "COPY migrate_log_cf_rec FROM STDIN"</code></p>
    <div>
      <h3>Applying changes in the destination database</h3>
      <a href="#applying-changes-in-the-destination-database">
        
      </a>
    </div>
    <p>Now, with a batch of data in the <i>migrate_log_cf_rec </i>table, we called a newly created function <i>log_change_apply</i> to apply and audit the changes. Once again, this was all executed within a single database transaction. The function did the following:</p><ol><li><p>Move a batch from the <i>migrate_log_cf_rec</i> table to a new temporary table.</p></li><li><p>Write the counts for the batch_id to the <i>log_change_action</i> table.</p></li><li><p>Delete from the temporary table all but the latest record for a unique id (last action). For example, an insert followed by 30 updates would have a single record left, the final update. There is no need to apply all the intermediate updates.</p></li><li><p>Delete any record from <i>cf_rec</i> that has any corresponding changes.</p></li><li><p>Insert any [i]nsert or [u]pdate records in <i>cf_rec</i>.</p></li><li><p>Copy the batch to <i>applied_migrate_log_cf_rec</i> for a full audit trail.</p></li></ol>
    <div>
      <h3>Putting it all together</h3>
      <a href="#putting-it-all-together">
        
      </a>
    </div>
    <p>There were 4 distinct phases, each of which was part of a different database transaction:</p><ol><li><p>Call <i>fn_log_change_transfer_log_cf_rec </i>in <b>cfdb </b>to get a batch of records.</p></li><li><p>Copy the batch of records to <b>dnsdb.</b></p></li><li><p>Call <i>log_change_apply </i>in <b>dnsdb </b>to apply the batch of records.</p></li><li><p>Compare the <i>log_change_action</i> table in each respective database to ensure counts match.</p></li></ol>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/2REIq71tc7M4jKPLZSJzS9/11f22f700300f2ad3a5ee5ca85a75480/Applying_changes_in_the_destination_database.png" />
          </figure><p>This process was run every 3 seconds for several weeks before the migration to ensure that we could keep <b>dnsdb</b> in sync with <b>cfdb</b>.</p>
    <div>
      <h2>Managing which database is live</h2>
      <a href="#managing-which-database-is-live">
        
      </a>
    </div>
    <p>The last major pre-migration task was the construction of the request locking system that would be used throughout the actual migration. The aim was to create a system that would allow the database to communicate with the DNS Records API, to allow the DNS Records API to handle HTTP connections more gracefully. If done correctly, this could reduce downtime for DNS Record API users to nearly zero.</p><p>In order to facilitate this, a new table called <i>cf_migration_manager</i> was created. The table would be periodically polled by the DNS Records API, communicating two critical pieces of information:</p><ol><li><p><b>Which database was active.</b> Here we just used a simple A or B naming convention.</p></li><li><p><b>If the database was locked for writing</b>. In the event the database was locked for writing, the DNS Records API would hold HTTP requests until the lock was released by the database.</p></li></ol><p>Both pieces of information would be controlled within a migration manager script.</p><p>The benefit of migrating the 20+ internal services from direct database access to using our internal DNS Records gRPC API is that we were able to control access to the database to ensure that no one else would be writing without going through the <i>cf_migration_manager</i>.</p>
    <div>
      <h2>During the migration </h2>
      <a href="#during-the-migration">
        
      </a>
    </div>
    <p>Although we aimed to complete this migration in a matter of seconds, we announced a DNS maintenance window that could last a couple of hours just to be safe. Now that everything was set up, and both <b>cfdb</b> and <b>dnsdb</b> were roughly in sync, it was time to proceed with the migration. The steps were as follows:</p><ol><li><p>Lower the time between copies from 3s to 0.5s.</p></li><li><p>Lock <b>cfdb</b> for writes via <i>cf_migration_manager</i>. This would tell the DNS Records API to hold write connections.</p></li><li><p>Make <b>cfdb</b> read-only and migrate the last logged changes to <b>dnsdb</b>. </p></li><li><p>Enable writes to <b>dnsdb</b>. </p></li><li><p>Tell DNS Records API that <b>dnsdb</b> is the new primary database and that write connections can proceed via the <i>cf_migration_manager</i>.</p></li></ol><p>Since we needed to ensure that the last changes were copied to <b>dnsdb</b> before enabling writing, this entire process took no more than 2 seconds. During the migration we saw a spike of API latency as a result of the migration manager locking writes, and then dealing with a backlog of queries. However, we recovered back to normal latencies after several minutes. </p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6agUpD8BQVxgDupBrwtTw3/38c96f91879c6539011866821ad6f11a/image3.png" />
          </figure><p><sup><i>DNS Records API Latency and Requests during migration</i></sup></p><p>Unfortunately, due to the far-reaching impact that DNS has at Cloudflare, this was not the end of the migration. There were 3 lesser-used services that had slipped by in our scan of services accessing DNS records via <b>cfdb</b>. Fortunately, the setup of the foreign table meant that we could very quickly fix any residual issues by simply changing the table name. </p>
    <div>
      <h2>Post-migration</h2>
      <a href="#post-migration">
        
      </a>
    </div>
    <p>Almost immediately, as expected, we saw a steep drop in usage across <b>cfdb</b>. This freed up a lot of resources for other services to take advantage of.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/Xfnbc9MZLwJB91ypItWsi/1eb21362893b31a1e3c846d1076a9f5b/image6.jpg" />
          </figure><p><sup><i><b>cfdb</b></i></sup><sup><i> usage dropped significantly after the migration period.</i></sup></p><p>Since the migration, the average <b>requests</b> per second to the DNS Records API has more than <b>doubled</b>. At the same time, our CPU usage across both <b>cfdb</b> and <b>dnsdb</b> has settled at below 10% as seen below, giving us room for spikes and future growth. </p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/39su35dkb5Pl8uwYfYjHLg/0eb26ced30b44efb71abb73830e01f3a/image2.png" />
          </figure>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5AdlLKXtD68QWCsMVLKnkt/9137beee9c941827eb57c53825ffe209/image4.png" />
          </figure><p><sup><i><b>cfdb</b></i></sup><sup><i> and </i></sup><sup><i><b>dnsdb</b></i></sup><sup><i> CPU usage now</i></sup></p><p>As a result of this improved capacity, our database-related incident rate dropped dramatically.</p><p>As for query latencies, our latency post-migration is slightly lower on average, with fewer sustained spikes above 500ms. However, the performance improvement is largely noticed during high load periods, when our database handles spikes without significant issues. Many of these spikes come as a result of clients making calls to collect a large amount of DNS records or making several changes to their zone in short bursts. Both of these actions are common use cases for large customers onboarding zones.</p><p>In addition to these improvements, the DNS team also has more granular control over <b>dnsdb</b> cluster-specific settings that can be tweaked for our needs rather than catering to all the other services. For example, we were able to make custom changes to replication lag limits to ensure that services using replicas were able to read with some amount of certainty that the data would exist in a consistent form. Measures like this reduce overall load on the primary because almost all read queries can now go to the replicas.</p><p>Although this migration was a resounding success, we are always working to improve our systems. As we grow, so do our customers, which means the need to scale never really ends. We have more exciting improvements on the roadmap, and we are looking forward to sharing more details in the future.</p><p>The DNS team at Cloudflare isn’t the only team solving challenging problems like the one above. If this sounds interesting to you, we have many more tech deep dives on our blog, and we are always looking for curious engineers to join our team — see open opportunities <a href="https://www.cloudflare.com/en-gb/careers/jobs/"><u>here</u></a>.</p> ]]></content:encoded>
            <category><![CDATA[DNS]]></category>
            <category><![CDATA[API]]></category>
            <category><![CDATA[Database]]></category>
            <category><![CDATA[Kafka]]></category>
            <category><![CDATA[Postgres]]></category>
            <category><![CDATA[Tracing]]></category>
            <category><![CDATA[Quicksilver]]></category>
            <guid isPermaLink="false">24rozMdbFQ7jmUgRNMF4RU</guid>
            <dc:creator>Alex Fattouche</dc:creator>
            <dc:creator>Corey Horton</dc:creator>
        </item>
        <item>
            <title><![CDATA[Making zone management more efficient with batch DNS record updates]]></title>
            <link>https://blog.cloudflare.com/batched-dns-changes/</link>
            <pubDate>Mon, 23 Sep 2024 13:00:00 GMT</pubDate>
            <description><![CDATA[ In response to customer demand, we now support the ability to DELETE, PATCH, PUT and POST multiple DNS records in a single API call, enabling more efficient and reliable zone management.
 ]]></description>
            <content:encoded><![CDATA[ <p>Customers that use Cloudflare to manage their DNS often need to create a whole batch of records, enable <a href="https://developers.cloudflare.com/dns/manage-dns-records/reference/proxied-dns-records/"><u>proxying</u></a> on many records, update many records to point to a new target at the same time, or even delete all of their records. Historically, customers had to resort to bespoke scripts to make these changes, which came with their own set of issues. In response to customer demand, we are excited to announce support for batched API calls to the <a href="https://developers.cloudflare.com/dns/manage-dns-records/how-to/create-dns-records/"><u>DNS records API</u></a> starting today. This lets customers make large changes to their zones much more efficiently than before. Whether sending a POST, PUT, PATCH or DELETE, users can now execute these four different <a href="https://en.wikipedia.org/wiki/HTTP#Request_methods"><u>HTTP methods</u></a>, and multiple HTTP requests all at the same time.</p>
    <div>
      <h2>Efficient zone management matters</h2>
      <a href="#efficient-zone-management-matters">
        
      </a>
    </div>
    <p><a href="https://www.cloudflare.com/en-gb/learning/dns/dns-records/"><u>DNS records</u></a> are an essential part of most web applications and websites, and they serve many different purposes. The most common use case for a DNS record is to have a hostname point to an <a href="https://en.wikipedia.org/wiki/IPv4"><u>IPv4</u></a> address, this is called an <a href="https://www.cloudflare.com/en-gb/learning/dns/dns-records/dns-a-record/"><u>A record</u></a>:</p><p><b>example.com</b> 59 IN A <b>198.51.100.0</b></p><p><b>blog.example.com</b> 59 IN A <b>198.51.100.1</b></p><p><b>ask.example.com</b> 59 IN A <b>198.51.100.2</b></p><p>In its most simple form, this enables Internet users to connect to websites without needing to memorize their IP address. </p><p>Often, our customers need to be able to do things like create a whole batch of records, or enable <a href="https://developers.cloudflare.com/dns/manage-dns-records/reference/proxied-dns-records/"><u>proxying</u></a> on many records, or update many records to point to a new target at the same time, or even delete all of their records. Unfortunately, for most of these cases, we were asking customers to write their own custom scripts or programs to do these tasks for them, a number of which are open sourced and whose content has not been checked by us. These scripts are often used to avoid needing to repeatedly make the same API calls manually. This takes time, not only for the development of the scripts, but also to simply execute all the API calls, not to mention it can leave the zone in a bad state if some changes fail while others succeed.</p>
    <div>
      <h2>Introducing /batch</h2>
      <a href="#introducing-batch">
        
      </a>
    </div>
    <p>Starting today, everyone with a <a href="https://developers.cloudflare.com/dns/zone-setups/"><u>Cloudflare zone</u></a> will have access to this endpoint, with free tier customers getting access to 200 changes in one batch, and paid plans getting access to 3,500 changes in one batch. We have successfully tested up to 100,000 changes in one call. The API is simple, expecting a POST request to be made to the <a href="https://developers.cloudflare.com/api/operations/dns-records-for-a-zone-batch-dns-records"><u>new API endpoint</u></a> /dns_records/batch, which passes in a JSON object in the body in the format:</p>
            <pre><code>{
    deletes:[]Record
    patches:[]Record
    puts:[]Record
    posts:[]Record
}
</code></pre>
            <p>Each list of records []Record will follow the same requirements as the regular API, except that the record ID on deletes, patches, and puts will be required within the Record object itself. Here is a simple example:</p>
            <pre><code>{
    "deletes": [
        {
            "id": "143004ef463b464a504bde5a5be9f94a"
        },
        {
            "id": "165e9ef6f325460c9ca0eca6170a7a23"
        }
    ],
    "patches": [
        {
            "id": "16ac0161141a4e62a79c50e0341de5c6",
            "content": "192.0.2.45"
        },
        {
            "id": "6c929ea329514731bcd8384dd05e3a55",
            "name": "update.example.com",
            "proxied": true
        }
    ],
    "puts": [
        {
            "id": "ee93eec55e9e45f4ae3cb6941ffd6064",
            "content": "192.0.2.50",
            "name": "no-change.example.com",
            "proxied": false,
            "ttl:": 1
        },
        {
            "id": "eab237b5a67e41319159660bc6cfd80b",
            "content": "192.0.2.45",
            "name": "no-change.example.com",
            "proxied": false,
            "ttl:": 3000
        }
    ],
    "posts": [
        {
            "name": "@",
            "type": "A",
            "content": "192.0.2.45",
            "proxied": false,
            "ttl": 3000
        },
        {
            "name": "a.example.com",
            "type": "A",
            "content": "192.0.2.45",
            "proxied": true
        }
    ]
}</code></pre>
            <p>Our API will then parse this and execute these calls in the following order: </p><ol><li><p>deletes</p></li><li><p>patches</p></li><li><p>puts</p></li><li><p>posts</p></li></ol><p>Each of these respective lists will be executed in the order given. This ordering system is important because it removes the need for our clients to worry about conflicts, such as if they need to create a CNAME on the same hostname as a to-be-deleted A record, which is not allowed in <a href="https://datatracker.ietf.org/doc/html/rfc1912#section-2.4"><u>RFC 1912</u></a>. In the event that any of these individual actions fail, the entire API call will fail and return the first error it sees. The batch request will also be executed inside a single database <a href="https://en.wikipedia.org/wiki/Database_transaction"><u>transaction</u></a>, which will roll back in the event of failure.</p><p>After the batch request has been successfully executed in our database, we then propagate the changes to our edge via <a href="https://blog.cloudflare.com/introducing-quicksilver-configuration-distribution-at-internet-scale"><u>Quicksilver</u></a>, our distributed KV store. Each of the individual record changes inside the batch request is treated as a single key-value pair, and database transactions are not supported. As such, <b>we cannot guarantee that the propagation to our edge servers will be atomic</b>. For example, if replacing a <a href="https://developers.cloudflare.com/dns/manage-dns-records/how-to/subdomains-outside-cloudflare/"><u>delegation</u></a> with an A record, some resolvers may see the <a href="https://www.cloudflare.com/en-gb/learning/dns/dns-records/dns-ns-record/"><u>NS</u></a> record removed before the A record is added. </p><p>The response will follow the same format as the request. Patches and puts that result in no changes will be placed at the end of their respective lists.</p><p>We are also introducing some new changes to the Cloudflare dashboard, allowing users to select multiple records and subsequently:</p><ol><li><p>Delete all selected records</p></li><li><p>Change the proxy status of all selected records</p></li></ol>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1ZU7nvMlcH2L51IqJrS1zC/db7ac600e503a72bb0c25679d63394e7/BLOG-2495_2.png" />
          </figure><p>We plan to continue improving the dashboard to support more batch actions based on your feedback.</p>
    <div>
      <h2>The journey</h2>
      <a href="#the-journey">
        
      </a>
    </div>
    <p>Although at the surface, this batch endpoint may seem like a fairly simple change, behind the scenes it is the culmination of a multi-year, multi-team effort. Over the past several years, we have been working hard to improve the DNS pipeline that takes our customers' records and pushes them to <a href="https://blog.cloudflare.com/introducing-quicksilver-configuration-distribution-at-internet-scale"><u>Quicksilver</u></a>, our distributed database. As part of this effort, we have been improving our <a href="https://developers.cloudflare.com/api/operations/dns-records-for-a-zone-list-dns-records"><u>DNS Records API</u></a> to reduce the overall latency. The DNS Records API is Cloudflare's most used API externally, serving twice as many requests as any other API at peak. In addition, the DNS Records API supports over 20 internal services, many of which touch very important areas such as DNSSEC, TLS, Email, Tunnels, Workers, Spectrum, and R2 storage. Therefore, it was important to build something that scales. </p><p>To improve API performance, we first needed to understand the complexities of the entire stack. At Cloudflare, we use <a href="https://www.jaegertracing.io/"><u>Jaeger tracing</u></a> to debug our systems. It gives us granular insights into a sample of requests that are coming into our APIs. When looking at API request latency, the <a href="https://www.jaegertracing.io/docs/1.23/architecture/#span"><u>span</u></a> that stood out was the time spent on each individual database lookup. The latency here can vary anywhere from ~1ms to ~5ms. </p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/61f3sKGUs9oWMPT9P4au6R/a91d8291b626f4bab3ac1c69adf62a5d/BLOG-2495_3.png" />
          </figure>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3L3OaTb9cTKKKcIjCm1RLq/86ffd63116988025fd52105e316c5b5a/BLOG-2495_4.png" />
          </figure><p><sub><i>Jaeger trace showing variable database latency</i></sub></p><p>Given this variability in database query latency, we wanted to understand exactly what was going on within each DNS Records API request. When we first started on this journey, the breakdown of database lookups for each action was as follows:</p><table><tr><th><p><b>Action</b></p></th><th><p><b>Database Queries</b></p></th><th><p><b>Reason</b></p></th></tr><tr><td><p>POST</p></td><td><p>2 </p></td><td><p>One to write and one to read the new record.</p></td></tr><tr><td><p>PUT</p></td><td><p>3</p></td><td><p>One to collect, one to write, and one to read back the new record.</p></td></tr><tr><td><p>PATCH</p></td><td><p>3</p></td><td><p>One to collect, one to write, and one to read back the new record.</p></td></tr><tr><td><p>DELETE</p></td><td><p>2</p></td><td><p>One to read and one to delete.</p></td></tr></table><p>The reason we needed to read the newly created records on POST, PUT, and PATCH was because the record contains information filled in by the database which we cannot infer in the API. </p><p>Let’s imagine that a customer needed to edit 1,000 records. If each database lookup took 3ms to complete, that was 3ms * 3 lookups * 1,000 records = 9 seconds spent on database queries alone, not taking into account the round trip time to and from our API or any other processing latency. It’s clear that we needed to reduce the number of overall queries and ideally minimize per query latency variation. Let’s tackle the variation in latency first.</p><p>Each of these calls is not a simple INSERT, UPDATE, or DELETE, because we have functions wrapping these database calls for sanitization purposes. In order to understand the variable latency, we enlisted the help of <a href="https://www.postgresql.org/docs/current/auto-explain.html"><u>PostgreSQL’s “auto_explain”</u></a>. This module gives a breakdown of execution times for each statement without needing to EXPLAIN each one by hand. We used the following settings:</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/2myvmIREh2Q9yl30HbRus/29f085d40ba7dde34e9a46c27e3c6ba2/BLOG-2495_5.png" />
          </figure><p>A handful of queries showed durations like the one below, which took an order of magnitude longer than other queries.</p>
          <figure>
          <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/557xg66x8OiHM6pcAG4svk/56157cd0e5b6d7fd47f0152798598729/BLOG-2495_6.png" />
          </figure><p>We noticed that in several locations we were doing queries like:</p><p><code>IF (EXISTS (SELECT id FROM table WHERE row_hash = __new_row_hash))</code></p><p>If you are trying to insert into very large zones, such queries could mean even longer database query times, potentially explaining the discrepancy between 1ms and 5ms in our tracing images above. Upon further investigation, we already had a unique index on that exact hash. <a href="https://www.postgresql.org/docs/current/indexes-unique.html"><u>Unique indexes</u></a> in PostgreSQL enforce the uniqueness of one or more column values, which means we can safely remove those existence checks without risk of inserting duplicate rows.</p><p>The next task was to introduce database batching into our DNS Records API. In any API, external calls such as SQL queries are going to add substantial latency to the request. Database batching allows the DNS Records API to execute multiple SQL queries within one single network call, subsequently lowering the number of database round trips our system needs to make. </p><p>According to the table above, each database write also corresponded to a read after it had completed the query. This was needed to collect information like creation/modification timestamps and new IDs. To improve this, we tweaked our database functions to now return the newly created DNS record itself, removing a full round trip to the database. Here is the updated table:</p><table><tr><th><p><b>Action</b></p></th><th><p><b>Database Queries</b></p></th><th><p><b>Reason</b></p></th></tr><tr><td><p>POST</p></td><td><p>1 </p></td><td><p>One to write</p></td></tr><tr><td><p>PUT</p></td><td><p>2</p></td><td><p>One to read, one to write.</p></td></tr><tr><td><p>PATCH</p></td><td><p>2</p></td><td><p>One to read, one to write.</p></td></tr><tr><td><p>DELETE</p></td><td><p>2</p></td><td><p>One to read, one to delete.</p></td></tr></table><p>We have room for improvement here, however we cannot easily reduce this further due to some restrictions around auditing and other sanitization logic.</p><p><b>Results:</b></p><table><tr><th><p><b>Action</b></p></th><th><p><b>Average database time before</b></p></th><th><p><b>Average database time after</b></p></th><th><p><b>Percentage Decrease</b></p></th></tr><tr><td><p>POST</p></td><td><p>3.38ms</p></td><td><p>0.967ms</p></td><td><p>71.4%</p></td></tr><tr><td><p>PUT</p></td><td><p>4.47ms</p></td><td><p>2.31ms</p></td><td><p>48.4%</p></td></tr><tr><td><p>PATCH</p></td><td><p>4.41ms</p></td><td><p>2.24ms</p></td><td><p>49.3%</p></td></tr><tr><td><p>DELETE</p></td><td><p>1.21ms</p></td><td><p>1.21ms</p></td><td><p>0%</p></td></tr></table><p>These are some pretty good improvements! Not only did we reduce the API latency, we also reduced the database query load, benefiting other systems as well.</p>
    <div>
      <h2>Weren’t we talking about batching?</h2>
      <a href="#werent-we-talking-about-batching">
        
      </a>
    </div>
    <p>I previously mentioned that the /batch endpoint is fully atomic, making use of a single database transaction. However, a single transaction may still require multiple database network calls, and from the table above, that can add up to a significant amount of time when dealing with large batches. To optimize this, we are making use of <a href="https://pkg.go.dev/github.com/jackc/pgx/v4#Batch"><u>pgx/batch</u></a>, a Golang object that allows us to write and subsequently read multiple queries in a single network call. Here is a high level of how the batch endpoint works:</p><ol><li><p>Collect all the records for the PUTs, PATCHes and DELETEs.</p></li><li><p>Apply any per record differences as requested by the PATCHes and PUTs.</p></li><li><p>Format the batch SQL query to include each of the actions.</p></li><li><p>Execute the batch SQL query in the database.</p></li><li><p>Parse each database response and return any errors if needed.</p></li><li><p>Audit each change.</p></li></ol><p>This takes at most only two database calls per batch. One to fetch, and one to write/delete. If the batch contains only POSTs, this will be further reduced to a single database call. Given all of this, we should expect to see a significant improvement in latency when making multiple changes, which we do when observing how these various endpoints perform: </p><p><i>Note: Each of these queries was run from multiple locations around the world and the median of the response times are shown here. The server responding to queries is located in Portland, Oregon, United States. Latencies are subject to change depending on geographical location.</i></p><p><b>Create only:</b></p><table><tr><th><p>
</p></th><th><p><b>10 Records</b></p></th><th><p><b>100 Records</b></p></th><th><p><b>1,000 Records</b></p></th><th><p><b>10,000 Records</b></p></th></tr><tr><td><p><b>Regular API</b></p></td><td><p>7.55s</p></td><td><p>74.23s</p></td><td><p>757.32s</p></td><td><p>7,877.14s</p></td></tr><tr><td><p><b>Batch API - Without database batching</b></p></td><td><p>0.85s</p></td><td><p>1.47s</p></td><td><p>4.32s</p></td><td><p>16.58s</p></td></tr><tr><td><p><b>Batch API - with database batching</b></p></td><td><p>0.67s</p></td><td><p>1.21s</p></td><td><p>3.09s</p></td><td><p>10.33s</p></td></tr></table><p><b>Delete only:</b></p><table><tr><th><p>
</p></th><th><p><b>10 Records</b></p></th><th><p><b>100 Records</b></p></th><th><p><b>1,000 Records</b></p></th><th><p><b>10,000 Records</b></p></th></tr><tr><td><p><b>Regular API</b></p></td><td><p>7.28s</p></td><td><p>67.35s</p></td><td><p>658.11s</p></td><td><p>7,471.30s</p></td></tr><tr><td><p><b>Batch API - without database batching</b></p></td><td><p>0.79s</p></td><td><p>1.32s</p></td><td><p>3.18s</p></td><td><p>17.49s</p></td></tr><tr><td><p><b>Batch API - with database batching</b></p></td><td><p>0.66s</p></td><td><p>0.78s</p></td><td><p>1.68s</p></td><td><p>7.73s</p></td></tr></table><p><b>Create/Update/Delete:</b></p><table><tr><th><p>
</p></th><th><p><b>10 Records</b></p></th><th><p><b>100 Records</b></p></th><th><p><b>1,000 Records</b></p></th><th><p><b>10,000 Records</b></p></th></tr><tr><td><p><b>Regular API</b></p></td><td><p>7.11s</p></td><td><p>72.41s</p></td><td><p>715.36s</p></td><td><p>7,298.17s</p></td></tr><tr><td><p><b>Batch API - without database batching</b></p></td><td><p>0.79s</p></td><td><p>1.36s</p></td><td><p>3.05s</p></td><td><p>18.27s</p></td></tr><tr><td><p><b>Batch API - with database batching</b></p></td><td><p>0.74s</p></td><td><p>1.06s</p></td><td><p>2.17s</p></td><td><p>8.48s</p></td></tr></table><p><b>Overall Average:</b></p><table><tr><th><p>
</p></th><th><p><b>10 Records</b></p></th><th><p><b>100 Records</b></p></th><th><p><b>1,000 Records</b></p></th><th><p><b>10,000 Records</b></p></th></tr><tr><td><p><b>Regular API</b></p></td><td><p>7.31s</p></td><td><p>71.33s</p></td><td><p>710.26s</p></td><td><p>7,548.87s</p></td></tr><tr><td><p><b>Batch API - without database batching</b></p></td><td><p>0.81s</p></td><td><p>1.38s</p></td><td><p>3.51s</p></td><td><p>17.44s</p></td></tr><tr><td><p><b>Batch API - with database batching</b></p></td><td><p>0.69s</p></td><td><p>1.02s</p></td><td><p>2.31s</p></td><td><p>8.85s</p></td></tr></table><p>We can see that on average, the new batching API is significantly faster than the regular API trying to do the same actions, and it’s also nearly twice as fast as the batching API without batched database calls. We can see that at 10,000 records, the batching API is a staggering 850x faster than the regular API. As mentioned above, these numbers are likely to change for a number of different reasons, but it’s clear that making several round trips to and from the API adds substantial latency, regardless of the region.</p>
    <div>
      <h2>Batch overload</h2>
      <a href="#batch-overload">
        
      </a>
    </div>
    <p>Making our API faster is awesome, but we don’t operate in an isolated environment. Each of these records needs to be processed and pushed to <a href="https://blog.cloudflare.com/introducing-quicksilver-configuration-distribution-at-internet-scale"><u>Quicksilver</u></a>, our distributed database. If we have customers creating tens of thousands of records every 10 seconds, we need to be able to handle this downstream so that we don’t overwhelm our system. In a May 2022 blog post titled <a href="https://blog.cloudflare.com/dns-build-improvement"><i><u>How we improved DNS record build speed by more than 4,000x</u></i></a>, I noted<i> </i>that:</p><blockquote><p><i>We plan to introduce a batching system that will collect record changes into groups to minimize the number of queries we make to our database and Quicksilver.</i></p></blockquote><p>This task has since been completed, and our propagation pipeline is now able to batch thousands of record changes into a single database query which can then be published to Quicksilver in order to be propagated to our global network. </p>
    <div>
      <h2>Next steps</h2>
      <a href="#next-steps">
        
      </a>
    </div>
    <p>We have a couple more improvements we may want to bring into the API. We also intend to improve the UI to bring more usability improvements to the dashboard to more easily manage zones. <a href="https://research.rallyuxr.com/cloudflare/lp/cm0zu2ma7017j1al98l1m8a7n?channel=share&amp;studyId=cm0zu2ma4017h1al9byak79iw"><u>We would love to hear your feedback</u></a>, so please let us know what you think and if you have any suggestions for improvements.</p><p>For more details on how to use the new /batch API endpoint, head over to our <a href="https://developers.cloudflare.com/dns/manage-dns-records/how-to/batch-record-changes/"><u>developer documentation</u></a> and <a href="https://developers.cloudflare.com/api/operations/dns-records-for-a-zone-batch-dns-records"><u>API reference</u></a>.</p> ]]></content:encoded>
            <category><![CDATA[Birthday Week]]></category>
            <category><![CDATA[DNS]]></category>
            <category><![CDATA[API]]></category>
            <category><![CDATA[Kafka]]></category>
            <category><![CDATA[Database]]></category>
            <guid isPermaLink="false">op0CI3wllMcGjptdRb2Ce</guid>
            <dc:creator>Alex Fattouche</dc:creator>
        </item>
        <item>
            <title><![CDATA[Building D1: a Global Database]]></title>
            <link>https://blog.cloudflare.com/building-d1-a-global-database/</link>
            <pubDate>Mon, 01 Apr 2024 13:00:41 GMT</pubDate>
            <description><![CDATA[ D1, Cloudflare’s SQL database, is now generally available.  ]]></description>
            <content:encoded><![CDATA[ <p></p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/76hMKeBHewbCLm4XlVS4zL/92271c25576185cad1ab5e70e29ede58/image2-33.png" />
            
            </figure><p>Developers who build Worker applications focus on what they're creating, not the infrastructure required, and benefit from the global reach of <a href="https://www.cloudflare.com/network/">Cloudflare's network</a>. Many applications require persistent data, from personal projects to business-critical workloads. Workers offer various <a href="https://developers.cloudflare.com/workers/platform/storage-options/">database and storage options</a> tailored to developer needs, such as key-value and <a href="https://www.cloudflare.com/learning/cloud/what-is-object-storage/">object storage</a>.</p><p>Relational databases are the backbone of many applications today. <a href="https://developers.cloudflare.com/d1/">D1</a>, Cloudflare's relational database complement, is now generally available. Our journey from alpha in late 2022 to GA in April 2024 focused on enabling developers to build production workloads with the familiarity of relational data and SQL.</p>
    <div>
      <h3>What’s D1?</h3>
      <a href="#whats-d1">
        
      </a>
    </div>
    <p>D1 is Cloudflare's built-in, serverless relational database. For Worker applications, D1 offers SQL's expressiveness, leveraging SQLite's SQL dialect, and developer tooling integrations, including object-relational mappers (ORMs) like <a href="https://orm.drizzle.team/docs/connect-cloudflare-d1">Drizzle ORM</a>. D1 is accessible via <a href="https://developers.cloudflare.com/d1/build-with-d1/d1-client-api/">Workers</a> or an <a href="https://developers.cloudflare.com/api/operations/cloudflare-d1-create-database">HTTP API</a>.</p><p>Serverless means no provisioning, default disaster recovery with <a href="https://developers.cloudflare.com/d1/reference/time-travel/">Time Travel</a>, and <a href="https://developers.cloudflare.com/d1/platform/pricing/">usage-based pricing</a>. D1 includes a generous free tier that allows developers to experiment with D1 and then graduate those trials to production.</p>
    <div>
      <h3>How to make data global?</h3>
      <a href="#how-to-make-data-global">
        
      </a>
    </div>
    <p>D1 GA has focused on reliability and developer experience. Now, we plan on extending D1 to better support globally-distributed applications.</p><p>In the Workers model, an incoming request invokes serverless execution in the closest data center. A Worker application can scale globally with user requests. Application data, however, remains stored in centralized databases, and global user traffic must account for access round trips to data locations. For example, a D1 database today resides in a single location.</p><p>Workers support <a href="https://developers.cloudflare.com/workers/configuration/smart-placement">Smart Placement</a> to account for frequently accessed data locality. Smart Placement invokes a Worker closer to centralized backend services like databases to lower latency and improve application performance. We’ve addressed Workers placement in global applications, but need to solve data placement.</p><p>The question, then, is how can D1, as Cloudflare’s <a href="https://www.cloudflare.com/developer-platform/products/d1/">built-in database solution</a>, better support data placement for global applications? The answer is asynchronous read replication.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1I58tQFeSOcIyqrGfv9UnB/c1bc267b8cd2eb09332ae909429aeb5b/image4-30.png" />
            
            </figure>
    <div>
      <h3>What is asynchronous read replication?</h3>
      <a href="#what-is-asynchronous-read-replication">
        
      </a>
    </div>
    <p>In a server-based database management system, like Postgres, MySQL, SQL Server, or Oracle, a <b><i>read replica</i></b> is a separate database server that serves as a read-only, almost up-to-date copy of the primary database server. An administrator creates a read replica by starting a new server from a snapshot of the primary server and configuring the primary server to send updates asynchronously to the replica server. Since the updates are asynchronous, the read replica may be behind the current state of the primary server. The difference between the primary server and a replica is called <b><i>replica lag</i></b>. It's possible to have more than one read replica.</p><p>Asynchronous read replication is a time-proven solution for improving the performance of databases:</p><ul><li><p>It's possible to increase throughput by distributing load across multiple replicas.</p></li><li><p>It's possible to lower query latency when the replicas are close to the users making queries.</p></li></ul><p>Note that some database systems also offer synchronous replication. In a synchronous replicated system, writes must wait until all replicas have confirmed the write. Synchronous replicated systems can run only as fast as the slowest replica and come to a halt when a replica fails. If we’re trying to improve performance on a global scale, we want to avoid synchronous replication as much as possible!</p>
    <div>
      <h3>Consistency models &amp; read replicas</h3>
      <a href="#consistency-models-read-replicas">
        
      </a>
    </div>
    <p>Most database systems provide <a href="https://jepsen.io/consistency/models/read-committed">read committed</a>, <a href="https://jepsen.io/consistency/models/snapshot-isolation">snapshot isolation</a>, or <a href="https://jepsen.io/consistency/models/serializable">serializable</a> consistency models, depending on their configuration. For example, Postgres <a href="https://jepsen.io/consistency/models/read-committed">defaults to read committed</a> but can be configured to use stronger modes. SQLite provides <a href="https://www.sqlite.org/draft/isolation.html">snapshot isolation in WAL mode</a>. Stronger modes like snapshot isolation or serializable are easier to program against because they limit the permitted system concurrency scenarios and the kind of concurrency race conditions the programmer has to worry about.</p><p>Read replicas are updated independently, so each replica's contents may differ at any moment. If all of your queries go to the same server, whether the primary or a read replica, your results should be consistent according to whatever <a href="https://jepsen.io/consistency">consistency model</a> your underlying database provides. If you're using a read replica, the results may just be a little old.</p><p>In a server-based database with read replicas, it's important to stick with the same server for all of the queries in a session. If you switch among different read replicas in the same session, you compromise the consistency model provided by your application, which may violate your assumptions about how the database acts and cause your application to return incorrect results!</p><p><b>Example</b>For example, there are two replicas, A and B. Replica A lags the primary database by 100ms, and replica B lags the primary database by 2s. Suppose a user wishes to:</p><ol><li><p>Execute query 11a. Do some computation based on query 1 results</p></li><li><p>Execute query 2 based on the results of the computation in (1a)</p></li></ol><p>At time t=10s, query 1 goes to replica A and returns. Query 1 sees what the primary database looked like at t=9.9s. Suppose it takes 500ms to do the computation, so at t=10.5s, query 2 goes to replica B. Remember, replica B lags the primary database by 2s, so at t=10.5s, query 2 sees what the database looks like at t=8.5s. As far as the application is concerned, the results of query 2 look like the database has gone backwards in time!</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/2R1p29j20c7szuRmY2Sjlp/52e4982c6c45e18c4d0c18835931b016/image3-34.png" />
            
            </figure><p>Formally, this is <a href="https://jepsen.io/consistency/models/read-committed">read committed</a> consistency since your queries will only see committed data, but there’s no other guarantee - not even that you can read your own writes. While read committed is a valid consistency model, it’s hard to reason about all of the possible race conditions the read committed model allows, making it difficult to write applications correctly.</p>
    <div>
      <h3>D1’s consistency model &amp; read replicas</h3>
      <a href="#d1s-consistency-model-read-replicas">
        
      </a>
    </div>
    <p>By default, D1 provides the <a href="https://jepsen.io/consistency/models/snapshot-isolation">snapshot isolation</a> that SQLite provides.</p><p>Snapshot isolation is a familiar consistency model that most developers find easy to use. We implement this consistency model in D1 by ensuring at most one active copy of the D1 database and routing all HTTP requests to that single database. While ensuring that there's at most one active copy of the D1 database is a gnarly distributed systems problem, it's one that we’ve solved by building D1 using <a href="https://developers.cloudflare.com/durable-objects/">Durable Objects</a>. Durable Objects guarantee global uniqueness, so once we depend on Durable Objects, routing HTTP requests is easy: just send them to the D1 Durable Object.</p><p>This trick doesn't work if you have multiple active copies of the database since there's no 100% reliable way to look at a generic incoming HTTP request and route it to the same replica 100% of the time. Unfortunately, as we saw in the previous section's example, if we don't route related requests to the same replica 100% of the time, the best consistency model we can provide is read committed.</p><p>Given that it's impossible to route to a particular replica consistently, another approach is to route requests to any replica and ensure that the chosen replica responds to requests according to a consistency model that "makes sense" to the programmer. If we're willing to include a <a href="https://en.wikipedia.org/wiki/Lamport_timestamp">Lamport timestamp</a> in our requests, we can implement <a href="https://jepsen.io/consistency/models/sequential">sequential consistency</a> using any replica. The sequential consistency model has important properties like "<a href="https://jepsen.io/consistency/models/read-your-writes">read my own writes</a>" and "<a href="https://jepsen.io/consistency/models/writes-follow-reads">writes follow reads</a>," as well as a total ordering of writes. The total ordering of writes means that every replica will see transactions commit in the same order, which is exactly the behavior we want in a transactional system. Sequential consistency comes with the caveat that any individual entity in the system may be arbitrarily out of date, but that caveat is a feature for us because it allows us to consider replica lag when designing our APIs.</p><p>The idea is that if D1 gives applications a Lamport timestamp for every database query and those applications tell D1 the last Lamport timestamp they've seen, we can have each replica determine how to make queries work according to the sequential consistency model.</p><p>A robust, yet simple, way to implement sequential consistency with replicas is to:</p><ul><li><p>Associate a Lamport timestamp with every single request to the database. A monotonically increasing commit token works well for this.</p></li><li><p>Send all write queries to the primary database to ensure the total ordering of writes.</p></li><li><p>Send read queries to any replica, but have the replica delay servicing the query until the replica receives updates from the primary database that are later than the Lamport timestamp in the query.</p></li></ul><p>What's nice about this implementation is that it's fast in the common case where a read-heavy workload always goes to the same replica and will work even if requests get routed to different replicas.</p>
    <div>
      <h3><b><i>Sneak Preview:</i></b> bringing read replication to D1 with Sessions</h3>
      <a href="#sneak-preview-bringing-read-replication-to-d1-with-sessions">
        
      </a>
    </div>
    <p>To bring read replication to D1, we will expand the D1 API with a new concept: <b>Sessions</b>. A Session encapsulates all the queries representing one logical session for your application. For example, a Session might represent all requests coming from a particular web browser or all requests coming from a mobile app. If you use Sessions, your queries will use whatever copy of the D1 database makes the most sense for your request, be that the primary database or a nearby replica. D1's Sessions implementation will ensure sequential consistency for all queries in the Session.</p><p>Since the Sessions API changes D1's consistency model, developers must opt-in to the new API. Existing D1 API methods are unchanged and will still have the same snapshot isolation consistency model as before. However, only queries made using the new Sessions API will use replicas.</p><p>Here’s an example of the D1 Sessions API:</p>
            <pre><code>export default {
  async fetch(request: Request, env: Env) {
    // When we create a D1 Session, we can continue where we left off
    // from a previous Session if we have that Session's last commit
    // token.  This Worker will return the commit token back to the
    // browser, so that it can send it back on the next request to
    // continue the Session.
    //
    // If we don't have a commit token, make the first query in this
    // session an "unconditional" query that will use the state of the
    // database at whatever replica we land on.
    const token = request.headers.get('x-d1-token') ?? 'first-unconditional'
    const session = env.DB.withSession(token)


    // Use this Session for all our Workers' routes.
    const response = await handleRequest(request, session)


    if (response.status === 200) {
      // Set the token so we can continue the Session in another request.
      response.headers.set('x-d1-token', session.latestCommitToken)
    }
    return response
  }
}


async function handleRequest(request: Request, session: D1DatabaseSession) {
  const { pathname } = new URL(request.url)


  if (pathname === '/api/orders/list') {
    // This statement is a read query, so it will execute on any
    // replica that has a commit equal or later than `token` we used
    // to create the Session.
    const { results } = await session.prepare('SELECT * FROM Orders').all()


    return Response.json(results)
  } else if (pathname === '/api/orders/add') {
    const order = await request.json&lt;Order&gt;()


    // This statement is a write query, so D1 will send the query to
    // the primary, which always has the latest commit token.
    await session
      .prepare('INSERT INTO Orders VALUES (?, ?, ?)')
      .bind(order.orderName, order.customer, order.value)
      .run()


    // In order for the application to be correct, this SELECT
    // statement must see the results of the INSERT statement above.
    // The Session API keeps track of commit tokens for queries
    // within the session and will ensure that we won't execute this
    // query until whatever replica we're using has seen the results
    // of the INSERT.
    const { results } = await session
      .prepare('SELECT COUNT(*) FROM Orders')
      .all()


    return Response.json(results)
  }


  return new Response('Not found', { status: 404 })
}</code></pre>
            <p>D1’s implementation of Sessions makes use of commit tokens.  Commit tokens identify a particular committed query to the database.  Within a session, D1 will use commit tokens to ensure that queries are sequentially ordered.  In the example above, the D1 session ensures that the “SELECT COUNT(*)” query happens <i>after</i> the “INSERT” of the new order, <i>even if</i> we switch replicas between the awaits.  </p><p>There are several options on how you want to start a session in a Workers fetch handler.  <code>db.withSession(&lt;condition&gt;)</code> accepts these arguments:</p><table><colgroup><col></col><col></col></colgroup><tbody><tr><td><p><span><b><code>condition</code> argument</b></span></p></td><td><p><span><b>Behavior</b></span></p></td></tr><tr><td><p><span><code>&lt;commit_token&gt;</code></span></p></td><td><p><span>(1) starts Session as of given commit token</span></p><p><span>(2) subsequent queries have sequential consistency</span></p></td></tr><tr><td><p><span><code>first-unconditional</code></span></p></td><td><p><span>(1) if the first query is read, read whatever current replica has and use the commit token of that read as the basis for subsequent queries.  If the first query is a write, forward the query to the primary and use the commit token of the write as the basis for subsequent queries.</span></p><p><span>(2) subsequent queries have sequential consistency</span></p></td></tr><tr><td><p><span><code>first-primary</code></span></p></td><td><p><span>(1) runs first query, read or write, against the primary</span></p><p><span>(2) subsequent queries have sequential consistency</span></p></td></tr><tr><td><p><span><code>null</code> or missing argument</span></p></td><td><p><span>treated like <code>first-unconditional</code> </span></p></td></tr></tbody></table><p>It’s possible to have a session span multiple requests by “round-tripping” the commit token from the last query of the session and using it to start a new session.  This enables individual user agents, like a web app or a mobile app, to make sure that all of the queries the user sees are sequentially consistent.</p><p>D1’s read replication will be built-in, will not incur extra usage or storage costs, and will require no replica configuration. Cloudflare will <a href="https://www.cloudflare.com/application-services/solutions/app-performance-monitoring/">monitor</a> an application’s D1 traffic and automatically create database replicas to spread user traffic across multiple servers in locations closer to users. Aligned with our serverless model, D1 developers shouldn’t worry about replica provisioning and management. Instead, developers should focus on designing applications for replication and data consistency tradeoffs.</p><p>We’re actively working on global read replication and realizing the above proposal (share feedback In the <a href="https://discord.cloudflare.com/">#d1 channel</a> on our Developer Discord). Until then, D1 GA includes several exciting new additions.</p>
    <div>
      <h3>Check out D1 GA</h3>
      <a href="#check-out-d1-ga">
        
      </a>
    </div>
    <p>Since D1’s open beta in October 2023, we’ve focused on D1’s reliability, scalability, and developer experience demanded of critical services. We’ve invested in several new features that allow developers to build and debug applications faster with D1.</p><p><b>Build bigger with larger databases</b>We’ve listened to developers who requested larger databases. D1 now supports up to 10 GB databases, with 50K databases on the Workers Paid plan. With D1’s horizontal scaleout, applications can model database-per-business-entity use cases. Since beta, new D1 databases process 40x more requests than D1 alpha databases in a given period.</p><p><b>Import &amp; export bulk data</b>Developers import and export data for multiple reasons:</p><ul><li><p>Database migration testing to/from different database systems</p></li><li><p>Data copies for local development or testing</p></li><li><p>Manual backups for custom requirements like compliance</p></li></ul><p>While you could execute SQL files against D1 before, we’re improving <code>wrangler d1 execute –file=&lt;filename&gt;</code> to ensure large imports are atomic operations, never leaving your database in a halfway state. <code>wrangler d1 execute</code> also now defaults to local-first to protect your remote production database.</p><p>To import our <a href="https://github.com/cloudflare/d1-northwind/tree/main">Northwind Traders</a> demo database, you can download the <a href="https://github.com/cloudflare/d1-northwind/blob/main/db/schema.sql">schema</a> &amp; <a href="https://github.com/cloudflare/d1-northwind/blob/main/db/data.sql">data</a> and execute the SQL files.</p>
            <pre><code>npx wrangler d1 create northwind-traders

# omit --remote to run on a local database for development
npx wrangler d1 execute northwind-traders --remote --file=./schema.sql

npx wrangler d1 execute northwind-traders --remote --file=./data.sql</code></pre>
            <p>D1 database data &amp; schema, schema-only, or data-only can be exported to a SQL file using:</p>
            <pre><code># database schema &amp; data
npx wrangler d1 export northwind-traders --remote --output=./database.sql

# single table schema &amp; data
npx wrangler d1 export northwind-traders --remote --table='Employee' --output=./table.sql

# database schema only
npx wrangler d1 export &lt;database_name&gt; --remote --output=./database-schema.sql --no-data=true</code></pre>
            <p><b>Debug query performance</b>Understanding SQL query performance and debugging slow queries is a crucial step for production workloads. We’ve added the experimental <a href="https://developers.cloudflare.com/d1/observability/metrics-analytics/#query-insights"><code>wrangler d1 insights</code></a> to help developers analyze query performance metrics also available via <a href="https://developers.cloudflare.com/d1/observability/metrics-analytics/">GraphQL API</a>.</p>
            <pre><code># To find top 10 queries by average execution time:
npx wrangler d1 insights &lt;database_name&gt; --sort-type=avg --sort-by=time --count=10</code></pre>
            <p><b>Developer tooling</b>Various <a href="https://developers.cloudflare.com/d1/reference/community-projects">community developer projects</a> support D1. New additions include <a href="https://developers.cloudflare.com/d1/tutorials/d1-and-prisma-orm">Prisma ORM</a>, in version 5.12.0, which now supports Workers and D1.</p>
    <div>
      <h3>Next steps</h3>
      <a href="#next-steps">
        
      </a>
    </div>
    <p>The features available now with GA and our global read replication design are just the start of delivering the SQL database needs for developer applications. If you haven’t yet used D1, you can <a href="https://developers.cloudflare.com/d1/get-started/">get started</a> right now, visit D1’s <a href="https://developers.cloudflare.com/d1/">developer documentation</a> to spark some ideas, or <a href="https://discord.cloudflare.com/">join the #d1 channel</a> on our Developer Discord to talk to other D1 developers and our product engineering team.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/2dTCMeWMaQjhBd1SM8hM6O/2cbe9ec1a7a4fb0c061afe0e1c0bf666/image1-35.png" />
            
            </figure><p></p> ]]></content:encoded>
            <category><![CDATA[Developer Week]]></category>
            <category><![CDATA[Developers]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <category><![CDATA[D1]]></category>
            <category><![CDATA[Database]]></category>
            <guid isPermaLink="false">6y8LbpExPriYEVMgzCDp4B</guid>
            <dc:creator>Vy Ton</dc:creator>
            <dc:creator>Justin Mazzola Paluska</dc:creator>
        </item>
        <item>
            <title><![CDATA[Hyperdrive: making databases feel like they’re global]]></title>
            <link>https://blog.cloudflare.com/hyperdrive-making-regional-databases-feel-distributed/</link>
            <pubDate>Thu, 28 Sep 2023 13:02:00 GMT</pubDate>
            <description><![CDATA[ Hyperdrive makes accessing your existing databases from Cloudflare Workers, wherever they are running, hyper fast ]]></description>
            <content:encoded><![CDATA[ <p></p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/38AL8kYOfUuUOlGD2p2Wfw/596490c5c841fb416154cbce56cc830b/image1-33.png" />
            
            </figure><p>Hyperdrive makes accessing your existing databases from Cloudflare Workers, wherever they are running, hyper fast. You connect Hyperdrive to your database, change one line of code to connect through Hyperdrive, and voilà: connections and queries get faster (and spoiler: <a href="https://developers.cloudflare.com/hyperdrive/">you can use it today</a>).</p><p>In a nutshell, Hyperdrive uses our global network to speed up queries to your existing databases, whether they’re in a legacy cloud provider or with <a href="https://www.cloudflare.com/developer-platform/products/d1/">your favorite serverless database provider;</a> dramatically reduces the <a href="https://www.cloudflare.com/learning/performance/glossary/what-is-latency/">latency</a> incurred from repeatedly setting up new database connections; and caches the most popular read queries against your database, often avoiding the need to go back to your database at all.</p><p>Without Hyperdrive, that core database — the one with your user profiles, product inventory, or running your critical web app — sitting in the us-east1 region of a legacy cloud provider is going to be really slow to access for users in Paris, Singapore and Dubai and slower than it should be for users in Los Angeles or Vancouver. With each round trip taking up to 200ms, it’s easy to burn up to a second (or more!) on the multiple round-trips needed just to set up a connection, before you’ve even made the query for your data. Hyperdrive is designed to fix this.</p><p>To demonstrate Hyperdrive’s performance, we built a <a href="https://hyperdrive-demo.pages.dev/">demo application</a> that makes back-to-back queries against the same database: both with Hyperdrive and without Hyperdrive (directly). The app selects a database in a neighboring continent: if you’re in Europe, it selects a database in the US — an all-too-common experience for many European Internet users — and if you’re in Africa, it selects a database in Europe (and so on). It returns raw results from a straightforward <code>SELECT</code> query, with no carefully selected averages or cherry-picked metrics.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3VWco8QZERMlkgBpiilOyA/99245c5a6ce8a208e7fc793121aeaef3/image2-25.png" />
            
            </figure><p><i>We</i> <a href="https://hyperdrive-demo.pages.dev/"><i>built a demo app</i></a> <i>that makes real queries to a PostgreSQL database, with and without Hyperdrive</i></p><p>Throughout internal testing, initial user reports and the multiple runs in our benchmark, Hyperdrive delivers a 17 - 25x performance improvement vs. going direct to the database for cached queries, and a 6 - 8x improvement for uncached queries and writes. The cached latency might not surprise you, but we think that being 6 - 8x faster on uncached queries changes “I can’t query a centralized database from Cloudflare Workers” to “where has this been all my life?!”. We’re also continuing to work on performance improvements: we’ve already identified additional latency savings, and we’ll be pushing those out in the coming weeks.</p><p>The best part? Developers with a Workers paid plan can <a href="https://developers.cloudflare.com/hyperdrive/">start using the Hyperdrive open beta immediately</a>: there are no waiting lists or special sign-up forms to navigate.</p>
    <div>
      <h3>Hyperdrive? Never heard of it?</h3>
      <a href="#hyperdrive-never-heard-of-it">
        
      </a>
    </div>
    <p>We’ve been working on Hyperdrive in secret for a short while: but allowing developers to connect to databases they already have — with their existing data, queries and tooling — has been something on our minds for quite some time.</p><p>In a modern distributed cloud environment like Workers, where compute is globally distributed (so it’s close to users) and functions are short-lived (so you’re billed no more than is needed), connecting to traditional databases has been both slow and unscalable. Slow because it takes upwards of seven round-trips (<a href="https://www.cloudflare.com/learning/ddos/glossary/tcp-ip/">TCP handshake</a>; <a href="https://www.cloudflare.com/learning/ssl/what-happens-in-a-tls-handshake/">TLS negotiation</a>; then auth) to establish the connection, and unscalable because databases like PostgreSQL have a <a href="https://www.postgresql.org/message-id/flat/31cc6df9-53fe-3cd9-af5b-ac0d801163f4%40iki.fi">high resource cost per connection</a>. Even just a couple of hundred connections to a database can consume non-negligible memory, separate from any memory needed for queries.</p><p>Our friends over at Neon (a popular serverless Postgres provider) wrote about this, and <a href="https://neon.tech/blog/serverless-driver-for-postgres">even released a WebSocket proxy and driver to reduce the</a> connection overhead, but are still fighting uphill in the snow: even with a custom driver, we’re down to 4 round-trips, each still potentially taking 50-200 milliseconds or more. When those connections are long-lived, that’s OK — it might happen once every few hours at best. But when they’re scoped to an individual function invocation, and are only useful for a few milliseconds to minutes at best — your code spends more time waiting. It’s effectively another kind of cold start: having to initiate a fresh connection to your database before making a query means that using a traditional database in a distributed or serverless environment is (to put it lightly) <i>really slow</i>.</p><p>To combat this, Hyperdrive does two things.</p><p>First, it maintains a set of regional database connection pools across Cloudflare’s network, so a Cloudflare Worker avoids making a fresh connection to a database on every request. Instead, the Worker can establish a connection to Hyperdrive (fast!), with Hyperdrive maintaining a pool of ready-to-go connections back to the database. Since a database can be anywhere from 30ms to (often) 300ms away over a <i>single</i> round-trip (let alone the seven or more you need for a new connection), having a pool of available connections dramatically reduces the latency issue that short-lived connections would otherwise suffer.</p><p>Second, it understands the difference between read (non-mutating) and write (mutating) queries and transactions, and can automatically cache your most popular read queries: which represent over 80% of most queries made to databases in typical web applications. That product listing page that tens of thousands of users visit every hour; open jobs on a major careers site; or even queries for config data that changes occasionally; a tremendous amount of what is queried does not change often, and caching it closer to where the user is querying it from can dramatically speed up access to that data for the next ten thousand users. Write queries, which can’t be safely cached, still get to benefit from both Hyperdrive’s connection pooling <i>and</i> Cloudflare’s <a href="https://www.cloudflare.com/network/">global network</a>: being able to take the fastest routes across the Internet across our backbone cuts down latency there, too.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/2ZPuh9S7KNGNfcIOW1FSqF/07cd95b1d45dd66d7b13fcab51cf4189/image4-16.png" />
            
            </figure><p><i>Even if your database is on the other side of the country, 70ms x 6 round-trips is a lot of time for a user to be waiting for a query response.</i></p><p>Hyperdrive works not only with PostgreSQL databases — including <a href="https://neon.tech/">Neon</a>, Google Cloud SQL, AWS RDS, and <a href="https://www.timescale.com/">Timescale</a>, but also PostgreSQL-compatible databases like <a href="https://materialize.com/">Materialize</a> (a powerful stream-processing database), <a href="https://www.cockroachlabs.com/">CockroachDB</a> (a major distributed database), Google Cloud’s <a href="https://cloud.google.com/alloydb">AlloyDB</a>, and AWS Aurora Postgres.</p><p>We’re also working on bringing support for MySQL, including providers like PlanetScale, by the end of the year, with more database engines planned in the future.</p>
    <div>
      <h3>The magic connection string</h3>
      <a href="#the-magic-connection-string">
        
      </a>
    </div>
    <p>One of the major design goals for Hyperdrive was the need for developers to keep using their existing drivers, query builder and ORM (Object-Relational Mapper) libraries. It wouldn’t have mattered how fast Hyperdrive was if we required you to migrate away from your favorite ORM and/or rewrite hundreds (or more) lines of code &amp; tests to benefit from Hyperdrive’s performance.</p><p>To achieve this, we worked with the maintainers of popular open-source drivers — including <a href="https://node-postgres.com/">node-postgres</a> and <a href="https://github.com/porsager/postgres">Postgres.js</a> — to help their libraries support <a href="/workers-tcp-socket-api-connect-databases/">Worker’s new TCP socket API</a>, which is going through the <a href="https://github.com/wintercg/proposal-sockets-api">standardization process</a>, and we expect to see land in Node.js, Deno and Bun as well.</p><p>The humble database connection string is the shared language of database drivers, and typically takes on this format:</p>
            <pre><code>postgres://user:password@some.database.host.example.com:5432/postgres</code></pre>
            <p>The magic behind Hyperdrive is that you can start using it in your existing Workers applications, with your existing queries, just by swapping out your connection string for the one Hyperdrive generates instead.</p>
    <div>
      <h3>Creating a Hyperdrive</h3>
      <a href="#creating-a-hyperdrive">
        
      </a>
    </div>
    <p>With an existing database ready to go — in this example, we’ll use a Postgres database from <a href="https://neon.tech/">Neon</a> — it takes less than a minute to get Hyperdrive running (yes, we timed it).</p><p>If you don’t have an existing Cloudflare Workers project, you can quickly create one:</p>
            <pre><code>$ npm create cloudflare@latest
# Call the application "hyperdrive-demo"
# Choose "Hello World Worker" as your template</code></pre>
            <p>From here, we just need the database connection string for our database and a quick <a href="https://developers.cloudflare.com/workers/wrangler/install-and-update/">wrangler command-line</a> invocation to have Hyperdrive connect to it.</p>
            <pre><code># Using wrangler v3.10.0 or above
wrangler hyperdrive create a-faster-database --connection-string="postgres://user:password@neon.tech:5432/neondb"

# This will return an ID: we'll use this in the next step</code></pre>
            <p>Add our Hyperdrive to the <a href="https://developers.cloudflare.com/workers/configuration/bindings/">wrangler.toml configuration</a> file for our Worker:</p>
            <pre><code>[[hyperdrive]]
binding = "HYPERDRIVE"
id = "cdb28782-0dfc-4aca-a445-a2c318fb26fd"</code></pre>
            <p>We can now write a <a href="https://developers.cloudflare.com/workers/">Worker</a> — or take an existing Worker script — and use Hyperdrive to speed up connections and queries to our existing database. We use <a href="https://node-postgres.com/">node-postgres</a> here, but we could just as easily use <a href="https://orm.drizzle.team/">Drizzle ORM</a>.</p>
            <pre><code>import { Client } from 'pg';

export interface Env {
	HYPERDRIVE: Hyperdrive;
}

export default {
	async fetch(request: Request, env: Env, ctx: ExecutionContext) {
		console.log(JSON.stringify(env));
		// Create a database client that connects to our database via Hyperdrive
		//
		// Hyperdrive generates a unique connection string you can pass to
		// supported drivers, including node-postgres, Postgres.js, and the many
		// ORMs and query builders that use these drivers.
		const client = new Client({ connectionString: env.HYPERDRIVE.connectionString });

		try {
			// Connect to our database
			await client.connect();

			// A very simple test query
			let result = await client.query({ text: 'SELECT * FROM pg_tables' });

			// Return our result rows as JSON
			return Response.json({ result: result });
		} catch (e) {
			console.log(e);
			return Response.json({ error: JSON.stringify(e) }, { status: 500 });
		}
	},
};</code></pre>
            <p>The code above is intentionally simple, but hopefully you can see the magic: our database driver gets a connection string from Hyperdrive, and is none-the-wiser. It doesn’t need to know anything about Hyperdrive, we don’t have to toss out our favorite query builder library, and we can immediately realize the speed benefits when making queries.</p><p>Connections are automatically pooled and kept warm, our most popular queries are cached, and our entire application gets faster.</p><p>We’ve also built out <a href="https://developers.cloudflare.com/hyperdrive/examples/">guides for every major database provider</a> to make it easy to get what you need from them (a connection string) into Hyperdrive.</p>
    <div>
      <h3>Going fast can’t be cheap, right?</h3>
      <a href="#going-fast-cant-be-cheap-right">
        
      </a>
    </div>
    <p>We think Hyperdrive is critical to accessing your existing databases when building on Cloudflare Workers: traditional databases were just never designed for a world where clients are globally distributed.</p><p><b>Hyperdrive’s connection pooling will always be free</b>, for both database protocols we support today and new database protocols we add in the future. Just like <a href="https://www.cloudflare.com/ddos/">DDoS protection</a> and our global <a href="https://www.cloudflare.com/application-services/products/cdn/">CDN</a>, we think access to Hyperdrive’s core feature is too useful to hold back.</p><p>During the open beta, Hyperdrive itself will not incur any charges for usage, regardless of how you use it. We’ll be announcing more details on how Hyperdrive will be priced closer to GA (early in 2024), with plenty of notice.</p>
    <div>
      <h3>Time to query</h3>
      <a href="#time-to-query">
        
      </a>
    </div>
    <p>So where to from here for Hyperdrive?</p><p>We’re planning on bringing Hyperdrive to GA in early 2024 — and we’re focused on landing more controls over how we cache &amp; automatically invalidate based on writes, detailed query and performance analytics (soon!), support for more database engines (including MySQL) as well as continuing to work on making it even faster.</p><p>We’re also working to enable private network connectivity via <a href="https://developers.cloudflare.com/magic-wan/">Magic WAN</a> and Cloudflare Tunnel, so that you can connect to databases that aren’t (or can’t be) exposed to the public Internet.</p><p>To connect Hyperdrive to your existing database, visit our <a href="https://developers.cloudflare.com/hyperdrive/">developer docs</a> — it takes less than a minute to create a Hyperdrive and update existing code to use it. Join the <i>#hyperdrive-beta</i> channel in our <a href="https://discord.cloudflare.com/">Developer Discord</a> to ask questions, surface bugs, and talk to our Product &amp; Engineering teams directly.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1BbZolVBokU7h0EedVawor/e98def78df39541f03a595ef2e083387/image3-31.png" />
            
            </figure><p></p> ]]></content:encoded>
            <category><![CDATA[Birthday Week]]></category>
            <category><![CDATA[Product News]]></category>
            <category><![CDATA[Database]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <guid isPermaLink="false">3buVHDU7WwOkwln1S36n02</guid>
            <dc:creator>Matt Silverlock</dc:creator>
            <dc:creator>Alex Robinson</dc:creator>
        </item>
        <item>
            <title><![CDATA[D1: open beta is here]]></title>
            <link>https://blog.cloudflare.com/d1-open-beta-is-here/</link>
            <pubDate>Thu, 28 Sep 2023 13:00:14 GMT</pubDate>
            <description><![CDATA[ D1 is now in open beta, and the theme is “scale”: with higher per-database storage limits and the ability to create more databases, we’re unlocking the ability for developers to build production-scale applications on D1 ]]></description>
            <content:encoded><![CDATA[ <p></p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4sioTCCEWQ0hiLg5ZSCD46/c53658e14bc379bea56cd0f3fed1d42b/image1-37.png" />
            
            </figure><p><b>D1 is now in open beta</b>, and the theme is “scale”: with higher per-database storage limits <i>and</i> the ability to create more databases, we’re unlocking the ability for developers to build production-scale applications on D1. Any developers with an existing paid Workers plan don’t need to lift a finger to benefit: we’ve retroactively applied this to all existing D1 databases.</p><p>If you missed the <a href="/d1-turning-it-up-to-11/">last D1 update</a> back during Developer Week, the <a href="https://developers.cloudflare.com/d1/changelog/">multitude of updates in the changelog</a>, or are just new to D1 in general: read on.</p>
    <div>
      <h3>Remind me: D1? Databases?</h3>
      <a href="#remind-me-d1-databases">
        
      </a>
    </div>
    <p>D1 our <a href="https://www.cloudflare.com/developer-platform/products/d1/">native serverless database</a>, which we launched into alpha in November last year: the queryable database complement to <a href="https://developers.cloudflare.com/kv/">Workers KV</a>, <a href="https://developers.cloudflare.com/durable-objects/">Durable Objects</a> and <a href="https://developers.cloudflare.com/r2/">R2</a>.</p><p>When we set out to build D1, we knew a few things for certain: it needed to be fast, it needed to be incredibly easy to create a database, and it needed to be SQL-based.</p><p>That last one was critical: so that developers could a) avoid learning another custom query language and b) make it easier for existing query buildings, ORM (object relational mapper) libraries and other tools to connect to D1 with minimal effort. From this, we’ve seen a huge number of projects build support in for D1: from support for D1 in the <a href="https://github.com/drizzle-team/drizzle-orm/blob/main/examples/cloudflare-d1/README.md">Drizzle ORM</a> and <a href="https://developers.cloudflare.com/d1/platform/community-projects/#d1-adapter-for-kysely-orm">Kysely</a>, to the <a href="https://t4stack.com/">T4 App</a>, a full-stack toolkit that uses D1 as its database.</p><p>We also knew that D1 couldn’t be the only way to query a database from Workers: for teams with existing databases and thousands of lines of SQL or existing ORM code, migrating across to D1 isn’t going to be an afternoon’s work. For those teams, we built <a href="/hyperdrive-making-regional-databases-feel-distributed/">Hyperdrive</a>, allowing you to connect to your existing databases and make them feel global. We think this gives teams flexibility: combine D1 and Workers for globally distributed apps, and use Hyperdrive for querying the databases you have in legacy clouds and just can’t get rid of overnight.</p>
    <div>
      <h3>Larger databases, and more of them</h3>
      <a href="#larger-databases-and-more-of-them">
        
      </a>
    </div>
    <p>This has been the biggest ask from the thousands of D1 users throughout the alpha: not just more databases, but also <i>bigger</i> databases.</p><p><b>Developers on the Workers paid plan will now be able to grow each database up to 2GB and create 50,000 databases (up from 500MB and 10). Yes, you read that right: 50,000 databases per account. This unlocks a whole raft of database-per-user use-cases and enables true isolation between customers, something that traditional relational database deployments can’t.</b></p><p>We’ll be continuing to work on unlocking even larger databases over the coming weeks and months: developers using the D1 beta will see automatic increases to these limits published on <a href="https://developers.cloudflare.com/d1/changelog/">D1’s public changelog</a>.</p><p>One of the biggest impediments to double-digit-gigabyte databases is performance: we want to ensure that a database can load in and be ready <i>really</i> quickly — cold starts of seconds (or more) just aren’t acceptable. A 10GB or 20GB database that takes 15 seconds before it can answer a query ends up being pretty frustrating to use.</p><p>Users on the <a href="https://www.cloudflare.com/plans/free/">Workers free plan</a> will keep the ten 500MB databases (<a href="https://developers.cloudflare.com/d1/changelog/#per-database-limit-now-500-mb">changelog</a>) forever: we want to give more developers the room to experiment with D1 and Workers before jumping in.</p>
    <div>
      <h3>Time Travel is here</h3>
      <a href="#time-travel-is-here">
        
      </a>
    </div>
    <p><a href="https://developers.cloudflare.com/d1/learning/time-travel/">Time Travel</a> allows you to roll your database back to a specific point in time: specifically, any minute in the last 30 days. And it’s enabled by default for every D1 database, doesn’t cost any more, and doesn’t count against your storage limit.</p><p>For those who have been keeping tabs: we originally announced Time Travel earlier this year, and made it <a href="https://developers.cloudflare.com/d1/changelog/#time-travel">available to all D1 users in July</a>. At its core, it’s deceptively simple: Time Travel introduces the concept of a “bookmark” to D1. A bookmark represents the state of a database at a specific point in time, and is effectively an append-only log. Time Travel can take a timestamp and turn it into a bookmark, or a bookmark directly: allowing you to restore back to that point. Even better: restoring doesn’t prevent you from going back further.</p><p>We think Time Travel works best with an example, so let’s make a change to a database: one with an Order table that stores every order made against our e-commerce store:</p>
            <pre><code># To illustrate: we have 89,185 unique addresses in our order database. 
➜  wrangler d1 execute northwind --command "SELECT count(distinct ShipAddress) FROM [Order]" 
┌──────────┐
│ count(*) │
├──────────┤
│ 89185    │
└──────────┘</code></pre>
            <p>OK, great. Now what if we wanted to make a change to a specific set of orders: an address change or freight company change?</p>
            <pre><code># I think we might be forgetting something here...
➜  wrangler d1 execute northwind --command "UPDATE [Order] SET ShipAddress = 'Av. Veracruz 38, Roma Nte., Cuauhtémoc, 06700 Ciudad de México, CDMX, Mexico' </code></pre>
            <p>Wait: we’ve made a mistake that many, many folks have before: we forgot the WHERE clause on our UPDATE query. Instead of updating a specific order Id, we’ve instead updated the ShipAddress for every order in our table.</p>
            <pre><code># Every order is now going to a wine bar in Mexico City. 
➜  wrangler d1 execute northwind --command "SELECT count(distinct ShipAddress) FROM [Order]" 
┌──────────┐
│ count(*) │
├──────────┤
│ 1        │
└──────────┘</code></pre>
            <p>Panic sets in. Did we remember to make a backup before we did this? How long ago was it? Did we turn on point-in-time recovery? It seemed potentially expensive at the time…</p><p>It’s OK. We’re using D1. We can Time Travel. It’s on by default: let’s fix this and travel back a few minutes.</p>
            <pre><code># Let's go back in time.
➜  wrangler d1 time-travel restore northwind --timestamp="2023-09-23T14:20:00Z"

🚧 Restoring database northwind from bookmark 0000000b-00000002-00004ca7-9f3dba64bda132e1c1706a4b9d44c3c9
✔ OK to proceed (y/N) … yes

⚡️ Time travel in progress...
✅ Database dash-db restored back to bookmark 00000000-00000004-00004ca7-97a8857d35583887de16219c766c0785
↩️ To undo this operation, you can restore to the previous bookmark: 00000013-ffffffff-00004ca7-90b029f26ab5bd88843c55c87b26f497</code></pre>
            <p>Let's check if it worked:</p>
            <pre><code># Phew. We're good. 
➜  wrangler d1 execute northwind --command "SELECT count(distinct ShipAddress) FROM [Order]" 
┌──────────┐
│ count(*) │
├──────────┤
│ 89185    │
└──────────┘</code></pre>
            <p>We think that Time Travel becomes even more powerful when you have many smaller databases, too: the downsides of any restore operation is reduced further and scoped to a single user or tenant.</p><p>This is also just the beginning for Time Travel: we’re working to support not just only restoring a database, but also the ability to fork from and overwrite existing databases. If you can fork a database with a single command and/or test migrations and schema changes against real data, you can de-risk a lot of the traditional challenges that working with databases has historically implied.</p>
    <div>
      <h3>Row-based pricing</h3>
      <a href="#row-based-pricing">
        
      </a>
    </div>
    <p><a href="/d1-turning-it-up-to-11/#not-going-to-burn-a-hole-in-your-wallet">Back in May</a> we announced pricing for D1, to a lot of positive feedback around how much we’d included in our Free and Paid plans. In August, we published a new row-based model, replacing the prior byte-units, that makes it easier to predict and quantify your usage. Specifically, we moved to rows as it’s easier to reason about: if you’re writing a row, it doesn’t matter if it’s 1KB or 1MB. If your read query uses an indexed column to filter on, you’ll see not only performance benefits, but cost savings too.</p><p>Here’s D1’s pricing — almost everything has stayed the same, with the added benefit of charging based on rows:</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4053N3dvxuEp46TQG6xec9/74244f620374666d3b8fcbcf5d0016bb/Screenshot-2023-09-29-at-09.33.51.png" />
            
            </figure><p>D1’s pricing — you can find more details in <a href="https://developers.cloudflare.com/d1/platform/pricing/">D1’s public documentation</a>.</p><p>As before, D1 does not charge you for “database hours”, the number of databases, or point-in-time recovery (<a href="https://developers.cloudflare.com/d1/learning/time-travel/">Time Travel</a>) — just query D1 and pay for your reads, writes, and storage — that’s it.</p><p>We believe this makes D1 not only far more cost-efficient, but also makes it easier to manage multiple databases to isolate customer data or prod vs. staging: we don’t care <i>which</i> database you query. Manage your data how you like, separate your customer data, and avoid having to fall for the trap of “Billing Based Architecture”, where you build solely around how you’re charged, even if it’s not intuitive or what makes sense for your team.</p><p>To make it easier to both see how much a given query charges <i>and</i> when to <a href="https://developers.cloudflare.com/d1/learning/using-indexes/">optimize your queries with indexes</a>, D1 also returns the number of rows a query read or wrote (or both) so that you can understand how it’s costing you in both cents and speed.</p><p>For example, the following query filters over orders based on date:</p>
            <pre><code>SELECT * FROM [Order] WHERE ShippedDate &gt; '2016-01-22'" 

[
  {
    "results": [],
    "success": true,
    "meta": {
      "duration": 5.032,
      "size_after": 33067008,
      "rows_read": 16818,
      "rows_written": 0
    }
  }
]</code></pre>
            <p>The unindexed query above scans 16,800 rows. Even if we don’t optimize it, D1 includes 25 billion queries per month for free, meaning we could make this query 1.4 million times for a whole month before having to worry about extra costs.</p><p>But we can do better with an index:</p>
            <pre><code>CREATE INDEX IF NOT EXISTS idx_orders_date ON [Order](ShippedDate)</code></pre>
            <p>With the index created, let’s see how many rows our query needs to read now:</p>
            <pre><code>SELECT * FROM [Order] WHERE ShippedDate &gt; '2016-01-22'" 

[
  {
    "results": [],
    "success": true,
    "meta": {
      "duration": 3.793,
             "size_after": 33067008,
      "rows_read": 417,
      "rows_written": 0
    }
  }
]</code></pre>
            <p>The same query with an index on the ShippedDate column reads just 417 rows: not only it is faster (duration is in milliseconds!), but it costs us less: we could run this query 59 million times per month before we’d have to pay any more than what the $5 Workers plan gives us.</p><p>D1 also <a href="https://developers.cloudflare.com/d1/platform/metrics-analytics/#metrics">exposes row counts</a> via both the Cloudflare dashboard and our GraphQL analytics API: so not only can you look at this per-query when you’re tuning performance, but also break down query patterns across all of your databases.</p>
    <div>
      <h3>D1 for Platforms</h3>
      <a href="#d1-for-platforms">
        
      </a>
    </div>
    <p>Throughout D1’s alpha period, we’ve both heard from and worked with teams who are excited about D1’s ability to scale out horizontally: the ability to deploy a database-per-customer (or user!) in order to keep data closer to where teams access it <i>and</i> more strongly isolate that data from their other users.</p><p>Teams building the next big thing on <a href="https://developers.cloudflare.com/cloudflare-for-platforms/workers-for-platforms/">Workers for Platforms</a> — think of it as “Functions as a Service, as a Service” — can use D1 to deploy a <b>database per user</b> — keeping customer data strongly separated from each other.</p><p>For example, and as one of the early adopters of D1, <a href="https://twitter.com/roninapp">RONIN</a> is building an edge-first content &amp; data platform backed by a dedicated D1 database per customer, which allows customers to place data closer to users and provides each customer isolation from the queries of others.</p><p>Instead of spinning up and managing countless traditional database instances, RONIN uses D1 for Platforms to offer automatic infinite scalability at the edge. This allows RONIN to focus on providing an intuitive editing experience for your content.</p><p>When it comes to enabling “D1 for Platforms”, we’ve thought about this in a few ways from the very beginning:</p><ul><li><p><b>Support for more than 100,000+ databases for Workers for Platforms users — there’s no limit, but if we said “unlimited” you might not believe us — on top of the 50,000 databases per account that D1 already enables.</b></p></li><li><p>D1’s pricing - you don’t pay per-database or for “idle databases”. If you have a range of users, from thousands of QPS down to 1-2 every 10 minutes — you aren’t paying more for “database hours” on the less trafficked databases, or having to plan around spiky workloads across your user-base.</p></li><li><p>The ability to programmatically configure more databases via <a href="https://developers.cloudflare.com/api/operations/cloudflare-d1-create-database">D1’s HTTP API</a> <i>and</i> <a href="https://developers.cloudflare.com/api/operations/worker-script-patch-settings">attach them to your Worker</a> without re-deploying. There’s no “provisioning” delay, either: you create the database, and it’s immediately ready to query by you or your users.</p></li><li><p>Detailed <a href="https://developers.cloudflare.com/d1/platform/metrics-analytics/">per-database analytics</a>, so you can understand which databases are being used and how they’re being queried via D1’s GraphQL analytics API.</p></li></ul><p>If you’re building the next big platform on top of Workers &amp; want to use D1 at scale — whether you’re part of the <a href="https://www.cloudflare.com/lp/workers-launchpad/">Workers Launchpad program</a> or not — reach out.</p>
    <div>
      <h3>What’s next for D1?</h3>
      <a href="#whats-next-for-d1">
        
      </a>
    </div>
    <p><b>We’re setting a clear goal: we want to make D1 “generally available” (GA) for production use-cases by early next year</b> <b>(Q1 2024)</b>. Although you can already use D1 without a waitlist or approval process, we understand that the GA label is an important one for many when it comes to a database (and as do we).</p><p>Between now and GA, we’re working on some really key parts of the D1 vision, with a continued focus on reliability and performance.</p><p>One of the biggest remaining pieces of that vision is global read replication, which we <a href="/d1-turning-it-up-to-11/">wrote about earlier this year</a>. Importantly, replication will be free, won’t multiply your storage consumption, and will still enable session consistency (read-your-writes). Part of D1’s mission is about getting data closer to where users are, and we’re excited to land it.</p><p>We’re also working to expand <a href="https://developers.cloudflare.com/d1/learning/time-travel/">Time Travel</a>, D1’s built-in point-in-time recovery capabilities, so that you can branch and/or clone a database from a specific point-in-time on the fly.</p><p>We’ll also <b>be progressively opening up our limits around per-database storage, unlocking more storage per account, and the number of databases you can create over the rest of this year</b>, so keep an eye on the D1 <a href="https://developers.cloudflare.com/d1/changelog/">changelog</a> (or your inbox).</p><p>In the meantime, if you haven’t yet used D1, you can <a href="https://developers.cloudflare.com/d1/get-started/">get started</a> right now, visit D1’s <a href="https://developers.cloudflare.com/d1/">developer documentation</a> to spark some ideas, or <a href="https://discord.cloudflare.com/">join the #d1-beta channel</a> on our Developer Discord to talk to other D1 developers and our product-engineering team.</p> ]]></content:encoded>
            <category><![CDATA[Birthday Week]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <category><![CDATA[Database]]></category>
            <category><![CDATA[D1]]></category>
            <guid isPermaLink="false">5I0knbF5YIn2PbvvOTa1q2</guid>
            <dc:creator>Matt Silverlock</dc:creator>
            <dc:creator>Ben Yule</dc:creator>
        </item>
        <item>
            <title><![CDATA[Workers AI: serverless GPU-powered inference on Cloudflare’s global network]]></title>
            <link>https://blog.cloudflare.com/workers-ai/</link>
            <pubDate>Wed, 27 Sep 2023 13:00:47 GMT</pubDate>
            <description><![CDATA[ We are excited to launch Workers AI - an AI inference as a service platform, empowering developers to run AI models with just a few lines of code, all powered by our global network of GPUs ]]></description>
            <content:encoded><![CDATA[ <p></p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1kH38tclcLOGwYv40vTHNy/300956275074e73dd480a93898d43c08/image1-29.png" />
            
            </figure><p>If you're anywhere near the developer community, it's almost impossible to avoid the impact that AI’s recent advancements have had on the ecosystem. Whether you're using <a href="https://www.cloudflare.com/learning/ai/what-is-artificial-intelligence/">AI</a> in your workflow to improve productivity, or you’re shipping AI based features to your users, it’s everywhere. The focus on AI improvements are extraordinary, and we’re super excited about the opportunities that lay ahead, but it's not enough.</p><p>Not too long ago, if you wanted to leverage the power of AI, you needed to know the ins and outs of <a href="https://www.cloudflare.com/learning/ai/what-is-machine-learning/">machine learning</a>, and be able to manage the infrastructure to power it.</p><p>As a developer platform with over one million active developers, we believe there is so much potential yet to be unlocked, so we’re changing the way AI is delivered to developers. Many of the current solutions, while powerful, are based on closed, proprietary models and don't address privacy needs that developers and users demand. Alternatively, the open source scene is exploding with powerful models, but they’re simply not accessible enough to every developer. Imagine being able to run a model, from your code, wherever it’s <a href="https://www.cloudflare.com/developer-platform/solutions/hosting/">hosted</a>, and never needing to find GPUs or deal with setting up the infrastructure to support it.</p><p>That's why we are excited to launch Workers AI - an AI inference as a service platform, empowering developers to run AI models with just a few lines of code, all powered by our global network of GPUs. It's open and accessible, serverless, privacy-focused, runs near your users, pay-as-you-go, and it's built from the ground up for a best in class developer experience.</p>
    <div>
      <h2>Workers AI - making inference <b>just work</b></h2>
      <a href="#workers-ai-making-inference-just-work">
        
      </a>
    </div>
    <p>We’re launching Workers AI to put AI inference in the hands of every developer, and to actually deliver on that goal, it should <b>just work</b> out of the box. How do we achieve that?</p><ul><li><p>At the core of everything, it runs on the right infrastructure - our world-class network of GPUs</p></li><li><p>We provide off-the-shelf models that run seamlessly on our infrastructure</p></li><li><p>Finally, deliver it to the end developer, in a way that’s delightful. A developer should be able to build their first Workers AI app in minutes, and say “Wow, that’s kinda magical!”.</p></li></ul><p>So what exactly is Workers AI? It’s another building block that we’re adding to our developer platform - one that helps developers run well-known AI models on serverless GPUs, all on Cloudflare’s trusted global network. As one of the latest additions to our developer platform, it works seamlessly with Workers + Pages, but to make it truly accessible, we’ve made it platform-agnostic, so it also works everywhere else, made available via a REST API.</p>
    <div>
      <h2>Models you know and love</h2>
      <a href="#models-you-know-and-love">
        
      </a>
    </div>
    <p>We’re launching with a curated set of popular, open source models, that cover a wide range of inference tasks:</p><ul><li><p><b>Text generation (large language model):</b> meta/llama-2-7b-chat-int8</p></li><li><p><b>Automatic speech recognition (ASR):</b> openai/whisper</p></li><li><p><b>Translation:</b> meta/m2m100-1.2</p></li><li><p><b>Text classification:</b> huggingface/distilbert-sst-2-int8</p></li><li><p><b>Image classification:</b> microsoft/resnet-50</p></li><li><p><b>Embeddings:</b> baai/bge-base-en-v1.5</p></li></ul><p>You can browse all available models in your Cloudflare dashboard, and soon you’ll be able to dive into logs and analytics on a per model basis!</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3iLFApyCjCwTCEtV8QRhke/91793f5eaabe3c426cf5fb7f421f4508/image4-14.png" />
            
            </figure><p>This is just the start, and we’ve got big plans. After launch, we’ll continue to expand based on community feedback. Even more exciting - in an effort to take our catalog from zero to sixty, we’re announcing a partnership with Hugging Face, a leading AI community + hub. The partnership is multifaceted, and you can read more about it <a href="/best-place-region-earth-inference">here</a>, but soon you’ll be able to browse and run a subset of the Hugging Face catalog directly in Workers AI.</p>
    <div>
      <h2>Accessible to everyone</h2>
      <a href="#accessible-to-everyone">
        
      </a>
    </div>
    <p>Part of the mission of our developer platform is to provide <b>all</b> the building blocks that developers need to build the applications of their dreams. Having access to the right blocks is just one part of it — as a developer your job is to put them together into an application. Our goal is to make that as easy as possible.</p><p>To make sure you could use Workers AI easily regardless of entry point, we wanted to provide access via: Workers or Pages to make it easy to use within the Cloudflare ecosystem, and via REST API if you want to use Workers AI with your current stack.</p><p>Here’s a quick CURL example that translates some text from English to French:</p>
            <pre><code>curl https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/ai/run/@cf/meta/m2m100-1.2b \
-H "Authorization: Bearer {API_TOKEN}" \
	-d '{ "text": "I'll have an order of the moule frites", "target_lang": "french" }'</code></pre>
            <p>And here are what the response looks like:</p>
            <pre><code>{
  "result": {
    "answer": "Je vais commander des moules frites"
  },
  "success": true,
  "errors":[],
  "messages":[]
}</code></pre>
            <p>Use it with any stack, anywhere - your favorite Jamstack framework, Python + Django/Flask, Node.js, Ruby on Rails, the possibilities are endless. And deploy.</p>
    <div>
      <h2>Designed for developers</h2>
      <a href="#designed-for-developers">
        
      </a>
    </div>
    <p>Developer experience is really important to us. In fact, most of this post has been about just that. Making sure it works out of the box. Providing popular models that just work. Being accessible to all developers whether you build and deploy with Cloudflare or elsewhere. But it’s more than that - the experience should be frictionless, zero to production should be fast, and it should feel good along the way.</p><p>Let’s walk through another example to show just how easy it is to use! We’ll run Llama 2, a popular <a href="https://www.cloudflare.com/learning/ai/what-is-large-language-model/">large language model</a> open sourced by Meta, in a worker.</p><p>We’ll assume you have some of the basics already complete (Cloudflare account, Node, NPM, etc.), but if you don’t <a href="https://developers.cloudflare.com/workers-ai/get-started/local-dev-setup/">this guide</a> will get you properly set up!</p>
    <div>
      <h3>1. Create a Workers project</h3>
      <a href="#1-create-a-workers-project">
        
      </a>
    </div>
    <p>Create a new project named workers-ai by running:</p>
            <pre><code>$ npm create cloudflare@latest</code></pre>
            <p>When setting up your workers-ai worker, answer the setup questions as follows:</p><ul><li><p>Enter <b>workers-ai</b> for the app name</p></li><li><p>Choose <b>Hello World</b> script for the type of application</p></li><li><p>Select <b>yes</b> to using TypeScript</p></li><li><p>Select <b>yes</b> to using Git</p></li><li><p>Select <b>no</b> to deploying</p></li></ul><p>Lastly navigate to your new app directory:</p>
            <pre><code>cd workers-ai</code></pre>
            
    <div>
      <h3>2. Connect Workers AI to your worker</h3>
      <a href="#2-connect-workers-ai-to-your-worker">
        
      </a>
    </div>
    <p>Create a Workers AI binding, which allows your worker to access the Workers AI service without having to manage an API key yourself.</p><p>To bind Workers AI to your worker, add the following to the end of your <b>wrangler.toml</b> file:</p>
            <pre><code>[ai]
binding = "AI" #available in your worker via env.AI</code></pre>
            <p>You can also bind Workers AI to a Pages Function. For more information, refer to <a href="https://developers.cloudflare.com/pages/platform/functions/bindings/#ai">Functions Bindings</a>.</p>
    <div>
      <h3>3. Install the Workers AI client library</h3>
      <a href="#3-install-the-workers-ai-client-library">
        
      </a>
    </div>
    
            <pre><code>npm install @cloudflare/ai</code></pre>
            
    <div>
      <h3>4. Run an inference task in your worker</h3>
      <a href="#4-run-an-inference-task-in-your-worker">
        
      </a>
    </div>
    <p>Update the <b>source/index.ts</b> with the following code:</p>
            <pre><code>import { Ai } from '@cloudflare/ai'
export default {
  async fetch(request, env) {
    const ai = new Ai(env.AI);
    const input = { prompt: "What's the origin of the phrase 'Hello, World'" };
    const output = await ai.run('@cf/meta/llama-2-7b-chat-int8', input );
    return new Response(JSON.stringify(output));
  },
};</code></pre>
            
    <div>
      <h3>5. Develop locally with Wrangler</h3>
      <a href="#5-develop-locally-with-wrangler">
        
      </a>
    </div>
    <p>While in your project directory, test Workers AI locally by running:</p>
            <pre><code>$ npx wrangler dev --remote</code></pre>
            <p><b>Note -</b> These models currently only run on Cloudflare’s network of GPUs (and not locally), so setting <code>--remote</code> above is a must, and you’ll be prompted to log in at this point.</p><p>Wrangler will give you a URL (most likely localhost:8787). Visit that URL, and you’ll see a response like this</p>
            <pre><code>{
  "response": "Hello, World is a common phrase used to test the output of a computer program, particularly in the early stages of programming. The phrase "Hello, World!" is often the first program that a beginner learns to write, and it is included in many programming language tutorials and textbooks as a way to introduce basic programming concepts. The origin of the phrase "Hello, World!" as a programming test is unclear, but it is believed to have originated in the 1970s. One of the earliest known references to the phrase is in a 1976 book called "The C Programming Language" by Brian Kernighan and Dennis Ritchie, which is considered one of the most influential books on the development of the C programming language.
}</code></pre>
            
    <div>
      <h3>6. Deploy your worker</h3>
      <a href="#6-deploy-your-worker">
        
      </a>
    </div>
    <p>Finally, deploy your worker to make your project accessible on the Internet:</p>
            <pre><code>$ npx wrangler deploy
# Outputs: https://workers-ai.&lt;YOUR_SUBDOMAIN&gt;.workers.dev</code></pre>
            <p>And that’s it. You can literally go from zero to deployed AI in minutes. This is obviously a simple example, but shows how easy it is to run Workers AI from any project. </p>
    <div>
      <h2>Privacy by default</h2>
      <a href="#privacy-by-default">
        
      </a>
    </div>
    <p>When Cloudflare was founded, our value proposition had three pillars: more secure, more reliable, and more performant. Over time, we’ve realized that a better Internet is also a more private Internet, and we want to play a role in building it.</p><p>That’s why Workers AI is private by default - we don’t train our models, LLM or otherwise, on your data or conversations, and our models don’t learn from your usage. You can feel confident using Workers AI in both personal and business settings, without having to worry about leaking your data. Other providers only offer this fundamental feature with their enterprise version. With us, it’s built in for everyone.</p><p>We’re also excited to support data localization in the future. To make this happen, we have an ambitious GPU rollout plan - we’re launching with seven sites today, roughly 100 by the end of 2023, and nearly everywhere by the end of 2024. Ultimately, this will empower developers to keep delivering killer AI features to their users, while staying compliant with their end users’ data localization requirements.</p>
    <div>
      <h2>The power of the platform</h2>
      <a href="#the-power-of-the-platform">
        
      </a>
    </div>
    
    <div>
      <h4>Vector database - Vectorize</h4>
      <a href="#vector-database-vectorize">
        
      </a>
    </div>
    <p>Workers AI is all about running Inference, and making it really easy to do so, but sometimes inference is only part of the equation. Large language models are trained on a fixed set of data, based on a snapshot at a specific point in the past, and have no context on your business or use case. When you submit a prompt, information specific to you can increase the quality of results, making it more useful and relevant. That’s why we’re also launching Vectorize, our <a href="https://www.cloudflare.com/learning/ai/what-is-vector-database/">vector database</a> that’s designed to work seamlessly with Workers AI. Here’s a quick overview of how you might use Workers AI + Vectorize together.</p><p>Example: Use your data (knowledge base) to provide additional context to an LLM when a user is chatting with it.</p><ol><li><p><b>Generate initial embeddings:</b> run your data through Workers AI using an <a href="https://www.cloudflare.com/learning/ai/what-are-embeddings/">embedding model</a>. The output will be embeddings, which are numerical representations of those words.</p></li><li><p><b>Insert those embeddings into Vectorize:</b> this essentially seeds the vector database with your data, so we can later use it to retrieve embeddings that are similar to your users’ query</p></li><li><p><b>Generate embedding from user question:</b> when a user submits a question to your AI app, first, take that question, and run it through Workers AI using an embedding model.</p></li><li><p><b>Get context from Vectorize:</b> use that embedding to query Vectorize. This should output embeddings that are similar to your user’s question.</p></li><li><p><b>Create context aware prompt:</b> Now take the original text associated with those embeddings, and create a new prompt combining the text from the vector search, along with the original question</p></li><li><p><b>Run prompt:</b> run this prompt through Workers AI using an LLM model to get your final result</p></li></ol>
    <div>
      <h4>AI Gateway</h4>
      <a href="#ai-gateway">
        
      </a>
    </div>
    <p>That covers a more advanced use case. On the flip side, if you are running models elsewhere, but want to get more out of the experience, you can run those APIs through our AI gateway to get features like caching, rate-limiting, analytics and logging. These features can be used to protect your end point, monitor and optimize costs, and also help with data loss prevention. Learn more about AI gateway <a href="/announcing-ai-gateway">here</a>.</p>
    <div>
      <h2>Start building today</h2>
      <a href="#start-building-today">
        
      </a>
    </div>
    <p>Try it out for yourself, and let us know what you think. Today we’re launching Workers AI as an open Beta for all Workers plans - free or paid. That said, it’s super early, so…</p>
    <div>
      <h4>Warning - It’s an early beta</h4>
      <a href="#warning-its-an-early-beta">
        
      </a>
    </div>
    <p>Usage is <b>not currently recommended for production apps</b>, and limits + access are subject to change.</p>
    <div>
      <h4>Limits</h4>
      <a href="#limits">
        
      </a>
    </div>
    <p>We’re initially launching with limits on a per-model basis</p><ul><li><p>@cf/meta/llama-2-7b-chat-int8: 50 reqs/min globally</p></li></ul><p>Checkout our <a href="https://developers.cloudflare.com/workers-ai/platform/limits/">docs</a> for a full overview of our limits.</p>
    <div>
      <h4>Pricing</h4>
      <a href="#pricing">
        
      </a>
    </div>
    <p>What we released today is just a small preview to give you a taste of what’s coming (we simply couldn’t hold back), but we’re looking forward to putting the full-throttle version of Workers AI in your hands.</p><p>We realize that as you approach building something, you want to understand: how much is this going to cost me? Especially with AI costs being so easy to get out of hand. So we wanted to share the upcoming pricing of Workers AI with you.</p><p>While we won’t be billing on day one, we are announcing what we expect our pricing will look like.</p><p>Users will be able to choose from two ways to run Workers AI:</p><ul><li><p><b>Regular Twitch Neurons (RTN)</b> - running wherever there's capacity at $0.01 / 1k neurons</p></li><li><p><b>Fast Twitch Neurons (FTN)</b> - running at nearest user location at $0.125 / 1k neurons</p></li></ul><p>You may be wondering — what’s a neuron?</p><p>Neurons are a way to measure AI output that always scales down to zero (if you get no usage, you will be charged for 0 neurons). To give you a sense of what you can accomplish with a thousand neurons, you can: generate 130 LLM responses, 830 image classifications, or 1,250 embeddings.</p><p>Our goal is to help our customers pay only for what they use, and choose the pricing that best matches their use case, whether it’s price or latency that is top of mind.</p>
    <div>
      <h3>What’s on the roadmap?</h3>
      <a href="#whats-on-the-roadmap">
        
      </a>
    </div>
    <p>Workers AI is just getting started, and we want your feedback to help us make it great. That said, there are some exciting things on the roadmap.</p>
    <div>
      <h4>More models, please</h4>
      <a href="#more-models-please">
        
      </a>
    </div>
    <p>We're launching with a solid set of models that just work, but will continue to roll out new models based on your feedback. If there’s a particular model you'd love to see on Workers AI, pop into our <a href="https://discord.cloudflare.com/">Discord</a> and let us know!</p><p>In addition to that, we're also announcing a <a href="/best-place-region-earth-inference">partnership with Hugging Face</a>, and soon you'll be able to access and run a subset of the Hugging Face catalog directly from Workers AI.</p>
    <div>
      <h4>Analytics + observability</h4>
      <a href="#analytics-observability">
        
      </a>
    </div>
    <p>Up to this point, we’ve been hyper focussed on one thing - making it really easy for any developer to run powerful AI models in just a few lines of code. But that’s only one part of the story. Up next, we’ll be working on some analytics and <a href="https://www.cloudflare.com/learning/performance/what-is-observability/">observability</a> capabilities to give you insights into your usage + performance + spend on a per-model basis, plus the ability to fig into your logs if you want to do some exploring.</p>
    <div>
      <h4>A road to global GPU coverage</h4>
      <a href="#a-road-to-global-gpu-coverage">
        
      </a>
    </div>
    <p>Our goal is to be the best place to run inference on Region: Earth, so we're adding GPUs to our data centers as fast as we can.</p><p><b>We plan to be in 100 data centers by the end of this year</b></p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5A8SGUOEAcs3sjNjv48yIh/bafbc77b256fef490d4357613b036603/image3-28.png" />
            
            </figure><p><b>And nearly everywhere by the end of 2024</b></p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/2rrL2H0dHYZ4hxOBq0X1pw/f38d122af92f789dc2b31d3bdea1ab06/unnamed-3.png" />
            
            </figure><p><b>We’re really excited to see you build</b> - head over to <a href="https://developers.cloudflare.com/workers-ai/">our docs</a> to get started.</p><p>If you need inspiration, want to share something you’re building, or have a question - pop into our <a href="https://discord.com/invite/cloudflaredev">Developer Discord</a>.</p> ]]></content:encoded>
            <category><![CDATA[Birthday Week]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[AI]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <category><![CDATA[Database]]></category>
            <category><![CDATA[Vectorize]]></category>
            <category><![CDATA[Developers]]></category>
            <guid isPermaLink="false">6jSrrIFC7yStZxCaqaM0c1</guid>
            <dc:creator>Phil Wittig</dc:creator>
            <dc:creator>Rita Kozlov</dc:creator>
            <dc:creator>Rebecca Weekly</dc:creator>
            <dc:creator>Celso Martinho</dc:creator>
            <dc:creator>Meaghan Choi</dc:creator>
        </item>
        <item>
            <title><![CDATA[Vectorize: a vector database for shipping AI-powered applications to production, fast]]></title>
            <link>https://blog.cloudflare.com/vectorize-vector-database-open-beta/</link>
            <pubDate>Wed, 27 Sep 2023 13:00:31 GMT</pubDate>
            <description><![CDATA[ Vectorize is our brand-new vector database offering, designed to let you build full-stack, AI-powered applications entirely on Cloudflare’s global network: and you can start building with it right away ]]></description>
            <content:encoded><![CDATA[ <p></p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4UEegJQ4EtPbnJwh7UQZcd/2221274c908415bce2e1eba81a115d90/image2-21.png" />
            
            </figure><p>Vectorize is our brand-new <a href="https://www.cloudflare.com/learning/ai/what-is-vector-database/">vector database</a> offering, designed to let you build full-stack, AI-powered applications entirely on Cloudflare’s global network: and you can start building with it right away. Vectorize is in open beta, and is available to any developer using <a href="https://workers.cloudflare.com/">Cloudflare Workers</a>.</p><p>You can use Vectorize with <a href="/workers-ai">Workers AI</a> to power semantic search, classification, recommendation and anomaly detection use-cases directly with Workers, improve the accuracy and context of answers from <a href="https://www.cloudflare.com/learning/ai/what-is-large-language-model/">LLMs (Large Language Models)</a>, and/or bring-your-own <a href="https://www.cloudflare.com/learning/ai/what-are-embeddings/">embeddings</a> from popular platforms, including OpenAI and Cohere.</p><p>Visit <a href="https://developers.cloudflare.com/vectorize/get-started/">Vectorize’s developer documentation</a> to get started, or read on if you want to better understand what vector databases do and how Vectorize is different.</p>
    <div>
      <h2>Why do I need a vector database?</h2>
      <a href="#why-do-i-need-a-vector-database">
        
      </a>
    </div>
    
    <div>
      <h3>Machine learning models can’t remember anything: only what they were trained on.</h3>
      <a href="#machine-learning-models-cant-remember-anything-only-what-they-were-trained-on">
        
      </a>
    </div>
    <p>Vector databases are designed to solve this, by capturing how an ML model represents data — including structured and unstructured text, images and audio — and storing it in a way that allows you to compare against <i>future</i> inputs. This allows us to leverage the power of existing machine-learning models and LLMs (Large Language Models) for content they haven’t been trained on: which, given the tremendous cost of training models, turns out to be extremely powerful.</p><p>To better illustrate why a vector database like Vectorize is useful, let’s pretend they don’t exist, and see how painful it is to give context to an ML model or LLM for a semantic search or recommendation task. Our goal is to understand what content is similar to our query and return it: based on our own dataset.</p><ol><li><p>Our user query comes in: they’re searching for “how to write to R2 from Cloudflare Workers”</p></li><li><p>We load up our entire documentation dataset — a thankfully “small” dataset at about 65,000 sentences, or 2.1 GB — and provide it alongside the query from our user. This allows the model to have the context it needs, based on our data.</p></li><li><p><b>We wait.</b></p></li><li><p><b>(A long time)</b></p></li><li><p>We get our similarity scores back, with the sentences most similar to the user’s query, and then work to map those back to URLs before we return our search results.</p></li></ol><p>… and then another query comes in, and we have to start this all over again.</p><p>In practice, this isn’t really possible: we can’t pass that much context in an API call (prompt) to most <a href="https://www.cloudflare.com/learning/ai/what-is-machine-learning/">machine learning models</a>, and even if we could, it’d take tremendous amounts of memory and time to process our dataset over-and-over again.</p><p>With a vector database, we don’t have to repeat step 2: we perform it once, or as our dataset updates, and use our vector database to provide a form of long-term memory for our machine learning model. Our workflow looks a little more like this:</p><ol><li><p>We load up our entire documentation dataset, run it through our model, and store the resulting vector embeddings in our vector database (just once).</p></li><li><p>For each user query (and only the query) we ask the same model and retrieve a vector representation.</p></li><li><p>We query our vector database with that query vector, which returns the vectors closest to our query vector.</p></li></ol><p>If we looked at these two flows side by side, we can quickly see how inefficient and impractical it is to use our own dataset with an existing model without a vector database:</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6nVc2lsVxxlTjVWb5fF8Gn/df03f68c7792ece281f887608f0bad2f/image4-11.png" />
            
            </figure><p>Using a vector database to help machine learning models remember.</p><p>From this simple example, it’s probably starting to make some sense: but you might also be wondering why you need a vector database instead of just a regular database.</p><p>Vectors are the model’s representation of an input: how it maps that input to its internal structure, or “features”. Broadly, the more similar vectors are, the more similar the model believes those inputs to be based on how it extracts features from an input.</p><p>This is seemingly easy when we look at example vectors of only a handful of dimensions. But with real-world outputs, searching across 10,000 to 250,000 vectors, each potentially 1,536 dimensions wide, is non-trivial. This is where vector databases come in: to make search work at scale, vector databases use a specific class of algorithm, such as k-nearest neighbors (<a href="https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm">kNN</a>) or other approximate nearest neighbor (ANN) <a href="https://arxiv.org/abs/1603.09320">algorithms</a> to determine vector similarity.</p><p>And although vector databases are extremely useful when building <a href="https://www.cloudflare.com/learning/ai/what-is-artificial-intelligence/">AI</a> and machine learning powered applications, they’re not <i>only</i> useful in those use-cases: they can be used for a multitude of classification and anomaly detection tasks. Knowing whether a query input is similar — or potentially dissimilar — from other inputs can power content moderation (does this match known-bad content?) and security alerting (have I seen this before?) tasks as well.</p>
    <div>
      <h2>Building a recommendation engine with vector search</h2>
      <a href="#building-a-recommendation-engine-with-vector-search">
        
      </a>
    </div>
    <p>We built Vectorize to be a powerful partner to <a href="https://developers.cloudflare.com/workers-ai/">Workers AI</a>: enabling you to run vector search tasks as close to users as possible, and without having to think about how to scale it for production.</p><p>We’re going to take a real world example — building a (product) recommendation engine for an e-commerce store — and simplify a few things.</p><p>Our goal is to show a list of “relevant products” on each product listing page: a perfect use-case for vector search. Our input vectors in the example are placeholders, but in a real world application we would generate them based on product descriptions and/or cart data by passing them through a sentence similarity model (such as Worker’s AI’s <a href="https://developers.cloudflare.com/workers-ai/models/embedding/">text embedding model</a>)</p><p>Each vector represents a product across our store, and we associate the URL of the product with it. We could also set the ID of each vector to the product ID: both approaches are valid. Our query — vector search — represents the product description and content for the product user is currently viewing.</p><p>Let’s step through what this looks like in code: this example is pulled straight from our <a href="https://developers.cloudflare.com/vectorize/get-started/">developer documentation</a>:</p>
            <pre><code>export interface Env {
	// This makes our vector index methods available on env.MY_VECTOR_INDEX.*
	// e.g. env.MY_VECTOR_INDEX.insert() or .query()
	TUTORIAL_INDEX: VectorizeIndex;
}

// Sample vectors: 3 dimensions wide.
//
// Vectors from a machine-learning model are typically ~100 to 1536 dimensions
// wide (or wider still).
const sampleVectors: Array&lt;VectorizeVector&gt; = [
	{ id: '1', values: [32.4, 74.1, 3.2], metadata: { url: '/products/sku/13913913' } },
	{ id: '2', values: [15.1, 19.2, 15.8], metadata: { url: '/products/sku/10148191' } },
	{ id: '3', values: [0.16, 1.2, 3.8], metadata: { url: '/products/sku/97913813' } },
	{ id: '4', values: [75.1, 67.1, 29.9], metadata: { url: '/products/sku/418313' } },
	{ id: '5', values: [58.8, 6.7, 3.4], metadata: { url: '/products/sku/55519183' } },
];

export default {
	async fetch(request: Request, env: Env, ctx: ExecutionContext): Promise&lt;Response&gt; {
		if (new URL(request.url).pathname !== '/') {
			return new Response('', { status: 404 });
		}
		// Insert some sample vectors into our index
		// In a real application, these vectors would be the output of a machine learning (ML) model,
		// such as Workers AI, OpenAI, or Cohere.
		let inserted = await env.TUTORIAL_INDEX.insert(sampleVectors);

		// Log the number of IDs we successfully inserted
		console.info(`inserted ${inserted.count} vectors into the index`);

		// In a real application, we would take a user query - e.g. "durable
		// objects" - and transform it into a vector emebedding first.
		//
		// In our example, we're going to construct a simple vector that should
		// match vector id #5
		let queryVector: Array&lt;number&gt; = [54.8, 5.5, 3.1];

		// Query our index and return the three (topK = 3) most similar vector
		// IDs with their similarity score.
		//
		// By default, vector values are not returned, as in many cases the
		// vectorId and scores are sufficient to map the vector back to the
		// original content it represents.
		let matches = await env.TUTORIAL_INDEX.query(queryVector, { topK: 3, returnVectors: true });

		// We map over our results to find the most similar vector result.
		//
		// Since our index uses the 'cosine' distance metric, scores will range
		// from 1 to -1.  A value of '1' means the vector is the same; the
		// closer to 1, the more similar. Values of -1 (least similar) and 0 (no
		// match).
		// let closestScore = 0;
		// let mostSimilarId = '';
		// matches.matches.map((match) =&gt; {
		// 	if (match.score &gt; closestScore) {
		// 		closestScore = match.score;
		// 		mostSimilarId = match.vectorId;
		// 	}
		// });

		return Response.json({
			// This will return the closest vectors: we'll see that the vector
			// with id = 5 has the highest score (closest to 1.0) as the
			// distance between it and our query vector is the smallest.
			// Return the full set of matches so we can see the possible scores.
			matches: matches,
		});
	},
};</code></pre>
            <p>The code above is intentionally simple, but illustrates vector search at its core: we insert vectors into our database, and query it for vectors with the smallest distance to our query vector.</p><p>Here are the results, with the values included, so we visually observe that our query vector <code>[54.8, 5.5, 3.1]</code> is similar to our highest scoring match: <code>[58.799, 6.699, 3.400]</code> returned from our search. This index uses <a href="https://en.wikipedia.org/wiki/Cosine_similarity">cosine</a> similarity to calculate the distance between vectors, which means that the closer the score to 1, the more similar a match is to our query vector.</p>
            <pre><code>{
  "matches": {
    "count": 3,
    "matches": [
      {
        "score": 0.999909,
        "vectorId": "5",
        "vector": {
          "id": "5",
          "values": [
            58.79999923706055,
            6.699999809265137,
            3.4000000953674316
          ],
          "metadata": {
            "url": "/products/sku/55519183"
          }
        }
      },
      {
        "score": 0.789848,
        "vectorId": "4",
        "vector": {
          "id": "4",
          "values": [
            75.0999984741211,
            67.0999984741211,
            29.899999618530273
          ],
          "metadata": {
            "url": "/products/sku/418313"
          }
        }
      },
      {
        "score": 0.611976,
        "vectorId": "2",
        "vector": {
          "id": "2",
          "values": [
            15.100000381469727,
            19.200000762939453,
            15.800000190734863
          ],
          "metadata": {
            "url": "/products/sku/10148191"
          }
        }
      }
    ]
  }
}</code></pre>
            <p>In a real application, we could now quickly return product recommendation URLs based on the most similar products, sorting them by their score (highest to lowest), and increasing the topK value if we want to show more. The metadata stored alongside each vector could also embed a path to an <a href="https://developers.cloudflare.com/r2/">R2 object</a>, a UUID for a row in a <a href="https://www.cloudflare.com/developer-platform/products/d1/">D1 database</a>, or a key-value pair from <a href="https://developers.cloudflare.com/kv/">Workers KV</a>.</p>
    <div>
      <h3>Workers AI + Vectorize: full stack vector search on Cloudflare</h3>
      <a href="#workers-ai-vectorize-full-stack-vector-search-on-cloudflare">
        
      </a>
    </div>
    <p>In a real application, we need a machine learning model that can both generate vector embeddings from our original dataset (to seed our database) and <i>quickly</i> turn user queries into vector embeddings too. These need to be from the same model, as each model represents features differently.</p><p>Here’s a compact example building an entire end-to-end vector search pipeline on Cloudflare:</p>
            <pre><code>import { Ai } from '@cloudflare/ai';
export interface Env {
	TEXT_EMBEDDINGS: VectorizeIndex;
	AI: any;
}
interface EmbeddingResponse {
	shape: number[];
	data: number[][];
}

export default {
	async fetch(request: Request, env: Env, ctx: ExecutionContext): Promise&lt;Response&gt; {
		const ai = new Ai(env.AI);
		let path = new URL(request.url).pathname;
		if (path.startsWith('/favicon')) {
			return new Response('', { status: 404 });
		}

		// We only need to generate vector embeddings just the once (or as our
		// data changes), not on every request
		if (path === '/insert') {
			// In a real-world application, we could read in content from R2 or
			// a SQL database (like D1) and pass it to Workers AI
			const stories = ['This is a story about an orange cloud', 'This is a story about a llama', 'This is a story about a hugging emoji'];
			const modelResp: EmbeddingResponse = await ai.run('@cf/baai/bge-base-en-v1.5', {
				text: stories,
			});

			// We need to convert the vector embeddings into a format Vectorize can accept.
			// Each vector needs an id, a value (the vector) and optional metadata.
			// In a real app, our ID would typicaly be bound to the ID of the source
			// document.
			let vectors: VectorizeVector[] = [];
			let id = 1;
			modelResp.data.forEach((vector) =&gt; {
				vectors.push({ id: `${id}`, values: vector });
				id++;
			});

			await env.TEXT_EMBEDDINGS.upsert(vectors);
		}

		// Our query: we expect this to match vector id: 1 in this simple example
		let userQuery = 'orange cloud';
		const queryVector: EmbeddingResponse = await ai.run('@cf/baai/bge-base-en-v1.5', {
			text: [userQuery],
		});

		let matches = await env.TEXT_EMBEDDINGS.query(queryVector.data[0], { topK: 1 });
		return Response.json({
			// We expect vector id: 1 to be our top match with a score of
			// ~0.896888444
			// We are using a cosine distance metric, where the closer to one,
			// the more similar.
			matches: matches,
		});
	},
};</code></pre>
            <p>The code above does four things:</p><ol><li><p>It passes the three sentences to Workers AI’s <a href="https://developers.cloudflare.com/workers-ai/models/embedding/">text embedding model</a> (<code>@cf/baai/bge-base-en-v1.5</code>) and retrieves their vector embeddings.</p></li><li><p>It inserts those vectors into our Vectorize index.</p></li><li><p>Takes the user query and transforms it into a vector embedding via the same Workers AI model.</p></li><li><p>Queries our Vectorize index for matches.</p></li></ol><p>This example might look “too” simple, but in a production application, we’d only have to change two things: just insert our vectors once (or periodically via <a href="https://developers.cloudflare.com/workers/configuration/cron-triggers/">Cron Triggers</a>), and replace our three example sentences with real data stored in R2, a D1 database, or another storage provider.</p><p>In fact, this is incredibly similar to how we run <a href="https://developers.cloudflare.com/workers/ai/">Cursor</a>, the AI assistant that can answer questions about Cloudflare Worker: we migrated Cursor to run on Workers AI and Vectorize. We generate text embeddings from our developer documentation using its built-in text embedding model, insert them into a Vectorize index, and transform user queries on the fly via that same model.</p>
    <div>
      <h2>BYO embeddings from your favorite AI API</h2>
      <a href="#byo-embeddings-from-your-favorite-ai-api">
        
      </a>
    </div>
    <p>Vectorize isn’t just limited to Workers AI, though: it’s a fully-fledged, standalone vector database.</p><p>If you’re already using <a href="https://platform.openai.com/docs/guides/embeddings">OpenAI’s Embedding API</a>, Cohere’s <a href="https://docs.cohere.com/reference/embed">multilingual model</a>, or any other embedding API, then you can easily bring-your-own (BYO) vectors to Vectorize.</p><p>It works just the same: generate your embeddings, insert them into Vectorize, and pass your queries through the model before you query your index. Vectorize includes a few shortcuts for some of the most popular embedding models.</p>
            <pre><code># Vectorize has ready-to-go presets that set the dimensions and distance metric for popular embeddings models
$ wrangler vectorize create openai-index-example --preset=openai-text-embedding-ada-002</code></pre>
            <p>This can be particularly useful if you already have an existing workflow around an existing embeddings API, and/or have validated a specific multimodal or multilingual embeddings model for your use-case.</p>
    <div>
      <h2>Making the cost of AI predictable</h2>
      <a href="#making-the-cost-of-ai-predictable">
        
      </a>
    </div>
    <p>There’s a tremendous amount of excitement around AI and ML, but there’s also one big concern: that it’s too expensive to experiment with, and hard to predict at scale.</p><p>With Vectorize, we wanted to bring a simpler pricing model to vector databases. Have an idea for a proof-of-concept at work? That should fit into our free-tier limits. Scaling up and optimizing your embedding dimensions for performance vs. accuracy? It shouldn’t break the bank.</p><p>Importantly, Vectorize aims to be predictable: you don’t need to estimate CPU and memory consumption, which can be hard when you’re just starting out, and made even harder when trying to plan for your peak vs. off-peak hours in production for a brand new use-case. Instead, you’re charged based on the total number of vector dimensions you store, and the number of queries against them each month. It’s our job to take care of scaling up to meet your query patterns.</p><p>Here’s the pricing for Vectorize — and if you have a Workers paid plan now, Vectorize is entirely free to use until 2024:</p>
<table>
<thead>
  <tr>
    <th></th>
    <th><span>Workers Free (coming soon)</span></th>
    <th><span>Workers Paid ($5/month)</span></th>
  </tr>
</thead>
<tbody>
  <tr>
    <td><span>Queried vector dimensions included</span></td>
    <td><span>30M total queried dimensions / month</span></td>
    <td><span>50M total queried dimensions / month</span></td>
  </tr>
  <tr>
    <td><span>Stored vector dimensions included</span></td>
    <td><span>5M stored dimensions / month</span></td>
    <td><span>10M stored dimensions / month</span></td>
  </tr>
  <tr>
    <td><span>Additional cost </span></td>
    <td><span>$0.04 / 1M vector dimensions queried or stored</span></td>
    <td><span>$0.04 / 1M vector dimensions queried or stored</span></td>
  </tr>
</tbody>
</table><p>Pricing is based entirely on what you store and query: <code>(total vector dimensions queried + stored) * dimensions_per_vector * price</code>. Query more? Easy to predict. Optimizing for smaller dimensions per vector to improve speed and reduce overall latency? Cost goes down. Have a few indexes for prototyping or experimenting with new use-cases? We don’t charge per-index.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/9i10jyPHmjy6FTjqtCD2S/8362250de55ae98d45068fc5d37dc7e4/image1-25.png" />
            
            </figure><p><i>Create as many as you need indexes to prototype new ideas and/or separate production from dev.</i></p><p>As an example: if you load 10,000 Workers AI vectors (384 dimensions each) and make 5,000 queries against your index each day, it’d result in 49 million total vector dimensions queried and <i>still</i> fit into what we include in the Workers Paid plan ($5/month). Better still: we don’t delete your indexes due to inactivity.</p><p>Note that while this pricing isn’t final, we expect few changes going forward. We want to avoid the element of surprise: there’s nothing worse than starting to build on a platform and realizing the pricing is untenable <i>after</i> you’ve invested the time writing code, tests and learning the nuances of a technology.</p>
    <div>
      <h2>Vectorize!</h2>
      <a href="#vectorize">
        
      </a>
    </div>
    <p>Every Workers developer on a paid plan can start using Vectorize immediately: the open beta is available right now, and you can <a href="https://developers.cloudflare.com/vectorize/">visit our developer documentation to get started</a>.</p><p>This is also just the beginning of the vector database story for us at Cloudflare. Over the next few weeks and months, we intend to land a new query engine that should further improve query performance, support even larger indexes, introduce sub-index filtering capabilities, increased metadata limits, and per-index analytics.</p><p>If you’re looking for inspiration on what to build, <a href="http://developers.cloudflare.com/vectorize/get-started/embeddings/">see the semantic search tutorial</a> that combines Workers AI and Vectorize for document search, running entirely on Cloudflare. Or an example of <a href="https://developers.cloudflare.com/workers-ai/tutorials/build-a-retrieval-augmented-generation-ai/">how to combine OpenAI and Vectorize</a> to give an LLM more context and dramatically improve the accuracy of its answers.</p><p>And if you have questions about how to use Vectorize for our product &amp; engineering teams, or just want to bounce an idea off of other developers building on Workers AI, join the #vectorize and #workers-ai channels on our <a href="https://discord.cloudflare.com/">Developer Discord</a>.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/V5sZHDJiYORdAiY3o6K6U/cd72b9e7eb6715300ce2b1afe4b7b26a/image6-3.png" />
            
            </figure><p></p> ]]></content:encoded>
            <category><![CDATA[Birthday Week]]></category>
            <category><![CDATA[Vectorize]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <category><![CDATA[AI]]></category>
            <category><![CDATA[Database]]></category>
            <category><![CDATA[Product News]]></category>
            <category><![CDATA[Developers]]></category>
            <guid isPermaLink="false">5I4TqJNTxn1vCQd79HEUoZ</guid>
            <dc:creator>Matt Silverlock</dc:creator>
            <dc:creator>Jérôme Schneider</dc:creator>
        </item>
        <item>
            <title><![CDATA[Cloudflare Workers database integration with Upstash]]></title>
            <link>https://blog.cloudflare.com/cloudflare-workers-database-integration-with-upstash/</link>
            <pubDate>Wed, 02 Aug 2023 13:00:22 GMT</pubDate>
            <description><![CDATA[ Announcing the new Upstash database integrations for Workers. Now it is easier to use Upstash Redis, Kafka and QStash inside your Worker  ]]></description>
            <content:encoded><![CDATA[ <p><i></i></p><p><i>This blog post references a feature which has updated documentation. For the latest reference content, visit </i><a href="https://developers.cloudflare.com/workers/databases/third-party-integrations/"><i>https://developers.cloudflare.com/workers/databases/third-party-integrations/</i></a></p><p>During <a href="https://www.cloudflare.com/developer-week/">Developer Week</a> we announced <a href="/announcing-database-integrations/">Database Integrations on Workers</a> a new and seamless way to connect with some of the most popular databases. You select the provider, authorize through an OAuth2 flow and automatically get the right configuration stored as encrypted environment variables to your Worker.</p><p>Today we are thrilled to announce that we have been working with Upstash to expand our integrations catalog. We are now offering three new integrations: Upstash Redis, Upstash Kafka and Upstash QStash. These integrations allow our customers to unlock new capabilities on Workers. Providing them with a broader range of options to meet their specific requirements.</p>
    <div>
      <h3>Add the integration</h3>
      <a href="#add-the-integration">
        
      </a>
    </div>
    <p>We are going to show the setup process using the Upstash Redis integration.</p><p>Select your Worker, go to the Settings tab, select the Integrations tab to see all the available integrations.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4PgG63i9pFA5GtOuhGAAeE/5580ef72388faa48bb274d81edfd16ba/2.png" />
            
            </figure><p>After selecting the Upstash Redis integration we will get the following page.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4oL9KEz7NUDqw16aXrk2g0/2708bd58089fa1e8abc503bfd7074649/3.png" />
            
            </figure><p>First, you need to review and grant permissions, so the Integration can add secrets to your Worker. Second, we need to connect to Upstash using the OAuth2 flow. Third, select the Redis database we want to use. Then, the Integration will fetch the right information to generate the credentials. Finally, click “Add Integration” and it's done! We can now use the credentials as environment variables on our Worker.</p>
    <div>
      <h3>Implementation example</h3>
      <a href="#implementation-example">
        
      </a>
    </div>
    <p>On this occasion we are going to use the <a href="https://developers.cloudflare.com/fundamentals/get-started/reference/http-request-headers/#cf-ipcountry">CF-IPCountry</a>  header to conditionally return a custom greeting message to visitors from Paraguay, United States, Great Britain and Netherlands. While returning a generic message to visitors from other countries.</p><p>To begin we are going to load the custom greeting messages using Upstash’s online CLI tool.</p>
            <pre><code>➜ set PY "Mba'ẽichapa 🇵🇾"
OK
➜ set US "How are you? 🇺🇸"
OK
➜ set GB "How do you do? 🇬🇧"
OK
➜ set NL "Hoe gaat het met u? 🇳🇱"
OK</code></pre>
            <p>We also need to install <code>@upstash/redis</code> package on our Worker before we upload the following code.</p>
            <pre><code>import { Redis } from '@upstash/redis/cloudflare'
 
export default {
  async fetch(request, env, ctx) {
    const country = request.headers.get("cf-ipcountry");
    const redis = Redis.fromEnv(env);
    if (country) {
      const localizedMessage = await redis.get(country);
      if (localizedMessage) {
        return new Response(localizedMessage);
      }
    }
    return new Response("👋👋 Hello there! 👋👋");
  },
};</code></pre>
            <p>Just like that we are returning a localized message from the Redis instance depending on the country which the request originated from. Furthermore, we have a couple ways to improve performance, for write heavy use cases we can use <a href="/announcing-workers-smart-placement/">Smart Placement</a> with no replicas, so the Worker code will be executed near the Redis instance provided by Upstash. Otherwise, creating a <a href="https://docs.upstash.com/redis/features/globaldatabase">Global Database</a> on Upstash to have multiple read replicas across regions will help.</p>
    <div>
      <h3><a href="https://developers.cloudflare.com/workers/databases/native-integrations/upstash/">Try it now</a></h3>
      <a href="#">
        
      </a>
    </div>
    <p>Upstash Redis, Kafka and QStash are now available for all users! Stay tuned for more updates as we continue to expand our Database Integrations catalog.</p> ]]></content:encoded>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[Developers]]></category>
            <category><![CDATA[Kafka]]></category>
            <category><![CDATA[Database]]></category>
            <category><![CDATA[Internship Experience]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <guid isPermaLink="false">6PIdVuhR9PDMgFblDoqqfc</guid>
            <dc:creator>Joaquin Gimenez</dc:creator>
            <dc:creator>Shaun Persad</dc:creator>
        </item>
        <item>
            <title><![CDATA[Announcing database integrations: a few clicks to connect to Neon, PlanetScale and Supabase on Workers]]></title>
            <link>https://blog.cloudflare.com/announcing-database-integrations/</link>
            <pubDate>Tue, 16 May 2023 13:05:00 GMT</pubDate>
            <description><![CDATA[ Today we’re announcing Database Integrations  – making it seamless to connect to your database of choice on Workers.  ]]></description>
            <content:encoded><![CDATA[ <p></p><p><i>This blog post references a feature which has updated documentation. For the latest reference content, visit </i><a href="https://developers.cloudflare.com/workers/databases/third-party-integrations/"><i>https://developers.cloudflare.com/workers/databases/third-party-integrations/</i></a></p><p>One of the best feelings as a developer is seeing your idea come to life. You want to move fast and Cloudflare’s developer platform gives you the tools to take your applications from 0 to 100 within minutes.</p><p>One thing that we’ve heard slows developers down is the question: <i>“What databases can be used with Workers?”</i>. Developers stumble when it comes to things like finding the databases that Workers can connect to, the right library or driver that's compatible with Workers and translating boilerplate examples to something that can run on our developer platform.</p><p>Today we’re announcing Database Integrations  – making it seamless to connect to your database of choice on Workers. To start, we’ve added some of the most popular databases that support HTTP connections: <a href="https://neon.tech/">Neon</a>, <a href="https://planetscale.com/">PlanetScale</a> and <a href="https://supabase.com/">Supabase</a> with more (like Prisma, Fauna, MongoDB Atlas) to come!</p>
    <div>
      <h2>Focus more on code, less on config</h2>
      <a href="#focus-more-on-code-less-on-config">
        
      </a>
    </div>
    <p>Our <a href="https://www.cloudflare.com/developer-platform/products/d1/">serverless SQL database</a>, D1, launched in open alpha last year, and we’re continuing to invest in making it production ready (stay tuned for an exciting update later this week!). We also recognize that there are plenty of flavours of databases, and we want developers to have the freedom to select what’s best for them and pair it with our powerful compute offering.</p><p>On our second day of this Developer Week 2023, data is in the spotlight. We’re taking huge strides in making it possible and more performant to connect to databases from Workers (spoiler alert!):</p><ul><li><p><a href="/workers-tcp-socket-api-connect-databases">Announcing connect() — a new API for creating TCP sockets from Cloudflare Workers</a></p></li><li><p><a href="/announcing-database-integrations">Smart Placement speeds up applications by moving code close to your backend — no config needed</a></p></li></ul><p>Making it possible and performant is just the start, we also want to make connecting to databases painless. Databases have specific protocols, drivers, APIs and vendor specific features that you need to understand in order to get up and running. With Database Integrations, we want to make this process foolproof.</p><p>Whether you’re working on your first project or your hundredth project, you should be able to connect to your database of choice with your eyes closed. With Database Integrations, you can spend less time focusing on configuration and more on doing what you love – building your applications!</p>
    <div>
      <h2>What does this experience look like?</h2>
      <a href="#what-does-this-experience-look-like">
        
      </a>
    </div>
    
    <div>
      <h3>Discoverability</h3>
      <a href="#discoverability">
        
      </a>
    </div>
    <p>If you’re starting a project from scratch or want to connect Workers to an existing database, you want to know <i>“What are my options?”</i>.</p><p>Workers supports connections to a wide array of database providers over HTTP.  With newly released <a href="/workers-tcp-socket-api-connect-databases">outbound TCP support</a>, the databases that you can connect to on Workers will only grow!</p><p>In the new “Integrations” tab, you’ll be able to view all the databases that we support and add the integration to your Worker directly from here. To start, we have support for Neon, PlanetScale and Supabase with many more coming soon.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/72AInQBXAsWNkEcBr3DPeo/3c546a2d7e2cfdf403ffac91154f172a/image2-10.png" />
            
            </figure>
    <div>
      <h3>Authentication</h3>
      <a href="#authentication">
        
      </a>
    </div>
    <p>You should never have to copy and paste your database credentials or other parts of the connection string.</p><p>Once you hit “Add Integration” we take you through an OAuth2 flow that automatically gets the right configuration from your database provider and adds them as encrypted environment variables to your Worker.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4DKr9USRBKD90foCkyPkLM/b9bedf4c535ed33c1b42bdc176f066c3/integration-flow.png" />
            
            </figure><p>Once you have credentials set up, check out our <a href="https://developers.cloudflare.com/workers/learning/integrations/databases/#native-database-integrations-beta">documentation</a> for examples on how to get started using the data platform’s client library. What’s more – we have templates coming that will allow you to get started even faster!</p><p>That’s it! With database integrations, you can connect your Worker with your database in just a few clicks. Head to your Worker &gt; Settings &gt; Integrations to try it out today.</p>
    <div>
      <h2>What’s next?</h2>
      <a href="#whats-next">
        
      </a>
    </div>
    <p>We’ve only just scratched the surface with Database Integrations and there’s a ton more coming soon!</p><p>While we’ll be continuing to add support for more popular data platforms we also know that it's impossible for us to keep up in a moving landscape. We’ve been working on an integrations platform so that any database provider can easily build their own integration with Workers. As a developer, this means that you can start tinkering with the next new database right away on Workers.</p><p>Additionally, we’re working on adding wrangler support, so you can create integrations directly from the CLI. We’ll also be adding support for account level environment variables in order for you to share integrations across the Workers in your account.</p><p>We’re really excited about the potential here and to see all the new creations from our developers! Be sure to join <a href="https://discord.gg/cloudflaredev">Cloudflare’s Developer Discord</a> and share your projects. Happy building!</p> ]]></content:encoded>
            <category><![CDATA[Developer Week]]></category>
            <category><![CDATA[Latin America]]></category>
            <category><![CDATA[SASE]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[Serverless]]></category>
            <category><![CDATA[Database]]></category>
            <category><![CDATA[Internet Performance]]></category>
            <category><![CDATA[Developers]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <category><![CDATA[Connectivity Cloud]]></category>
            <guid isPermaLink="false">bAEI0IEwgtOSNAWKw64Zl</guid>
            <dc:creator>Shaun Persad</dc:creator>
            <dc:creator>Emily Chen</dc:creator>
            <dc:creator>Tanushree Sharma</dc:creator>
        </item>
        <item>
            <title><![CDATA[Smart Placement speeds up applications by moving code close to your backend — no config needed]]></title>
            <link>https://blog.cloudflare.com/announcing-workers-smart-placement/</link>
            <pubDate>Tue, 16 May 2023 13:00:46 GMT</pubDate>
            <description><![CDATA[ Smart Placement automatically places your workloads in an optimal location that minimizes latency and speeds up your applications! ]]></description>
            <content:encoded><![CDATA[ 
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/68NfZl5AwLPrk573wmR7Id/e605ccc58d1612b69beef233229ae79f/image2-14.png" />
            
            </figure><p>We’ve all experienced the frustrations of slow loading websites, or an app that seems to stall when it needs to call an API for an update. Anything less than instant and your mind wanders to something else...</p><p>One way to make things fast is to bring resources as close to the user as possible — this is what Cloudflare has been doing with compute by running within milliseconds of most of the world’s population. But, counterintuitive as it may seem, sometimes bringing compute closer to the user can actually slow applications down. If your application needs to connect to APIs, databases or other resources that aren’t located near the end user then it can be more performant to run the application near the resources instead of the user.</p><p>So today we’re excited to announce Smart Placement for Workers and Pages Functions, making every interaction as fast as possible. With Smart Placement, Cloudflare is taking serverless computing to the Supercloud by moving compute resources to optimal locations in order to speed up applications. The best part – it’s completely automatic, without any extra input (like the dreaded “region”) needed.</p><p>Smart Placement is available now, in open beta, to all Workers and Pages customers!</p><p><a href="https://smart-placement-demo.pages.dev/">Check out our demo on how Smart Placement works!</a></p><p></p>
    <div>
      <h2>The serverless shift</h2>
      <a href="#the-serverless-shift">
        
      </a>
    </div>
    <p>Cloudflare’s anycast network is built to process requests <i>instantly and close to the user</i>. As a developer, that’s what makes Cloudflare Workers, our serverless compute offering so compelling. Competitors are bounded by “regions” while Workers run everywhere — hence we have one region: Earth. Requests handled entirely by Workers can be processed right then and there, without ever having to hit an origin server.</p><p>While this concept of serverless was originally considered to be for lightweight tasks, serverless computing has been seeing a shift in recent years. It’s being used to replace traditional architecture, which relies on origin servers and self-managed infrastructure, instead of simply augmenting it. We’re seeing more and more of these use cases with Workers and Pages users.</p>
    <div>
      <h3>Serverless needs state</h3>
      <a href="#serverless-needs-state">
        
      </a>
    </div>
    <p>With the shift to going serverless and building entire applications on Workers comes a need for data. Storing information about previous actions or events lets you build personalized, interactive applications. Say you need to create user profiles, store which page a user left off at, which SKUs a user has in their cart – all of these are mapped to data points used to maintain state. Backend services like relational databases, key-value stores, blob storage, and APIs all let you build stateful applications.</p>
    <div>
      <h3>Cloudflare compute + storage: a powerful duo</h3>
      <a href="#cloudflare-compute-storage-a-powerful-duo">
        
      </a>
    </div>
    <p>We have our own growing suite of storage offerings: Workers KV, Durable Objects, D1, <a href="https://www.cloudflare.com/developer-platform/r2/">R2</a>. As we’re maturing our data products, we think deeply about their interactions with Workers so that you don’t have to! For example, another approach that has better performance in some cases is moving storage, rather than compute close to users. If you’re using Durable Objects to create a real-time game we could move the Durable Objects to minimize latency for all users.</p><p>Our goal for the future state is that you set mode = "smart"and we evaluate the optimal placement of all your resources with no additional configuration needed.</p>
    <div>
      <h3>Cloudflare compute + ${backendService}</h3>
      <a href="#cloudflare-compute-backendservice">
        
      </a>
    </div>
    <p>Today, the primary use case for Smart Placement is when you’re using non-Cloudflare services like external databases or third party APIs for your applications.</p><p>Many backend services, whether they’re self-hosted or managed services, are centralized, meaning that data is stored and managed in a single location. Your users are global and Workers are global, but your backend is centralized.</p><p>If your code makes multiple requests to your backend services they could be crossing the globe multiple times, having a big hit on performance. Some services offer data replication and caching which help to improve performance, but also come with trade-offs like data consistency and higher costs that should be weighed against your use case.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6Xh0tIBnMeZp7z1mDeonqd/2d55124ff2c831d964eb36a98852a968/map.png" />
            
            </figure><p>The Cloudflare network is <a href="https://www.cloudflare.com/network/">~50ms from 95% of the world’s connected population</a>. Turning this on its head, we're also very close to your backend services.</p>
    <div>
      <h2>Application performance is user experience</h2>
      <a href="#application-performance-is-user-experience">
        
      </a>
    </div>
    <p>Let’s understand how moving compute close to your backend services could decrease application latency by walking through an example:</p><p>Say you have a user in Sydney, Australia who’s accessing an application running on Workers. This application makes three round trips to a database located in Frankfurt, Germany in order to serve the user’s request.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4EGjgjjCqADFBQGq5OXVFC/cfd95273a6e9a7afbe09ab27e3c90104/download--1--6.png" />
            
            </figure><p>Intuitively, you can guess that the bottleneck is going to be the time that it takes the Worker to perform multiple round trips to your database. Instead of the Worker being invoked close to the user, what if it was invoked in a data center closest to the database instead?</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4D4IOJIAlHuGYce5DqjGky/09d291252b813c1981cac9ef19cb1fd7/wAAAABJRU5ErkJggg__" />
            
            </figure><p>Let’s put this to the test.</p><p>We measured the request duration for a Worker without Smart Placement and compared it to one with Smart Placement enabled. For both tests, we sent 3,500 requests from Sydney to a Worker which does three round trips to an <a href="https://upstash.com/">Upstash</a> instance (free-tier) located in eu-central-1 (Frankfurt).</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/54nvh0FI2RcDh1gQ9ea4Tq/6a0ca6a7ea678cd965c79b25d7b90874/image3-7.png" />
            
            </figure><p>The results are clear! In this example, moving the Worker close to the backend <b>improved</b> <b>application performance by 4-8x</b>.</p>
    <div>
      <h2>Network decisions shouldn’t be human decisions</h2>
      <a href="#network-decisions-shouldnt-be-human-decisions">
        
      </a>
    </div>
    <p>As a developer, you should focus on what you do best – building applications – without needing to worry about the network decisions that will make your application faster.</p><p>Cloudflare has a unique vantage point: our network gathers intelligence around the optimal paths between users, Cloudflare data centers and back-end servers – we have lots of experience in this area with <a href="/argo/">Argo Smart Routing</a>. Smart Placement takes these factors into consideration to automatically place your Worker in the best spot to minimize overall request duration.</p><p>So, how does Smart Placement work?</p><p>Smart Placement can be enabled on a per-Worker basis under the “Settings” tab or in your wrangler.toml file:</p>
            <pre><code>[placement]
mode = "smart"</code></pre>
            <p>Once you enable Smart Placement on your Worker or Pages Function, the Smart Placement algorithm analyzes fetch requests (also known as subrequests) that your Worker is making in real time. It then compares these to latency data aggregated by our network. If we detect that on average your Worker makes more than one subrequest to a back-end resource, then your Worker will automatically get invoked from the optimal data center!</p><p>There are some back-end services that, for good reason, are not considered by the Smart Placement algorithm:</p><ul><li><p>Globally distributed services: If the services that your Worker communicates with are geo-distributed in many regions, Smart Placement isn’t a good fit. We automatically rule these out of the Smart Placement optimization.</p></li><li><p>Analytics or logging services: Requests to analytics or logging services don’t need to be in the critical path of your application. <a href="https://developers.cloudflare.com/workers/runtime-apis/fetch-event/?ref=blog.cloudflare.com#waituntil"><code>waitUntil()</code></a> should be used so that the response back to users isn’t blocked when instrumenting your code. Since <code>waitUntil()</code> doesn’t impact the request duration from a user’s perspective, we automatically rule analytics/logging services out of the Smart Placement optimization.</p></li></ul><p>Refer to our <a href="https://developers.cloudflare.com/workers/platform/smart-placement/#supported-backends">documentation</a> for a list of services not considered by the Smart Placement algorithm.</p><p>Once Smart Placement kicks in, you’ll be able to see a new “Request Duration” tab on your Worker. We route 1% of requests without Smart Placement enabled so that you can see its impact on request duration.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/2IicbhnuU143f8T6EJc9s4/2f24b79cc4b019123b1e7ed2df4cd02a/download-9.png" />
            
            </figure><p>And yes, it is really that easy!</p><p>Try out Smart Placement by checking out our <a href="https://smart-placement-demo.pages.dev/">demo</a> (it’s a lot of fun to play with!). To learn more, visit our <a href="https://developers.cloudflare.com/workers/platform/smart-placement/">developer documentation</a>.</p>
    <div>
      <h2>What’s next for Smart Placement?</h2>
      <a href="#whats-next-for-smart-placement">
        
      </a>
    </div>
    <p>We’re only getting started! We have lots of ideas on how we can improve Smart Placement:</p><ul><li><p>Support for calculating the optimal location when the application uses multiple back-ends</p></li><li><p>Fine-tuned placement (e.g. if your Worker uses multiple back-ends depending on the path. We calculate the optimal placement per path instead of per-Worker)</p></li><li><p>Support for TCP based connections</p></li></ul><p>We would like to hear from you! If you have feedback or feature requests, reach out through the <a href="https://discord.com/invite/cloudflaredev">Cloudflare Developer Discord</a>.</p>
    <div>
      <h3>Watch on Cloudflare TV</h3>
      <a href="#watch-on-cloudflare-tv">
        
      </a>
    </div>
    <div></div> ]]></content:encoded>
            <category><![CDATA[Developer Week]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[Serverless]]></category>
            <category><![CDATA[Database]]></category>
            <category><![CDATA[Developers]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <guid isPermaLink="false">7AkCd01DEJfXbjEGoK8Uzq</guid>
            <dc:creator>Michael Hart</dc:creator>
            <dc:creator>Serena Shah-Simpson</dc:creator>
            <dc:creator>Tanushree Sharma</dc:creator>
        </item>
        <item>
            <title><![CDATA[Announcing connect() — a new API for creating TCP sockets from Cloudflare Workers]]></title>
            <link>https://blog.cloudflare.com/workers-tcp-socket-api-connect-databases/</link>
            <pubDate>Tue, 16 May 2023 13:00:13 GMT</pubDate>
            <description><![CDATA[ Today, we are excited to announce a new API in Cloudflare Workers for creating outbound TCP sockets, making it possible to connect directly to databases and any TCP-based service from Workers ]]></description>
            <content:encoded><![CDATA[ 
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1CjlPkdLJUXlfgIKgq2Jvy/d2e17e3027c02f82e191007561640f79/image2-12.png" />
            
            </figure><p>Today, we are excited to announce a new API in Cloudflare Workers for creating outbound TCP sockets, making it possible to connect directly to any TCP-based service from Workers.</p><p>Standard protocols including <a href="https://www.cloudflare.com/learning/access-management/what-is-ssh/">SSH</a>, MQTT, SMTP, FTP, and IRC are all built on top of TCP. Most importantly, nearly all applications need to connect to databases, and most databases speak TCP. And while <a href="https://developers.cloudflare.com/d1/">Cloudflare D1</a> works seamlessly on Workers, and some <a href="https://developers.cloudflare.com/workers/learning/integrations/databases/">hosted database providers</a> allow connections over HTTP or WebSockets, the vast majority of databases, both relational (SQL) and document-oriented (NoSQL), require clients to connect by opening a direct TCP “socket”, an ongoing two-way connection that is used to send queries and receive data. Now, Workers provides an API for this, the first of many steps to come in allowing you to use any database or infrastructure you choose when building full-stack applications on Workers.</p><p>Database drivers, the client code used to connect to databases and execute queries, are already using this new API. <a href="https://github.com/brianc/node-postgres">pg</a>, the most widely used JavaScript database driver for PostgreSQL, works on Cloudflare Workers today, with more database drivers to come.</p><p>The TCP Socket API is available today to everyone. Get started by reading the <a href="https://developers.cloudflare.com/workers/runtime-apis/tcp-sockets">TCP Socket API docs</a>, or connect directly to any PostgreSQL database from your Worker by following <a href="https://developers.cloudflare.com/workers/databases/connect-to-postgres/">this guide</a>.</p>
    <div>
      <h2>First — what is a TCP Socket?</h2>
      <a href="#first-what-is-a-tcp-socket">
        
      </a>
    </div>
    <p><a href="https://www.cloudflare.com/learning/ddos/glossary/tcp-ip/">TCP (Transmission Control Protocol)</a> is a foundational networking protocol of the Internet. It is the underlying protocol that is used to make HTTP requests (prior to <a href="https://www.cloudflare.com/learning/performance/what-is-http3/">HTTP/3</a>, which uses <a href="https://cloudflare-quic.com/">QUIC</a>), to send email over <a href="https://www.cloudflare.com/learning/email-security/what-is-smtp/">SMTP</a>, to query databases using database–specific protocols like MySQL, and many other application-layer protocols.</p><p>A TCP socket is a programming interface that represents a two-way communication connection between two applications that have both agreed to “speak” over TCP. One application (ex: a Cloudflare Worker) initiates an outbound TCP connection to another (ex: a database server) that is listening for inbound TCP connections. Connections are established by negotiating a three-way handshake, and after the handshake is complete, data can be sent bi-directionally.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6xxArl43DbexJUoRmw8JrG/0ad545bb25f002a4598d387aca491997/image1-30.png" />
            
            </figure><p>A socket is the programming interface for a single TCP connection — it has both a readable and writable “stream” of data, allowing applications to read and write data on an ongoing basis, as long as the connection remains open.</p>
    <div>
      <h2>connect() — A simpler socket API</h2>
      <a href="#connect-a-simpler-socket-api">
        
      </a>
    </div>
    <p>With Workers, we aim to support standard APIs that are supported across browsers and non-browser environments wherever possible, so that as many NPM packages as possible work on Workers without changes, and package authors don’t have to write runtime-specific code. But for TCP sockets, we faced a challenge — there was no clear shared standard across runtimes. Node.js provides the <a href="https://nodejs.org/api/net.html">net</a> and <a href="https://nodejs.org/api/tls.html">tls</a> APIs, but Deno implements a different API — <a href="https://deno.land/api@v1.33.1?s=Deno.connect">Deno.connect</a>. And web browsers do not provide a raw TCP socket API, though a <a href="https://github.com/WICG/direct-sockets/blob/main/docs/explainer.md">WICG proposal</a> does exist, and it is different from both Node.js and Deno.</p><p>We also considered how a TCP socket API could be designed to maximize performance and ergonomics in a serverless environment. Most networking APIs were designed well before serverless emerged, with the assumption that the developer’s application is also the server, responsible for directly handling configuring TLS options and credentials.</p><p>With this backdrop, we reached out to the community, with a focus on maintainers of database drivers, ORMs and other libraries that create outbound TCP connections. Using this feedback, we’ve tried to incorporate the best elements of existing APIs and proposals, and intend to contribute back to future standards, as part of the <a href="/introducing-the-wintercg/">Web-interoperable Runtimes Community Group (WinterCG)</a>.</p><p>The API we landed on is a simple function, connect(), imported from the new cloudflare:sockets module, that returns an instance of a Socket. Here’s a simple example showing it used to connect to a <a href="https://www.w3.org/People/Bos/PROSA/rep-protocols.html#gopher">Gopher</a> server. Gopher was one of the Internet’s early protocols that relied on TCP/IP, and still works today:</p>
            <pre><code>import { connect } from 'cloudflare:sockets';

export default {
  async fetch(req: Request) {
    const gopherAddr = "gopher.floodgap.com:70";
    const url = new URL(req.url);

    try {
      const socket = connect(gopherAddr);

      const writer = socket.writable.getWriter()
      const encoder = new TextEncoder();
      const encoded = encoder.encode(url.pathname + "\r\n");
      await writer.write(encoded);

      return new Response(socket.readable, { headers: { "Content-Type": "text/plain" } });
    } catch (error) {
      return new Response("Socket connection failed: " + error, { status: 500 });
    }
  }
};</code></pre>
            <p>We think this API design has many benefits that can be realized not just on Cloudflare, but in any serverless environment that adopts this design:</p>
            <pre><code>connect(address: SocketAddress | string, options?: SocketOptions): Socket

declare interface Socket {
  get readable(): ReadableStream;
  get writable(): WritableStream;
  get closed(): Promise&lt;void&gt;;
  close(): Promise&lt;void&gt;;
  startTls(): Socket;
}

declare interface SocketOptions {
  secureTransport?: string;
  allowHalfOpen: boolean;
}

declare interface SocketAddress {
  hostname: string;
  port: number;
}</code></pre>
            
    <div>
      <h3>Opportunistic TLS (StartTLS), without separate APIs</h3>
      <a href="#opportunistic-tls-starttls-without-separate-apis">
        
      </a>
    </div>
    <p>Opportunistic TLS, a pattern of creating an initial insecure connection, and then upgrading it to a secure one that uses TLS, remains common, particularly with database drivers. In Node.js, you must use the <a href="https://nodejs.org/api/net.html#class-netsocket">net</a> API to create the initial connection, and then use the <a href="https://nodejs.org/api/tls.html">tls</a> API to create a new, upgraded connection. In Deno, you pass the original socket to <a href="https://deno.land/api@v1.33.1?s=Deno.startTls">Deno.startTls()</a>, which creates a new, upgraded connection.</p><p>Drawing on a <a href="https://www.w3.org/TR/tcp-udp-sockets/#idl-def-TCPOptions">previous W3C proposal</a> for a TCP Socket API, we’ve simplified this by providing one API, that allows TLS to be enabled, allowed, or used when creating a socket, and exposes a simple method, startTls(), for upgrading a socket to use TLS.</p>
            <pre><code>// Create a new socket without TLS. secureTransport defaults to "off" if not specified.
const socket = connect("address:port", { secureTransport: "off" })

// Create a new socket, then upgrade it to use TLS.
// Once startTls() is called, only the newly created socket can be used.
const socket = connect("address:port", { secureTransport: "starttls" })
const secureSocket = socket.startTls();

// Create a new socket with TLS
const socket = connect("address:port", { secureTransport: "use" })</code></pre>
            
    <div>
      <h3>TLS configuration — a concern of host infrastructure, not application code</h3>
      <a href="#tls-configuration-a-concern-of-host-infrastructure-not-application-code">
        
      </a>
    </div>
    <p>Existing APIs for creating TCP sockets treat TLS as a library that you interact with in your application code. The <a href="https://nodejs.org/api/tls.html#tlscreatesecurecontextoptions">tls.createSecureContext()</a> API from Node.js has a plethora of advanced configuration options that are mostly environment specific. If you use custom certificates when connecting to a particular service, you likely use a different set of credentials and options in production, staging and development. Managing direct file paths to credentials across environments and swapping out .env files in production build steps are common pain points.</p><p>Host infrastructure is best positioned to manage this on your behalf, and similar to Workers support for <a href="/mtls-workers/">making subrequests using mTLS</a>, TLS configuration and credentials for the socket API will be managed via Wrangler, and a connect() function provided via a <a href="https://developers.cloudflare.com/workers/platform/bindings/">capability binding</a>. Currently, custom TLS credentials and configuration are not supported, but are coming soon.</p>
    <div>
      <h3>Start writing data immediately, before the TLS handshake finishes</h3>
      <a href="#start-writing-data-immediately-before-the-tls-handshake-finishes">
        
      </a>
    </div>
    <p>Because the connect() API synchronously returns a new socket, one can start writing to the socket immediately, without waiting for the TCP handshake to first complete. This means that once the handshake completes, data is already available to send immediately, and host platforms can make use of pipelining to optimize performance.</p>
    <div>
      <h2>connect() API + DB drivers = Connect directly to databases</h2>
      <a href="#connect-api-db-drivers-connect-directly-to-databases">
        
      </a>
    </div>
    <p>Many <a href="https://www.cloudflare.com/developer-platform/products/d1/">serverless databases</a> already work on Workers, allowing clients to connect over HTTP or over <a href="/neon-postgres-database-from-workers/">WebSockets</a>. But most databases don’t “speak” HTTP, including databases hosted on most cloud providers.</p><p>Databases each have their own “wire protocol”, and open-source database “drivers” that speak this protocol, sending and receiving data over a TCP socket. Developers rely on these drivers in their own code, as do database ORMs. Our goal is to make sure that you can use the same drivers and ORMs you might use in other runtimes and on other platforms on Workers.</p>
    <div>
      <h2>Try it now — connect to PostgreSQL from Workers</h2>
      <a href="#try-it-now-connect-to-postgresql-from-workers">
        
      </a>
    </div>
    <p>We’ve worked with the maintainers of <a href="https://www.npmjs.com/package/pg">pg</a>, one of the most popular database drivers in the JavaScript ecosystem, used by ORMs including <a href="https://sequelize.org/docs/v6/getting-started/">Sequelize</a> and <a href="https://knexjs.org/">knex.js</a>, to add support for connect().</p><p>You can try this right now. First, create a new Worker and install pg:</p>
            <pre><code>wrangler init
npm install --save pg</code></pre>
            <p>As of this writing, you’ll need to <a href="https://developers.cloudflare.com/workers/wrangler/configuration/#add-polyfills-using-wrangler">enable the node_compat</a> option in wrangler.toml:</p><p><b>wrangler.toml</b></p>
            <pre><code>name = "my-worker"
main = "src/index.ts"
compatibility_date = "2023-05-15"
node_compat = true</code></pre>
            <p>In just 20 lines of TypeScript, you can create a connection to a Postgres database, execute a query, return results in the response, and close the connection:</p><p><b>index.ts</b></p>
            <pre><code>import { Client } from "pg";

export interface Env {
  DB: string;
}

export default {
  async fetch(
    request: Request,
    env: Env,
    ctx: ExecutionContext
  ): Promise&lt;Response&gt; {
    const client = new Client(env.DB);
    await client.connect();
    const result = await client.query({
      text: "SELECT * from customers",
    });
    console.log(JSON.stringify(result.rows));
    const resp = Response.json(result.rows);
    // Close the database connection, but don't block returning the response
    ctx.waitUntil(client.end());
    return resp;
  },
};</code></pre>
            <p>To test this in local development, use the <code>--experimental-local</code> flag (instead of <code>–local</code>), which <a href="/miniflare-and-workerd/">uses the open-source Workers runtime</a>, ensuring that what you see locally mirrors behavior in production:</p>
            <pre><code>wrangler dev --experimental-local</code></pre>
            
    <div>
      <h2>What’s next for connecting to databases from Workers?</h2>
      <a href="#whats-next-for-connecting-to-databases-from-workers">
        
      </a>
    </div>
    <p>This is only the beginning. We’re aiming for the two popular MySQL drivers, <a href="https://github.com/mysqljs/mysql">mysql</a> and <a href="https://github.com/sidorares/node-mysql2">mysql2</a>, to work on Workers soon, with more to follow. If you work on a database driver or ORM, we’d love to help make your library work on Workers.</p><p>If you’ve worked more closely with database scaling and performance, you might have noticed that in the example above, a new connection is created for every request. This is one of the biggest current challenges of connecting to databases from serverless functions, across all platforms. With typical client connection pooling, you maintain a local pool of database connections that remain open. This approach of storing a reference to a connection or connection pool in global scope will not work, and is a poor fit for serverless. Managing individual pools of client connections on a per-isolate basis creates other headaches — when and how should connections be terminated? How can you limit the total number of concurrent connections across many isolates and locations?</p><p>Instead, we’re already working on simpler approaches to connection pooling for the most popular databases. We see a path to a future where you don’t have to think about or manage client connection pooling on your own. We’re also working on a brand new approach to making your database reads lightning fast.</p>
    <div>
      <h2>What’s next for sockets on Workers?</h2>
      <a href="#whats-next-for-sockets-on-workers">
        
      </a>
    </div>
    <p>Supporting outbound TCP connections is only one half of the story — we plan to support inbound TCP and UDP connections, as well as new emerging application protocols based on QUIC, so that you can build applications beyond HTTP with <a href="/introducing-socket-workers/">Socket Workers</a>.</p><p>Earlier today we also announced <a href="/announcing-workers-smart-placement">Smart Placement</a>, which improves performance by placing any Worker that makes multiple HTTP requests to an origin run as close as possible to reduce round-trip time. We’re working on making this work with Workers that open TCP connections, so that if your Worker connects to a database in Virginia and makes many queries over a TCP connection, each query is lightning fast and comes from the nearest location on <a href="https://www.cloudflare.com/network/">Cloudflare’s global network</a>.</p><p>We also plan to support custom certificates and other TLS configuration options in the coming months — tell us what is a must-have in order to connect to the services you need to connect to from Workers.</p>
    <div>
      <h2>Get started, and share your feedback</h2>
      <a href="#get-started-and-share-your-feedback">
        
      </a>
    </div>
    <p>The TCP Socket API is available today to everyone. Get started by reading the <a href="https://developers.cloudflare.com/workers/runtime-apis/tcp-sockets">TCP Socket API docs</a>, or connect directly to any PostgreSQL database from your Worker by following <a href="https://developers.cloudflare.com/workers/databases/connect-to-postgres/">this guide</a>.</p><p>We want to hear your feedback, what you’d like to see next, and more about what you’re building. Join the <a href="https://discord.cloudflare.com/">Cloudflare Developers Discord</a>.</p>
    <div>
      <h3>Watch on Cloudflare TV</h3>
      <a href="#watch-on-cloudflare-tv">
        
      </a>
    </div>
    <div></div><p></p> ]]></content:encoded>
            <category><![CDATA[Developer Week]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[TCP]]></category>
            <category><![CDATA[Database]]></category>
            <category><![CDATA[Developers]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <guid isPermaLink="false">14RexUSLCzOVnpWl5DkGIq</guid>
            <dc:creator>Brendan Irvine-Broque</dc:creator>
            <dc:creator>Matt Silverlock</dc:creator>
        </item>
        <item>
            <title><![CDATA[UPDATE Supercloud SET status = 'open alpha' WHERE product = 'D1';]]></title>
            <link>https://blog.cloudflare.com/d1-open-alpha/</link>
            <pubDate>Wed, 16 Nov 2022 14:01:00 GMT</pubDate>
            <description><![CDATA[ As we continue down the road to making D1 production ready, it wouldn’t be “the Cloudflare way” unless we stopped for feedback first. D1 is now in Open Alpha! ]]></description>
            <content:encoded><![CDATA[ 
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7w9UvQOVgrNbxPrz1tOWJz/611cdc1253d0c6971709f5dddacc0811/image1-48.png" />
            
            </figure><p>In May 2022, we <a href="/introducing-d1/">announced</a> our quest to simplify databases – building them, maintaining them, integrating them. Our goal is to empower you with the tools to run a database that is <a href="https://www.cloudflare.com/developer-platform/products/d1/">powerful, scalable, with world-beating performance</a> without any hassle. And we first set our sights on reimagining the database development experience for every type of user – not just database experts.</p><p>Over the past couple of months, we’ve <a href="/whats-new-with-d1/">been working</a> to create just that, while learning some very important lessons along the way. As it turns out, building a global relational database product on top of Workers pushes the boundaries of the developer platform to their absolute limit, and often beyond them, but in a way that’s absolutely thrilling to us at Cloudflare. It means that while our progress might seem slow from outside, every improvement, bug fix or stress test helps lay down a path for <i>all</i> of our customers to build the world’s most <a href="/welcome-to-the-supercloud-and-developer-week-2022/">ambitious serverless application</a>.</p><p>However, as we continue down the road to making D1 production ready, it wouldn’t be “the Cloudflare way” unless we stopped for feedback first – even though it’s not <i>quite</i> finished yet. In the spirit of Developer Week, <b>there is no better time to introduce the D1 open alpha</b>!</p><p>An “open alpha” is a new concept for us. You'll likely hear the term “open beta” on various announcements at Cloudflare, and while it makes sense for many products here, it wasn’t quite right for D1. There are still some crucial pieces that are still in active development and testing, so before we release the fully-formed D1 as a public beta for you to start building real-world apps with, we want to make sure everybody can start to get a feel for the product on their hobby apps or side-projects.</p>
    <div>
      <h2>What’s included in the alpha?</h2>
      <a href="#whats-included-in-the-alpha">
        
      </a>
    </div>
    <p>While a lot is still changing behind the scenes with D1, we’ve put a lot of thought into how you, as a developer, interact with it – even if you’re new to databases.</p>
    <div>
      <h3>Using the D1 dashboard</h3>
      <a href="#using-the-d1-dashboard">
        
      </a>
    </div>
    <p>In a few clicks you can get your D1 database up and running right from within your dashboard. In our D1 interface, you can create, maintain and view your database as you please. Changes made in the UI are instantly available to your Worker - no redeploy required!</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6vOzmnP9cvUYbJanSvprvl/b4a01d4edcc3dcada5a326e352b5f0e2/image2-30.png" />
            
            </figure>
    <div>
      <h3>Use Wrangler</h3>
      <a href="#use-wrangler">
        
      </a>
    </div>
    <p>If you’re looking to get your hands a little dirty, you can also work with your database using our Wrangler CLI. Create your database and begin adding your data manually or bootstrap your database with one of two ways:</p><p><b>1.  Execute an SQL file</b></p>
            <pre><code>$ wrangler d1 execute my-database-name --file ./customers.sql</code></pre>
            <p>where your <code>.sql</code> file looks something like this:</p><p>customers.sql</p>
            <pre><code>DROP TABLE IF EXISTS Customers;
CREATE TABLE Customers (CustomerID INT, CompanyName TEXT, ContactName TEXT, PRIMARY KEY (`CustomerID`));
INSERT INTO Customers (CustomerID, CompanyName, ContactName) 
VALUES (1, 'Alfreds Futterkiste', 'Maria Anders'),(4, 'Around the Horn', 'Thomas Hardy'),(11, 'Bs Beverages', 'Victoria Ashworth'),(13, 'Bs Beverages', 'Random Name');</code></pre>
            <p><b>2. Create and run migrations</b></p><p>Migrations are a way to version your database changes. With D1, you can <a href="https://developers.cloudflare.com/d1/migrations/">create a migration</a> and then apply it to your database.</p><p>To create the migration, execute:</p>
            <pre><code>wrangler d1 migrations create &lt;my-database-name&gt; &lt;short description of migration&gt;</code></pre>
            <p>This will create an SQL file in a <code>migrations</code> folder where you can then go ahead and add your queries. Then apply the migrations to your database by executing:</p>
            <pre><code>wrangler d1 migrations apply &lt;my-database-name&gt;</code></pre>
            
    <div>
      <h3>Access D1 from within your Worker</h3>
      <a href="#access-d1-from-within-your-worker">
        
      </a>
    </div>
    <p>You can attach your D1 to a Worker by adding the D1 binding to your <code>wrangler.toml</code> configuration file. Then interact with D1 by executing queries inside your Worker like so:</p>
            <pre><code>export default {
 async fetch(request, env) {
   const { pathname } = new URL(request.url);

   if (pathname === "/api/beverages") {
     const { results } = await env.DB.prepare(
       "SELECT * FROM Customers WHERE CompanyName = ?"
     )
       .bind("Bs Beverages")
       .all();
     return Response.json(results);
   }

   return new Response("Call /api/beverages to see Bs Beverages customers");
 },
};</code></pre>
            
    <div>
      <h3>Or access D1 from within your Pages Function</h3>
      <a href="#or-access-d1-from-within-your-pages-function">
        
      </a>
    </div>
    <p>In this Alpha launch, D1 also supports integration with <a href="https://pages.cloudflare.com/">Cloudflare Pages</a>! You can add a D1 binding inside the Pages dashboard, and write your queries inside a Pages Function to build a full-stack application! Check out the <a href="https://developers.cloudflare.com/pages/platform/functions/bindings/#d1-database">full documentation</a> to get started with Pages and D1.</p>
    <div>
      <h2>Community built tooling</h2>
      <a href="#community-built-tooling">
        
      </a>
    </div>
    <p>During our private alpha period, the excitement behind D1 led to some valuable contributions to the D1 ecosystem and developer experience by members of the community. Here are some of our favorite projects to date:</p>
    <div>
      <h3>d1-orm</h3>
      <a href="#d1-orm">
        
      </a>
    </div>
    <p>An Object Relational Mapping (ORM) is a way for you to query and manipulate data by using JavaScript. Created by a Cloudflare Discord Community Champion, the <code>d1-orm</code> seeks to provide a strictly typed experience while using D1:</p>
            <pre><code>const users = new Model(
    // table name, primary keys, indexes etc
    tableDefinition,
    // column types, default values, nullable etc
    columnDefinitions
)

// TS helper for typed queries
type User = Infer&lt;type of users&gt;;

// ORM-style query builder
const user = await users.First({
    where: {
        id: 1,
    },
});</code></pre>
            <p>You can check out the <a href="https://docs.interactions.rest/d1-orm/">full documentation</a>, and provide feedback by making an issue on the <a href="https://github.com/Interactions-as-a-Service/d1-orm/issues">GitHub repository</a>.</p>
    <div>
      <h3>workers-qb</h3>
      <a href="#workers-qb">
        
      </a>
    </div>
    <p>This is a zero-dependency query builder that provides a simple standardized interface while keeping the benefits and speed of using raw queries over a traditional ORM. While not intended to provide ORM-like functionality, <code>workers-qb</code> makes it easier to interact with the database from code for direct SQL access:</p>
            <pre><code>const qb = new D1QB(env.DB)

const fetched = await qb.fetchOne({
  tableName: 'employees',
  fields: 'count(*) as count',
  where: {
    conditions: 'department = ?1',
    params: ['HQ'],
  },
})</code></pre>
            <p>You can read more about the query builder <a href="https://workers-qb.massadas.com/">here</a>.</p>
    <div>
      <h3>d1-console</h3>
      <a href="#d1-console">
        
      </a>
    </div>
    <p>Instead of running the <code>wrangler d1 execute</code> command in your terminal every time you want to interact with your database, you can interact with D1 from within the <code>d1-console</code>. Created by a Discord Community Champion, this gives the benefit of executing multi-line queries, obtaining command history, and viewing a cleanly formatted table output.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4QR9Tf5DXnp3brBVvlvgJq/7f5b5083198492190dfc9f24e4fb70e0/image3-23.png" />
            
            </figure><p>While this is a community project today, we plan to natively support a “D1 Console” in the future. For now, get started by checking out the <code>d1-console</code> package <a href="https://github.com/isaac-mcfadyen/d1-console">here</a>.</p>
    <div>
      <h3>D1 adapter for <a href="https://github.com/koskimas/kysely">Kysely</a></h3>
      <a href="#d1-adapter-for">
        
      </a>
    </div>
    <p>Kysely is a type-safe and autocompletion-friendly typescript SQL query builder. With this adapter you can interact with D1 with the familiar Kysely interface:</p>
            <pre><code>// Create Kysely instance with kysely-d1
const db = new Kysely&lt;Database&gt;({ 
  dialect: new D1Dialect({ database: env.DB })
});
    
// Read row from D1 table
const result = await db
  .selectFrom('kv')
  .selectAll()
  .where('key', '=', key)
  .executeTakeFirst();</code></pre>
            <p>Check out the project <a href="https://github.com/aidenwallis/kysely-d1">here</a>.</p>
    <div>
      <h2>What’s still in testing?</h2>
      <a href="#whats-still-in-testing">
        
      </a>
    </div>
    <p>The biggest pieces that have been disabled for this alpha release are replication and JavaScript transaction support. While we’ll be rolling out these changes gradually, we want to call out some limitations that exist today that we’re actively working on testing:</p><ul><li><p><b>Database location:</b> Each D1 database only runs a single instance. It’s created close to where you, as the developer, create the database, and does not currently move regions based on access patterns. Workers running elsewhere in the world will see higher latency as a result.</p></li><li><p><b>Concurrency limitations:</b> Under high load, read and write queries may be queued rather than triggering new replicas to be created. As a result, the performance &amp; throughput characteristics of the open alpha won’t be representative of the final product.</p></li><li><p><b>Availability limitations:</b> Backups will block access to the DB while they’re running. In most cases this should only be a second or two, and any requests that arrive during the backup will be queued.</p></li></ul><p>You can also check out a more detailed, up-to-date list on <a href="https://developers.cloudflare.com/d1/platform/limits/">D1 alpha Limitations</a>.</p>
    <div>
      <h2>Request for feedback</h2>
      <a href="#request-for-feedback">
        
      </a>
    </div>
    <p>While we can make all sorts of guesses and bets on the kind of databases you want to use D1 for, we are not the users – you are! We want developers from all backgrounds to preview the D1 tech at its early stages, and let us know where we need to improve to make it suitable for your production apps.</p><p>For general feedback about your experience and to interact with other folks in the alpha, join our <a href="https://discord.com/channels/595317990191398933/992060581832032316">#d1-open-alpha</a> channel in the <a href="https://discord.gg/cloudflaredev">Cloudflare Developers Discord</a>. We plan to make any important announcements and changes in this channel as well as on our <a href="https://discord.com/channels/595317990191398933/832698219824807956">monthly community calls</a>.</p><p>To file more specific feature requests (no matter how wacky) and report any bugs, create a thread in the <a href="https://community.cloudflare.com/c/developers/d1">Cloudflare Community forum</a> under the D1 category. We will be maintaining this forum as a way to plan for the months ahead!</p>
    <div>
      <h2>Get started</h2>
      <a href="#get-started">
        
      </a>
    </div>
    <p>Want to get started right away? Check out our <a href="https://developers.cloudflare.com/d1/">D1 documentation</a> to get started today. <a href="https://github.com/cloudflare/d1-northwind">Build</a> our classic <a href="https://northwind.d1sql.com/">Northwind Traders demo</a> to explore the D1 experience and deploy your first D1 database!</p> ]]></content:encoded>
            <category><![CDATA[Developer Week]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[Database]]></category>
            <category><![CDATA[Developers]]></category>
            <category><![CDATA[Supercloud]]></category>
            <category><![CDATA[D1]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <guid isPermaLink="false">1rFO7pAwS1HGnsa6rhrIXa</guid>
            <dc:creator>Nevi Shah</dc:creator>
            <dc:creator>Glen Maddern</dc:creator>
            <dc:creator>Sven Sauleau</dc:creator>
        </item>
        <item>
            <title><![CDATA[D1: our quest to simplify databases]]></title>
            <link>https://blog.cloudflare.com/whats-new-with-d1/</link>
            <pubDate>Tue, 27 Sep 2022 13:00:00 GMT</pubDate>
            <description><![CDATA[ Get an inside look on the D1 experience today, what the team is currently working on and what’s coming up!  ]]></description>
            <content:encoded><![CDATA[ <p><i></i></p><p><i>This blog post references a feature which has updated documentation. For the latest reference content, visit </i><a href="https://developers.cloudflare.com/d1/best-practices/read-replication/"><i>D1 read replication documentation</i></a><i>.</i></p><p>When we announced D1 in May of this year, we knew it would be the start of something new – our first SQL database with Cloudflare Workers. Prior to D1 we’ve announced storage options like KV (key-value store), Durable Objects (single location, strongly consistent data storage) and <a href="https://www.cloudflare.com/learning/cloud/what-is-blob-storage/">R2 (blob storage)</a>. But the question always remained “How can I store and query relational data without latency concerns and an easy API?”</p><p>The long awaited “Cloudflare Database'' was the true missing piece to build your application <b>entirely</b> on Cloudflare’s global network, going from a blank canvas in VSCode to a full stack application in seconds. Compatible with the popular SQLite API, D1 empowers developers to build out their databases without getting bogged down by complexity and having to manage every underlying layer.</p><p>Since our launch announcement in May and private beta in June, we’ve made great strides in building out our vision of a <a href="https://www.cloudflare.com/developer-platform/products/d1/">serverless database</a>. With D1 still in <a href="https://www.cloudflare.com/lp/d1/">private beta</a> but an open beta on the horizon, we’re excited to show and tell our journey of building D1 and what’s to come.</p>
    <div>
      <h2>The D1 Experience</h2>
      <a href="#the-d1-experience">
        
      </a>
    </div>
    <p>We knew from Cloudflare Workers feedback that using Wrangler as the mechanism to create and deploy applications is loved and preferred by many. That’s why when <a href="/10-things-i-love-about-wrangler/">Wrangler 2.0</a> was announced this past May alongside D1, we took advantage of the new and improved CLI for every part of the experience from data creation to every update and iteration. Let’s take a quick look on how to get set up in a few easy steps.</p>
    <div>
      <h3>Create your database</h3>
      <a href="#create-your-database">
        
      </a>
    </div>
    <p>With the latest version of <a href="https://github.com/cloudflare/wrangler2">Wrangler</a> installed, you can create an initialized empty database with a quick</p><p><code>npx wrangler d1 create my_database_name</code></p><p>To get your database up and running! Now it’s time to add your data.</p>
    <div>
      <h3>Bootstrap it</h3>
      <a href="#bootstrap-it">
        
      </a>
    </div>
    <p>It wouldn’t be the “Cloudflare way” if you had to sit through an agonizingly long process to get set up. So we made it easy and painless to bring your existing data from an old database and bootstrap your new D1 database.  You can run</p><p><code>wrangler d1 execute my_database-name --file ./filename.sql</code></p><p>and pass through an existing SQLite .sql file of your choice. Your database is now ready for action.</p>
    <div>
      <h3>Develop &amp; Test Locally</h3>
      <a href="#develop-test-locally">
        
      </a>
    </div>
    <p>With all the improvements we’ve made to Wrangler since version 2 launched <a href="/wrangler-v2-beta/">a few months ago</a>, we’re pleased to report that D1 has full remote &amp; local wrangler dev support:</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7JRGM62yWrL3h7BKLhj5Jf/6d324ce4a2b19691ef4ec39095d2e43b/image2-43.png" />
            
            </figure><p>When running <code>wrangler dev -–local -–persist</code>, an SQLite file will be created inside <code>.wrangler/state</code>. You can then use a local GUI program for managing it, like SQLiteFlow (<a href="https://www.sqliteflow.com/">https://www.sqliteflow.com/</a>) or Beekeeper (<a href="https://www.beekeeperstudio.io/">https://www.beekeeperstudio.io/</a>).</p><p>Or you can simply use SQLite directly with the SQLite command line by running <code>sqlite3 .wrangler/state/d1/DB.sqlite3</code>:</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7wSVzsxnFxKpJDbF5pO4hs/89aab6231071b6cd8cc657a9fd2bd24b/image6-8.png" />
            
            </figure>
    <div>
      <h3>Automatic backups &amp; one-click restore</h3>
      <a href="#automatic-backups-one-click-restore">
        
      </a>
    </div>
    <p>No matter how much you test your changes, sometimes things don’t always go according to plan. But with Wrangler you can create a backup of your data, view your list of backups or restore your database from an existing backup. In fact, during the beta, we’re taking backups of your data every hour automatically and storing them in R2, so you will have the option to rollback if needed.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7BgC81NRBtxLJAl4Gf09oz/03084ab36894c484675f0ec7e58e9462/image1-53.png" />
            
            </figure><p>And the best part – if you want to use a production snapshot for local development or to reproduce a bug, simply copy it into the .wrangler/state directory and <code>wrangler dev –-local –-persist</code> will pick it up!</p><p>Let’s download a D1 backup to our local disk. It’s SQLite compatible.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4lGMre56aSosKozuHmITRD/51b282602897ed9af9d0813461f81732/image4-14.png" />
            
            </figure><p>Now let’s run our D1 worker locally, from the backup.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4t10c5k7VKcT2tF4CjW9Dw/eb5ba21817f6a38b1d1f450d6e2e2c3a/image5-16.png" />
            
            </figure>
    <div>
      <h3>Create and Manage from the dashboard</h3>
      <a href="#create-and-manage-from-the-dashboard">
        
      </a>
    </div>
    <p>However, we realize that CLIs are not everyone’s jam. In fact, we believe databases should be accessible to every kind of developer – even those without much database experience! D1 is available right from the Cloudflare dashboard giving you near total command parity with Wrangler in just a few clicks. Bootstrapping your database, creating tables, updating your database, viewing tables and triggering backups are all accessible right at your fingertips.</p>
            <figure>
            
            <img src="https://cf-assets.www.cloudflare.com/zkvhlag99gkb/2mXkO7uRDs4lVvgjm8VC4r/c32f96a738980294dfea5db7f2ea8794/image3-32.png" />
            
            </figure><p>Changes made in the UI are instantly available to your Worker — no deploy required!</p><p>We’ve told you about some of the improvements we’ve landed since we first announced D1, but as always, we also wanted to give you a small taste (with some technical details) of what’s ahead. One really important functionality of a database is transactions — something D1 wouldn’t be complete without.</p>
    <div>
      <h2>Sneak peek: how we're bringing JavaScript transactions to D1</h2>
      <a href="#sneak-peek-how-were-bringing-javascript-transactions-to-d1">
        
      </a>
    </div>
    <p>With D1, we strive to present a dramatically simplified interface to creating and querying relational data, which for the most part is a good thing. But simplification occasionally introduces drawbacks, where a use-case is no longer easily supported without introducing some new concepts. D1 transactions are one example.</p>
    <div>
      <h3>Transactions are a unique challenge</h3>
      <a href="#transactions-are-a-unique-challenge">
        
      </a>
    </div>
    <p>You don't need to specify where a Cloudflare Worker or a D1 database run—they simply run everywhere they need to. For Workers, that is as close as possible to the users that are hitting your site right this second. For D1 today, we don't try to run a copy in every location worldwide, but dynamically manage the number and location of read-only replicas based on how many queries your database is getting, and from where. However, for queries that make changes to a database (which we generally call "writes" for short), they all have to travel back to the single Primary D1 instance to do their work, to ensure consistency.</p><p>But what if you need to do a series of updates at once? While you can send multiple SQL queries with <code>.batch()</code> (which does in fact use database transactions under the hood), it's likely that, at some point, you'll want to interleave database queries &amp; JS code in a single unit of work.</p><p>This is exactly what database transactions were invented for, but if you try running <code>BEGIN TRANSACTION</code> in D1 you'll get an error. Let's talk about why that is.</p><p><b>Why native transactions don't work</b>The problem arises from SQL statements and JavaScript code running in dramatically different places—your SQL executes inside your D1 database (primary for writes, nearest replica for reads), but your Worker is running near the user, which might be on the other side of the world. And because D1 is built on SQLite, only one write transaction can be open at once. Meaning that, if we permitted <code>BEGIN TRANSACTION</code>, any one Worker request, anywhere in the world, could effectively block your whole database! This is a quite dangerous thing to allow:</p><ul><li><p>A Worker could start a transaction then crash due to a software bug, without calling <code>ROLLBACK</code>. The primary would be blocked, waiting for more commands from a Worker that would never come (until, probably, some timeout).</p></li><li><p>Even without bugs or crashes, transactions that require multiple round-trips between JavaScript and SQL could end up blocking your whole system for multiple seconds, dramatically limiting how high an application built with Workers &amp; D1 could scale.</p></li></ul><p>But allowing a developer to define transactions that mix both SQL and JavaScript makes building applications with Workers &amp; D1 so much more flexible and powerful. We need a new solution (or, in our case, a new version of an old solution).</p><p><b>A way forward: stored procedures</b>Stored procedures are snippets of code that are uploaded to the database, to be executed directly next to the data. Which, at first blush, sounds exactly like what we want.</p><p>However, in practice, stored procedures in traditional databases are notoriously frustrating to work with, as anyone who's developed a system making heavy use of them will tell you:</p><ul><li><p>They're often written in a different language to the rest of your application. They're usually written in (a specific dialect of) SQL or an embedded language like Tcl/Perl/Python. And while it's technically possible to write them in JavaScript (using an embedded V8 engine), they run in such a different environment to your application code it still requires significant context-switching to maintain them.</p></li><li><p>Having both application code and in-database code affects every part of the development lifecycle, from authoring, testing, deployment, rollbacks and debugging. But because stored procedures are usually introduced to solve a specific problem, not as a general purpose application layer, they're often managed completely manually. You can end up with them being written once, added to the database, then never changed for fear of breaking something.</p></li></ul><p>With D1, we can do better.</p><p>The <i>point</i> of a stored procedure was to execute directly next to the data—uploading the code and executing it inside the database was simply a means to that end. But we're using Workers, a global JavaScript execution platform, can we use them to solve this problem?</p><p>It turns out, absolutely! But here we have a few options of exactly how to make it work, and we're working with our private beta users to find the right <a href="https://www.cloudflare.com/learning/security/api/what-is-an-api/">API</a>. In this section, I'd like to share with you our current leading proposal, and invite you all to give us your feedback.</p><p>When you connect a Worker project to a D1 database, you add the section like the following to your <code>wrangler.toml</code>:</p>
            <pre><code>[[ d1_databases ]]
# What binding name to use (e.g. env.DB):
binding = "DB"
# The name of the DB (used for wrangler d1 commands):
database_name = "my-d1-database"
# The D1's ID for deployment:
database_id = "48a4224e-...3b09"
# Which D1 to use for `wrangler dev`:
# (can be the same as the previous line)
preview_database_id = "48a4224e-...3b09"

# NEW: adding "procedures", pointing to a new JS file:
procedures = "./src/db/procedures.js"</code></pre>
            <p>That D1 Procedures file would contain the following (note the new <code>db.transaction()</code> API, that is only available within a file like this):</p>
            <pre><code>export default class Procedures {
  constructor(db, env, ctx) {
    this.db = db
  }

  // any methods you define here are available on env.DB.Procedures
  // inside your Worker
  async Checkout(cartId: number) {
    // Inside a Procedure, we have a new db.transaction() API
    const result = await this.db.transaction(async (txn) =&gt; {
      
      // Transaction has begun: we know the user can't add anything to
      // their cart while these actions are in progress.
      const [cart, user] = Helpers.loadCartAndUser(cartId)

      // We can update the DB first, knowing that if any of the later steps
      // fail, all these changes will be undone.
      await this.db
        .prepare(`UPDATE cart SET status = ?1 WHERE cart_id = ?2`)
        .bind('purchased', cartId)
        .run()
      const newBalance = user.balance - cart.total_cost
      await this.db
        .prepare(`UPDATE user SET balance = ?1 WHERE user_id = ?2`)
        // Note: the DB may have a CHECK to guarantee 'user.balance' can not
        // be negative. In that case, this statement may fail, an exception
        // will be thrown, and the transaction will be rolled back.
        .bind(newBalance, cart.user_id)
        .run()

      // Once all the DB changes have been applied, attempt the payment:
      const { ok, details } = await PaymentAPI.processPayment(
        user.payment_method_id,
        cart.total_cost
      )
      if (!ok) {
        // If we throw an Exception, the transaction will be rolled back
        // and result.error will be populated:
        // throw new PaymentFailedError(details)
        
        // Alternatively, we can do both of those steps explicitly
        await txn.rollback()
        // The transaction is rolled back, our DB is now as it was when we
        // started. We can either move on and try something new, or just exit.
        return { error: new PaymentFailedError(details) }
      }

      // This is implicitly called when the .transaction() block finishes,
      // but you can explicitly call it too (potentially committing multiple
      // times in a single db.transaction() block).
      await txn.commit()

      // Anything we return here will be returned by the 
      // db.transaction() block
      return {
        amount_charged: cart.total_cost,
        remaining_balance: newBalance,
      }
    })

    if (result.error) {
      // Our db.transaction block returned an error or threw an exception.
    }

    // We're still in the Procedure, but the Transaction is complete and
    // the DB is available for other writes. We can either do more work
    // here (start another transaction?) or return a response to our Worker.
    return result
  }
}</code></pre>
            <p>And in your Worker, your DB binding now has a “Procedures” property with your function names available:</p>
            <pre><code>const { error, amount_charged, remaining_balance } =
  await env.DB.Procedures.Checkout(params.cartId)

if (error) {
  // Something went wrong, `error` has details
} else {
  // Display `amount_charged` and `remaining_balance` to the user.
}</code></pre>
            <p>Multiple Procedures can be triggered at one time, but only one <code>db.transaction()</code> function can be active at once: any other write queries or other transaction blocks will be queued, but all read queries will continue to hit local replicas and run as normal. This API gives you the ability to ensure consistency when it’s essential but with the minimal impact on total overall performance worldwide.</p>
    <div>
      <h3>Request for feedback</h3>
      <a href="#request-for-feedback">
        
      </a>
    </div>
    <p>As with all our products, feedback from our users drives the roadmap and development. While the D1 API is in beta testing today, we're still seeking feedback on the specifics. However, we’re pleased that it solves both the problems with transactions that are specific to D1 and the problems with stored procedures described earlier:</p><ul><li><p>Code is executing as close as possible to the database, removing network latency while a transaction is open.</p></li><li><p>Any exceptions or cancellations of a transaction cause an instant rollback—there is no way to accidentally leave one open and block the whole D1 instance.</p></li><li><p>The code is in the same language as the rest of your Worker code, in the exact same dialect (e.g. same TypeScript config as it's part of the same build).</p></li><li><p>It's deployed seamlessly as part of your Worker. If two Workers bind to the same D1 instance but define different procedures, they'll only see their own code. If you want to share code between projects or databases, extract a library as you would with any other shared code.</p></li><li><p>In local development and test, the procedure works just like it does in production, but without the network call, allowing seamless testing and debugging as if it was a local function.</p></li><li><p>Because procedures and the Worker that define them are treated as a single unit, rolling back to an earlier version never causes a skew between the code in the database and the code in the Worker.</p></li></ul>
    <div>
      <h2>The D1 ecosystem: contributions from the community</h2>
      <a href="#the-d1-ecosystem-contributions-from-the-community">
        
      </a>
    </div>
    <p>We've told you about what we've been up to and what's ahead, but one of the unique things about this project is all the contributions from our users. One of our favorite parts of private betas is not only getting feedback and feature requests, but also seeing what ideas and projects come to fruition. While sometimes this means personal projects, with D1, we’re seeing some incredible contributions to the D1 ecosystem. Needless to say, the work on D1 hasn’t just been coming from within the D1 team, but also from the wider community and other developers at Cloudflare. Users have been showing off their D1 additions within our Discord private beta channel and giving others the opportunity to use them as well. We wanted to take a moment to highlight them.</p>
    <div>
      <h3>workers-qb</h3>
      <a href="#workers-qb">
        
      </a>
    </div>
    <p>Dealing with raw SQL syntax is powerful (and using the D1 .bind() API, safe against <a href="https://www.cloudflare.com/learning/security/threats/how-to-prevent-sql-injection/">SQL injections</a>) but it can be a little clumsy. On the other hand, most existing query builders assume direct access to the underlying DB, and so aren’t suitable to use with D1. So Cloudflare developer Gabriel Massadas designed a small, zero-dependency query builder called <code>workers-qb</code>:</p>
            <pre><code>import { D1QB } from 'workers-qb'
const qb = new D1QB(env.DB)

const fetched = await qb.fetchOne({
    tableName: "employees",
    fields: "count(*) as count",
    where: {
      conditions: "active = ?1",
      params: [true]
    },
})</code></pre>
            <p>Check out the project homepage for more information: <a href="https://workers-qb.massadas.com/">https://workers-qb.massadas.com/</a>.</p>
    <div>
      <h3>D1 console</h3>
      <a href="#d1-console">
        
      </a>
    </div>
    <p>While you can interact with D1 through both Wrangler and the dashboard, Cloudflare Community champion, Isaac McFadyen created the very first D1 console where you can quickly execute a series of queries right through your terminal. With the D1 console, you don’t need to spend time writing the various Wrangler commands we’ve created – just execute your queries.</p><p>This includes all bells and whistles you would expect from a modern database console including multiline input, command history, validation for things D1 may not yet support, and ability to save your Cloudflare credentials for later use.</p><p>Check out the full project on <a href="https://github.com/isaac-mcfadyen/d1-console">GitHub</a> or <a href="https://www.npmjs.com/package/d1-console">NPM</a> for more information.</p>
    <div>
      <h3>Miniflare test Integration</h3>
      <a href="#miniflare-test-integration">
        
      </a>
    </div>
    <p>The <a href="https://miniflare.dev/">Miniflare project,</a> which powers Wrangler’s local development experience, also provides fully-fledged test environments for popular JavaScript test runners, <a href="https://miniflare.dev/testing/jest">Jest</a> and <a href="https://miniflare.dev/testing/vitest">Vitest</a>. With this comes the concept of <a href="https://miniflare.dev/testing/jest#isolated-storage"><i>Isolated Storage</i></a>, allowing each test to run independently, so that changes made in one don’t affect the others. Brendan Coll, creator of Miniflare, guided the D1 test implementation to give the same benefits:</p>
            <pre><code>import Worker from ‘../src/index.ts’
const { DB } = getMiniflareBindings();

beforeAll(async () =&gt; {
  // Your D1 starts completely empty, so first you must create tables
  // or restore from a schema.sql file.
  await DB.exec(`CREATE TABLE entries (id INTEGER PRIMARY KEY, value TEXT)`);
});

// Each describe block &amp; each test gets its own view of the data.
describe(‘with an empty DB’, () =&gt; {
  it(‘should report 0 entries’, async () =&gt; {
    await Worker.fetch(...)
  })
  it(‘should allow new entries’, async () =&gt; {
    await Worker.fetch(...)
  })
])

// Use beforeAll &amp; beforeEach inside describe blocks to set up particular DB states for a set of tests
describe(‘with two entries in the DB’, () =&gt; {
  beforeEach(async () =&gt; {
    await DB.prepare(`INSERT INTO entries (value) VALUES (?), (?)`)
            .bind(‘aaa’, ‘bbb’)
            .run()
  })
  // Now, all tests will run with a DB with those two values
  it(‘should report 2 entries’, async () =&gt; {
    await Worker.fetch(...)
  })
  it(‘should not allow duplicate entries’, async () =&gt; {
    await Worker.fetch(...)
  })
])</code></pre>
            <p>All the databases for tests are run in-memory, so these are lightning fast. And fast, reliable testing is a big part of building maintainable real-world apps, so we’re thrilled to extend that to D1.</p>
    <div>
      <h2>Want access to the private beta?</h2>
      <a href="#want-access-to-the-private-beta">
        
      </a>
    </div>
    <p>Feeling inspired?</p><p>We love to see what our beta users build or want to build especially when our products are at an early stage. As we march toward an open beta, we’ll be looking specifically for your feedback. We are slowly letting more folks into the beta, but if you haven’t received your “golden ticket” yet with access, sign up <a href="https://www.cloudflare.com/lp/d1/">here</a>! Once you’ve been invited in, you’ll receive an official welcome email.</p><p>As always, happy building!</p> ]]></content:encoded>
            <category><![CDATA[Birthday Week]]></category>
            <category><![CDATA[Serverless]]></category>
            <category><![CDATA[Database]]></category>
            <category><![CDATA[D1]]></category>
            <category><![CDATA[Cloudflare Workers]]></category>
            <category><![CDATA[Product News]]></category>
            <category><![CDATA[Developers]]></category>
            <category><![CDATA[Developer Platform]]></category>
            <guid isPermaLink="false">nODp0eoC5szCr7aW59sde</guid>
            <dc:creator>Nevi Shah</dc:creator>
            <dc:creator>Glen Maddern</dc:creator>
        </item>
    </channel>
</rss>