Subscribe to receive notifications of new posts:

Use Snowflake with R2 to extend your global data lake

2023-05-16

2 min read
Use Snowflake with R2 to extend your global data lake

R2 is the ideal object storage platform to build data lakes. It’s infinitely scalable, highly durable (eleven 9's of annual durability), and has no egress fees. Zero egress fees mean zero vendor lock-in. You are free to use the tools you want to get the maximum value from your data.

Today we’re excited to announce our partnership with Snowflake so that you can use Snowflake to query data stored in your R2 data lake and load data from R2 into Snowflake. Organizations use Snowflake's Data Cloud to unite siloed data, discover, and securely share data, and execute diverse analytic workloads across multiple clouds.

One challenge of loading data into Snowflake database tables and querying external data lakes is the cost of data transfer. If your data is coming from a different cloud or even different region within the same cloud, this typically means you are paying an additional tax for each byte going into Snowflake. Pairing R2 and Snowflake lets you focus on getting valuable insights from your data, without having to worry about egress fees piling up.

Getting started

Sign up for R2 and create an API token

If you haven’t already, you’ll need to sign up for R2 and create a bucket. You’ll also need to create R2 security credentials for Snowflake following the steps below.

Generate an R2 token

1. In the Cloudflare dashboard, select R2.

2. Select Manage R2 API Tokens on the right side of the dashboard.

3. Select Create API token.

4. Optionally select the pencil icon or R2 Token text to edit your API token name.

5. Under Permissions, select Edit.

6. Select Create API Token.

You’ll need the Secret Access Key and Access Key ID to create an external stage in Snowflake.

Creating external stages in Snowflake

In Snowflake, stages refer to the location of data files in object storage. To create an external stage, you’ll need your bucket name and R2 credentials. Find your Cloudflare account ID in the dashboard.

CREATE STAGE my_r2_stage
  URL = 's3compat://my_bucket/files/'
  ENDPOINT = 'cloudflare_account_id.r2.cloudflarestorage.com'
  CREDENTIALS = (AWS_KEY_ID = '1a2b3c...' AWS_SECRET_KEY = '4x5y6z...')

Note: You may need to contact your Snowflake account team to enable S3-compatible endpoints in Snowflake. Get more information here.

Loading data into Snowflake

To load data from your R2 data lake into Snowflake, use the COPY INTO command.

COPY INTO t1
  FROM @my_r2_stage/load/;

You can flip the table and external stage parameters in the example above to unload data from Snowflake into R2.

Querying data in R2 with Snowflake

You’ll first need to create an external table in Snowflake. Once you’ve done that you’ll be able to query your data stored in R2.

SELECT * FROM external_table;

For more information on how to use R2 and Snowflake together, refer to documentation here.

“Data is becoming increasingly the center of every application, process, and business metrics, and is the cornerstone of digital transformation. Working with partners like Cloudflare, we are unlocking value for joint customers around the world by helping save costs and helping maximize customers data investments,” – James Malone, Director of Product Management at Snowflake

Have any feedback?

We want to hear from you! If you have any feedback on the integration between Cloudflare R2 and Snowflake, please let us know by filling this form.

Be sure to check our Discord server to discuss everything R2!

Watch on Cloudflare TV

Cloudflare's connectivity cloud protects entire corporate networks, helps customers build Internet-scale applications efficiently, accelerates any website or Internet application, wards off DDoS attacks, keeps hackers at bay, and can help you on your journey to Zero Trust.

Visit 1.1.1.1 from any device to get started with our free app that makes your Internet faster and safer.

To learn more about our mission to help build a better Internet, start here. If you're looking for a new career direction, check out our open positions.
Developer WeekDevelopersConnectivity Cloud

Follow on X

Phillip Jones|@akaphill
Abhi Das|@abhidasone
Cloudflare|@cloudflare

Related posts

October 24, 2024 1:00 PM

Durable Objects aren't just durable, they're fast: a 10x speedup for Cloudflare Queues

Learn how we built Cloudflare Queues using our own Developer Platform and how it evolved to a geographically-distributed, horizontally-scalable architecture built on Durable Objects. Our new architecture supports over 10x more throughput and over 3x lower latency compared to the previous version....

October 09, 2024 1:00 PM

Improving platform resilience at Cloudflare through automation

We realized that we need a way to automatically heal our platform from an operations perspective, and designed and built a workflow orchestration platform to provide these self-healing capabilities across our global network. We explore how this has helped us to reduce the impact on our customers due to operational issues, and the rich variety of similar problems it has empowered us to solve....