Service Bindings are generally available, with efficient pricing

Today, we’re happy to unveil a new way to communicate between your Workers. In the spirit of baking more and more flexibility into our Developer Platform, our team has been hard at work building a new API to facilitate Worker to Worker communication: Service Bindings. Service Bindings allow your Workers to send requests to other Workers Services, from your code, without those requests going over the Internet. It opens up a world of composability that was previously closed off by a difficult interface, and makes it a lot easier for you to build complex applications on our developer platform.

Service Bindings allow teams to segment application logic across multiple Workers. By segmenting your logic, your teams can now build with more confidence by only deploying narrowly scoped changes to your applications, instead of recommitting the whole application every time. Service Bindings give developers both composability and confidence. We’ve seen some excellent uses so far, and today we’ll go through one of the more common examples. Alongside this functionality, we'll show you how Cloudflare’s cost efficiency will save you money.

Example: An API Gateway

Service Bindings allow you to easily expand the number of services running on a single request. Developers can now create a pipeline of Workers that call one another and create a complex series of compute blocks. The ability to separate and compose application logic together has opened Cloudflare Workers up to even more uses.

With Service Bindings, one of our customers has moved multiple services off of their legacy infrastructure by creating a gateway Worker that serves as the entry point of a request. This gateway Worker handles decision-making about request routing and quickly shifts requests to appropriate services – be it on their legacy application servers or their newly created Workers. This project enabled several new teams to onboard as a result, each managing their Worker independently. Large teams need a development ecosystem that allows for granular deployments, minimizing the scope of impact when a bad push to production occurs.

Let’s walk through a simple example of an API gateway Worker that handles routing and user authentication. We’ll build an application that takes in a user request and checks for authorization. If the user isn’t authorized, we block the request. If the user has valid credentials, we’ll fetch the user data. The application will also implement login and logout to change the user authentication state.

The api-gateway Worker handles routing and authentication checks for all the available endpoints

Here, the api-gateway Worker calls login and logout Workers for authentication to privileged endpoints like /getuser. The api-gateway Worker also checks each request for authorization via the auth Worker and allows valid requests to call the get-user Worker. The get-user Worker then makes an outbound network request to gather the required user information, and passes that data back to the client via our api-gateway Worker. The api-gateway Worker is therefore bound to four other Worker Services: auth, get-user, login, and logout.

The api-gateway Worker is bound to auth, get-user, login, and logout Workers via Service Bindings.

Let’s take a look at the code for the api-gateway Worker. We’ll see the routes /login, /logout, and /getuser are implemented on this API. For the /getuser route, the api-gateway Worker requires authorization via the auth Worker. Requests to any other endpoints will return a 404 HTTP status code.

export default {
 async fetch(request, environment) {
   const url = new URL(request.url);
   switch (url.pathname) {
     case '/login':
       return await environment.login.fetch(request);

     case '/logout':
       return await environment.logout.fetch(request);

     case '/getuser': {
       // Check that the "Authorization" header is sent when authenticated.
       const authCheck = await environment.auth.fetch(request.clone());
       if (authCheck.status != 200) { return authCheck }
       // If the auth check passes, send the request to the /admin endpoint
       return await environment.getuser.fetch(request);
     }
   }
   return new Response('Not Found.', { status: 404 });
 }
}

The code really is that simple. The separation of concerns allows your teams to work independently of each other, relying on each service to do what it is supposed to do in production. It allows you to separate your code by use case, developing, testing, and debugging more effectively.

But your next question might be, what am I charged for? Before we get into price, let’s first talk about where the compute execution is happening using our example above. A request to /getuser may look something like this, when looking across the request’s lifecycle:

A request lifetime graphic representing a sample application with multiple subroutines and a network request

The get-user Worker makes a network call to gather user information while the auth Worker executes entirely within the Workers runtime. Now that we understand what a single execution looks like, let’s talk about cost efficiency.

Cost efficiency that saves you money

Service Bindings are available for you to use starting today. They cost the same as any normal Worker; each invocation is charged as if it’s a request from the Internet – with one major and important difference. We’re removing the concept of “idle resources” across Workers. You will be charged a single billable duration across all Workers triggered by a single incoming request. This is possible because Cloudflare can share compute resources used by each request across your Workers and pass the resulting cost savings on to our customers.

Revisiting our example above, the api-gateway Worker may be waiting on other dependencies to perform some work, while it sits idle. When we say idle, we mean the time the api-gateway Worker is awaiting a response from the auth and get-user Workers – represented by the gray bars in the request lifetime graphic.

A request lifetime graphic representing a sample application with multiple subroutines and a network request

When using Service Bindings, you no longer have to pay for those “idle resources”. With the Workers model, customers can execute work on a single shared compute thread across multiple individual Services, for each and every request. Cloudflare will charge for the amount of time that thread is allocated to your Workers and the time your Workers are awaiting external dependencies. Cloudflare won’t double charge for any overlap.

with the Workers model, resources are shared and you only pay a flattened duration bill

This is in stark contrast to classic serverless compute models (like Amazon Web Services’ Lambda), where resources are allocated on a per-instance basis, and as such cost is passed to the customer even when those resources are not actively being used. That extra charge is represented by the magenta portions of the request lifetime graphic below.

In a classic compute model, you’re potentially overpaying for resources that are not being used

Cloudflare is able to squash duration down to a single charge, since Cloudflare can share the compute resources between your services. We pass those cost savings on to our customers, so you can pay only for the work you need done, when you need it done, every time.

Getting Started

Excited to try our Service Bindings? Head over to the Settings => Variables tab of your Worker, and click ‘Edit Variables’ under Service Bindings. You can then reference those bindings within your code and call fetch() on any one of them.

We can’t wait to see what you build. Check us out on Discord to join the conversation.