Real World Serverless: The Video

We held our second Real World Serverless event in London last week and filmed the four talks about Serverless technology to share with you here, on the Cloudflare blog. Check out the recording, featuring Henry Heinemann, Sevki Hasirci, and Stephen Pinkerton from Cloudflare and Paddy Sherry from gambling.com Group.

For details of our other upcoming Real World Serverless events in Austin, Singapore, Sydney, and Melbourne, scroll to the bottom.

Video transcript:

Moderator: Okay, welcome to Cloudflare, everybody. I'm so pleased that you're here. This is the second event of our real world serverless event series. We had our first one in San Francisco just last week, and we were so excited to fly over here, both me and Jade and some other Cloudflare employees, and tap into this community here, because this is our second largest office.

We also frankly, love coming here to London to visit and engage with the developer community here. Let me hand this over to Jade and she'll speak a little bit about Cloudflare, and then we'll get started with speaker number one.

[applause]

Jade: Hello, everyone. Welcome to Cloudflare. It's on, okay cool. How many of you know what Cloudflare is? That's great. [laughs] I guess we can get started. We run about 10% of the internet as measured by request. Data centers in 154 places worldwide. Very recently, we launched a Serverless platform called Cloudflare Workers which allows you to write code that runs in Cloudflare data centers.

You'll be hearing about various things related to that, about practical real-world concerns and best practices when deploying serverless applications. You'll also be hearing from someone who worked on the integration with this Serverless framework all today. Without further ado, who's speaking first, by the way? Henry. Henry, come on up. Henry is our first speaker today.

[applause]

Henry: Cool, let me bring this up. Okay, we can get started. A little bit about myself, real quick. I work at Cloudflare which is as Jade just said, not a given. I work on our go-to-market strategy team, basically trying to make sure that our new products such as Cloudflare Workers actually are a success to our business. In that position, I have a unique view on both our engineering side as well as our sales side.

This talk is going to be quite high level, not very technical as opposed to the following talks. To those who are quite new to Serverless and the entire concept of running stuff and functions in the cloud, this is hopefully going to be interesting for you. To those of you who already know all of this, I'm going to try and maybe bring up a few ideas that you potentially haven't thought about yet that I've been exposed to in the last couple of weeks.

Real quick, we have an intro and very brief market overview, like apart from Cloudflare who is doing this kind of stuff and then some opportunities in the Serverless space. We're going to talk a lot about buzzwords. I came up with another one after seeing this tiny comic down here in Latin, because why the hell not? Basically, this just means a man made cloud, as in a literal cloud that's been built by humans. Because that's what I'm going to be talking about.

I hope you can all see this. If you think about it traditionally how you run software on your Internet, you run it on a server like a proper machine that's somewhere in a closet or a data center which is called an On-premise. One of the companies that I just took out here is Dell. They manufactured those SAPs.

You can buy server. It's a huge upfront cost, you have to maintain it. You probably have some networking team that has to plug all the cables in, make sure there's electricity, put on the software basically do literally everything. If your company, scales you have to buy more of them and if you company has to scale down, you somehow have to get rid of them or keep paying for them. The cost is per machine importantly. Some smart people thought, "Okay, that's really not ideal. That's not a good way of running a business.

What if your start up you can't really afford thousands of dollars on an actual server? They came up with infrastructure as a service. People literally give you their infrastructure as a service. So that's what DigitalOcean does and several other companies as well and you just go to those companies and you spin up a virtual machine and you're good to go. You can run your software there, but you still have to maintain your operating system, you still have to install stuff to make sure your program runs, web servers, everything. There's still a lot of stuff you have to maintain.

The good thing, and we'll see it here this is billed by the hour. You no longer have to worry about having a huge up front cost, but you can actually just scale this by adding a new instance for your virtual machine or adding a new server virtually. That's where Heroku comes in.

If this is a bit much for you and you don't want to maintain your operating system and wasteful things like that, you can just use a platform as a service. That's what Heroku does. You don't actually have to install anything anymore. You just use that platform and you run your code there. To those of you who are familiar with serverless and running functions in the cloud, this might already look a little bit like serverless, but it's not quite there yet. The important distinction is that this is still billed by the hour. Even if nobody uses you application and it's just sitting there being idle, you're still paying for the stuff. That's where functions come in, or functions as a service.

I took out Amazon here. Again, I don't think that these are the only people who do this stuff. There a lot of people who do this, I just took all the people who currently dominating the space or who are the first ones to do it. The important thing with functions is that you're not paying for this per request. So you don't have to think about how much RAM the machine is consuming or about any kind of CPU stuff and things like this. You literally just write a function and it does something. It's a unit of application logic if that makes sense.

It's a tiny piece of the logic of your application which represents this function and then you can have lots of functions which represent your overall architecture in the end. Of course, these can also interact with each other. I'm not saying that on-premise is evil or that everyone should use DigitalOcean and nobody should buy Dell service. Sometimes you might have to use one or the other and sometimes you might even use all four. It really depends on your specific scenario.

One final one that I just wanted to include here, there's obviously also SAS which just removes all abstraction there is. You don't even have to build any software anymore, like the software stack, you literally just click a button and you have your software. It's finished you don't have to write any code. It's like the ultimate level of obstructing your stack, basically.

Again, here you don't pay per request but you pay per user. That's just as a side note. Now, where does Cloudflare come in? We had Cloudflare in the end, and well the buzzwords. There you go. Basically, just going to reiterate this again real quick. We started with the Dell service on the left, the on-premise ones; then we went to virtual machines with DigitalOcean; and then I talked about Amazon and AWS.

If you are familiar with AWS lambda, that's serverless. The idea is basically that you just have your function which runs on an Amazon data center and that's where Cloudflare comes in with another buzzword and an important distinction between serverless and originless. If you run your function on AWS lambda, it runs in a very specific location somewhere in a data center which you can choose. You can, for example, say this function runs in London. Which then also means that every time somebody requests this function to run that request will be sent to London and back which takes potentially a lot of time right and we can't change the speed of light. We at Cloudflare thought we would make the server less framework and basically put it into every one of our Cloudflare edge notes. Every location where we have a data center can run these functions.

We bring the serverless part where we take all of the infrastructure abstraction away and stuff and we add one more thing that you don't have to worry about, which is you don't have to worry about location anymore. You deploy your code and it runs everywhere in the world instantly.

Now we have some more buzzwords down here, courtesy of Cisco. Fog computing was a marketing term invented by Cisco, mostly related to IoT devices and if you delve into this whole serverless space a bit more, you're going to hear a lot about IoT.

We at Cloudflare don't do that much related to IoT and I wanted to avoid it on purpose first, because there's so much more that you can do with serverless but anyway, fog computing was invented, so to speak, or invented by marketing people anyway, in the context of IoT devices. Where Cisco basically said that if you have your IoT devices that send lots of data into your cloud for processing, that takes a lot of time. It makes more sense to do some processing on the edge, so to speak, before it gets to the cloud computer and that's what they refer to as fog computing.

You can see that all of these terms are more or less interchangeable, but the important part is to remember that we're trying to bring the computation as close as possible to the end user or to the end device. That's what functions do and we had Cloudflare typically use the edge to refer to our nodes, every data center we have. Other people sometimes refer to the edge as your actual device. You can run code on your phone, obviously but that just depends on the way you look at it.

Again, why we're doing this, why do we care if we are close to the end user? I just brought up this example again, because we were talking about IoT. If you have a smart speaker, like an Alexa and you ask Alexa about the weather in London, I'm sure maybe they just have a default response and it says the weather in London is always terrible, but typically, you would ask Alexa, how's the weather in London, and then Alexa would send some kind of requests to cloud server somewhere, that server would then probably send another request to an API, get the weather, return it to the user, and they will know that it's raining in London.

Basically, the closer we have this computational part to the user, the faster they will hear how the weather is in London. It makes sense to have this API request which asks for the weather come from somewhere in the same location as this devices. If somebody is hosting the cloud application which asked for the weather in San Francisco but the user is in London it makes absolutely no sense to to send that request to San Francisco, ask for the weather there, then return it back to London. With Cloudflare Workers, or many other platforms, you can easily implement something like this.

Actually, regardless of where you are, the code will always return quickly to the user. We reduce the run-through time. At the same time, we also reduce cost. If you think back to our Heroku or DigitalOcean example, if we have a smart speaker on our home, we're probably not constantly talking to it, or at least if we are, that would be weird. We really only want to have an infrastructure cost every time we actually have a request going to that device. Again, that's something that cloud functions solve.

The same thing that I am talking about here for IOT devices applies to websites as well so, every time I request something on my phone, it sends a request somewhere to a server, and I want that request to be fast, and I only want to pay for that request from an infrastructure-provider perspective when it actually comes in, and not constantly.

If you're completely new to this space, here's a list of some of the major players. As you can see, AWS is dominating the space, which is largely because they've been around since 2014. They were kind of the first ones to do functions as a service, and also they do a lot of other things. AWS has a whole bunch of things that integrate with their function as a service platform, building a pretty nice and holistic serverless environment.

The important thing to notice here though is, we're comparing apples to oranges. If you think back to what I explained previously about the difference between serverless and originless, you'll realize that actually, Cloudflare Workers is running everywhere. AWS Lambda is only running in one single location. That's not to say that Cloudflare Workers is better and that AWS Lambda, that's worse, but not the case. They're just different use cases and scenarios in which each of these solutions would make more sense than the other. We'll see how this develops, but it's just something to keep in mind when you develop your own applications.

Now something even more abstract, I hope you can read this. Basically, we have an infrastructural concern, and we have a multi-cloud concern. If you move your entire application stack to let's say Cloudflare Workers, you're going to expect it to never go offline of course, but that's the same thing you expect from every other vendor that you work with. However, inevitably, at some point, you may or may not have an outage. Anything could happen. As you can see and as you might already all know, people implement secondary CDNs, they implement Secondary DNS, what people haven't really thought about yet is how to implement secondary edge computes or secondary functions as a service. What happens if your serverless platform goes offline?

I thought about this and one solution would be to just put an additional serverless platform in front of it, which then routes the requests to let's say, AWS Lambda, and Google Cloud Functions. Then again, you have a third vendor so you're effectively solving your lock-in by locking yourself in even further. That's just an interesting challenge I want to leave you with. If you come up with a great solution, might be a good business to start at some point. Regardless, even if you don't try to solve the whole secondary edge compute problem, workers is actually and in general, these functions as a service are great way of implementing a multi-cloud strategy, because you can effectively on every request that you got to your website, route your traffic to a different cloud provider according to different criteria.

If you have your content on, let's say, Google Cloud, and let's say AWS, you can decide on the fly which one is cheaper, at this very point in time, and then you route your traffic to AWS because it's five cents less than Google at this time. Then maybe an hour later or if the request actually comes from a different region in the world, you might want to route it to the other data center and so on. You can implement some pretty interesting vendor strategies using serverless computing. Even more opportunities.

To those of you who've already had some experience with serverless code that you wrote, you probably did something in these two regions. You probably did something that had a fairly minor impact. It may have been a temporary fix, it may have been a permanent fix, but you probably didn't put the future of your business and you didn't bet that on the serverless platform that you were using. Maybe you fixed a typo on your site because you were too lazy to SSH into your server and to do it properly, but then the next day you would actually log in and fix a typo. That's a temporary fix with a minor impact.

Another example is maybe your business name is misspelled in some places on your website like capitalized in a weird way, but it's actually like this is happening across the entire marketing side so you can't just quickly fix that everywhere. You just put in a worker, very minor impact while we just capitalize the F, doesn't change the word, but we'll leave it in permanently because why not? What I'm increasingly starting to see is people are using serverless technology as a form of a major impact scenario. One interesting way of using serverless is to actually enable your cloud migration or to patch your legacy infrastructure temporarily. Maybe you have an on-premise system and you're not happy with it, maybe you had some security vulnerabilities and you already planned to migrate to let's say DigitalOcean in the future. Well, you're not doing it immediately. So in the meantime, you can just do serverless to patch all of your leaks and keep your Website running for let's say half a year. Still, this is pretty critical stuff, so if it breaks your business would be in trouble.

Also, if you're trusting your business reputation or even revenue on a serverless platform, you probably want to use a continuous deployment integration tool to do so. You don't want to log into some IDE or some web portal and maybe make a mistake somewhere and be unable to roll back, you want to have some additional control here.

Actually, Cloudflare recently started supporting the serverless framework with Workers. So that's a step in that direction.

Then, this is extremely rare, but I think we're going to start seeing this happen a lot more in the future.

It's people who basically more or less build the entire application using serverless. For example, they would have their entire marketing side. So the entire public facing part of their Website in an originless architecture. The site does not have a server anymore that it runs on, it completely lives on, for example, Cloudflare's app service.

Also mentioned IoT integration here, for example, if you build a new smart speaker and you implement it with serverless, obviously if that breaks, that would be a pretty major impact on your business.

I also saw an interesting case study by AWS Lambda on Netflix. So Netflix does some of their media encoding using serverless technology. You all know Netflix if something like this breaks, it has a major impact on the business. At Cloudflare we see more and more people with a similar setup, whereby everything they do hinges on the fact that our workers are delivering their content fast and securely.

What I find interesting is that this is becoming a skill. People put this on their LinkedIn profiles. Almost bet that somebody here may be an expert on serverless, and maybe put something like that on their profile. Obviously, not only do people have this in their skill set, but also in their job titles. If somebody here in this audience have this skill I would love to meet them. If that happened by coincidence, that would be great, but the point is basically, this is a job description. This is no longer just an abstract thing that people talk about and it's intentionally started with the buzzwords. It looks like something really abstract that nobody really actually knows what it is but people make a living doing this. I got the inspiration to search for this in the first place. Because we at Cloudflare saw this job posting where somebody literally searched for somebody who's proficient at writing AWS Lambda code and Cloudflare Workers code. This is a skill set that you may want to develop in the future if you don't already have it.

With that, I'm going to leave you to the actual experts who know how to write this stuff and stop talking about it in a high level way. Because it is, in fact, not magic, it works, and we at Cloudflare are already doing it. Thank you so much.

[applause]

Moderator: Okay, all the way from San Francisco, Stephen Pinkerton. [applause]

Stephen: Everyone, thanks for being here. Thank you, Andrew. Heads up I'm very jet lagged, we're going to try to get through this together. I'm going to talk a little about real world serverless. I'm a product manager at Cloudflare. I work on some products adjacent to workers. Some things in the makes, some things in the works and something we actually shipped last week that I'll talk about in a moment.

Before this job, I worked in a couple of different engineering capacities in embedded software distribution systems.

If you have a Monzo card, I worked at Monzo. We'll talk about what is serverless, why it might be useful for you for side projects or for your business, and then how you can get started using it and where we see serverless right now and where we see it going in the future.

What is it? Serverless is really a way to build applications and write code in a way that doesn't need to be concerned about the underlying infrastructure on which it runs. You really get to express what your product does in the most concise way possible without worrying about how you're actually going to deliver that experience to customers.

We have a serverless product. We've built integration with the open source serverless framework which is a really convenient way to write platform independent code that you can deploy to different cloud providers. Our serverless offering Cloudflare for workers is now integrated with it. It's a really community way to deploy code, manage configuration within version control which was previously impossible. You manager entire application with a team, deploy it really easily, it makes testing much easier it's very cool. I recommend checking out.

How many people will use the serverless open source framework? That's cool, okay. Awesome thank you for coming. Okay. Why does serverless make sense? We'll take a step back and talk about the history of computing or the history of getting paged. A while ago depending on your needs as someone who like maybe writes code, you used to write applications that would run on your own hardware. You buy computers, you put them in a data center or in a room or you co-locate these computers in someone else's data center and you'd have to worry about networking, you have to worry about configuration management, how you deploy code, how you secure all these networking, it was all very complicated.

If something went wrong, if a cat or a person tripped over a networking cable, you got paged and it was your fault and you had a fix it. Now, we live in the cloud era. I'm sorry for saying cloud era, but we live in an era now where applications run isolated from each other. You can run these applications in a way that you maybe rent time on serverless providers, you may be rent time in terms of like microseconds or milliseconds if you're printing specifically CPU time on a service provider.

All of this is really analogous to you need somewhere to live. You can build your own house, you can buy a pre-built house, you can rent an apartment, you can buy an apartment, you can stay in a hotel and it really depends on what your needs are. If you need somewhere to live every night of the year or if you just need somewhere to stay for a couple of nights. Your requirements are different and the same applies for application.

A lot of people have different opinions about where the Internet is going and we really see it as how do you get code running as close to your customers as possible? One way that people are thinking about this might be possible is some sort of mobile age. Where your code is actually running in a data center or on servers at cell towers that are within maybe a couple of miles of where people's handheld devices are.

It really makes you ask the question of like what business are you in? This really comes down to how do you add value to your customers? What do your customers care for and how do you add value to them? Do you add value to your customers with technologies like this? Do your customers care that you run engine X? Or that you have a really cool Reddit set up or that you use micro-services in Net CD and LinkedIn, some sort of crazy infrastructure, or do they care about the experiences that you're delivering to them?

Really makes you ask that question and realize that your customer doesn't care about these things and maybe you shouldn't as well that you could free up your time your and resources by instead architecting your applications in a way that doesn't need to be concerned for these.

If you work in tech or you work in a technical capacity, you can just throw your job description down to something like this. Where you're trying to deliver some experience to some person in some part of the world really fast. So, why does this matter.? You might have heard of the famous AWS statistic that 100 milliseconds of latency is 1% of revenue. That's an important number to remember because maybe it makes sense to focus on that and not how cool your infrastructure is. It basically who you can pass the pager to who you want to be accountable. Do you want someone who's an expert at managing servers to be accountable for running your servers? Or do you want to do it yourself in addition to delivering an experience for your customers?

Something interesting I hope you noticed about this slide is that most of the world doesn't see the internet through windows that look like this. Devices used around the world generally don't look like an iPhone or Mac, people use all sorts of devices and so the experience you deliver to them is really what matters.

Serverless is a really powerful tool for expressing your business logic to your customers. It really just lets you focus on the product that you're going to deliver. This really means that you should focus on building products and delivering value and less on infrastructure. What can you do to do the minimal amount of work to provide the most value to your customers? Basically, where is your focus? This really applies to someone building a personal project or if you're working at a large company.

As an engineer, you should be afraid of a couple of things. As an engineer you should probably be afraid of code, you should be afraid of infrastructure because everything that can break will break, and everything is a liability. This goes back to who do you pass the pager to, to solve these problems for you. Because I would much rather have a larger company manage servers configuration management networking and someone like me. I'd much rather not figure out how to code locate a computer in a data center near my house. I don't think that really makes sense. Who do you pass the pager to in these problems? It really comes down to paying someone else to solve heart problems for you so that you can focus on your customers.

Of course, this comes with an asterisk. Serverless isn't a one size fits all solution. With engineering everything is complex and you probably have existing applications so you can't go serverless tomorrow. There are cases when it makes sense for these when you should be delivering to your customers to build your own patching layers and manage your own infrastructure.

We'll talk about the current state of serverless, how you can get started with it and then where we see serverless going. First-gen serverless is really an adaptation of the current model of computing where you maybe rent compute time by the hour you're leasing CPU time on someone else's servers. It really comes to standardizing a couple piece of technology that all of that relies on.

Serverless relies on containers that are running your application and running web servers. One problem that has come up a lot and you've probably heard about was the COLD-SAT problem, that's really a result of web servers not being optimized for spinning up really quickly to deliver requests on demand.

It also relies on the model of regional deployment where you have to pick where your application is going to be deployed and distributing your application is a very difficult concern for someone who is a developer and someone who's managing data. All I know like the current generation of serverless is really, really powerful and it's let people focus on the real problems that matter to their customers.

Right now you can go write code and deploy it and let people interact with it without worrying about public configuration, without worrying about networking or a lot of very hard problems that are time-consuming and people have solved many many times.

It's gotten us really far but it really relies on the previous generation of computing. Something that goes along with this previous generation is just complex billing, that many of you may have experienced if you use serverless technologies.

Now, tell me what this number is. This is the number of ways in which you can be billed for using AWS Lambda. You may have heard that serverless can be expensive depending on what cloud provider you're using and your use case. It can also be complex. You're running a business and you need to know how much it's going to cost to run your application and all a lot of these services are usage-based, so you don't get a bill until the end of the month. If your application or your project gets on the front page of hacker news, if you experience spikes in traffic, you want to be able to predict what you're going to be paying for this.

I'm sure you've seen the blog post before about people who don't expect the insane bills that they might get from a cloud provider. You see these all sort of problems of the current model. Although it's gotten us very far.

These are kind of requirements that we see as kind of the next generation of serverless, that you shouldn't be tied down to a region. Deploying your code to be fast and it should be global. You shouldn't worry about where your code is deployed. It should be accessible to your customers at low latency without variance in latency as well, something that you may see now with web servers needed to be eating to be restarted as you get more requests coming in. Your billing should be predictable. Your billing, your latency or how your application behaves should all be predictable.

With that in mind, I'll talk a little bit about workers. Workers is our answer to serverless and it's architected in a very interesting way where we look at the model of web browsers and how people run back in code right now. We asked ourselves if things are really being done the right way. This blank here is really like how your application might run right now where you run it behind a Web server like Apache or Nginx and someone accesses it in Chrome.

There's some problems with this. It's like Apache and Nginx aren't designed to be started up on demand to deliver requests in high volume. They're very good at delivering requests in high volume but they're not optimized for this COLD-SAT problem. People have been making very impressive strides on solving this problem, but the whole architecture is not on your side.

We looked at the way that web browsers run code and we thought that that might be an interesting way to let you run applications. A web browser executes JavaScript extremely quickly. As soon as you download JavaScript, you can start executing it and in some browser insert JavaScript runtime implementations, you're executing JavaScript before you even download all of it.

What if we take existing technology and standards and let you run code that you would normally think about running in the browser on the server. There is a service workers API, I won't get into that if you're interested in learning more about the technology behind this. Look up Kenton Varda's talk on YouTube. He is the architect behind all this.

Essentially there are some assumptions you can make about code that you run in a web browser where if you have drops running in two tabs, JavaScript in one tab can't modify the state of JavaScript running in the other tab and that's a really powerful form of isolation that you get. It also means that tabs are very lightweight that a lightweight concept that we're all aware of, it's much faster to open a new tab in a browser than it is to open up Chrome every time you want go to another website. A new instance of Chrome.

The latter is really about analogous to how web servers act right now. You speed up new processes and to speed up a new process with containers and everything. We really want to think of things as threads as tabs at your own encoding and the tabs with this code or running in isolation from each other. That's what we actually did with workers. We took the V8 runtime and we wrote some code on top of it, we put it across our 150 data centers around the world. You can take JavaScript that you would think about running in the browser and actually run it everywhere. The benefits you get are crazy.

Probably the most significant one, if you're concerned about using JavaScript is that you're not going to experience the extreme variance and latency that you may be right now with several applications where you hit your application, you hit an endpoint that's running a servers application and it might hurt your copyright or may need to go provision a new container for you. Start off engine X and start up your application.

There are benefits to this model, but you're going to be paying for it latency. It's interesting because the ideal of serverless is really that you should be paying for what you use. This really isn't what's happening. If you have high demand serverless application with lot of requests, you might have noticed or heard other people doing this where you have a cron job that consistently makes requests to your severless application to keep it awake and to keep your cloud provider from removing some of those containers and sort of scaling your operation back. You're no longer actually paying for just what you use, you're paying an extra amount just keep your application awake.

You may have some other breakdowns as well that people have done where they've modeled latency in their application. It turns out like optimizing latency in a serverless application isn't the same as you might do in a traditional application. All of a sudden you're optimizing the way your code is run on a serverless provider but the whole point of using a service framework would be to not worry about these problems and to let someone who's more qualified solve these problems for you.

Another very interesting model around what we've been doing is you get to treat this entire network of code that you deploy your code across all of these data centers around the world, and that you no longer have to worry about thinking about this as some big distributed application or distributed network. It's really a single computer that you can think about this as running your code and you can reduce it further. It's a single function running your code.

The fundamental unit here is an event that happens in your application. I also would recommend Kenton Varda talks, he goes to a lot of detail about this. It's a completely new way to write code that we've seen our customers are very powerful things with. On top of this if you deploy serverless applications you may complain that deployments take a long time and some cloud providers like me have 30 minutes to globally to deploy your application. Even if you're using a regional model where you need to specify what regions your code is going to, and you shouldn't be concerned about what region your code is being deployed to, like a deploy should be global it should be fast, your code should startup fast, your code should scale quickly as your request volume increases and pricing should be predictable.

You should have a solid idea of the variables that are being used to charge you for pricing or to charge you for your application. This is all stuff that we've learned from talking to our customers, and we've seen them use serverless applications for different things and come to us with questions about our other offerings but also with workers. How do I make services work for me?

People see the benefit of it, but sometimes like the first generation of it is just complex and it can be expensive. Something interesting that we've achieved with this and that the data really speaks is that, the cold startup time for the average worker is about five milliseconds and so that's where any workload you could deploy a worker, get on hacker news in a minute, your total scale in a predictable fast way.

I'm going to pull the plug for this integration that we made with the serverless framework I definitely recommend checking it out. Especially if you use workers already. It's a really easy way to manage your code with them, solve everything in version control. There's some great documentation on it. I highly recommend it. Just want to make a shout out to the engineers Avery and Norvik Cloudflare who worked on this. It’s something we're really proud of and we actually did this hot class week in San Francisco as well incarnation with the serverless team in San Francisco. It’s been really fun to work on.

Sevki: Hi, I’m Sevki I’m a software engineer at Cloudflare and if you haven’t had enough of this, we're going to talk about APIs and how that relates to H-computing. I have a very not so secret motive for giving this talk, I want to change your mindset about how you think, how everyone thinks APIs should work.

Not to go over this over and over again, but this is what servers used to look like when I was growing up. It was a server in the back room or utility closet where cleaning supplies were. Then we went to this which I believe Google and Amazon just built this so they can put this on Reddit Cable Porn sub Reddit, makes for an amazing picture.

You don’t really care where your server is running, it’s just one of those machines. I think the CTF Netflix at some point said, they were asking him, "Why are you using SSDs," and he famously said, "I'm not running SSDs, I only care about performance, Amazon is running the SSDs. Whenever one fails, they swap it out, they put it back in. I don’t care."

The future of computing even looks even more obstructed away from us where we don’t really have to care what operating system we're running on. We just care about the fact that we're running some sort of JavaScript code or Python code or something like that. We are stretching away all the operating system stuff that’s related to it. If someone said, you’re Lambda code or your WorkerScript is running on OpenBSD instead of Linux, would you care? I probably would not.

That’s the place we want to go, we don’t want to think about, "Okay this is the carnal patch that I’m running. For this operating system I’m currently running or the vulnerabilities, whatnot." We really want to only care about the code that we're deploying and nothing else.

In the end, not too late at the point, but we run from a server in the utilities closet to a server running in us-east-1b to no computers, but it's still one location that it runs on. I really want you to think about, why code running on Cloudflare work is being very global is interesting.

Edge, when we say edge computing, edge in edge computing refers to the edge of the cloud, so it is the closest thing, closest computing units that are available to your users with the exception of the ones that they're looking at when they're running into each other on the station. There is a really, really important thing for you to think about: what is the latency when someone refreshes their Twitter feed? What happens then? What happens when they're swiping on Tinder? How fast it is.

Matthew our CEO, very-- I was going to say he was very famous, but apparently not that famous, said, "We’re not there yet, but what we want to do is get to 10 milliseconds of 99% of the global population." Anywhere you might have a user, anywhere you might have someone, eyeballs that you want to attract, where latency matters, for the 99% of them, we want to be within 10 milliseconds. That is really, really important for us that we are closer to your users than to your region servers.

Now, we sort of think about all this interaction between our users and our servers, but in all honesty, it’s probably a little bit more like this. We probably want to be here. We don't want to be close to your server, we want to be close to your user. That really matters to us, because we really care about performance.

One of our colleagues Zack made a comparison of performance between Workers, Lambda and Lambda@Edge and this is the architect of Workers channelling Rita who is the program manager for Workers. We still have this mindset that- What are things that are cacheable? What are things that we want to put behind Cloudflare or a CDN? What are things we don't want to put behind the CDN? We probably do something like in our CI/CD pipelines, bundle JavaScript images, CSS files. We put hashes after them. We know that they're cached whatever.

When we're downloading them we know the correct version that we're downloading. Their libraries upon libraries built for this exact same reason. We can cache these things. We don't really think about API calls as being cached. Like, Jade mentioned earlier where now all these things like, machine learning and whatnot are becoming very, very popular. We're not really thinking about the cost of, "Okay, how much is it going to cost me to make another translation request to Watson? How much is it going to cost me if I make a request to Google's image recognition service? We're not really thinking about those things but more and more people are actually looking for those stuff. We really want to be able to at the edge when before someone is uploading something, we want to be able to do boost detection. We want to be able to figure out if a particular image that is being uploaded to our website is copyrighted material so we can stop it before it hits and becomes a extra cost for our support staff to deal with the abuse claim. Deal with the takedown notice or so on and so forth.

We don't really think about those things as being cachable but we think they are. We think authenticated pages should be put behind Cloudflare admin tools. We certainly do this internally. All our API calls are behind Cloudflare. All our admin tools and whatnot we put behind Cloudflare and there are reasons for this.

Your restricted content like, when the GDPR think it how many US-based websites just went offline for the entire European regions? They're using things like Cloudflare and whatnot to go, "Okay, this is coming from the origin is such-and-such country we're going to block it." We're now starting to think about these things and serverless end points as well.

You don't really have to think about workers as being a independent platform on its own. It can be complementary to all these things that already exist like, Google Cloud, Azure functions or Lambda or something like that or IBM's Watson or hosted services like that. This is our of the curses of working for Cloudflare is that the number of data centers update so much that we have to make these maps interactive so we can download the list of pops, and actually this doesn't really look right. Does anybody have? Yes, I think we should make those orange. He said setting up his first demo. What is our official orange for workers? Oh, sorry.

I should probably know this by now. Let's check if that's right. That looks about right. Let's see how fast that actual script is deploying. That's how fast Workerscripts deployed. No need to applaud for that. That is the global map of how many PoPs we have; points of presence and literally just as I click save it goes out on uploads it everywhere. It is really, really fast and it is extremely gratifying to be able to push code and then it's deployed in seconds. I think our current Max top limit for how long it will take is about 30 seconds, but I've never seen anything take that long. I think it's just something we say to basically, cover ourselves, but it really doesn't take that long.

What does all this have to do with GraphQL and this entire thing I was telling you about? Certainly, Henry and Stephen talked about this as well, where we really, really want these experiences to be as performant as it as it can be. So if I go back to this, what were you really want to be doing is because these things from Cloudflare, origin, a Cloudflare's edge to your origin. We can do a lot of things like do your argo-tunnel and whatnot and do smart routing and whatnot. Because we have this global network, we can route your requests from the orange cloud to your origin servers very, very fast. What we really want to be able to do is, we want to be able to get this distance as short as possible as well, because this really matters. If the round trip doesn't have to go all the way to your origin servers, that's a win. That's a win because your server is going to have less load, this person's batteries not going to die because they're waiting on HTTP connection to close and many of those add up. We really, really want to be able to fast.

One of the ways that some folks at Facebook some time ago, I think around 2015 figured out how to do this is by batching the bunch of these codes, so you don't have to open a new socket for every single call you make, and then incur the cost of doing a TLS handshake, a TCP handshake, and whatnot, and you just complete all those at one go and that is called GraphQL.

Here is what the GraphQL looks like. This is graphical, the GraphQL editor that the GraphQL team built. Let's actually try to write a query. I'm going to put this microphone down for a bit. I hope it doesn't make a huge punk sound. Notice as I'm typing, and this is one of the great things about GraphQL, is that you get autocomplete, because your entire API is defined in the schema.

It also gives you hints about what you need to put in and what not to put in. What we're going to do is, this is a GraphQL server that I wrote on top of our one-dot one APIs, which resolves DNS queries and returns us the data for it.

We're going to put the name, let's keep that in the company called Cloudflare. Then we're going to put the type in, autocomplete, thank you very much. Then we'll do quad A and then we get all the fields that we want, filled in automatically. Now, this is great, but what is really also great about this pattern is that you can shape your data without changing the server site code. That gives you the ability to iterate quickly. If you're writing a mobile application, you don't have to raise a ticket, to get the backend folks to change what the resulting data shape that you want to look like. You can just change it yourself. The interesting thing, so if I wanted to maybe get rid of the TTL and get rid of the name, let's get rid of the site as well, let's run that code and run again. Works.

What if I would want to have, multiple of these? Sorry, it's hard typing with one hand. Let's do IBM here. Now, I'm getting also errors as I'm typing them. The reason I'm getting an error here is, it says there's a conflict for resolve, because we already have something called resolve. As you can see on this side, this is the field that we return in our data object. What we're going to do is, we're going to call this IBM, and do this, then we're going to call this CF. Do that.

Is this an error? Not really, IBM doesn't have a quad A record. They're not like IPv6 yet. Shame on them. If I do, IPv4, go old school, that comes back. Thank you. One of the interesting things, as you can see is, these are separate. I'm going to go into the code in a bit. These are separate API queries that we're bunching together. We're doing them on the server site. Now the great thing about this is, I am going to go into the console. Let's look at our network.

Everyone okay? Right. As we can see, we're doing two different resolutions on our end and we're getting 47, 12, 24, 17, 22 milliseconds for not that small of a query that we're doing. Well, if we wanted to actually multiply these add more, let's call this MS. Good thing's Microsoft has a quad A record for the IPv6 right now. Let's see. No. Microsoft. LG. Or let's look at the MX records. Interesting. Yes, outlook.com.

If we do this over and over again you will see that all these things are resolving fairly quickly like 15 milliseconds, 19 milliseconds. If I keep going, it's going to probably even out somewhere around 10 to 20-ish milliseconds. The reason for that is we're caching these very, very aggressively on the worker. Let's actually jump into the code that does that. Everyone can read this? All right. This is our starting point. All this code is available. I'm just going to go through a few points of these but all this code is available on GitHub. You can go and play with it. I encourage you to play with it because if you clone the repo in three simple commands, you can build, when I say build, bundle your Workerscript into a single thing and deploy to cloudflareworkers.com. If any of you are familiar with Rappler or Go playground or something like that. This is our playground for workers. You can go and write scripts. You don't have to sign up. You don't have to do anything. Just go in, put your code in and it works.

In the repo, there are three-- I'll just show you the repo. I'll show you the repo later. All right. Let's do this first.

Basically, first what we want to do is we're going to decode the code that we get. If you go and look at the GraphQL documentation online, you will see that most of it is set up for doing something like express or something like that where you need to set up an HTTP server. With Cloudflare Workers you just get the request. With the request we actually register an event I haven't included in this slide, but in the full code, this all will make sense. We handle the GraphQL request that receives an event, which is a fetch event, we're going to decode that query.- This is basically very simple boilerplate code that you need to have in the GraphQL implementation of Cloudflare, but I just wanted to put this here to show you that it literally is hundred something lines code to have a fully functional GraphQL server in workers. When GraphQL first came out, I really, really wanted to get into it, I just set up a server, how does that work, I have never found anything as simple as Cloudflare Workers to actually get started with GraphQL.

We basically do some house working, housekeeping. Because these are all worker scripts, use the standard web APIs, we basically get a body, we decode it literally by changing UTF-8 to strings. Then we do this one simple thing, which is the only external library that we have in this. It is the GraphQL resolver.

Let's look at the schema first and what a GraphQL schema looks like. The bits that were giving me autocomplete, and the e-names and whatnot, as you can see, these are the DNS record types that we have defined. If I go back here, and I close this bad boy and open this up and I look at the query I see resolve, name, string, type, record type, and answer. All these are documented and commented, it's all this code. We give this schema to GraphQL and GraphQL knows how to make introspection query out of it, and the introspection queries or also in GraphQL. You can actually find, for example, GitHub's or Facebook's or I don't know if Facebook has a public GraphQL, if they have, you can certainly go to github.com and look at their GraphQL API and their GraphQL API is defined in GraphQL and you can literally query through it to write tools and whatnot that you may want to use.

This is really the only query that we have for this particular GraphQL, code that we have which is resolve and it is the exact same signature for the function that we had. It basically says it needs a name that should be non-nullable type, and it will return to you an array of answers.

The GQL.query bit is the bit that we basically decode it and from the JSON object that gets posted to our server, we just take out the query bit and that is the GraphQL query that we send over here. This is the bit, It's a string that comes out of GQL.query right here. Let's look at the new route. What are we doing here? This is really the interesting bit of a GraphQL on workers or GraphQL anywhere really. We're passing in this event and we'll get into it in a bit.

We're basically here creating a root object. This root object has a constructor. Into it, we're passing event here. Basically, the signature for this function is basically Schema that you have to give it, the query that you have to give it and then route is optional, but basically it takes the Schema, it takes the query and it traverses through the query and figures out which fields of the object that you're passing into it to send back to you. If you take out like we did in this bit, if I put a name, it'll basically say when it's going through to query is going to say, "Hey, I want a distinct to me there as well," while that works.

Into this object, and this could be a very simple, plain old JavaScript object that we're passing in but we're passing in class and the reason for that will become clear in a bit. This bit is the only field. This could have been a field. We could have just basically said instead of having this be async function, we could have just had a string literal that will be returned from it, and that would have worked. Or it could have been the array of answer answer objects, and that would have worked. We could have basically hardcoded it. Well, what we're doing is we're returning a promise for what is to come to this object and that is going to enable us to take all these queries that we have, batch them, and paralyze them. As you can see, this resolve has a one to one mapping with the query as well. The X object that we have here is going to have a name and a type.

Let's go back to the event thing, why are we doing this? When we're constructing this root object, what we're doing is we're setting the resolvers field and that resolvers is a data loader. This is one of the patterns that Facebook folks came up with, and they use this very heavily on their back end services. What this does is, as you can see, it takes a object called keys, we don't know what it is, but we're probably sure those are keys. It has a batch resolver, and data loader does batch resolving by default and doesn't do single resolves, and there's a very good reason for that as well.

Let's go into that data loader thing and see what it does and more specifically why we're passing this event thing into the batch resolver. In the batch resolver, we're just going to take our keys iterate through them and we're going to resolve them one by one by passing IDs.

This is literally the only thing that actually does any queries outside. What we're doing here is, we're creating a new request. Again, this is all on the Mozilla Web API docs page, I think that everyone commerged on using MDM as the source of truth for the web APIs. We're saying this is an application slash/DNS JSON request that we're making. We're making it to cloudflaredns.com, we take the name and we put the type as query strings for this request. This was introduced two days ago. This is the very, very shiny new cache API that we have and Rita tweeted about this event. She has a blog post on the blog about how to use this. It's really, really cool. This is the the bit that we've been talking about where we want you to think about, "Do I really have to make this request to IBM, Watson, or Google's machinery service to make this request and get it back. I have to do it again, then I have to pay ¢5 or however much you have to pay for it. No, you don't have to.

What you do is you open up your cache which we have them local to whatever pop you're connected to, then we match it with the request. This is why I'm basically up here constructing a new request. I go, "If the response is not empty really, I want to fetch this." What I then do is, this is why I've been propagating the event down, all the way down to here where I say, "Event, wait until cache put."

What this basically says to you, the Workerscript is, "Hey, you can keep writing. You can start writing." If I've received all my fetches and maybe six of these requests are batched together, a couple of them are cached a couple of them are not, you can start streaming the response back if you have everything you want ready. I'm going to keep this script working so you can cache all the requests that you've received. The next time someone comes in, these are all cached. We wait for the response and then we return it. That's pretty much all you have to do to get GraphQL on workers going.

There are a couple of other things that we do, because the object that we were passing into data loader were objects and not really single strings, what we're doing is basically concatenating two strings to say, "This is the domain name that I'm getting, this is going to be the cache key." Then we're basically here saying, "I'm going to use a simple map object that I have here. to use as my in memory cache." Then what we do is we basically stringify the responce, and that's our query result. That is literally all the code you need to get graft you out working on Cloudflare workers. That's me, the demand that I just made, the website is online here. The code is available on GitHub if you want to go and check that out. Like I promised, literally, all we have to do is MPM install, MPM ROM build, MPM RAM preview. Literally, to get this going on the Cloudflare workers playground is shorter than my entire talk. That's it. Any questions? Also this is where the documentation leaves. If you want to, you can email me at jake@cloudflare.com and I'm on Twitter and GitHub again.

Moderator: I'll now hand it over to our final speaker who has flown here from Madrid to speak not necessarily on our behalf, but his own behalf about his interesting workers use cases out in the real world. Paddy Sherry is a workers expert. We wanted him to come here and speak this evening. Thank you for coming here, Paddy.

[applause]

Paddy Sherry: Hello. So briefly about me. I work for Gambling.com Group. I'm a lead developer there. It doesn't have the quick switch. Anyway, we operate in the online gambling industry, and we do performance marketing. What that means is, we create lots of websites that offer reviews of online casinos, and present unbiased reviews, so people can come on and see which casinos are the best, and choose one based on their preferences. What we build on a daily basis are a global network of multilingual websites. We currently have 55 websites online. Another one just went online today, and they're based all around the world. The US, Australia, Europe, we have most continents, there's a website we're running there. We run everything through Cloudflare. Just get on to the tech stack.

These are some of our sites, gambling.com, bookies.com/sourcingcasinosource.co.uk. Our tech stack is entirely static websites served by the Cloudflare edge locations. There's no processing done on our servers when user request a page, everything is served by Cloudflare at their edge immediately. It basically means that we have an extremely fast collection of websites and we have minimal server cost because Cloudflare is serving everything for us. It's just a little graphic of basically what happens.

Obviously, the first request needs to get to a server so Cloudflare can cache it, but after that every single user no matter where they are they are getting served the website from a location close to them. It just means our platform is-- I don't want to say infinitely scalable but it's pretty robust.

The benefits of building our system this way is that it's fast. One of the number one things we're concerned with is speed is getting the website loading fast for users so we factor that into every technical decision we make and the primary benefit is that we have extremely fast websites.

They are also secure given all of the Cloudflare security features, we really don't have to worry too much about that side. They are served from the edge so there's nothing getting through to our server, we just let Cloudflare handle all of that. The combination of static HTML and speed is very SEO friendly and that means where we have somewhat of a head start in trying to rank highly in Google because of the technology choices we make.

We do have some limitations then and the primary one is static. Everything is the same for every user. When someone visits the homepage of a site it doesn't matter where they are coming from they all see the same content. Until now we've been doing okay with that, but about one year ago we decided that we need to start making these things dynamic and personalized in order to stay ahead of the competition. That's the primary problem that I face then as lead developer was trying to find a way to make all our static websites dynamic. Initially, I felt we were going to have to partly rebuild our platform and move to server rendered sites and have servers all over the world or use AWS to try and do it efficiently. For a while we were in a bit of deep period of research about how we were going to do this and then we heard about Workers so we were Cloudflare customer and we follow the blog and Workers was mentioned and straight away caught our attention. We applied to be part of the Beta and as soon as it came out we started using it, started experimenting it and very quickly within a couple of hours we could see this is going to solve some problems for us. Given that, we set about trying to do things with Workers.

The first one was geo-targeting. As I said, our websites rank very well in Google, but usually that is the English version of the site, so one of our websites is gambling.com and the homepage is tailored for a UK audience so it features UK casinos, but let's say some guy from the US lands on gambling.com, well, that content is of no benefit to him because the prices are in pounds, not dollars. Let's say if someone from Italy lands on the site, well he may not even be able to read it because it's not Italian and the prices are in pounds instead of Euros, so we were losing a lot of traffic that was coming to the sites because content wasn't tailored for what they need, so we tried to solve this with Workers.

The way we did it is with Cloudflare, they give you the option of detecting the country of the request so we just created a simple Worker that checks incoming country. It's a two letter code. If we have a local version of the site, then we offer the user a redirect if they want it so. Let's give you a little example of what it looks like. On the left is page and you can see it's entirely English. Currency is pounds. If a guy comes in from Italy, that's of no benefit, but once we deployed Workers, we're now able to show a banner at the top of the page which if anyone that can't read it, it basically says we have a local version of the site, would you like to go there?

So, now when someone comes from another country and lands on our site, we can show them a message, offer them the option of going to a more relevant version for them. The benefits of this is that we're not losing traffic and our bounce rate is not going up because people are actually staying on the site when they come and also users are getting what they want, so we're providing a better user experience via Workers. Second use case is restricting access to content, so I know some companies have very high requirements when it comes to security, protecting their systems, but sometimes military grade security isn't really required and I'll give an example. I submitted a guest post on the Cloudflare blog and while that was still in draft it was available on the internet. It was cloudflare.com/p/some random string. The content was nowhere near finished, there were spelling mistakes and everything, but people were able to access that before it was ready. Some people actually started linking to that article from within our company because they saw that they got mentioned and started promoting it that we got mentioned on the Cloudflare blog and it was nowhere near ready. It was just open to the public and that's something that we would have occasionally like.

Let's say, for example, we have a guest author who contributes an article and maybe we want to just put it online for that guest author to approve before we make it available to the public. What you can do with Workers is you can just say If there's a parameter and the URL allow access to the content. If it's not there, block them. Here's what it looks like. Without Workers, everyone can access the page. With Workers, we can block people unless they're either on to the end of URL?preview=true. Just a simple way of firstly preventing people seeing content before they should and secondly preventing it being crawled by Google before we want it to be crawlable. The benefits are that it was extremely simple to implement and it took 10 minutes. It's really easy for other people to understand how they actually need to manipulate the URL to get to the page.

Third use case is A/B testing. We like to experiment with lots of things like changing layouts and stuff and we use A/B testing tools to do that. There are some very good ones out there, but the problem is there's always a JavaScript that you add to the page. Then, once the user loads the website for briefs, they can for a moment see the original version and then the variant will snap into place. I think the correct term is flicker. It's a really bad experience in something we try hard to prevent, but we still need to A/B test to get the insights that will provide.

Without Workers, users would see something like this. They would see the original and then the variant would snap into place, but with Workers we can detect the response coming in, manipulate the page, load the variant and send it all back to the user with the variant already in place so there's no snap, user is not trying to click on something that is going to disappear afterwards. It's just a much better experience. We still get all the benefits of A/B testing and the data that it gathers for us. Then final one which is not something we're doing in production yet, but it's something we're working on. We have this website called footballscores.com. As you can imagine, it shows live football scores from around the world and it's currently running. It's a traditional server rendered site. We don't like that because it's not scalable and we really like doing static websites. With a static website, we can't show live scores. We're finding a way to make this a static website that is dynamic using Workers. When a user loads the page the Workerscript will go off to an API, get the score data and bring it back into the page. When the page loads, the user will see what's on the right. The scores already loaded.

Now, of course, we could do this without workers by firing a JavaScript request after the page loads, but then users would see this loading indicator and then the content would snap in. As I said, we don't like things changing after the page loads. With Workers we will be able to fetch data from an API, inject it into the page, and users will have it loading incredibly fast because it's a static site and it's served from an edge close to them.

Those are our use cases. The roadblocks that Workers removed for us is that it allowed us to make our static sites dynamic. We avoided having to undertake a major architecture change to server rendered sites or some other technology and it solved our problem a lot quicker than we expected. That's not to say we didn't look at alternatives. The first one we looked at was Lambda. We investigated that. We also looked at Netlify which is a platform for creating static sites which actually integrates with Lambda closely and then the Google and Microsoft offerings.

What we found was that the implementation of workers was incredibly simple. In the back end, in the Cloudflare dashboard before they integrated with the serverless framework we could just click Workers launch, you get a little window and you could be coding in a couple of seconds and finish the Worker in 10 minutes and deployed to your production website with no problems. Now, obviously with that flexibility comes great responsibility because you could easily leave a character out of place and take the site down. We have to be very careful with it, but it does give us the ability to do things very fast. That's why we went for Workers over any of the other offerings out there.

As I said, it's easy to implement at the top of our architecture, no additional cost, something like $5 a month or something. Nothing too major, but also the speed. As from the Cloudflare blog, we can see that it's faster than Lambda, it's faster than the others. Speed is, as I said, one of the number one things we're concerned with. If there's any possible way to see it have a millisecond, we will do that and we'll choose the right tool for the job. Given Workers are recent developments, they're not perfect and there are some things that we think would really help them develop and become more of a mainstream technology, that's access to more Workerscripts within the back end. Right now, we can only access one. If we could access multiple, it would be great. Having all of the code for a site in one file it's hard to navigate.

More documentation will be good when that comes. Recipes, so just snippets of code that you can use to do just regular things without having to write it all from scratch. Integrations with other tools, for example, an integration with Google Firebase would be great. If we wanted to integrate with an API, we have to do that all manually in JavaScript. It would be cool if there are some common tools that people want to integrate with that, we could just do that but with the click of a button. Also, we would really love to see your databases on the edge with any of our sites. If they need to get data, the Worker is still going to have to send a request back to an origin server, which could be on the other side of the world so the benefit of Workers is somewhat lost. If we could get together database closer to the user without having to have database servers all over the world, I think we would have our site as fast as they could possibly get. Those are some things that we think would really help along with some extra logging so we can see exactly what's going on.

That's it.

[END OF AUDIO]