Subscribe to receive notifications of new posts:

Multiplayer Doom on Cloudflare Workers

2021-05-18

10 min read

There are halls and corridors in Cloudflare engineering, dangerous places for innocent wanderers, filled with wild project ideas, experiments that we should do, and extremely convincing proponents. A couple of months ago, John Graham-Cumming, our CTO, bumped into me in one of those places and asked: "What if we ported Doom multiplayer to work with our edge network?". He fatally nerd-sniped me.

Aside by John: I nerd-sniped him because I wanted to show how Cloudflare Workers and Durable Objects are a new architectural paradigm where, rather than choosing between two places to write code (the client, the browser or app, and the server, perhaps in a cloud provider availability zone), there’s a third way: put code on the edge.

Writing code that runs on a client (such as JavaScript that runs in a browser or a native app on a phone) has advantages. Because the code runs close to the end-user it can be highly interactive, there’s almost no latency since it’s literally running on the device. But client-side code has security problems: it’s literally in the hands of the end-user and thus can be reverse engineered or modified. And client-side code can be slow to update as it depends on the end-user. If the client is actually an IoT device then updates can be even slower or essentially never.

On the other hand, code running on a server is quick to update and secure. You can keep secrets in that code, and modify it at will. But it tends to be far from the end-user which limits interactivity because of latency.

The edge offers advantages from both worlds: the code is secured and can contain secrets since it runs in a controlled environment and not on the end-user device, the code can be updated quickly since it doesn’t depend on the end-user and the edge is close to end-users meaning low latency and opportunity for high interactivity.

Games are fun, interactive and tend to stress systems. Doom Multiplayer was the perfect example because it demonstrates the power of WebAssembly to run code in the browser and the power of Durable Objects to provide the backend for multiplayer games with low latency.

So while you’re enjoying multiplayer Doom running in the browser and coordinated by Durable Objects and Cloudflare Workers, try to keep in mind this new architecture for applications. And we’ve open sourced the code for all of it: Doom running in Wasm, the associated website and the message routing code that runs on the edge using Cloudflare Workers and Durable Objects.

Back to Celso...

Doom is considered the first popularized FPS (first-person shooter) game of all time. When id Software launched it in 1993, it boasted 3D graphics, spatial navigation, networked multiplayer, and an open format (WAD) to define levels, sprites, and all sorts of game modifications, also known as Doom modding. It became an instant revolution and sparked both a new game industry and a global community of gamers and Doom fans that lasts up until today.

If you want to read more about this period and how the "Two Johns" changed everything in gaming back in the 90s, I highly recommend you get a copy of the book Masters of Doom.

But Doom has gone beyond the gaming community. In 1997, John Carmack, still at id Software, open-sourced the original code and presented it as "a useful base to experiment and build on." Little did he know. Since then, his masterpiece was ported to run on pretty much anything that has a CPU on it, including a pregnancy test and a spectrum analyser. Running Doom is effectively the new “Hello, World” in computing.

A few weeks ago, we introduced a major feature in our Workers edge computing platform: WebSockets support. Not only that, but we improved Durable Objects to seamlessly interface with WebSocket connections and provide state and storage capabilities. This opened a large door to start implementing real-time, interactive applications that can communicate over persistent data connections, all on top of Workers.

So, if a developer wants to run their next hit, say a real-time multiplayer game, using nothing but their app and Workers, without any servers or traditional infrastructure, can they? Let’s prove they can.

Here's Doom Multiplayer running on top of Cloudflare Workers showcasing this scenario…

https://silentspacemarine.com/

...and read it to find out how we did it.

Keeping it simple

We wanted this demo to be as simple as possible, to run in the browser, end to end, with no downloads or native binaries required. This meant one first task: porting Doom to compile to WebAssembly, an open standard that defines a portable binary-code format for executable programs and enables high-performance applications in multiple environments, including web pages on modern browsers.

We looked around for other Wasm Doom ports that included the network layer but didn't find any. It was time to get our hands dirty and do it ourselves. How hard could it be?

Enter the world of Emscripten, “a complete compiler toolchain to WebAssembly, using LLVM” that provides Web support for popular portable APIs such as SDL, effectively allowing ports of complex applications such as Doom.

We also found Chocolate Doom, a modern, community-driven, well-maintained port that aims to reproduce the original DOS version of Doom and has a few bonuses, like networked multiplayer support, and a decent, readable, and modular codebase. Perfect match.

Long story short, we got Emscripten to compile Chocolate Doom a few days later. It felt like magic when we saw Doom running in a browser window for the first time. The magic is WebAssembly.

It cost us a few head bangs on the wall, but we learned a lot in the process. Here's a few things I think you’ll find interesting:

No main() loop

Unlike many C/C++ graphical apps, like games, you can't have an infinite loop running on a page. The browser environment is event-driven. When an event is processed, it then needs to return control to the browser so that the next event gets its turn.

Emscripten provides a function to solve this problem. With emscripten_set_main_loop(), you can have your infinite loop running at, say, 60 frames per second, but the other browser functions can still run between iterations without hanging the page.

In d_main.c

while (1)
{
	D_RunFrame();
}

Became:

emscripten_set_main_loop(D_RunFrame, 0, 0);

No UDP

The original MS-DOS Doom used the IPX networking protocol from Novell NetWare. Later versions and current ports migrated to UDP (though some still maintain compatibility with IPX, because why not?). There's a good reason why action games use UDP (instead of TCP) for their multiplayer protocols. UDP is simple, fast, and non-blocking. However, you need to deal with packet loss or out-of-order packets at the application level. You're basically trading responsiveness, low latency, and speed for guarantees and reliability.

Emscripten has some wrappers to emulate POSIX TCP sockets over WebSockets, but not UDP. Also, we don't want UDP; we want to use WebSockets, which in turn use TCP.

It became evident that we'd have to write a new Chocolate Doom network module to use TCP and WebSockets instead of UDP while working to mitigate the problems mentioned above.

Enter the net_websockets.c driver

We had to deal with a few challenges to make the WebSockets module work the way we wanted. The first was the network topology:

On the left, let’s imagine there are four players in the original LAN topology: the server (the one who starts the game) and the other three players who connect to the server via UDP. If you wanted to play Doom over the Internet, you'd have to make sure all players had public IP addresses and specific firewall and forwarding rules configured. Chocolate Doom even has some hole punching code in it. We don't want to deal with any of this.

On the right, in our case, we want everything to be Web-based and zero-config, so all of the four players connect to a single WebSockets server that acts as a message router between the group.

We won’t have IP addresses anymore, just a WebSocket connection to which we need to tunnel our UDP-based point-to-point protocol through. To emulate the standard behaviour we create a fake IP at startup, just a random uint32_t number, and we add ourselves and the other clients we discover in the WebSocket protocol and have to talk to, to an internal routing table.

// Generate UID for this instance - it will be used as the websocketsID
srand((unsigned int)time(NULL));
instanceUID = rand() % 0xfffe;

// addresses table
static int addrs_index = 0;
net_addr_t addrs[MAX_QUEUE_SIZE];
uint32_t ips[MAX_QUEUE_SIZE];

// add a new fake ip to the routing table
addrs[addrs_index].refcount = 1;
addrs[addrs_index].module = &net_websockets_module;
ips[addrs_index] = ip;
addrs[addrs_index].handle = &ips[addrs_index];
return (&addrs[addrs_index++]);

Next, we wanted to avoid blocking Doom at all costs, so we created an intermediate queue between the asynchronous WebSockets layer and the internal Doom routines to act as a buffer and alleviate the fact that, indeed, we'll be using TCP. This worked nicely.

// WebSockets packet queue
typedef struct {
    net_packet_t *packets[MAX_QUEUE_SIZE];
    uint32_t froms[MAX_QUEUE_SIZE];
    int head, tail;
} packet_queue_t;

// Pushes one packet into the queue
static void WebsocketsQueuePush(packet_queue_t *queue, net_packet_t *packet, uint32_t from)

// Pops one packet from the queue
static ws_packet_t *WebsocketsQueuePop(packet_queue_t *queue)

This queue is then fed every time we receive a new WebSockets packet, asynchronously, outside the game main loop and logic.

WebSocketMessage(int t, const WebSocketMessageEvent *e, void *data) {
    ...
    packet = NET_NewPacket(e->numBytes - 4);
    memcpy(packet->data, &e->data[4], e->numBytes - 4);
    ...
    WebsocketsQueuePush(&client_queue, packet, ip);
}

emscripten_websocket_set_onmessage_callback(websocket, (void *)45, WebSocketMessage);

And then, when the main game loop asks for newly received packets as part of its “main loop”, we’re now able to instantly return everything we see in our queue, close to how UDP would behave.

NET_Websockets_RecvPacket(net_addr_t **addr, net_packet_t **packet)
{
    ws_packet_t *popped;

    if (InitWebSockets() == false) return false;

    popped = WebsocketsQueuePop(&client_queue);

    if (popped != NULL) {
        *packet = popped->packet;
        *addr = FindAddressByIp((*(uint32_t *)(popped->from)));
        return true;
    }

    return false;
}

Finally we use NET_Websockets_SendPacket() and emscripten_websocket_send_binary() to send the traffic out. The messages are sent through the WebSocket using a simple envelope that contains the “From” and the “To” fake IPs, 4 bytes each (UInt32 little-endian), and the original packet from the Doom protocol.

NET_Websockets_SendPacket(net_addr_t *addr, net_packet_t *packet)
{
    char *wspacket = malloc(packet->len + 8);
    to_ip = (*(uint32_t *)(addr->handle));
    memcpy(&wspacket[0], &to_ip, 4);       // to
    memcpy(&wspacket[4], &instanceUID, 4); // from
    memcpy(&wspacket[8], packet->data, packet->len);
    emscripten_websocket_send_binary(websocket, wspacket, packet->len + 8);
    free(wspacket);
}

The message router in Cloudflare Workers, explained below, then keeps its own table of connections and uses the address headers to, well, route the incoming messages to the correspondent client.

We also added a few options to the original Chocolate Doom code base, like choosing your name from the command line, and support for multiple keys for the same action.

The open-sourced code and instructions to compile and run Wasm Doom locally can be found in this repo.

The Message Router

The message router runs on Cloudflare edge and handles the incoming and outgoing messages between the clients and the server. It has the following requirements:

  • Accept WebSocket connections

  • Build a routing table that maps a connection to a “From” address

  • Receive and parse the incoming messages

  • Broadcast the messages to the corresponding clients

  • Handle some REST APIs to create and validate Doom rooms (game sessions)

We're using Cloudflare Workers’ WebSockets support to handle the client connections and Durable Objects to provide state across them for the message router.

WebSockets are persistent connections between the client and, in this case, Cloudflare edge. Inside the connection, both ends can exchange data fast, whenever they want, without additional overhead, which make them ideal for real-time interactive applications like chats, sports results, stock information, or gaming.

A Durable Object is simply a class in our Worker code that contains data and methods to access and manipulate that data and other logic. You interact with your Durable Object through the well-known Fetch API, just like any regular Worker.

Let's go through some code:

This is the default Worker class, all it does to handover things to handleApiRequest()

export default {
    ...
    return handleApiRequest(url.pathname, request, env)
    ...
}

Each multiplayer game session creates a unique Id, or "room", which corresponds to its own Durable Object instance. This Id is used in the URL when the user invites friends to his new game.

handleApiRequest()validates the room Id in the URL and forwards the API request or WebSocket connection to the corresponding Durable Object object.

There's a small performance trick in here. There are two ways you can ask Workers to generate a unique Durable Object Id. One is by using newUniqueId(), which is very fast; another one is to use idFromName(), which derives the Id from your arbitrary input string. The latter can be convenient for some use-cases, but it requires our platform to do a global network lookup, which can be expensive time-wise. We're using newUniqueId(), and we construct our room Id with it (Thanks, Kenton Varda).

async function handleApiRequest(path, request, env) {
  ...
  switch (parts[1]) {
    case 'ws':
    case 'room':
      room = await checkRoom(parts[2], env)
      if (room) {
        let id = env.router.idFromString(room)
        let routerObject = env.router.get(id)
        return routerObject.fetch(path, request)
      }
    case 'newroom':
      room = await createRoom(env)
      return jsonReply({ room: room }, 200)
  }
}

Here’s how we create a new room Id:

async function createRoom(env) {
  const room = env.router.newUniqueId().toString()
  const digest = await crypto.subtle.digest({ name: 'SHA-256' },
    new TextEncoder().encode(room + env.DOOM_KEY),
  )
  const hash = Array.from(new Uint8Array(digest))
  const hex = hash.slice(0, 4).map(b => b.toString(16)
    .padStart(2, '0'))
    .join('')
  return `${room}-${hex}`
}

Finally, our Router takes over and handles the WebSockets messages. webSocket.addEventListener('message')runs on every received message and does a few things.

First it decodes the packet:

let data = msg.data
let from = new Uint32Array(data.slice(4, 8))[0]
let to = new Uint32Array(data.slice(0, 4))[0]

Then, it adds every new client to a routing table, a list of WebSocket objects (open connections), and corresponding "From" Ids (see our Doom protocol packet scheme above).

// if it's a new client, add it to the table of clients
if (this.sessions.map(c => c.from).indexOf(from) == -1) {
    let session = { ws: webSocket, from: from }
    this.sessions.push(session)
}

And finally, it sends the message to its destination, the "To" client.

// send this packet to the corresponding client
i = this.sessions.map(c => c.from).indexOf(to)
if (i != -1) this.sessions[i].ws.send(data.slice(4))

You can check the full source code and setup instructions for the message router in our GitHub repo here. We also provide a simplified NodeJS implementation to facilitate local development.

The Website

The cherry on top of our little pet project is the website, the thing that glues everything together. We will, of course, be using Cloudflare Pages for this.

The website needs to do a couple of things:

  • Run Wasm Doom on the page and adjust its input arguments according to the user context (multiplayer, initiator or friend, deathmatch or cooperative, solo play, etc.)

  • Have great UX and the most straightforward possible game setup and game join sequences.

  • Interact with Doom and provide feedback to the user.

  • Have that retro feeling 90s dark look.

Running Wasm Doom

Emscripten Wasm binaries use Module, a global JavaScript object, as their environment configuration. When the app starts, it looks at the Module's definitions and applies them during the different execution stages.

Module allows you to define things like the input arguments (like the ones you'd pass to a command-line app), code to run before the Wasm starts, the function to handle stdout, and many other parameters.

For instance, we use PreRun to make sure we load two files, doom1.wad and default.cfg into Emscripten’s virtual file-system before Doom starts. Without them, the app wouldn’t start.

preRun: () => {
    Module.FS.createPreloadedFile("", "doom1.wad", "doom1.wad");
    Module.FS.createPreloadedFile("", "default.cfg", "default.cfg");
}

You can see our website’s JavaScript code and how we configure Module and interact with Wasm Doom and our Message Router APIs here.

One other neat trick is to show important messages coming from the game while it's running, using a bottom scrolling ticker just below the game canvas. There are other, more sophisticated ways to interface between the web page and the Wasm process, but in this case we’re simply parsing stdout messages using a custom purpose handcrafted text protocol.

This is what the user sees:

This is our stdout based protocol:

doom: 1, failed to connect to websockets server
doom: 2, connected to %s
doom: 3, we're out of client addresses

And this is all we have to do in Doom’s C source code:

net_websockets.c: printf("doom: 2, connected to %s\n", attr.url);

Game setup

The game setup is as easy as it can be. You go to silentspacemarine.com, start a multiplayer game, get a unique game permalink, share it with your friends online, wait for them to join, and start playing. The person who starts the game becomes "the server" in the network, but we handle that for you; there's no need to worry about technical details.

If you receive one of these permalink invites, all you need to do is click on it, choose your name, wait for your host to start the game, and that's it.

The source and Worker Pages wrangler configuration for the silentspacemarine.com website can be found in this repo.

Conclusion

You can play our demo at https://silentspacemarine.com/.

Our Wasm Doom with WebSockets implementation runs in the browser, supports mouse, sound and fullscreen (press F during the game), works on the desktop and mobile phones (using virtual gamepads) and supports real-time networked multiplayer (up to four players) on top of the Cloudflare edge.

There is certainly room for improvement in this project. For instance, we don't handle reconnects or disconnects very well if players drop off abruptly (ex: close their web browser window). We're also not using storage in our Durable Object to persist a session over an unlikely but possible isolate termination.

Doom's original protocol (1994) is not well suited for modern Internet multiplayer fast-paced games either. Every client receives a full copy of all the input (keyboard, mouse) from all the other clients. The game only advances when everyone receives the commands from all the other players in the group. The result is that the game's playability depends on the player with the slowest connection.

Modern FPS gaming protocols solve these and other challenges by predicting future movements (games are mostly deterministic), using compression, sending only the deltas, and other techniques. You can read more about this in “The DOOM III Network Architecture”.

However, as John put it, we wanted to showcase a paradigm shift and the opportunity to build real-time networked applications, now made possible because new and powerful technologies and features are available on our edge computing platform.

Having said this, the source code for Wasm Doom, Message Router, and website is all open-sourced and available in our GitHub, and you can start playing with it. Comments, issues, or even PRs are welcome. We tried to package everything together so that you can run things locally or deploy your version in Cloudflare.

Edge Doom (or Durable Objects Offer Multiplayer, as we like to call it) was a fun project to put together. We hope you found this tech write-up interesting and inspiring to do other things with our edge computing components, and we can't wait to see what you come up with.

Watch on Cloudflare TV

Cloudflare's connectivity cloud protects entire corporate networks, helps customers build Internet-scale applications efficiently, accelerates any website or Internet application, wards off DDoS attacks, keeps hackers at bay, and can help you on your journey to Zero Trust.

Visit 1.1.1.1 from any device to get started with our free app that makes your Internet faster and safer.

To learn more about our mission to help build a better Internet, start here. If you're looking for a new career direction, check out our open positions.
Cloudflare WorkersDurable Objects

Follow on X

Celso Martinho|@celso
Cloudflare|@cloudflare

Related posts

October 31, 2024 1:00 PM

Moving Baselime from AWS to Cloudflare: simpler architecture, improved performance, over 80% lower cloud costs

Post-acquisition, we migrated Baselime from AWS to the Cloudflare Developer Platform and in the process, we improved query times, simplified data ingestion, and now handle far more events, all while cutting costs. Here’s how we built a modern, high-performing observability platform on Cloudflare’s network....

October 24, 2024 1:05 PM

Build durable applications on Cloudflare Workers: you write the Workflows, we take care of the rest

Cloudflare Workflows is now in open beta! Workflows allows you to build reliable, repeatable, long-lived multi-step applications that can automatically retry, persist state, and scale out. Read on to learn how Workflows works, how we built it on top of Durable Objects, and how you can deploy your first Workflows application....