Introducing lua-capnproto: better serialization in Lua

by Jiale Zhi.

When we need to transfer data from one program to another program, either within a machine or from one data center to another some form of serialization is needed. Serialization converts data stored in memory into a form that can be sent across a network or between processes and then converted back into data a program can use directly.

At CloudFlare, we have data centers all over the world. When transferring data from one data center to another, we need a very efficient way of serializing data, saving us both time and network bandwidth.

We've looked at a few serialization projects. For example, one popular serialization format is JSON, for some of our Go programs we use gob, and we've made use of Protocol Buffers in the past. But lately we've been using a new serialization protocol called Cap'n Proto.

Cap'n Proto attracted us because of its very high performance compared to other serialization projects. It looks a little like a better version of Protocol Buffers, and the author of Cap'n Proto, Kenton, was the primary author of Protocol Buffers version 2.

At CloudFlare, we use NGINX in conjunction with Lua for front-line web serving, proxying and traffic filtering. We need to serialize our data in Lua and transport it across the Internet. But unfortunately, there was no Lua module for Cap'n Proto. So, we decided to write lua-capnproto and release it as yet another CloudFlare open source contribution.

lua-capnproto provides very fast data serialization and a really easy to use API. Here I'll show you how to use lua-capnproto to do serialization and deserialization.

Install lua-capnproto

To install lua-capnproto, you need to install Cap'n Proto, LuaJIT 2.1 and luarocks first.

Then you can install lua-capnproto using the following commands:

git clone https://github.com/cloudflare/lua-capnproto.git                    
cd lua-capnproto                                                             
sudo luarocks make

To test whether lua-capnproto was installed successfully, you can use the capnp compiler to generate a Lua version of one of the Cap'n Proto examples as follows:

capnp compile -olua proto/example.capnp

If everything goes well, you should see no errors and a file named example_capnp.lua generated under the proto directory.

Write a Cap’n Proto definition

Here's a sample Cap’n Proto definition that would be stored in a file called AddressBook.capnp:

    @0xdbb9ad1f14bf0b36;  # unique file ID, generated by capnp id

    struct Person {
      id @0 :UInt32;
      name @1 :Text;
      email @2 :Text;
      phones @3 :List(PhoneNumber);

      struct PhoneNumber {
        number @0 :Text;
        type @1 :Type;

        enum Type {
          mobile @0;
          home @1;
          work @2;
        }
      }

      employment :union {
        unemployed @4 :Void;
        employer @5 :Text;
        school @6 :Text;
        selfEmployed @7 :Void;
        # We assume that a person is only one of these.
      }
    }

    struct AddressBook {
      people @0 :List(Person);
    }

We have a root structure named AddressBook containing a list named people whose members are also structures. What we are going to do is to serialize an AddressBook structure and then read the structure from serialized data. For more details about the Cap'n Proto definition, you can checkout its documentation at here.

Prepare your data

Preparing data is pretty simple. All you need to do is to generate a Lua table corresponding to the root structure. The following list gives rules to help you write this table. On the left is a Cap'n Proto data type, on the right is its corresponding Lua data type.

  • struct -> Lua hash table
  • list -> Lua array table
  • bool -> Lua boolean
  • int8/16/32 or uint8/16/32 or float32/64 -> Lua number
  • int64/uint64 -> LuaJIT 64bit number
  • enum -> Lua string
  • void -> Lua string “Void”
  • group -> Lua hash table (the same as struct)
  • union -> Lua hash table which has only one value set

A few notices:

Because Lua number type represents real (double-precision floating-point) numbers and Lua has no integer type, so by default you can't store a 64 bit integer using number type without losing precision. LuaJIT has an extension which supports 64-bit integers. You need to add ULL or LL to the end of the number (ULL is for unsigned integers, LL for signed). So, if you need to serialize a 64-bit integer, remember to append ULL or LL to your number.

For example:

id @0 :Int64;                   ->   id = 12345678901234LL

Enum values are automatically converted from strings to their values, you don’t need to do it yourself. By default, enums will be converted to the uppercase with underscores form. You can change this behavior using annotations. The lua-capnproto document has more details.

Here is an example:

type @0 :Type;
enum Type {
    binaryAddr @0;
    textAddr @1;
}                               ->    type = “TEXT_ADDR”

void is a special type in Cap’n Proto. For simplicity, we just use a string "Void" to represent void (actually, when serializing, any value other than nil will work, but we use "Void" for consistency).

A sample data table looks like this:

    local data = {
        people = {
            {
                id = "123",
                name = "Alice",
                email = "[email protected]",
                phones = {
                    {
                        number = "555-1212",
                        ["type"] = "MOBILE",
                    },
                },
                employment = {
                    school = "MIT",
                },
            },
            {
                id = "456",
                name = "Bob",
                email = "[email protected]",
                phones = {
                    {
                        number = "555-4567",
                        ["type"] = "HOME",
                    },
                    {
                        number = "555-7654",
                        ["type"] = "WORK",
                    },
                },
                employment = {
                    unemployed = "Void",
                },
            },
        }
    }

Compile and run

Now let's compile the Cap'n Proto file:

capnp compile -olua AddressBook.capnp

You shouldn't see any errors and a file named AddressBook_capnp.lua is generated under the current directory.

To use this file, we only need to write a file named main.lua (or whatever name you desire), and get all the required modules ready.

    local addressBook = require "AddressBook_capnp"                                                                       
    local capnp = require "capnp"

Now we can start to serialize using our already prepared data.

    local bin = addressBook.AddressBook.serialize(data)

That’s it. Just one line of code. All the serialization work is done. Now variable bin (a Lua string) holds the serialized binary data. You can write this string to a file or send it through network.

Want about deserialization? It's as easy as serialization.

    local decoded = addressBook.AddressBook.parse(bin)

Now variable decoded holds a table which is identical to data. You can find the complete code here. Note that you need LuaJIT to run the code.

Performance

If you are happy with the API, here is even better news. We chose Cap'n Proto because of its impressively high performance. So when writing lua-capnproto, I also made every effort to make it fast.

In one project we switched to lua-capnproto from lua-cjson (a quite fast JSON library for Lua) for serialization. So let's see how fast lua-capnproto is compared to lua-cjson.

Performance

You can also run benchmark.lua yourself (included in the source code) to find out how fast lua-capnproto is compared to lua-cjson on your machine.

The future

We are already using lua-capnproto in production at CloudFlare and it has been running very well for the past month. But lua-capnproto is still a very young project. Some features are missing and there's a lot of work to do in the future. We will continue to make lua-capnproto more stable and more reliable, and would be happy to receive contributions from the open source community.

comments powered by Disqus