When we launched our InterPlanetary File System (IPFS) gateway last year we were blown away by the positive reception. Countless people gave us valuable suggestions for improvement and made open-source contributions to make serving content through our gateway easy (many captured in our developer docs). Since then, our gateway has grown to regularly handle over a thousand requests per second, and has become the primary access point for several IPFS websites.
We’re committed to helping grow IPFS and have taken what we have learned since our initial release to improve our gateway. So far, we’ve done the following:
Automatic Cache Purge
One of the ways we tried to improve the performance of our gateway when we initially set it up was by setting really high cache TTLs. After all, content on IPFS is largely meant to be static. The complaint we heard though, was that site owners were frustrated at wait times upwards of several hours for changes to their website to propagate.
The way an IPFS gateway knows what content to serve when it receives a request for a given domain is by looking up the value of a TXT record associated with the domain – the DNSLink record. The value of this TXT record is the hash of the entire site, which changes if any one bit of the website changes. So we wrote a Worker script that makes a DNS-over-HTTPS query to 18.104.22.168 and bypasses cache if it sees that the DNSLink record of a domain is different from when the content was originally cached.
Checking DNS gives the illusion of a much lower cache TTL and usually adds less than 5ms to a request, whereas revalidating the cache with a request to the origin could take anywhere from 30ms to 300ms. And as an additional usability bonus, the 22.214.171.124 cache automatically purges when Cloudflare customers change their DNS records. Customers who don’t manage their DNS records with us can purge their cache using this tool.
Beta Testing for Orange-to-Orange
Our gateway was originally based on a feature called SSL for SaaS. This tweaks the way our edge works to allow anyone, Cloudflare customers or not, to CNAME their own domain to a target domain on our network, and have us send traffic we see for their domain to the target domain’s origin. SSL for SaaS keeps valid certificates for these domains in the Cloudflare database (hence the name), and applies the target domain’s configuration to these requests (for example, enforcing Page Rules) before they reach the origin.
The great thing about SSL for SaaS is that it doesn’t require being on the Cloudflare network. New people can start serving their websites through our gateway with their existing DNS provider, instead of migrating everything over. All Cloudflare settings are inherited from the target domain. This is a huge convenience, but also means that the source domain can’t customize their settings even if they do migrate.
This can be improved by an experimental feature called Orange-to-Orange (O2O) from the Cloudflare Edge team. O2O allows one zone on Cloudflare to CNAME to another zone, and apply the settings of both zones in layers. For example, cloudflare-ipfs.com has Always Use HTTPS turned off for various reasons, which means that every site served through our gateway also does. O2O allows site owners to override this setting by enabling Always Use HTTPS just for their website, if they know it’s okay, as well as adding custom Page Rules and Worker scripts to embed all sorts of complicated logic.
If you are on an Enterprise plan and would like to try this out on your domain, please reach out to. your account team with this request and we'll enable it for you in the coming weeks.
To host an application on IPFS it’s pretty much essential to have a custom domain for your app. We discussed all the reasons for this in our post, End-to-End Integrity with IPFS – essentially saying that because browsers only sandbox websites at the domain-level, serving an app directly from a gateway’s URL is not secure because another (malicious) app could steal its data.
Having a custom domain gives apps a secure place to keep user data, but also makes it possible for whoever controls the DNS for the domain to change a website’s content without warning. To provide both a secure context to apps as well as eternal immutability, Cloudflare set up a subdomain-based gateway at cf-ipfs.com.
cf-ipfs.com doesn’t respond to requests to the root domain, only at subdomains, where it interprets the subdomain as the hash of the content to serve. This means a request to https://<hash>.cf-ipfs.com is the equivalent of going to https://cloudflare-ipfs.com/ipfs/<hash>. The only technicality is that because domain names are case-insensitive, the hash must be re-encoded from Base58 to Base32. Luckily, the standard IPFS client provides a utility for this!
As an example, we’ll take the classic Wikipedia mirror on IPFS:
First, we convert the hash,
QmXoyp...6uco to base32:
$ ipfs cid base32 QmXoypizjW3WknFiJnKLwHCnL72vedxjQkDDP1mXWo6uco bafybeiemxf5abjwjbikoz4mc3a3dla6ual3jsgpdr4cjr3oz3evfyavhwq
which tells us we can go here instead:
The main downside of the subdomain approach is that for clients without Encrypted SNI support, the hash is leaked to the network as part of the TLS handshake. This can be bad for privacy and enable network-level censorship.
Enabling Session Affinity
Loading a website usually requires fetching more than one asset from a backend server, and more often than not, “more than one” is more like “more than a dozen.” When that website is being loaded over IPFS, it dramatically improves performance when the IPFS node can make one connection and re-use it for all assets.
Behind the curtain, we run several IPFS nodes to reduce the likelihood of an outage and improve throughput. Unfortunately, with the way it was originally setup, each request for a different asset on a website would likely go to a different IPFS node and all those connections would have to be made again.
We fixed this by replacing the original backend load balancer with our own Load Balancing product that supports Session Affinity and automatically directs requests from the same user to the same IPFS node, minimizing redundant network requests.
Connecting with Pinata
And finally, we’ve configured our IPFS nodes to maintain a persistent connection to the nodes run by Pinata, a company that helps people pin content to the IPFS network. Having a persistent connection significantly improves the performance and reliability of requests to our gateway, for content on their network. Pinata has written their own blog post, which you can find here, that describes how to upload a website to IPFS and keep it online with a combination of Cloudflare and Pinata.
As always, we look forward to seeing what the community builds on top of our work, and hearing about how else Cloudflare can improve the Internet.