Subscribe to receive notifications of new posts:

RPKI - The required cryptographic upgrade to BGP routing

2018-09-19

13 min read

We have talked about the BGP Internet routing protocol before. We have talked about how we build a more resilient network and how we can see outages at a country-level via BGP. We have even talked about the network community that is vital to the operation of the global Internet.

Today we need to talk about why existing operational practices for BGP routing and filtering have to significantly improve in order to finally stop route leaks and hijacks; which are sadly pervasive in today’s Internet routing world. In fact, the subtle art of running a BGP network and the various tools (both online and within your a networks subsystems) that are vital to making the Internet routing world a safe and reliable place to operate need to improve.

Internet routing and BGP and security along with its operational expertise must improve globally.

photo by Marco Verch by/2.0

Nothing specific triggered today’s writing except the fact that Cloudflare has decided that it's high-time we took a leadership role to finally secure BGP routing. We believe that each and every network needs to change its mindset towards BGP security both on a day-by-day and a long-term basis.

It's time to stop BGP route leaks and hijacks by deploying operationally-excellent RPKI!

Cloudflare commits to RPKI

Resource Public Key Infrastructure (RPKI) is a cryptographic method of signing records that associate a BGP route announcement with the correct originating AS number. RPKI is defined in RFC6480 (An Infrastructure to Support Secure Internet Routing). Cloudflare commits to RPKI.

Because any route can be originated and announced by any random network, independent of its rights to announce that route, there needs to be an out-of-band method to help BGP manage which network can announce which route. That system exists today. It's part of the IRR (Internet Routing Registry) system. Many registries exist, some run by networks, some by RIRs (Regional Internet Registries) and the grand daddy of IRRs, Merit's RADB service. This service provides a collective method to allow one network to filter another networks routes.

This works somewhat. An invalid announcement is normally squashed near-instantly as the route crosses an ASN boundary because one network is meant to filter the other network (based on rules created from the IRR database). This of course doesn’t happen perfectly - in fact, far from it. Route leaks or route hijacks happen more often than they should. A fact that is well documented. Here’s the highlights:

  • 1997 - AS7007 mistakenly (re)announces 72,000+ routes (becomes the poster-child for route filtering).

  • 2008 - ISP in Pakistan accidentally announces IP routes for YouTube by blackholing the video service internally to their network.

  • 2017 - Russian ISP leaks 36 prefixes for payments services owned by Mastercard, Visa, and major banks.

  • 2018 - BGP hijack of Amazon DNS to steal crypto currency.

That’s just a partial list! Each route leak or hijack exposes a lack of route filtering by the network that peers or transits the offending network.

RPKI comes into the picture because the existing IRR system lacks any form of cryptographic signing for its data. In fact, today the IRR databases contain plenty of invalid data (both stale data and typo’ed data). There's very little control over the creation of invalid data.

Implementing RPKI is just the first step in better BGP route security because RPKI only secures the route origin; it doesn't secure the path. (Sadly the same is true for IRR data). When we want to secure the path; we are going to need something else; but that comes later.

The RPKI TL;DR

BGP routing isn’t secure. Its main hope, RPKI, uses a certificate system that’s akin to secure web browsing (or at-least its early days). While secure web browsing has moved on and is far more secure and is somewhat the default these days, the state of BGP route validation has not moved forward. To secure BGP routing, all networks would need to be embrace RPKI (and more). Cloudflare proposes to plot a course to improve BGP routing-security globally by setting an example and implementing best practices, installing operationally excellent software and promoting its RPKI effort worldwide. RPKI is one of our focuses in 2018 and beyond!

The simplest introduction to BGP possible

BGP isn’t simple. BGP on the global Internet doubly-so. This fact should not deter either the casual reader or the seasoned network engineer. What is important is to place the limit around what is worth knowing about and discarding all the minor items that make up the very complex world of BGP networking. In fact, to operate a BGP enabled network connected to a telco or ISP isn’t that complicated. It turns out that in the world of BGP, security is an afterthought.

Lets begin.

I’m going to pick a hypothetical example. The configuration of a single university within a country that operates an NREN (National Research & Education Network) for all it's universities. This is not uncommon. The university in this case is connected via a single telecommunications link and (using BGP terminology) has a single upstream. The NREN provides all the connectivity to the local and global Internet for its countries universities, along with connectivity to other NRENs in other countries.

We start with some basics. BGP is about numbers. First off is a unique number called the Autonomous System Number or ASN. This number comes from a range of numbers that are managed by the RIRs (Regional Internet Registries). For example, Cloudflare has the AS number 13335 allocated for its network. ASNs were just 16-bit numbers, but are now 32-bit numbers (because the internet grew to the point of running out of the 65,536 or 2^16 initial allocation). For our university, we will use 65099 as our example ASN. This is from the reserved block of ASNs and used here for documentation reasons only.

The second number is the IP addresses allocated to the university. Most reader are familiar with IP addresses; however in the BGP world we use IP blocks called CIDRs (Classless Inter-Domain Routing). This is a range of IP addresses that are sequential and bonded on binary boundaries. Within Cloudflare, we have quite a few IP blocks allocated by the RIRs. For our example, we will assume the university has two blocks allocated. 10.0.0.0/8 and 2001:db8::/32 . Both these are private or documentation addresses and later-on you’ll see these show up again in a different manner when we talk about filtering.

This is enough for us to get this university ready to connect to the NREN. Or maybe not.

Ready to connect

Hold on a second - there’s paperwork to fill-in. Not actual paper; but close enough. While the internet is build on the concept of permissionless innovation, there’s still good practices that still need to be adhered too.

Before you can announce a route via your BGP speaking router, you need to setup either an IRR route object or an RPKI ROA (or both).

Internet Routing Registries

The IRR (Internet Routing Registries) is used to record a route that will be announced on the Internet and associate it with the ASN that will announce it. In this example we will use the private or documentation ranges of 10.0.0.0/8 and 2001:db8::/32 along with ASN 65099. The simplest IRR routing record looks like this:

route:	10.0.0.0/8
origin:	AS65099

In reality, we need a lot more to make it fully-functional and we need a place to upload this routing record. You could use your RIR to host your IRR data, or you could use global services like RADB or ALTDB. You can also use your transit provider in some cases. Once you have an account setup on one of these services, you will be ready to upload these routing record (how you upload it is very specific to the IRR chosen).

route:	10.0.0.0/8
descr:	University of Blogging
descr:	Anytown, USA
origin:	AS65099
mnt-by:	MNT-UNIVERSITY
notify:	[email protected]
changed:	[email protected] 20180101
source:	RADB

That last line reflects where you store your IRR routing record.

IRR for your ASN

Just like your IP network blocks, its also good to place a record for your ASN in the IRR. When you networking gets more complex, this will be solidly needed. It doesn’t hurt to add it now.

aut-num:    AS65099
as-name:    UNIVERSITY-OF-BLOGGING-AS
descr:      University of Blogging
descr:      Anytown, USA
mnt-by:     MNT-UNIVERSITY
notify:     [email protected]
changed:    [email protected] 20180101
source:     RADB

You can check for their existence using the classic command line whois command (or the RADB website).

One last item needs to be completed; but not by you.

Your ASN needs to be placed in the as-set of your upstream ISP (or service provider). The entry in there will provide the rest of the global Internet an indication that your ASN is allowed to be routed via your upstream (the NREN in this case). If all goes well, something like this will show up in the IRRs.

as-set:     AS-NREN
descr:      NREN of country XX
members:    ...
members:    AS65099
members:    ...
mnt-by:     MNT-NREN
notify:     [email protected]
changed:    [email protected] 20180101
source:     RADB

The members area of this as-set provides a list of ASNs that are announced by the upstream (the ASN). We have not defined the upstreams ASN yet, so lets pretend they are ASN 65001 (this ASN is still from the documentation range).

Getting the university online

BGP (like everything in networking) needs some configuration setup. This configuration would exist on a network router at the edge of your network, or whatever device is being used to connect the local network to the upstream (the NREN). We are using a very simple router config here to show the minimum configuration needed. Your configuration language could be different.

router bgp 65099
neighbor 192.168.0.2 remote-as 65001
neighbor 192.168.0.2 prefix-list as65001-listen in
neighbor 192.168.0.2 route-map as65001-listen in

This is a very trivial example (it’s missing a complete filter configuration that’s normally required). The key point is that the router doesn’t contain any code or language regarding the IRR entries shown above. That’s because the IRR entries are out-of-band. They exist outside of the BGP protocol. In other words, it takes more than just configuring a BGP session in order to actually connect to the global Internet.

The key filtering comes into play on the upstream (the NREN in this example). It’s the job of that network to confirm everything heard from its customer.

RPKI vs IRR - why is it so important?

Two global databases are being discussed today. IRR & RPKI. While IRR is clearly in use today; it’s not the primary focus herein. However, it’s the de-facto bridging option for route filtering today.

As stated above, Internet Routing Registries (IRRs) have a very loose security model. This has been known for a long time. Records exist within IRRs that are both clearly wrong and/or are clearly missing. There’s no cryptographic signing of records. There are multiple suppliers of IRR data; some better than others. IRR still has some proponents that want to clean up its operational data (including the author of this blog). Efforts like IRRD4 (by Job Snijders @ NTT) could help clean-up IRR usage. IRR is not the main focus herein.

Resource Public Key Infrastructure (RPKI) is a cryptographic method of signing records that associate a route with an originating AS number. Presently the five RIRs (AFRINIC, APNIC, ARIN, LACNIC & RIPE) provide a method for members to take an IP/ASN pair and sign a ROA (Route Origin Authorization) record. The ROA record is what we need to focus on.

Once a route is signed; it can propagate to anyone that wants to use the data to filter routing or monitor this data as ROAs are public. A ROA is a digitally signed object that makes use of RFC3852 Cryptographic Message Syntax (CMS) as a standard encapsulation format. In fact ROAs are X.509 certificates as defined in RFC5280 (Internet X.509 Public Key Infrastructure Certificate) and RFC3779 (X.509 Extensions for IP Addresses and AS Identifiers).

As the ROA is a digitally signed object, it provides a means of verifying that an IP address block holder has authorized an AS (Autonomous System) to originate routes to that one or more prefixes within the address block. The RPKI system provides an attestation method for BGP routing.

define attestation: ... the action of bearing witness ... something which bears witness, confirms or authenticates

The existence of routing information (an IP block plus the matching ASN) within a valid certificate (i.e. something that can be validated against the RIRs authoritative data cryptographically) is the missing part of the BGP security system and something that the IRR system can't provide. You really know who should be doing what with a BGP route.

Where are the certificates if they are not in the BGP protocol?

Good question. As we said above, the routing databases are outside of the BGP protocol. Both IRR and RPKI use a third-party entities to hold the database information. The difference is that with RPKI the same entity that allocated or assigned a numeric resource (like an IP address or ASN) also holds the CA (Certificate Authority) used to validate the ROAs record.

In the RPKI world; CAs are called TAs, or Trust Anchors. However, if you are familiar with the web security model, then you are familiar with what a TA is.

Who could operate a TA?

Today the five RIRs are the TAs for RPKI. This makes sense. Only the RIRs know who is an owner of IP space (and ASs). The present day RPKI systems operate in conjunction with existing RIR login credentials. Once you can login to a portal and control your IP allocations and ASN allocations; then you can also create, edit, modify, and delete RPKI data in the forms of ROAs. This is the basis of how RPKI separates itself from the IRR. You can only sign your own resources. You can’t just randomly create data. If you lose your RIR allocation, then you lose the RPKI data.From a policy point of view, there are some interesting issues that become apparent pretty quickly. First off, an ISP with an allocation needs to keep its RIR membership up to date (i.e. pay its dues). Second, it needs to be aware that the RIR and the ISP could be legal entities based in different countries and hence international law plays a role in any dispute between the ISP and RIR or in fact any third party that gets involved in an IP address dispute. This has been a concern within the RIPE (Europe, the Middle East and parts of Central Asia) region as RIPE is based in The Netherlands. Similarly, ARIN (North America and parts of the Caribbean) is a US entity.

Which RIR for which IP address?

Presently, because of the large amount of IP address transfers occurring between some RIR regions, the RIRs changed their TA root certificates so that each RIR includes every available IP address (0.0.0.0/0 & ::/0) and every available AS number (0-4,294,967,295). IP numeric space and ASN numeric space are well defined as follows:

IPv4: 0.0.0.0 - 255.255.255.255
IPv6: 0000:0000:0000:0000:0000:0000:0000:0000 - ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff
ASN:  1 - 4,294,967,295 (AS 0 is unused)

IANA (Internet Assigned Numbers Authority) holds the master list for this space and divvies it up the five RIRs as allocations or assignments. The IPv4 and IPv6 assignments can be seen here and  here. ASNs can be found here. For example, here’s an abbreviated overview into how IPv6 space is allocated to various RIRs.

Prefix         Designation    Date          WHOIS         Status
2001:0000::/23 IANA        1999-07-01 whois.iana.org     ALLOCATED
2001:0200::/23 APNIC       1999-07-01 whois.apnic.net    ALLOCATED
2001:0400::/23 ARIN        1999-07-01 whois.arin.net     ALLOCATED
...
2001:1200::/23 LACNIC      2002-11-01 whois.lacnic.net   ALLOCATED
...
2001:4200::/23 AFRINIC     2004-06-01 whois.afrinic.net  ALLOCATED
...
2002:0000::/16 6to4        2001-02-01                    ALLOCATED
...
2a00:0000::/12 RIPE NCC    2006-10-03 whois.ripe.net     ALLOCATED
2c00:0000::/12 AFRINIC     2006-10-03 whois.afrinic.net  ALLOCATED
...

As stated above; each RIR holds a root key (a TA, or Trust Anchor) that provides them the ability to create signed records below their root. Below that TA there is a certificate that covers the exact space allocated or assigned to the specific RIR. This allows the TA to be somewhat static (or stable) and the RIR to update the underlying records as-needed.

Who is implementing RPKI today?

Sadly not enough people or networks. While each RIR is supporting RPKI for its members; the toolset for successfully operating a network with RPKI enabled route filtering is still very limited.

It turns out that IXP (Internet Exchange Points) have started to realize that filtering using RPKI is a valid option for their route-servers.

In addition, a handful of networks are also participating in both signing IP routes and verifying IP routes via RPKI. This isn’t quite enough to secure the global Internet yet.

Then there's the Dutch!

In early September, the NLNOG technical meeting featured a non-trivial number of RPKI-related talks. It seems that local Dutch operators and software developers are taking RPKI seriously and it’s possible that The Netherlands may contain some of the more forward-thinking RPKI networks around. Read more here.

Mutually Agreed Norms for Routing Security (MANRS)

The Internet Society (Cloudflare is a strong supporter of this organization) has pushed an initiative called MANRS (Mutually Agreed Norms for Routing Security) in order to convince the network operator community to implement routing security. It focuses on Filtering, Anti-spoofing, Coordination, and Global Validation. The Internet Society is doing a good job in educating networks on the importance of better routing security. While they do educate networks about various aspects of running a healthy BGP environment; it's not an effort that creates any of the required new technologies. MANRS simply promotes best-practices, which is a good start and something Cloudflare can collaborate on. That all said, we think it’s simply too-polite an effort as it doesn’t have enough teeth to quickly change how networks behave.

Cloudflare also wants to move the BGP community further along the RPKI path. Our operational efforts can, and should, coexist with The Internet Society’s MANRS initiative; however, we're focusing on operationally viable solutions that help move the global network community much further along.

How is RPKI deployed in a real operational network

As network operators don’t want to run an cryptographic software on the control plane of a router (or even have RPKI data anywhere near the control plane), the normal deployment is to pair routers with a server.

The server runs all the RPKI code (including the crypto processing of the TA, the certificate tree, and the ROAs). When the router sees a new route, the router send a simple message across a communications path (that includes the origin AS plus the IP route). The server, running a validator, responds with a yes/no answer that drives the filtering of that BGP route. This lightweight protocol is defined in RFC6810, then updated later to include some BGPsec support in RFC8210 (The Resource Public Key Infrastructure (RPKI) to Router Protocol). This lightweight protocol is nicknamed “RTR”.

Present implementations include https://github.com/rtrlib/rtrlib (in ‘C’) and NIST’s package https://www.nist.gov/services-resources/software/bgp-secure-routing-extension-bgp-srx-prototype which is based on quagga; hence not usable in production.

Operationally, neither are fully usable within production environments.The RIPE validator https://github.com/RIPE-NCC/rpki-validator-3 (written in Java) can produce filter sets similar to IRR tools and seems to be the most prevalent tool for the limited number of RPKI setups found in networks today. There's recently a software release from NLnet Labs research group which is Rust-based. Their RPKI validator is called Routinator 3000.

The industry still needs some more operationally-focused software!

Can everyone participate in RPKI routing filtering?

Yes. No. Maybe. Ask your lawyers.

For many years there’s been a solid discussion about the role of the RIRs as holders of the private key of the CA at the top of their tree. Five trees. IANA was meant to run a single root above them (similar to how DNSSEC works with one key held at the DNS root - or dot); but that didn’t happen for many reasons including the fact that IANA/ICANN was essentially reporting to the US government back when this was all being setup. The RIR setup has stuck and at this point no-one expects IANA to ever hold a single root certificate, plus it’s all historic at this point and not worth rehashing here.

This is not a major operational issue; however it does have some slight consequences. While having five roots could be considered a messy setup, it actually matches the web space CA model.

Some RIR regions have special issues. ARIN (in North America and portions of the Caribbean) has a TA and ROAs; but wants full indemnification should the data be wrong or used incorrectly. In the RIPE region (Europe, ME & Russia), the members voted down full support for RPKI because they didn’t want to have a Dutch entity (RIPE NCC) hold a certificate for a non-Dutch entity and have a Dutch LEA letter shutdown a network by forcing that certificate to be invalidated. Read their respective terms of service:

The legal issues aren’t the focus of this blog entry; but it will be obvious later when implementing RPKI as-to-why the legal issues become an impediment to successful global RPKI deployment.

IRR - legacy or bridging solution?

Everyone assumes that IRR will ultimately go away; however, that’s a long long way out. There’s efforts underway to make IRR data cleaner, and in some cases, to (finally) link the underlying RPKI & IRR data together. They are very similar data; but with different security models.

This blog post was written with RPKI as the go-forward methodology in mind and hence does not need to address all the subtle issues around IRR brokenness. It would be a whole fresh blog post to address the legacy issues within IRR. That said, it’s clear that RPKI isn’t today a complete substitute for all the IRR data (and RPSL/RPSLng data) that exists today. The good news is that there’s work within the IETF and drafts in-flight to cover that. RPKI is a good protocol to base route filtering on and Cloudflare will be rolling out full support for RPKI enabled filtering and route announcements within its global network.

If you look back at the examples above of the university and its NREN, then realize that in the RPKI world the same information is being stored; however, the validity and attestation of the data increases n-fold once RPKI becomes the universal mechanism of choice.

Cloudflare wants to see this happen and will push for RPKI to become mainstream within the BGP world. We want to squash the existence of BGP route leaks and hijacks forever!

Next steps

Read the RPKI and BGP: securing our part of the Internet blog entry to follow what we are doing on the technical side for Cloudflare’s RPKI implementation.

Subscribe to the blog for daily updates on our announcements.

Cloudflare's connectivity cloud protects entire corporate networks, helps customers build Internet-scale applications efficiently, accelerates any website or Internet application, wards off DDoS attacks, keeps hackers at bay, and can help you on your journey to Zero Trust.

Visit 1.1.1.1 from any device to get started with our free app that makes your Internet faster and safer.

To learn more about our mission to help build a better Internet, start here. If you're looking for a new career direction, check out our open positions.
Crypto WeekBGPRPKISecurityProduct NewsCryptography

Follow on X

Martin J Levy|@mahtin
Cloudflare|@cloudflare

Related posts

October 24, 2024 1:00 PM

Durable Objects aren't just durable, they're fast: a 10x speedup for Cloudflare Queues

Learn how we built Cloudflare Queues using our own Developer Platform and how it evolved to a geographically-distributed, horizontally-scalable architecture built on Durable Objects. Our new architecture supports over 10x more throughput and over 3x lower latency compared to the previous version....