The deep-dive into how Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Monday

A recap on what happened Monday

On Monday we wrote about a painful Internet wide route leak. We wrote that this should never have happened because Verizon should never have forwarded those routes to the rest of the Internet. That blog entry came out around 19:58 UTC, just over seven hours after the route leak finished (which will we see below was around 12:39 UTC). Today we will dive into the archived routing data and analyze it. The format of the code below is meant to use simple shell commands so that any reader can follow along and, more importantly, do their own investigations on the routing tables.

This was a very public BGP route leak event. It was both reported online via many news outlets and the event’s BGP data was reported via social media as it was happening. Andree Toonk tweeted a quick list of 2,400 ASNs that were affected.

Quick dumps through the data, showing about 2400 ASns (networks) affected. Cloudflare being hit the hardest. Top 20 of affected ASns below pic.twitter.com/9J7uvyasw2
— Andree Toonk (@atoonk) June 24, 2019

This blog contains a large number of acronyms and those are explained at the end of the blog.

Using RIPE NCC archived data

The RIPE NCC operates a very useful archive of BGP routing. It runs collectors globally and provides an API for querying the data. More can be seen at https://stat.ripe.net/. In the world of BGP all routing is public (within the ability of anyone collecting data to have enough collections points). The archived data is very valuable for research and that’s what we will do in this blog. The site can create some very useful data visualizations.

Dumping the RIPEstat data for this event

Presently, the RIPEstat data gets ingested around eight to twelve hours after real-time. It’s not meant to be a real-time service. The data can be queried in many ways, including a full web interface and an API. We are using the API to extract the data in a JSON format.

We are going to focus only on the Cloudflare routes that were leaked. Many other ASNs were leaked (see the tweet above); however, we want to deal with a finite data set and focus on what happened to Cloudflare’s routing. All the commands below can be run with ease on many systems. Both the scripts and the raw data file are now available on GitHub. The following was done on MacBook Pro running macOS Mojave.

First we collect 24 hours of route announcements and AS-PATH data that RIPEstat sees coming from AS13335 (Cloudflare).

$ # Collect 24 hours of data - more than enough
$ ASN="AS13335"
$ START="2019-06-24T00:00:00"
$ END="2019-06-25T00:00:00"
$ ARGS="resource=${ASN}&starttime=${START}&endtime=${END}"
$ URL="https://stat.ripe.net/data/bgp-updates/data.json?${ARGS}"
$ # Fetch the data from RIPEstat
$ curl -sS "${URL}" | jq . > 13335-routes.json
$ ls -l 13335-routes.json
-rw-r--r--  1 martin  staff  339363899 Jun 25 08:47 13335-routes.json
$

That’s 340MB of data - which seems like a lot, but it contains plenty of white space and plenty of data we just don’t need. Our second task is to reduce this raw data down to just the required data - that’s timestamps, actual routes, and AS-PATH. The third item will be very useful. Note we are using jq, which can be installed on macOS with the brew package manager.

$ # Extract just the times, routes, and AS-PATH
$ jq -rc '.data.updates[]|.timestamp,.attrs.target_prefix,.attrs.path' < 13335-routes.json | paste - - - > 13335-listing-a.txt
$ wc -l 13335-listing-a.txt
691318 13335-listing-a.txt
$

We are down to just below seven hundred thousand routing events, however, that’s not a leak, that’s everything that includes Cloudflare’s ASN (the number 13335 above). For that we need to go back to Monday’s blog and realize it was AS396531 (Allegheny Technologies) that showed up with 701 (Verizon) in the leak. Now we reduce the data further:

$ # Extract the route leak 701,396531
$ # AS701 is Verizon and AS396531 is Allegheny Technologies
$ egrep '701,396531' < 13335-listing-a.txt > 13335-listing-b.txt
$ wc -l 13335-listing-b.txt
204568 13335-listing-b.txt
$

At 204 thousand data points, we are looking better. It’s still a lot of data because BGP can be very chatty if topology is changing. A route leak will cause exactly that. Now let’s see how many routes were affected:

$ # Extract the actual routes affected by the route leak
$ cut -f2 < 13335-listing-b.txt | sort -V -u > 13335-listing-c.txt
$ wc -l 13335-listing-c.txt
101 13335-listing-c.txt
$

It’s a much smaller number. We now have a listing of at least 101 routes that were leaked via Verizon. This may not be the full list because route collectors like RIPEstat don’t have direct feeds from Verizon, so this data is a blended view with Verizon’s path and other paths. We can see that if we look at the AS-PATH in the above files. Please note that I had a typo in this script when this blog was first published and only 20 routes showed up because the -n vs -V option was used on sort. Now the list is correct with 101 affected routes. Please see this short article from stackoverflow to see the issue.

Here’s a partial listing of affected routes.

$ cat 13335-listing-c.txt
8.39.214.0/24
8.42.245.0/24
8.44.58.0/24
...
104.16.80.0/21
104.17.168.0/21
104.18.32.0/21
104.19.168.0/21
104.20.64.0/21
104.22.8.0/21
104.23.128.0/21
104.24.112.0/21
104.25.144.0/21
104.26.0.0/21
104.27.160.0/21
104.28.16.0/21
104.31.0.0/21
141.101.120.0/23
162.159.224.0/21
172.68.60.0/22
172.69.116.0/22
...
$

This is an interesting list, as some of these routes do not originate from Cloudflare’s network, however, they show up with AS13335 (our ASN) as the originator. For example, the 104.26.0.0/21 route is not announced from our network, but we do announce 104.26.0.0/20 (which covers that route). More importantly, we have an IRR (Internet Routing Registries) route object plus an RPKI ROA for that block. Here’s the IRR object:

route:          104.26.0.0/20
origin:         AS13335
source:         ARIN

And here’s the RPKI ROA. This ROA has Max Length set to 20, so no smaller route should be accepted.

Prefix:       104.26.0.0/20
Max Length:   /20
ASN:          13335
Trust Anchor: ARIN
Validity:     Thu, 02 Aug 2018 04:00:00 GMT - Sat, 31 Jul 2027 04:00:00 GMT
Emitted:      Thu, 02 Aug 2018 21:45:37 GMT
Name:         535ad55d-dd30-40f9-8434-c17fc413aa99
Key:          4a75b5de16143adbeaa987d6d91e0519106d086e
Parent Key:   a6e7a6b44019cf4e388766d940677599d0c492dc
Path:         rsync://rpki.arin.net/repository/arin-rpki-ta/5e4a23ea-...

The Max Length field in an ROA says what the minimum size of an acceptable announcement is. The fact that this is a /20 route with a /20 Max Length says that a /21 (or /22 or /23 or /24) within this IP space isn’t allowed. Looking further at the route list above we get the following listing:

Route Seen            Cloudflare IRR & ROA    ROA Max Length
104.16.80.0/21    ->  104.16.80.0/20          /20
104.17.168.0/21   ->  104.17.160.0/20         /20
104.18.32.0/21    ->  104.18.32.0/20          /20
104.19.168.0/21   ->  104.19.160.0/20         /20
104.20.64.0/21    ->  104.20.64.0/20          /20
104.22.8.0/21     ->  104.22.0.0/20           /20
104.23.128.0/21   ->  104.23.128.0/20         /20
104.24.112.0/21   ->  104.24.112.0/20         /20
104.25.144.0/21   ->  104.25.144.0/20         /20
104.26.0.0/21     ->  104.26.0.0/20           /20
104.27.160.0/21   ->  104.27.160.0/20         /20
104.28.16.0/21    ->  104.28.16.0/20          /20
104.31.0.0/21     ->  104.31.0.0/20           /20

So how did all these /21’s show up? That’s where we dive into the world of BGP route optimization systems and their propensity to synthesize routes that should not exist. If those routes leak (and it’s very clear after this week that they can), all hell breaks loose. That can be compounded when not one, but two ISPs allow invalid routes to be propagated outside their autonomous network. We will explore the AS-PATH further down this blog.

More than 20 years ago, RFC1997 added the concept of communities to BGP. Communities are a way of tagging or grouping route advertisements. Communities are often used to label routes so that specific handling policies can be applied. RFC1997 includes a small number of universal well-known communities. One of these is the NO_EXPORT community, which has the following specification:

    All routes received carrying a communities attribute
    containing this value MUST NOT be advertised outside a BGP
    confederation boundary (a stand-alone autonomous system that
    is not part of a confederation should be considered a
    confederation itself).

The use of the NO_EXPORT community is very common within BGP enabled networks and is a community tag that would have helped alleviate this route leak immensely.

How BGP route optimization systems work (or don’t work in this case) can be a subject for a whole other blog entry.

Timing of the route leak

As we saved away the timestamps in the JSON file and in the text files, we can confirm the time for every route in the route leak by looking at the first and the last timestamp of a route in the data. We saved data from 00:00:00 UTC until 00:00:00 the next day, so we know we have covered the period of the route leak. We write a script that checks the first and last entry for every route and report the information sorted by start time:

$ # Extract the timing of the route leak
$ while read cidr
do
  echo $cidr
  fgrep $cidr < 13335-listing-b.txt | head -1 | cut -f1
  fgrep $cidr < 13335-listing-b.txt | tail -1 | cut -f1
done < 13335-listing-c.txt |\
paste - - - | sort -k2,3 | column -t | sed -e 's/2019-06-24T//g'
...
104.25.144.0/21   10:34:25  12:38:54
104.22.8.0/21     10:34:27  12:29:39
104.20.64.0/21    10:34:27  12:30:00
104.23.128.0/21   10:34:27  12:30:34
141.101.120.0/23  10:34:27  12:30:39
162.159.224.0/21  10:34:27  12:30:39
104.18.32.0/21    10:34:29  12:30:34
104.24.112.0/21   10:34:29  12:30:34
104.27.160.0/21   10:34:29  12:30:34
104.28.16.0/21    10:34:29  12:30:34
104.31.0.0/21     10:34:29  12:30:34
8.39.214.0/24     10:34:31  12:19:24
104.26.0.0/21     10:34:36  12:29:53
172.68.60.0/22    10:34:38  12:19:24
172.69.116.0/22   10:34:38  12:19:24
8.44.58.0/24      10:34:38  12:19:24
8.42.245.0/24     11:52:49  11:53:19
104.17.168.0/21   12:00:13  12:29:34
104.16.80.0/21    12:00:13  12:30:00
104.19.168.0/21   12:09:39  12:29:34
...
$

Now we know the times. The route leak started at 10:34:25 UTC (just before lunchtime London time) on 2019-06-24 and ended at 12:38:54 UTC. That’s a hair over two hours. Here’s that same time data in a graphical form showing the near-instant start of the event and the duration of each route leaked:

We can also go back to RIPEstat and look at the activity graph for Cloudflare’s AS13335 network:

Clearly between 10:30 UTC and 12:40 UTC there’s a lot of route activity - far more than normal.

Note that as we mentioned above, RIPEstat doesn’t get a full view of Verizon’s network routing and hence some of the propagated routes won’t show up.

Drilling down on the AS-PATH part of the data

Having the routes is useful, but now we want to look at the paths of these leaked routes to see which ASNs are involved. We knew the offending ASNs during the route leak on Monday. Now we want to dig deeper using the archived data. This allows us to see the extent and reach of this route leak.

$ # Extract the AS-PATH of the route leak
$ # Use the list of routes to extract the full AS-PATH
$ # Merge the results together to show an amalgamation of paths.
$ # We know (luckily) the last few ASNs in the AS-PATH are consistent
$ cut -f3 < 13335-listing-b.txt | tr -d '[\[\]]' |\
awk '{
  n=split($0, a, ",");
  printf "%50s\n",
    a[n-5] "_" a[n-4] "_" a[n-3] "_" a[n-2] "_" a[n-1] "_" a[n];
}' | sort -u
   174_701_396531_33154_3356_13335
   2497_701_396531_33154_174_13335
   577_701_396531_33154_3356_13335
   6939_701_396531_33154_174_13335
  1239_701_396531_33154_3356_13335
  1273_701_396531_33154_3356_13335
  1280_701_396531_33154_3356_13335
  2497_701_396531_33154_3356_13335
  2516_701_396531_33154_3356_13335
  3320_701_396531_33154_3356_13335
  3491_701_396531_33154_3356_13335
  4134_701_396531_33154_3356_13335
  4637_701_396531_33154_3356_13335
  6453_701_396531_33154_3356_13335
  6461_701_396531_33154_3356_13335
  6762_701_396531_33154_3356_13335
  6830_701_396531_33154_3356_13335
  6939_701_396531_33154_3356_13335
  7738_701_396531_33154_3356_13335
 12956_701_396531_33154_3356_13335
 17639_701_396531_33154_3356_13335
 23148_701_396531_33154_3356_13335
$

This script clearly shows the AS-PATH of the leaked routes. It’s very consistent. Reading from the back of the line to the front, we have 13335 (Cloudflare), 3356 or 174 (Level3/CenturyLink or Cogent - both tier 1 transit providers for Cloudflare). So far, so good. Then we have 33154 (DQE Communications) and 396531 (Allegheny Technologies Inc) which is still technically not a leak, but trending that way. The reason why we can state this is still technically not a leak is because we don’t know the relationship between those two ASs. It’s possible they have a mutual transit agreement between them. That’s up to them.

Back to the AS-PATH’s. Still reading leftwards, we see 701 (Verizon), which is very very bad and clear evidence of a leak. It’s a leak for two reasons. First, this matches the path when a transit is leaking a non-customer route from a customer. This Verizon customer does not have 13335 (Cloudflare) listed as a customer. Second, the route contains within its path a tier 1 ASN. This is the point where a route leak should have been absolutely squashed by filtering on the customer BGP session. Beyond this point there be dragons.

And dragons there be! Everything above is about how Verizon filtered (or didn’t filter) its customer. What follows the 701 (i.e the number to the left of it) is the peers or customers of Verizon that have accepted these leaked routes. They are mainly other tier 1 networks of Verizon in this list: 174 (Cogent), 1239 (Sprint), 1273 (Vodafone), 3320 (DTAG), 3491 (PCCW), 6461 (Zayo), 6762 (Telecom Italia), etc.

What’s missing from that list are three networks worthy of mentioning - 1299 (Telia), 2914 (NTT), and 7018 (AT&T). All three implement a very simple AS-PATH filter which saved the day for their network. They do not allow one tier 1 ISP to send them a route which has another tier 1 further down the path. That’s because when that happens, it’s officially a leak as each tier 1 is fully connected to all other tier 1’s (which is part of the definition of a tier 1 network). The topology of the Internet’s global BGP routing tables simply states that if you see another tier 1 in the path, then it’s a bad route and it should be filtered away.

Additionally we know that 7018 (AT&T) operates a network which drops RPKI invalids. Because Cloudflare routes are RPKI signed, this also means that AT&T would have dropped these routes when it receives them from Verizon. This shows a clear win for RPKI (and for AT&T when you see their bandwidth graph below)!

BREAKING - AT&T / AS 7018 is now rejecting RPKI Invalid BGP announcements they receive from their peering partners. This is big news for routing security! If AT&T can do it - you can do it! :-) https://t.co/FhjmWByagO
— Job Snijders (@JobSnijders) February 11, 2019

That all said, keep in mind we are still talking about routes that Cloudflare didn't announce. They all came from the route optimizer.

What should 701 Verizon network accept from their customer 396531?

This is a great question to ask. Normally we would look at the IRR (Internet Routing Registries) to see what policy an ASN wants for it’s routes.

$ whois -h whois.radb.net AS396531 ; the Verizon customer
%  No entries found for the selected source(s).
$ whois -h whois.radb.net AS33154  ; the downstream of that customer
%  No entries found for the selected source(s).
$

That’s enough to say that we should not be seeing this ASN anywhere on the Internet, however, we should go further into checking this. As we know the ASN of the network, we can search for any routes that are listed for that ASN. We find one:

$ whois -h whois.radb.net ' -i origin AS396531' | egrep '^route|^origin|^mnt-by|^source'
route:          192.92.159.0/24
origin:         AS396531
mnt-by:         MNT-DCNSL
source:         ARIN
$

More importantly, now we have a maintainer (the owner of the routing registry entries). We can see what else is there for that network and we are specifically looking for this:

$ whois -h whois.radb.net ' -i mnt-by -T as-set MNT-DCNSL' | egrep '^as-set|^members|^mnt-by|^source'
as-set:         AS-DQECUST
members:        AS4130, AS5050, AS11199, AS11360, AS12017, AS14088, AS14162,
                AS14740, AS15327, AS16821, AS18891, AS19749, AS20326,
                AS21764, AS26059, AS26257, AS26461, AS27223, AS30168,
                AS32634, AS33039, AS33154, AS33345, AS33358, AS33504,
                AS33726, AS40549, AS40794, AS54552, AS54559, AS54822,
                AS393456, AS395440, AS396531, AS15204, AS54119, AS62984,
                AS13659, AS54934, AS18572, AS397284
mnt-by:         MNT-DCNSL
source:         ARIN
$

This object is important. It lists all the downstream ASNs that this network is expected to announce to the world. It does not contain Cloudflare’s ASN (or any of the leaked ASNs). Clearly this as-set was not used for any BGP filtering.

Just for completeness the same exercise can be done for the other ASN (the downstream of the customer of Verizon). In this case, we just searched for the maintainer object (as there are plenty of route and route6 objects listed).

$ whois -h whois.radb.net ' -i origin AS33154' | egrep '^mnt-by' | sort -u
mnt-by:         MNT-DCNSL
mnt-by:     MAINT-AS3257
mnt-by:     MAINT-AS5050
$

None of these maintainers are directly related to 33154 (DQE Communications). They have been created by other parties and hence they become a dead-end in that search.

It’s worth doing a secondary search to see if any as-set object exists with 33154 or 396531 included. We turned to the most excellent IRR Explorer website run by NLNOG. It provides deep insight into the routing registry data. We did a simple search for 33154 using http://irrexplorer.nlnog.net/search/33154 and we found these as-set objects.

It’s interesting to see this ASN listed in other as-set’s but none are important or related to Monday’s route-leak. Next we looked at 396531

This shows that there’s nowhere else we need to check. AS-DQECUST is the as-set macro that controls (or should control) filtering for any transit provider of their network.

The summary of all the investigation is a solid statement that no Cloudflare routes or ASNs are listed anywhere within the routing registry data for the customer of Verizon. As there were 2,300 ASNs listed in the tweet above, we can conclusively state no filtering was in place and hence this route leak went on its way unabated.

IPv6? Where is the IPv6 route leak?

In what could be considered the only plus from Monday’s route leak, we can confirm that there was no route leak within IPv6 space. Why?

It turns out that 396531 (Allegheny Technologies Inc) is a network without IPv6 enabled. Normally you would hear Cloudflare chastise anyone that’s yet to enable IPv6, however, in this case we are quite happy that one of the two protocol families survived. IPv6 was stable during this route leak, which now can be called an IPv4-only route leak.

Yet that’s not really the whole story. Let’s look at the percentage of traffic Cloudflare sends Verizon that’s IPv6 (vs IPv4). Normally the IPv4/IPv6 percentage holds steady.

This uptick in IPv6 traffic could be the direct result of Happy Eyeballs on mobile handsets picking a working IPv6 path into Cloudflare vs a non-working IPv4 path. Happy Eyeballs is meant to protect against IPv6 failure, however in this case it’s doing a wonderful job in protecting from an IPv4 failure. Yet we have to be careful with this graph because after further thought and investigation the percentage only increased because IPv4 reduces. Sometimes graphs can be misinterpreted, yet Happy Eyeballs still did a good job even as end users were being affected.

Happy Eyeballs, described in RFC8305, is a mechanism where a client (lets say a mobile device) tries to connect to a website both on IPv4 and IPv6 simultaneously. IPv6 is sometimes given a head-start. The theory is that, should a failure exist on one of the paths (sadly IPv6 is the norm), then IPv4 will save the day. Monday was a day of opposites for Verizon.

In fact, enabling IPv6 for mobile users is the one area where we can praise the Verizon network this week (or at least the Verizon mobile network), unlike the residential Verizon networks where IPv6 is almost non-existent.

Using bandwidth graphs to confirm routing leaks and stability.

As we have already stated,Verizon had impacted their own users/customers. Let’s start with their bandwidth graph:

The red line is 24 June 2019 (00:00 UTC to 00:00 UTC the next day). The gray lines are previous days for comparison. This graph includes both Verizon fixed-line services like FiOS along with mobile.

The AT&T graph is quite different.

There’s no perturbation. This, along with some direct confirmation, shows that 7018 (AT&T) was not affected. This is an important point.

Going back and looking at a third tier 1 network, we can see 6762 (Telecom Italia) affected by this route leak and yet Cloudflare has a direct interconnect with them.

We will be asking Telecom Italia to improve their route filtering as we now have this data.

Future work that could have helped on Monday

The IETF is doing work in the area of BGP path protection within the Secure Inter-Domain Routing Operations Working Group (sidrops) area. The charter of this IETF group is:

The SIDR Operations Working Group (sidrops) develops guidelines for the operation of SIDR-aware networks, and provides operational guidance on how to deploy and operate SIDR technologies in existing and new networks.

One new effort from this group should be called out to show how important the issue of route leaks like today’s event is. The draft document from Alexander Azimov et al named draft-ietf-sidrops-aspa-profile (ASPA stands for Autonomous System Provider Authorization) extends the RPKI data structures to handle BGP path information. This in ongoing work and Cloudflare and other companies are clearly interested in seeing it progress further.

However, as we said in Monday’s blog and something we should reiterate again and again: Cloudflare encourages all network operators to deploy RPKI now!

Acronyms used in the blog

API - Application Programming Interface
AS-PATH - The list of ASNs that a routes has traversed so far
ASN - Autonomous System Number - A unique number assigned for each network on the Internet
BGP - Border Gateway Protocol (version 4) - the core routing protocol for the Internet
IETF - Internet Engineering Task Force - an open standards organization
IPv4 - Internet Protocol version 4
IPv6 - Internet Protocol version 6
IRR - Internet Routing Registries - a database of Internet route objects
ISP - Internet Service Provider
JSON - JavaScript Object Notation - a lightweight data-interchange format
RFC - Request For Comment - published by the IETF
RIPE NCC - Réseaux IP Européens Network Coordination Centre - a regional Internet registry
ROA - Route Origin Authorization - a cryptographically signed attestation of a BGP route announcement
RPKI - Resource Public Key Infrastructure - a public key infrastructure framework for routing information
Tier 1 - A network that has no default route and peers with all other tier 1's
UTC - Coordinated Universal Time - a time standard for clocks and time
"there be dragons" - a mistype, as it was meant to be "here be dragons" which means dangerous or unexplored territories

The Cloudflare Blog