TLD glue sticks around too long

by Marek Majkowski.

Recent headline grabbing DDoS attacks provoked heated debates in the DNS community. Everyone has strong opinions on how to harden DNS to avoid downtime in the future. Is it better to use a single DNS provider or multiple? What DNS TTL values are best? Does DNSSEC make you more or less exposed?

CC BY 2.0 image by Leticia Chamorro

These are valid questions worth serious discussion, but tuning your own DNS server settings is not the full story. Together, as a community, we need to harden the DNS protocol itself. We need to prepare it to withstand the toughest DDoS attacks the future will surely bring. In this blog post I'll point out an obscure feature in the core DNS protocol. It is not practical to use this "hidden" feature for DDoS mitigation now, but with a small tweak it could become extremely useful. The feature is currently unused not due to protocol problems - it's unused because of the DNS Top Level Domain (TLD) operators' apathy. If it was working it would reduce DDoS recovery time for the DNS servers under attack.

The feature in question is: DNS TLD glue records. More specifically DNS TLD glue records with custom TTL values.

DNS glue is one of the least understood quirks in the DNS protocol. Allow me to explain why I think reducing glue TTL is a good idea.

But first: what is glue anyway?

CC BY 2.0 image by Frankieleon

DNS Glue

DNS glue is a solution to "the chicken or the egg problem" that is inherent in DNS. It's easiest to explain it with a concrete example.

Imagine you want to resolve the cloudflare.net domain. For that you ask your local recursive DNS server for the resolution. OK, but that doesn't answer the question, what does the resolver do?

For simplicity let's make a couple of assumptions:

  • Our recursor doesn't have any data cached for cloudflare.net.
  • However, it does know that the .net TLD is handled by a number of nameservers, among them a.gtld-servers.net which has the IP address 192.5.6.30.
  • We ignore the first steps and start our investigation by looking at the recursor when it queries the .net nameserver.

To resolve cloudflare.net the recursor needs to figure out which nameservers host the cloudflare.net data - or in DNS speak: which nameservers are authoritative for that zone?

To do so, the recursor asks the .net nameserver. Let's assume we know that one of these is 192.5.6.30. The recursor will launch a query which we can simulate with this dig command:

$ dig cloudflare.net @192.5.6.30
[ output truncated for brevity ]
;; AUTHORITY SECTION:
cloudflare.net.         172800  IN      NS      ns1.cloudflare.net.

;; ADDITIONAL SECTION:
[ skipped for now ]

We politely asked one of .net nameservers: where can I find cloudflare.net? The answer is: I don't know, but I know who to ask! Go talk to ns1.cloudflare.net, it knows all about the cloudflare.net zone!

This is called "a delegation". .net told us to go away and ask ns1.cloudflare.net instead.

Hold on, but where is ns1.cloudflare.net? What is its IP address? If we asked the .net nameserver, it would tell us the same thing - go and talk to ns1.cloudflare.net!

As you can see, here is a chicken and egg problem. To resolve cloudflare.net we need to resolve ns1.cloudflare.net. To resolve ns1.cloudflare.net we need to resolve ns1.cloudflare.net, and so on.

CC BY 2.0 image by Sam-Cat

The glue

This is where DNS glue comes in. I lied a bit in the previous terminal output, the resolution of ns1.cloudflare.net is available in the response given by .net nameserver. This time allow me to show the relevant "ADDITIONAL" section of the answer:

$ dig cloudflare.net @192.5.6.30
[ output truncated for brevity ]
;; AUTHORITY SECTION:
cloudflare.net.         172800  IN      NS      ns1.cloudflare.net.

;; ADDITIONAL SECTION:
ns1.cloudflare.net.     172800  IN      A       173.245.59.31  

To break the resolution loop we need the second bit of data in the answer - the ADDITIONAL SECTION. Here the .net server says: by the way, in case you wondered where is ns1.cloudflare.net, it's 173.245.59.31.

This is DNS glue. Conceptually it's a pretty weird invention. We are asking the authoritative nameservers of .net zone, for the resolution of cloudflare.net. In response we not only get the delegation information but also an address of the server. Think about it - it's as if a part of the cloudflare.net zone was handled by the .net TLD zone!

How far can this go? Can there be arbitrary resolutions stuck in the ADDITIONAL SECTION? Will this work?

$ dig cloudflare.net @192.5.6.30
[ output truncated for brevity ]
;; AUTHORITY SECTION:
cloudflare.net.         172800  IN      NS      ns1.cloudflare.net.

;; ADDITIONAL SECTION:
ns1.cloudflare.net.     172800  IN      A       173.245.59.31  
www.google.com          172800  IN      A       1.2.3.4  

The fun story is: it used to "work" and confuse recursors. This is precisely what the Kashpureff attack did in 1997.

This is a good old school DNS cache injection or cache poisoning attack. The recursor logic of interpreting DNS glue answers is pretty twisted. The details are poorly understood, and vary with every implementation. Conceptually the barrier between a valid glue record and cache injection is very thin. This is being actively discussed by the DNS gurus, see draft-fujiwara-dnsop-resolver-update-00 and draft-weaver-dnsext-comprehensive-resolver-00.

What's the problem?

We've shown what DNS glue is, how it works, and why it is needed in the DNS protocol. Frankly speaking, DNS glue is a pretty ingenious solution to solve a real struggle.

Let me now explain the problem. Let's take a look at the glue answer again:

;; ADDITIONAL SECTION:
ns1.cloudflare.net.     172800  IN      A       173.245.59.31  

The problem is the TTL value. Here, you can see the TTL of that record is 172800 seconds = 48 hours. In normal situations a domain owner, in this case my colleague managing the cloudflare.net domain, has a way to configure this value in a glue record. But 48 hours is not the value we intended to use! If you ask a cloudflare.net authoritative nameserver for this record you get a different TTL that's much shorter:

$ dig ns1.cloudflare.net @173.245.59.31
ns1.cloudflare.net.    900 IN  A   173.245.59.31  

You can see that the authoritative nameserver claims this record is valid for only 900 seconds = 15 minutes, not 48 hours!

Where does this discrepancy come from?

The glue records are usually managed in some kind of panel exposed by the registrar. This is fine; in the end, we inject part of the cloudflare.net namespace into the .net zone. But here's the problem: while there is a way to set the glue IP address, there is no way to configure the TTL. The glue TTL is hardcoded to 48 hours by the TLD operators.

I strongly believe this is way too long and hurts aggressive DDoS mitigation techniques.

Scattering

Had that DNS glue TTL been smaller, it would be possible to rotate the nameserver IPs during an attack. In fact, at Cloudflare we use this technique at the HTTP layer all the time.

During significant attacks we have the ability to promptly move customer traffic between IP addresses by changing the DNS resolution of our customer orange-clouded domains (those we proxy). This allows us to shift legitimate traffic off attacked IP addresses, and deploy aggressive DDoS mitigations on them. In extreme cases we can BGP null route the targeted IPs with little customer impact. Internally we call this technique "scattering".

"Scattering" on the HTTP layer is very effective against L3 attacks. It is also possible to do scattering with no impact to customers, because we serve DNS records with low DNS TTL values.

But "scattering" could also be done on the DNS authoritative layer! During heavy L3 attacks against one of our DNS servers we'd love to move legitimate traffic off that attacked IP address.

"Scattering" on the DNS authoritative layer could be a powerful mitigation technique. This will work great against attacks when packets from a botnet hit authoritative servers directly (as opposed to being reflected by legitimate DNS recursors). Unfortunately, it is impossible to do this "DNS auth scattering" because we don't have power to adjust the TLD glue TTL values. With the TTL stuck at 48 hours, changing the nameserver IP addresses dynamically is not an option.

I believe this should be fixed.

Counter arguments

While I strongly believe that short DNS TTLs are a good thing, others disagree.

An often raised point is that short TTLs increase the load on DNS servers. This is certainly true, but as pointed out in this OARC presentation by the .nl operators, the impact is minimal. DNS servers must be heavily over provisioned anyway to deal with attacks. Actually the .nl operators have been serving 1 hour glue TTL since the beginning of 2016 without issues.

In this blog post, Bozhidar Bozhanov argues that short TTLs in general are undesirable.

What matters is that the glue TTL should be configurable.

Testing things

It's hard to prove the effectiveness of the "DNS auth scattering" technique since glue TTL is hardcoded at the lengthy 48 hours, but we tried to check it anyway. For a test we added a glue record and measured how long it took to pick up its share of the traffic.

We performed the experiment on the cloudflare.com domain. Here is a chart of traffic levels to two Cloudflare nameservers with glue already present: ns3 and ns6, and new one we just added glue for: ns6-bis.

We added glue at 2200 UTC one day. It is nicely visible that the traffic on this IP address gradually increased as the caches on recursors worldwide expired. The traffic seem to have reached levels comparable with other glue nameservers at about 1600 to 1800 the next day - around 8 hours later.

There is at least an 8 hour delay before a big chunk of DNS resolvers will pick up new glued IP. The maximum time for the full switch is, of course, 48 hours.

Closing thoughts

We must use every possible technique in order to make the Internet's DNS infrastructure more resilient against DDoS attacks. We may need to improve the core DNS protocol (aggressive NSEC caching), tune the defaults (advocate the use of low TTLs) and share advanced mitigation techniques (scattering).

In this article, I explained what DNS glue is, and why I believe that DNS TLD glue TTL values hardcoded at 48 hours are not helping with DDoS mitigation. I hope this article will serve as a call to action for relevant TLD operators. I believe the ability to adjust DNS glue TTLs is a simple yet effective way to make DNS infrastructure more reliable.

comments powered by Disqus