Subscribe to receive notifications of new posts:

TLD glue sticks around too long


8 min read

Recent headline grabbing DDoS attacks provoked heated debates in the DNS community. Everyone has strong opinions on how to harden DNS to avoid downtime in the future. Is it better to use a single DNS provider or multiple? What DNS TTL values are best? Does DNSSEC make you more or less exposed?

CC BY 2.0 image by Leticia Chamorro

These are valid questions worth serious discussion, but tuning your own DNS server settings is not the full story. Together, as a community, we need to harden the DNS protocol itself. We need to prepare it to withstand the toughest DDoS attacks the future will surely bring. In this blog post I'll point out an obscure feature in the core DNS protocol. It is not practical to use this "hidden" feature for DDoS mitigation now, but with a small tweak it could become extremely useful. The feature is currently unused not due to protocol problems - it's unused because of the DNS Top Level Domain (TLD) operators' apathy. If it was working it would reduce DDoS recovery time for the DNS servers under attack.

The feature in question is: DNS TLD glue records. More specifically DNS TLD glue records with custom TTL values.

DNS glue is one of the least understood quirks in the DNS protocol. Allow me to explain why I think reducing glue TTL is a good idea.

But first: what is glue anyway?

CC BY 2.0 image by Frankieleon

DNS Glue

DNS glue is a solution to "the chicken or the egg problem" that is inherent in DNS. It's easiest to explain it with a concrete example.

Imagine you want to resolve the domain. For that you ask your local recursive DNS server for the resolution. OK, but that doesn't answer the question, what does the resolver do?

For simplicity let's make a couple of assumptions:

  • Our recursor doesn't have any data cached for
  • However, it does know that the .net TLD is handled by a number of nameservers, among them which has the IP address
  • We ignore the first steps and start our investigation by looking at the recursor when it queries the .net nameserver.

To resolve the recursor needs to figure out which nameservers host the data - or in DNS speak: which nameservers are authoritative for that zone?

To do so, the recursor asks the .net nameserver. Let's assume we know that one of these is The recursor will launch a query which we can simulate with this dig command:

$ dig @
[ output truncated for brevity ]
;; AUTHORITY SECTION:         172800  IN      NS

[ skipped for now ]

We politely asked one of .net nameservers: where can I find The answer is: I don't know, but I know who to ask! Go talk to, it knows all about the zone!

This is called "a delegation". .net told us to go away and ask instead.

Hold on, but where is What is its IP address? If we asked the .net nameserver, it would tell us the same thing - go and talk to!

As you can see, here is a chicken and egg problem. To resolve we need to resolve To resolve we need to resolve, and so on.

CC BY 2.0 image by Sam-Cat

The glue

This is where DNS glue comes in. I lied a bit in the previous terminal output, the resolution of is available in the response given by .net nameserver. This time allow me to show the relevant "ADDITIONAL" section of the answer:

$ dig @
[ output truncated for brevity ]
;; AUTHORITY SECTION:         172800  IN      NS

;; ADDITIONAL SECTION:     172800  IN      A

To break the resolution loop we need the second bit of data in the answer - the ADDITIONAL SECTION. Here the .net server says: by the way, in case you wondered where is, it's

This is DNS glue. Conceptually it's a pretty weird invention. We are asking the authoritative nameservers of .net zone, for the resolution of In response we not only get the delegation information but also an address of the server. Think about it - it's as if a part of the zone was handled by the .net TLD zone!

How far can this go? Can there be arbitrary resolutions stuck in the ADDITIONAL SECTION? Will this work?

$ dig @
[ output truncated for brevity ]
;; AUTHORITY SECTION:         172800  IN      NS

;; ADDITIONAL SECTION:     172800  IN      A          172800  IN      A

The fun story is: it used to "work" and confuse recursors. This is precisely what the Kashpureff attack did in 1997.

This is a good old school DNS cache injection or cache poisoning attack. The recursor logic of interpreting DNS glue answers is pretty twisted. The details are poorly understood, and vary with every implementation. Conceptually the barrier between a valid glue record and cache injection is very thin. This is being actively discussed by the DNS gurus, see draft-fujiwara-dnsop-resolver-update-00 and draft-weaver-dnsext-comprehensive-resolver-00.

What's the problem?

We've shown what DNS glue is, how it works, and why it is needed in the DNS protocol. Frankly speaking, DNS glue is a pretty ingenious solution to solve a real struggle.

Let me now explain the problem. Let's take a look at the glue answer again:

;; ADDITIONAL SECTION:     172800  IN      A

The problem is the TTL value. Here, you can see the TTL of that record is 172800 seconds = 48 hours. In normal situations a domain owner, in this case my colleague managing the domain, has a way to configure this value in a glue record. But 48 hours is not the value we intended to use! If you ask a authoritative nameserver for this record you get a different TTL that's much shorter:

$ dig @	900	IN	A

You can see that the authoritative nameserver claims this record is valid for only 900 seconds = 15 minutes, not 48 hours!

Where does this discrepancy come from?

The glue records are usually managed in some kind of panel exposed by the registrar. This is fine; in the end, we inject part of the namespace into the .net zone. But here's the problem: while there is a way to set the glue IP address, there is no way to configure the TTL. The glue TTL is hardcoded to 48 hours by the TLD operators.

I strongly believe this is way too long and hurts aggressive DDoS mitigation techniques.


Had that DNS glue TTL been smaller, it would be possible to rotate the nameserver IPs during an attack. In fact, at Cloudflare we use this technique at the HTTP layer all the time.

During significant attacks we have the ability to promptly move customer traffic between IP addresses by changing the DNS resolution of our customer orange-clouded domains (those we proxy). This allows us to shift legitimate traffic off attacked IP addresses, and deploy aggressive DDoS mitigations on them. In extreme cases we can BGP null route the targeted IPs with little customer impact. Internally we call this technique "scattering".

"Scattering" on the HTTP layer is very effective against L3 attacks. It is also possible to do scattering with no impact to customers, because we serve DNS records with low DNS TTL values.

But "scattering" could also be done on the DNS authoritative layer! During heavy L3 attacks against one of our DNS servers we'd love to move legitimate traffic off that attacked IP address.

"Scattering" on the DNS authoritative layer could be a powerful mitigation technique. This will work great against attacks when packets from a botnet hit authoritative servers directly (as opposed to being reflected by legitimate DNS recursors). Unfortunately, it is impossible to do this "DNS auth scattering" because we don't have power to adjust the TLD glue TTL values. With the TTL stuck at 48 hours, changing the nameserver IP addresses dynamically is not an option.

I believe this should be fixed.

Counter arguments

While I strongly believe that short DNS TTLs are a good thing, others disagree.

An often raised point is that short TTLs increase the load on DNS servers. This is certainly true, but as pointed out in this OARC presentation by the .nl operators, the impact is minimal. DNS servers must be heavily over provisioned anyway to deal with attacks. Actually the .nl operators have been serving 1 hour glue TTL since the beginning of 2016 without issues.

In this blog post, Bozhidar Bozhanov argues that short TTLs in general are undesirable.

What matters is that the glue TTL should be configurable.

Testing things

It's hard to prove the effectiveness of the "DNS auth scattering" technique since glue TTL is hardcoded at the lengthy 48 hours, but we tried to check it anyway. For a test we added a glue record and measured how long it took to pick up its share of the traffic.

We performed the experiment on the domain. Here is a chart of traffic levels to two Cloudflare nameservers with glue already present: ns3 and ns6, and new one we just added glue for: ns6-bis.

We added glue at 2200 UTC one day. It is nicely visible that the traffic on this IP address gradually increased as the caches on recursors worldwide expired. The traffic seem to have reached levels comparable with other glue nameservers at about 1600 to 1800 the next day - around 8 hours later.

There is at least an 8 hour delay before a big chunk of DNS resolvers will pick up new glued IP. The maximum time for the full switch is, of course, 48 hours.

Closing thoughts

We must use every possible technique in order to make the Internet's DNS infrastructure more resilient against DDoS attacks. We may need to improve the core DNS protocol (aggressive NSEC caching), tune the defaults (advocate the use of low TTLs) and share advanced mitigation techniques (scattering).

In this article, I explained what DNS glue is, and why I believe that DNS TLD glue TTL values hardcoded at 48 hours are not helping with DDoS mitigation. I hope this article will serve as a call to action for relevant TLD operators. I believe the ability to adjust DNS glue TTLs is a simple yet effective way to make DNS infrastructure more reliable.

We protect entire corporate networks, help customers build Internet-scale applications efficiently, accelerate any website or Internet application, ward off DDoS attacks, keep hackers at bay, and can help you on your journey to Zero Trust.

Visit from any device to get started with our free app that makes your Internet faster and safer.

To learn more about our mission to help build a better Internet, start here. If you're looking for a new career direction, check out our open positions.

Follow on X

Marek Majkowski|@majek04

Related posts

June 23, 2023 1:00 PM

How we scaled and protected Eurovision 2023 voting with Pages and Turnstile

More than 162 million fans tuned in to the 2023 Eurovision Song Contest, the first year that non-participating countries could also vote. Cloudflare helped scale and protect the voting application, built by using our rapid DNS infrastructure, CDN, Cloudflare Pages and Turnstile...