A few days ago, my colleague Marek sent an email about a DDoS attack against one of our DNS servers that we'd been blocking with our BPF rules. He noticed that there seemed to be a strange correlation between the TTL field in the IP header and the IPv4 source address.
CC BY 2.0 image by Jeremy Keith
The source address was being spoofed, as usual, and apparently chosen randomly, but something else was going on. He offered a bottle of Scotch to the first person to come up with a satisfactory solution.
Here's what some of the packets looked like:
$ tcpdump -ni eth0 -c 10 "ip[8]=40 and udp and port 53"
1.181.207.7.46337 > x.x.x.x.53: 65098+
1.178.97.141.45569 > x.x.x.x.53: 65101+
1.248.136.142.63489 > x.x.x.x.53: 65031+
1.207.241.195.52993 > x.x.x.x.53: 65072+
$ tcpdump -ni eth0 -c 10 "ip[8]=41 and udp and port 53"
2.10.30.2.2562 > x.x.x.x.53: 65013+
2.4.9.36.1026 > x.x.x.x.53: 65019+
2.98.1.99.25090 > x.x.x.x.53: 64925+
2.109.69.229.27906 > x.x.x.x.53: 64914+
$ tcpdump -ni eth0 -c 10 "ip[8]=42 and udp and port 53"
4.72.42.184.18436 > x.x.x.x.53: 64439+
4.240.78.0.61444 > x.x.x.x.53: 64271+
5.73.44.84.18693 > x.x.x.x.53: 64182+
4.69.99.10.17668 > x.x.x.x.53: 64442+
I've removed the destination IP address, but left in the DNS ID number (the number with the plus after it), the spoofed source IP, and port. There are three different TTLs represented (40, 41, and 42) by filtering on ip[8].
Stop reading here if you want to go figure this out for yourself.
Into the hex
I couldn't resist Marek's challenge, so I did what I always do with anything involving packets: I went straight for the hex. Since Marek hadn't given me a pcap, I manually converted the first few to hex.
The reason hex is useful is that bytes, words, dwords, and qwords are the real stuff of computing and looking at decimal obscures data.
Taking the first three (one for each TTL), I saw this pattern:
1.181.207.7.46337
2.10.30.2.2562
4.72.42.184.18436
Converted to hex they are
01.b5.cf.07.b501
02.0a.1e.02.0a02
04.48.2a.b8.4804
It's immediately obvious that the 'random' source port is the first two bytes of the random IP source address reversed: 01.b5.cf.07
has source port b501
.
A little bit of tinkering revealed a relationship between the TTL and the first byte of the IP address.
TTL = first byte >> 1 + 40
This relationship was confirmed by a later packet that had a TTL of 150 and a source IP of 220.255.141.181. The shift right also explained why the same TTL was used when the first byte differed by one (see the third group above for an example: 4.x.x.x and 5.x.x.x have the same TTL).
I then spotted that the DNS ID field was also related to the IP address.
1.181.207.7.46337 > x.x.x.x.53: 65098+
Converted to hex:
01.b5.cf.07.b501 > x.x.x.x.35: fe4a+
It's probably not obvious at first glance (unless you're a hex-level hacker), but fe4a is the one's complement of 01b5. So the DNS ID field was simply the one's complement of the first two bytes of the source IP.
Finally, Marek gave me a pcap, and I had one more relationship to find. The ID value in the IPv4 header was also related to the random source IP address—in fact, it was just the first two bytes of the IP address.
Mystery
One mystery remains (and there's a CloudFlare T-shirt for person with the most convincing explanation): How are the random source IPs being generated?
The answer might be boring (i.e. it might just be reading bytes from /dev/random), but given the author's love of relationships between fields, perhaps there's something else going on. We've seen IP addresses get reused which leads us to think that there's something to discover here.
Here's a sequence of actual source IPs seen:
218.254.187.151
8.187.160.236
123.73.134.186
68.133.199.20
205.26.91.155
169.235.56.120
96.160.119.221
44.226.72.236
205.26.91.155
140.206.27.92
70.62.151.0
161.98.197.249
We have no real guarantee that the attacker generated them in this order, but perhaps there's an interesting way these are being generated.
Answers in comments if you manage to figure it out!