In previous blog post we discussed how we use the TPROXY
iptables module to power Cloudflare Spectrum. With TPROXY
we solved a major technical issue on the server side, and we thought we might find another use for it on the client side of our product.
This is Addressograph. Source Wikipedia
When building an application level proxy, the first consideration is always about retaining real client source IP addresses. Some protocols make it easy, e.g. HTTP has a defined X-Forwarded-For
header[1], but there isn't a similar thing for generic TCP tunnels.
Others have faced this problem before us, and have devised three general solutions:
(1) Ignore the client IP
For certain applications it may be okay to ignore the real client IP address. For example, sometimes the client needs to identify itself with a username and password anyway, so the source IP doesn't really matter. In general, it's not a good practice because...
(2) Nonstandard TCP header
A second method was developed by Akamai: the client IP is saved inside a custom option in the TCP header in the SYN packet. Early implementations of this method weren't conforming to any standards, e.g. using option field 28, but recently RFC7974 was ratified for this option. We don't support this method for a number of reasons:
The space in TCP headers is very limited. It's insufficient to store the full 128 bits of client IPv6 addresses, especially with 15%+ of Cloudflare’s traffic being IPv6.
No software or hardware supports the RFC7974 yet.
It's surprisingly hard to add support for RFC7947 in real world applications. One option is to patch the operating system and overwrite
getpeername(2)
andaccept4(2)
syscalls, another is to usegetsockopt(TCP_SAVED_SYN)
to extract the client IP from a SYN packet in the userspace application. Neither technique is simple.
(3) Use the PROXY protocol
Finally, there is the last method. HAProxy developers, faced with this problem developed the "PROXY protocol". The premise of this protocol is to prepend client metadata in front of the original data stream. For example, this string could be sent to the origin server in front of proxied data:
PROXY TCP4 192.0.2.123 104.16.112.25 19235 80\r\n
As you can see, the PROXY protocol is rather trivial to implement, and is generally sufficient for most use cases. However, it requires application support. The PROXY protocol (v1) is supported by Cloudflare Spectrum, and we highly encourage using it over other methods of keeping client source IP addresses.
Mmproxy to the rescue
But sometimes adding PROXY protocol support to the application isn't an option. This can be the case when the application isn’t open source, or when it's hard to edit. A good example is "sshd" - it doesn't support PROXY protocol and adding the support would be far from trivial. For such applications it may just be impossible to use any application level load balancer whatsoever. This is very unfortunate.
Fortunately we think we found a workaround.
Allow me to present mmproxy
, a PROXY protocol gateway. mmproxy
listens for remote connections coming from an application level load balancer, like Spectrum. It then reads a PROXY protocol header, opens a localhost connection to the target application, and duly proxies data in and out.
Such a proxy wouldn't be too useful if not for one feature—the localhost connection from mmproxy
to the target application is sent with a real client source IP.
That's right, mmproxy
spoofs the client IP address. From the application’s point of view, this spoofed connection, coming through Spectrum and mmproxy
, is indistinguishable from a real one, connecting directly to the application.
This technique requires some Linux routing trickery. The mmproxy
daemon will walk you through the necessary details, but there are the important bits:
mmproxy
works only on Linux.Since it forwards traffic over the loopback interface, it must be run on the same machine as the target application.
It requires kernel 2.6.28 or newer.
It guides the user to add four
iptables
firewall rules, and fouriproute2
routing rules, covering both IPv4 and IPv6.For IPv4,
mmproxy
requires theroute_localnet
sysctl to be set.For IPv6, it needs a working IPv6 configuration. A working
ping6 cloudflare.com
is a prerequisite.mmproxy
needs root orCAP_NET_RAW
permissions to set theIP_TRANSPARENT
socket option. Once started, it jails itself withseccomp-bpf
for a bit of added security.
How to run mmproxy
To run mmproxy
, first download the source and compile it:
git clone https://github.com/cloudflare/mmproxy.git --recursive
cd mmproxy
make
Please report any issues on GitHub.
Then set up the needed configuration:
sudo iptables -t mangle -I PREROUTING -m mark --mark 123 -j CONNMARK --save-mark
sudo iptables -t mangle -I OUTPUT -m connmark --mark 123 -j CONNMARK --restore-mark
sudo ip rule add fwmark 123 lookup 100
sudo ip route add local 0.0.0.0/0 dev lo table 100
sudo ip6tables -t mangle -I PREROUTING -m mark --mark 123 -j CONNMARK --save-mark
sudo ip6tables -t mangle -I OUTPUT -m connmark --mark 123 -j CONNMARK --restore-mark
sudo ip -6 rule add fwmark 123 lookup 100
sudo ip -6 route add local ::/0 dev lo table 100
You will also need route_localnet
to be set on your default outbound interface, for example for eth0
:
echo 1 | sudo tee /proc/sys/net/ipv4/conf/eth0/route_localnet
Finally, verify your IPv6 connectivity:
$ ping6 cloudflare.com
PING cloudflare.com(2400:cb00:2048:1::c629:d6a2) 56 data bytes
64 bytes from 2400:cb00:2048:1::c629:d6a2: icmp_seq=1 ttl=61 time=0.650 ms
Now, you are ready to run mmproxy
. For example, forwarding localhost SSH would look like this:
$ sudo ./mmproxy --allowed-subnets ./cloudflare-ip-ranges.txt \
-l 0.0.0.0:2222 \
-4 127.0.0.1:22 -6 '[::1]:22'
root@ubuntu:~# ./mmproxy -a cloudflare-ip-ranges.txt -l 0.0.0.0:2222 -4 127.0.0.1:22 -6 [::1]:22[ ] Remember to set the reverse routing rules correctly:
iptables -t mangle -I PREROUTING -m mark --mark 123 -m comment --comment mmproxy -j CONNMARK --save-mark # [+] VERIFIED
iptables -t mangle -I OUTPUT -m connmark --mark 123 -m comment --comment mmproxy -j CONNMARK --restore-mark # [+] VERIFIED
ip6tables -t mangle -I PREROUTING -m mark --mark 123 -m comment --comment mmproxy -j CONNMARK --save-mark # [+] VERIFIED
ip6tables -t mangle -I OUTPUT -m connmark --mark 123 -m comment --comment mmproxy -j CONNMARK --restore-mark # [+] VERIFIED
ip rule add fwmark 123 lookup 100 # [+] VERIFIED
ip route add local 0.0.0.0/0 dev lo table 100 # [+] VERIFIED
ip -6 rule add fwmark 123 lookup 100 # [+] VERIFIED
ip -6 route add local ::/0 dev lo table 100 # [+] VERIFIED
[+] OK. Routing to 127.0.0.1 points to a local machine.
[+] OK. Target server 127.0.0.1:22 is up and reachable using conventional connection.
[+] OK. Target server 127.0.0.1:22 is up and reachable using spoofed connection.
[+] OK. Routing to ::1 points to a local machine.
[+] OK. Target server [::1]:22 is up and reachable using conventional connection.
[+] OK. Target server [::1]:22 is up and reachable using spoofed connection.
[+] Listening on 0.0.0.0:2222
On startup, mmproxy
performs a number of self checks. Since we prepared the necessary routing and firewall rules, its self check passes with a "VERIFIED" mark. It's important to confirm these pass.
We're almost ready to go! The last step is to create a Spectrum application that sends PROXY protocol traffic to mmproxy
, port 2222. Here is an example configuration[2]:
With Spectrum we are forwarding TCP/22 on domain "ssh.example.org", to our origin at 192.0.2.1, port 2222. We’ve enabled the PROXY protocol toggle.
mmproxy in action
Now we can see if it works. My testing VPS has IP address 79.1.2.3. Let's see if the whole setup behaves:
vps$ nc ssh.example.org 22
SSH-2.0-OpenSSH_7.2p2 Ubuntu-4ubuntu2.1
Hurray, this worked! The "ssh.example.org" on port 22 is indeed tunneled over Spectrum. Let's see mmproxy
logs:
[+] 172.68.136.1:32654 connected, proxy protocol source 79.1.2.3:0,
local destination 127.0.0.1:22
The log confirmed what happened - Cloudflare IP 172.68.136.1 has connected, advertised client IP 79.1.2.3 over the PROXY protocol, and established a spoofed connection to 127.0.0.1:22. The ssh daemon logs show:
$ tail /var/log/auth.log
Apr 15 14:39:09 ubuntu sshd[7703]: Did not receive identification
string from 79.1.2.3
Hurray! All works! sshd recorded the real client IP address, and with mmproxy
’s help we never saw that it's actually traffic flowing through Cloudflare Spectrum.
Under the hood
Under the hood mmproxy
relies on two hacks.
The first hack is about setting source IP on outgoing connections. We are using the well known bind-before-connect technique to do this.
Normally, it's only possible to set a valid source IP that is actually handled by a local machine. We can override this by using the IP_TRANSPARENT
socket option. With it set, we can select arbitrary source IP addresses before establishing a legitimate connection handled by kernel. For example, we can have a localhost socket between, say 8.8.8.8 and 127.0.0.1, even though 8.8.8.8 may not be explicitly assigned to our server.
It's worth saying that IP_TRANSPARENT
was not created for this use case. This socket option was specifically added as support for TPROXY module.
The second hack is about routing. Normally, response packets coming from the application are routed to the Internet - via a default gateway. We must prevent that from happening, and instead direct these packets towards the loopback interface. To achieve this, we rely on CONNMARK
and an additional routing table selected by fwmark
. mmproxy
sets a MARK value of 123 (by default) on packets it sends, which is preserved at the CONNMARK
layer, and restored for the return packets. Then we route the packets with MARK == 123 to a specific routing table (number 100 by default), which force-routes everything back to the loopback interface. We do this by totally abusing the AnyIP trick and assigning 0.0.0.0/0 to "local" - meaning that entire internet shall be treated as belonging to our machine.
Summary
mmproxy
is not the only tool that uses this IP spoofing technique to preserve real client IP addresses. One example is OpenBSD's relayd
"transparent" mode. Another is the pen
load balancer. Compared to mmproxy
, these tools look heavyweight and require more complex routing.
mmproxy
is the first daemon to do just one thing: unwrap the PROXY protocol and spoof the client IP address on locally running connections going to the application process. While it requires some firewall and routing setup, it's small enough to make an mmproxy
deployment acceptable in many situations.
We hope that mmproxy
, while a gigantic hack, could help some of our customers with onboarding onto Cloudflare Spectrum.
However, frankly speaking - we don't know. mmproxy
should be treated as a great experiment. If you find it useful, let us know! If you find a problem, please report it!We are looking for feedback. If our users will find the mmproxy
approach useful, we will repackage it and release as an easier to use tool.
Doing low level socket work sound interesting? Join our world famous team in London, Austin, San Francisco, Champaign and our elite office in Warsaw, Poland.