In previous blog post we discussed how we use the
TPROXY iptables module to power Cloudflare Spectrum. With
TPROXY we solved a major technical issue on the server side, and we thought we might find another use for it on the client side of our product.
This is Addressograph. Source Wikipedia
When building an application level proxy, the first consideration is always about retaining real client source IP addresses. Some protocols make it easy, e.g. HTTP has a defined
X-Forwarded-For header, but there isn't a similar thing for generic TCP tunnels.
Others have faced this problem before us, and have devised three general solutions:
(1) Ignore the client IP
For certain applications it may be okay to ignore the real client IP address. For example, sometimes the client needs to identify itself with a username and password anyway, so the source IP doesn't really matter. In general, it's not a good practice because...
(2) Nonstandard TCP header
A second method was developed by Akamai: the client IP is saved inside a custom option in the TCP header in the SYN packet. Early implementations of this method weren't conforming to any standards, e.g. using option field 28, but recently RFC7974 was ratified for this option. We don't support this method for a number of reasons:
The space in TCP headers is very limited. It's insufficient to store the full 128 bits of client IPv6 addresses, especially with 15%+ of Cloudflare’s traffic being IPv6.
No software or hardware supports the RFC7974 yet.
It's surprisingly hard to add support for RFC7947 in real world applications. One option is to patch the operating system and overwrite
accept4(2)syscalls, another is to use
getsockopt(TCP_SAVED_SYN)to extract the client IP from a SYN packet in the userspace application. Neither technique is simple.
(3) Use the PROXY protocol
Finally, there is the last method. HAProxy developers, faced with this problem developed the "PROXY protocol". The premise of this protocol is to prepend client metadata in front of the original data stream. For example, this string could be sent to the origin server in front of proxied data:
PROXY TCP4 192.0.2.123 22.214.171.124 19235 80\r\n
As you can see, the PROXY protocol is rather trivial to implement, and is generally sufficient for most use cases. However, it requires application support. The PROXY protocol (v1) is supported by Cloudflare Spectrum, and we highly encourage using it over other methods of keeping client source IP addresses.
Mmproxy to the rescue
But sometimes adding PROXY protocol support to the application isn't an option. This can be the case when the application isn’t open source, or when it's hard to edit. A good example is "sshd" - it doesn't support PROXY protocol and adding the support would be far from trivial. For such applications it may just be impossible to use any application level load balancer whatsoever. This is very unfortunate.
Fortunately we think we found a workaround.
Allow me to present
mmproxy, a PROXY protocol gateway.
mmproxy listens for remote connections coming from an application level load balancer, like Spectrum. It then reads a PROXY protocol header, opens a localhost connection to the target application, and duly proxies data in and out.
Such a proxy wouldn't be too useful if not for one feature—the localhost connection from
mmproxy to the target application is sent with a real client source IP.
mmproxy spoofs the client IP address. From the application’s point of view, this spoofed connection, coming through Spectrum and
mmproxy, is indistinguishable from a real one, connecting directly to the application.
This technique requires some Linux routing trickery. The
mmproxy daemon will walk you through the necessary details, but there are the important bits:
mmproxyworks only on Linux.
- Since it forwards traffic over the loopback interface, it must be run on the same machine as the target application.
- It requires kernel 2.6.28 or newer.
- It guides the user to add four
iptablesfirewall rules, and four
iproute2routing rules, covering both IPv4 and IPv6.
- For IPv4,
route_localnetsysctl to be set.
- For IPv6, it needs a working IPv6 configuration. A working
ping6 cloudflare.comis a prerequisite.
mmproxyneeds root or
CAP_NET_RAWpermissions to set the
IP_TRANSPARENTsocket option. Once started, it jails itself with
seccomp-bpffor a bit of added security.
How to run mmproxy
mmproxy, first download the source and compile it:
git clone https://github.com/cloudflare/mmproxy.git --recursive cd mmproxy make
Please report any issues on GitHub.
Then set up the needed configuration:
sudo iptables -t mangle -I PREROUTING -m mark --mark 123 -j CONNMARK --save-mark sudo iptables -t mangle -I OUTPUT -m connmark --mark 123 -j CONNMARK --restore-mark sudo ip rule add fwmark 123 lookup 100 sudo ip route add local 0.0.0.0/0 dev lo table 100 sudo ip6tables -t mangle -I PREROUTING -m mark --mark 123 -j CONNMARK --save-mark sudo ip6tables -t mangle -I OUTPUT -m connmark --mark 123 -j CONNMARK --restore-mark sudo ip -6 rule add fwmark 123 lookup 100 sudo ip -6 route add local ::/0 dev lo table 100
You will also need
route_localnet to be set on your default outbound interface, for example for
echo 1 | sudo tee /proc/sys/net/ipv4/conf/eth0/route_localnet
Finally, verify your IPv6 connectivity:
$ ping6 cloudflare.com PING cloudflare.com(2400:cb00:2048:1::c629:d6a2) 56 data bytes 64 bytes from 2400:cb00:2048:1::c629:d6a2: icmp_seq=1 ttl=61 time=0.650 ms
Now, you are ready to run
mmproxy. For example, forwarding localhost SSH would look like this:
$ sudo ./mmproxy --allowed-subnets ./cloudflare-ip-ranges.txt \ -l 0.0.0.0:2222 \ -4 127.0.0.1:22 -6 '[::1]:22' [email protected]:~# ./mmproxy -a cloudflare-ip-ranges.txt -l 0.0.0.0:2222 -4 127.0.0.1:22 -6 [::1]:22[ ] Remember to set the reverse routing rules correctly: iptables -t mangle -I PREROUTING -m mark --mark 123 -m comment --comment mmproxy -j CONNMARK --save-mark # [+] VERIFIED iptables -t mangle -I OUTPUT -m connmark --mark 123 -m comment --comment mmproxy -j CONNMARK --restore-mark # [+] VERIFIED ip6tables -t mangle -I PREROUTING -m mark --mark 123 -m comment --comment mmproxy -j CONNMARK --save-mark # [+] VERIFIED ip6tables -t mangle -I OUTPUT -m connmark --mark 123 -m comment --comment mmproxy -j CONNMARK --restore-mark # [+] VERIFIED ip rule add fwmark 123 lookup 100 # [+] VERIFIED ip route add local 0.0.0.0/0 dev lo table 100 # [+] VERIFIED ip -6 rule add fwmark 123 lookup 100 # [+] VERIFIED ip -6 route add local ::/0 dev lo table 100 # [+] VERIFIED [+] OK. Routing to 127.0.0.1 points to a local machine. [+] OK. Target server 127.0.0.1:22 is up and reachable using conventional connection. [+] OK. Target server 127.0.0.1:22 is up and reachable using spoofed connection. [+] OK. Routing to ::1 points to a local machine. [+] OK. Target server [::1]:22 is up and reachable using conventional connection. [+] OK. Target server [::1]:22 is up and reachable using spoofed connection. [+] Listening on 0.0.0.0:2222
mmproxy performs a number of self checks. Since we prepared the necessary routing and firewall rules, its self check passes with a "VERIFIED" mark. It's important to confirm these pass.
We're almost ready to go! The last step is to create a Spectrum application that sends PROXY protocol traffic to
mmproxy, port 2222. Here is an example configuration:
With Spectrum we are forwarding TCP/22 on domain "ssh.example.org", to our origin at 192.0.2.1, port 2222. We’ve enabled the PROXY protocol toggle.
mmproxy in action
Now we can see if it works. My testing VPS has IP address 126.96.36.199. Let's see if the whole setup behaves:
vps$ nc ssh.example.org 22 SSH-2.0-OpenSSH_7.2p2 Ubuntu-4ubuntu2.1
Hurray, this worked! The "ssh.example.org" on port 22 is indeed tunneled over Spectrum. Let's see
[+] 188.8.131.52:32654 connected, proxy protocol source 184.108.40.206:0, local destination 127.0.0.1:22
The log confirmed what happened - Cloudflare IP 220.127.116.11 has connected, advertised client IP 18.104.22.168 over the PROXY protocol, and established a spoofed connection to 127.0.0.1:22. The ssh daemon logs show:
$ tail /var/log/auth.log Apr 15 14:39:09 ubuntu sshd: Did not receive identification string from 22.214.171.124
Hurray! All works! sshd recorded the real client IP address, and with
mmproxy’s help we never saw that it's actually traffic flowing through Cloudflare Spectrum.
Under the hood
Under the hood
mmproxy relies on two hacks.
The first hack is about setting source IP on outgoing connections. We are using the well known bind-before-connect technique to do this.
Normally, it's only possible to set a valid source IP that is actually handled by a local machine. We can override this by using the
IP_TRANSPARENT socket option. With it set, we can select arbitrary source IP addresses before establishing a legitimate connection handled by kernel. For example, we can have a localhost socket between, say 126.96.36.199 and 127.0.0.1, even though 188.8.131.52 may not be explicitly assigned to our server.
It's worth saying that
IP_TRANSPARENT was not created for this use case. This socket option was specifically added as support for TPROXY module.
The second hack is about routing. Normally, response packets coming from the application are routed to the Internet - via a default gateway. We must prevent that from happening, and instead direct these packets towards the loopback interface. To achieve this, we rely on
CONNMARK and an additional routing table selected by
mmproxy sets a MARK value of 123 (by default) on packets it sends, which is preserved at the
CONNMARK layer, and restored for the return packets. Then we route the packets with MARK == 123 to a specific routing table (number 100 by default), which force-routes everything back to the loopback interface. We do this by totally abusing the AnyIP trick and assigning 0.0.0.0/0 to "local" - meaning that entire internet shall be treated as belonging to our machine.
mmproxy is not the only tool that uses this IP spoofing technique to preserve real client IP addresses. One example is OpenBSD's
relayd "transparent" mode. Another is the
pen load balancer. Compared to
mmproxy, these tools look heavyweight and require more complex routing.
mmproxy is the first daemon to do just one thing: unwrap the PROXY protocol and spoof the client IP address on locally running connections going to the application process. While it requires some firewall and routing setup, it's small enough to make an
mmproxy deployment acceptable in many situations.
We hope that
mmproxy, while a gigantic hack, could help some of our customers with onboarding onto Cloudflare Spectrum.
However, frankly speaking - we don't know.
mmproxy should be treated as a great experiment. If you find it useful, let us know! If you find a problem, please report it!
We are looking for feedback. If our users will find the
mmproxy approach useful, we will repackage it and release as an easier to use tool.
Doing low level socket work sound interesting? Join our world famous team in London, Austin, San Francisco, Champaign and our elite office in Warsaw, Poland.