Below is what we thought as of 12:27pm UTC. To verify our belief we crowd sourced the investigation. It turns out we were wrong. While it takes effort, it is possible to extract private SSL keys. The challenge was solved by Software Engineer Fedor Indutny and Ilkka Mattila at NCSC-FI roughly 9 hours after the challenge was first published. Fedor sent 2.5 million requests over the course of the day and Ilkka sent around 100K requests. Our recommendation based on this finding is that everyone reissue and revoke their private keys. CloudFlare has accelerated this effort on behalf of the customers whose SSL keys we manage. You can read more here.
The widely-used open source library OpenSSL revealed on Monday it had a major bug, now known as “heartbleed". By sending a specially crafted packet to a vulnerable server running an unpatched version of OpenSSL, an attacker can get up to 64kB of the server’s working memory. This is the result of a classic implementation bug known as a Buffer over-read
There has been speculation that this vulnerability could expose server certificate private keys, making those sites vulnerable to impersonation. This would be the disaster scenario, requiring virtually every service to reissue and revoke its SSL certificates. Note that simply reissuing certificates is not enough, you must revoke them as well.
Unfortunately, the certificate revocation process is far from perfect and was never built for revocation at mass scale. If every site revoked its certificates, it would impose a significant burden and performance penalty on the Internet. At CloudFlare scale the reissuance and revocation process could break the CA infrastructure. So, we’ve spent a significant amount of time talking to our CA partners in order to ensure that we can safely and successfully revoke and reissue our customers' certificates.
While the vulnerability seems likely to put private key data at risk, to date there have been no verified reports of actual private keys being exposed. At CloudFlare, we received early warning of the Heartbleed vulnerability and patched our systems 12 days ago. We’ve spent much of the time running extensive tests to figure out what can be exposed via Heartbleed and, specifically, to understand if private SSL key data was at risk.
Here’s the good news: after extensive testing on our software stack, we have been unable to successfully use Heartbleed on a vulnerable server to retrieve any private key data. Note that is not the same as saying it is impossible to use Heartbleed to get private keys. We do not yet feel comfortable saying that. However, if it is possible, it is at a minimum very hard. And, we have reason to believe based on the data structures used by OpenSSL and the modified version of NGINX that we use, that it may in fact be impossible.
To get more eyes on the problem, we have created a site so the world can challenge this hypothesis:
This site was created by CloudFlare engineers to be intentionally vulnerable to heartbleed. It is not running behind CloudFlare’s network. We encourage everyone to attempt to get the private key from this website. If someone is able to steal the private key from this site using heartbleed, we will post the full details here.
While we believe it is unlikely that private key data was exposed, we are proceeding with an abundance of caution. We’ve begun the process of reissuing and revoking the keys CloudFlare manages on behalf of our customers. In order to ensure that we don’t overburden the certificate authority resources, we are staging this process. We expect that it will be complete by early next week.
In the meantime, we’re hopeful we can get more assurance that SSL keys are safe through our crowd-sourced effort to hack them. To get everyone started, we wanted to outline the process we’ve embarked on to date in order to attempt to hack them.
A heartbeat is a message that is sent to the server just so the server can send it back. This lets a client know that the server is still connected and listening. The heartbleed bug was a mistake in the implementation of the response to a heartbeat message.
Here is the offending code
p = &s->s3->rrec.data
hbtype = *p++;
pl = p;
buffer = OPENSSL_malloc(1 + 2 + payload + padding);
bp = buffer;
memcpy(bp, pl, payload);
The incoming message is stored in a structure called
rrec, which contains the incoming request data. The code reads the type (finding out that it's a heartbeat) from the first byte, then reads the next two bytes which indicate the length of the heartbeat payload. In a valid heartbeat request, this length matches the length of the payload sent in the heartbeat request.
The major problem (and cause of heartbleed) is that the code does not check that this length is the actual length sent in the heartbeat request, allowing the request to ask for more data than it should be able to retrieve. The code then copies the amount of data indicated by the length from the incoming message to the outgoing message. If the length is longer than the incoming message, the software just keeps copying data past the end of the message. Since the length variable is 16 bits, you can request up to 65,535 bytes from memory. The data that lives past the end of the incoming message is from a kind of no-man’s land that the program should not be accessing and may contain data left behind from other parts of OpenSSL.
When processing a request that contains a longer length than the request payload, some of this unknown data is copied into the response and sent back to the client. This extra data can contain sensitive information like session cookies and passwords, as we describe in the next section.
The fix for this bug is simple: check that the length of the message actually matches the length of the incoming request. If it is too long, return nothing. That’s exactly what the OpenSSL patch does.
Malloc and the Heap
So what sort of data can live past the end of the request? The technical answer is “heap data,” but the more realistic answer is that it’s platform dependent.
On most computer systems, each process has its own set of working memory. Typically this is split into two data structures: the stack and the heap. This is the case on Linux, the operating system that CloudFlare runs on its servers.
The memory address with the highest value is where the stack data lives. This includes local working variables and non-persistent data storage for running a program. The lowest portion of the address space typically contains the program’s code, followed by static data needed by the program. Right above that is the heap, where all dynamically allocated data lives.
Managing data on the heap is done with the library calls
malloc (used to get memory) and
free (used to give it back when no longer needed). When you call
malloc, the program picks some unused space in the heap area and returns the address of the first part of it to you. Your program is then able to store data at that location. When you call
free, memory space is marked as unused. In most cases, the data that was stored in that space is just left there unmodified.
Every new allocation needs some unused space from the heap. Typically this is chosen to be at the lowest possible address that has enough room for the new allocation. A heap typically grows upwards; later allocations get higher addresses. If a block of data is allocated early it gets a low address and later allocations will get higher addresses, unless a big early block is freed.
This is of direct relevance because both the incoming message request (
s->s3->rrec.data) and the certificate private key are allocated on the heap with
malloc. The exploit reads data from the address of the incoming message. For previous requests that were allocated and freed, their data (including passwords and cookies) may still be in memory. If they are stored less than 65,536 bytes higher in the address space than the current request, the details can be revealed to an attacker.
Requests come and go, recycling memory at around the top of the heap. This makes extracting previous request data very likely from this attack. This is a important in understanding what you can and cannot get at using the vulnerability. Previous requests could contain password data, cookies or other exploitable data. Private keys are a different story; due to the way the heap is structured. The good news is this means that it is much less likely private SSL keys would be exposed.
Read up, not down
In NGINX, the keys are loaded immediately when the process is started, which puts the keys very low in the memory space. This makes it unlikely that incoming requests will be allocated with a lower address space. We tested this experimentally.
We modified our test version of NGINX to print out the location in memory of each request (
s->s3->rrec.data), whenever there was an incoming heartbeat. We compared this to the location in memory where the private key is stored and found that we could never get a request to be at a lower address than our private keys regardless of the number of requests we sent. Since the exploit only reads higher addresses, it could not be used to obtain private keys.
Here is a video of what searching for private keys looks like:
If NGINX is reloaded, it starts a new process and loads the keys right away, putting them at a low address. Getting a request to be allocated even lower in the memory space than the early-loaded keys is very unlikely.
We not only checked the location of the private keys, we wrote a tool to repeatedly extract extra data and write the results to file for analysis. We searched through gigabytes of these responses for private key information but did not find any. The most interesting things we found related to certificates were the occasional copy of the public certificate (from a previous output buffer) and some NGINX configuration data. However, the private keys were nowhere to be found.
To get an idea of what is happening inside the heap used by OpenSSL inside NGINX we wrote another tool to create a graphic showing the location of private keys (red pixels), memory that has never been used (black), memory that has been used but it now sitting idle because of a call to free (blue) and memory that is in use (green).
This picture shows the state of the heap memory (from left to right) immediately after NGINX has loaded and has yet to serve a request, after a single request, after two requests and after millions of requests. As described above the critical thing to note is that when the first request is made new memory is allocated far beyond the place where the private key is stored. Each 2x2 pixel square represents a single byte of memory; each row is 256 bytes.
Eagle-eyed readers will have noticed a block of memory that was allocated at a lower memory location than the private key. That's true. We looked into it and it is not being used to store the heartbleed (or other) TLS packet data. And it is much more than 64k away from the private key.
What can you get?
We said above that it's possible to get sensitive data from HTTP and TLS requests that the server has handled, even if the private key looks inaccessible.
Here, for example, is a dump showing some HTTP headers from a previous request to a running NGINX server. These headers would have been transmitted securely over HTTPS but Heartbleed means that an attacker can read them. That’s a big problem because the headers might contain login credentials or a cookie.
And here’s a copy of the public part of a certificate (as would be sent as part of the TLS/SSL handshake) sitting in memory and readable. Since it’s public this is not in itself dangerous -- by design, you can get the public key of a website even without the vulnerability and doing so does not create risk due to the nature of public/private key cryptography.
We have not fully ruled out the possibility, albeit slim, that some early elements of the heap get reused when NGINX is restarted. In theory, the old memory of the previous process might be available to a newly restarted NGINX. However, after extensive testing, we have not been able to reproduce this situation with an NGINX server on Linux. If a private key is available, it is most likely only available on the first request after restart. After that the chance that the memory is still available is extremely low.
There have been reports of a private key being stolen from Apache servers, but only on the first request. This fits with our hypothesis that restarting a server may cause the key to be revealed briefly. Apache also creates some special data structures in order to load private keys that are encrypted with a passphrase which may make it more likely for private keys to appear in the vulnerable portion of the stack.
At CloudFlare we do not restart our NGINX instances very often, so the likelihood that an attacker had hit our server with this exploit on the first request after restart is extremely low. Even if they did, the likelihood of seeing private key material on that request is very low. Moreover, NGINX, which is what CloudFlare’s system is based on, does not create the same special structures for HTTPS processing, making it less likely keys would ever appear in a vulnerable portion of the stack.
We think the stealing private keys on most NGINX servers is at least extremely hard and, likely, impossible. Even with Apache, which we think may be slightly more vulnerable, and we do not use at CloudFlare, we believe the likelihood of private SSL keys being revealed with the Heartbleed vulnerability is very low. That’s about the only good news of the last week.
We want others to test our results so we created the Heartbleed Challenge. Aristotle struggled with the problem of disproving the existence of something that doesn’t exist. You can’t prove the negative, so through experimental results we will never be absolutely sure there’s not a condition we haven’t tested. However, the more eyes we get on the problem, the more confident we will be that, in spite of a number of other ways the Heartbleed vulnerability was extremely bad, we may have gotten lucky and been spared the worst of the potential consequences.
That said, we’re proceeding assuming the worst. With respect to private keys held by CloudFlare, we patched the vulnerability before the public had knowledge of the vulnerability, making it unlikely that attackers were able to obtain private keys. Still, to be safe, as outlined at the beginning of this post, we are executing on a plan to reissue and revoke potentially affected certificates, including the cloudflare.com certificate.
Vulnerabilities like this one are challenging because people have imperfect information about the risks they pose. It is important that the community works together to identify the real risks and work towards a safer Internet. We’ll monitor the results on the Heartbleed Challenge and immediately publicize results that challenge any of the above. I will be giving a webinar about this topic next week with updates.
You can register for that here.