订阅以接收新文章的通知:

Why secure systems require random numbers

2013-09-13

5 分钟阅读时间
这篇博文也有 English 版本。

If you've been following recent news about technical spying by the US National Security Agency and the UK's Government Communications Headquarters you may have come across a claim that the NSA was involved in weakening a random number generator. The obvious question to ask is... why mess with random number generation?

The answer is rather simple: good random numbers are fundemental to almost all secure computer systems. Without them everything from Second World War ciphers like Lorenz to the SSL your browser uses to secure web traffic are in serious trouble.

To understand why, and the threat that bad random numbers pose, it's necessary to understand a little about random numbers themselves (such as "what is a good random number anyway?") and how they are used in secure systems.

A Hacker News Hack

As an example of how random numbers go wrong I'll begin with a hack of the popular programming and technology web site Hacker News.

Hacker News

Four years ago I mentioned on the site that its random number generator was vulnerable to being used to attack the site. Not long after, and entirely independently, another contributor to the site actually carried out the attack with the permission of the site owner.

Here's how it worked. When you log into a web site you are typically assigned a unique ID for that session (the period you are logged in). That unique ID needs to be unique to you and not guessable by someone else. If someone else can guess it they can impersonate you.

In the case of Hacker News, the unique ID is a string of randomcharacters such as lBGn0tWMcx7380gZyrUO9B. Each logged in user has adifferent string and the strings should be very, very difficult toguess or figure out.

Pseudo-randomness

The IDs are generated internally using a pseudo-random numbergenerator. That's a mathematical function that can be calledrepeatedly to get apparently random numbers. I say apparently because, as the great mathematician John von Neumann said: "Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin." The computer scientist Donald Knuth tells a story of inventing a pseudo-random number generator himself only to be shocked at how poor it was.

Although pseudo-random number generators can generate a sequence of apparently random numbers they have weaknesses.

The Three

von Neumann used a simple pseudo-random number generator called themiddle square that works as follows. You start with some number (called a seed) and square it. You take the four middle digits as your random number and square them to get the next random number, and so on.

For example, if you chose 4181 as a seed the sequence 4807, 1072,1491, 2230, 9279, ... would be generated as follows:

 Random number        Its Square    Middle digits
 4181                 17480761      4807
 4807                 23107249      1072
 1072                  1149184      1491
 1491                  2223081      2230
 2230                  4972900      9729
 9279                 94653441      6534
 and so on

This particular pseudo-random number has long since been replaced by better ones such as the Mersenne Twister whose output is harder to predict. The middle square method is trivial to predict: the next number it generates is entirely determined by the number it last produced. The Mersenne Twister on the other hand is much harder to predict because it has internal state that it uses to produce random numbers.

In the world of cryptography there are cryptographically secure pseudo-random number generators which are designed to be unpredictable no matter how many random cnumbers you ask it to generate. (The Mersenne Twister isn't cryptographically secure because it can be predicted if enough of the random numbers it generates are observed.)

For secure systems it's vital that the random number generator be unpredictable.

Starting With A Seed

And all pseudo-random number generators need to start somewhere; they need to be seeded and that's where Hacker News failed. The random number generator was seeded with the time in milliseconds when the Hacker News software was last started. By some careful work, the attacker managed to make Hacker News crash and could then predict when it restarted within a window of about one minute. From it he was able to predict the unique IDs assigned to users as they logged in and could, therefore, impersonate them. (Similar random number problems enabled one group of people to cheat at online poker.)

The full details of how the Hacker News Hack worked are here. The attack worked because once Hacker News crashed the attacker would wait for it to start and note the current time. Amusingly, the Hacker News server was willing to give out that information. The attacker then had 60s worth of possible seeds (60,000 seeds since the seed was in milliseconds).

So, the attacker would log in and look at their own unique ID. It had been generated by random numbers inside Hacker News's software. He then tried out each of the 60,000 seeds and ran the random number generation algorithm used by Hacker News until he found a match with his own unique ID. That told him which seed had been used, and it let him keep generating further unique IDs by generating the same sequence of random numbers that Hacker News was using. From that he could predict the unique IDs given out to users as they logged in and he could then impersonate them.

The Hacker News code was changed to use the Linux /dev/urandom source of random numbers which means that today unique IDs are generated with a good random number generator and without the weak seed previously used.

So, there are two ways in which pseudo-random number generation canfail: the seed could be bad or the algorithm itself could be weak and predictable.

Random Numbers Everywhere

The Hacker News example isn't about cryptography itself, but random numbers are vital to cryptographic schemes. For example, any HTTPS session starts as follows:

  1. The web browser sends information to the server about which version of SSL it wants to use and other information.

  2. The web server replies with similar information about SSL versions and its SSL certificate.

  3. The web browser checks that the certificate is valid. If it is, it generates a random 'pre-main secret' that will be used to secure the connection.

After that further exchanges occur all based on the randomly chosen pre-main secret. It needs to be unpredictable for the connection to be secure.

Here's part of how a computer using WiFi establishes a secure connection to an access point using the popular WPA2 protocol:

  1. The access point generates a random nonce and sends it to the computer.

  2. The computer generates a random nonce and sends it to the access point.

The access point and the computer continue on from there using thoserandom nonce values to secure the connection.

Similarly, random numbers turn up when logging into web sites (and other systems), creating secure connections to servers using SSH, holding Skype video chats, sending encrypted email and more.

Soviet one-time pad

And the Achilles' Heel of the only completely secure cryptosystem, the one-time pad is that the pad itself must be completely randomly generated. Any predictability or non-uniformity in the random numbers used can lead to breaking of a one-time pad. (The other problem with one-time pads is reuse: they must be used only once.)

CloudFlare's Random Number Source

At CloudFlare we need lots of random numbers for cryptographic purposes: we need them to secure SSL connections, Railgun, generating public/private key pairs, and authentication systems. They are an important part of forward secrecy which we've rolled out for all our customers.

We currently obtain most of our random numbers from either OpenSSL's random number generation system or from the Linux kernel. Both seed their random number generators from a variety of sources to make them as unpredictable as possible. Sources include things like network data, or the seek time of disks. But we think we can improve on them by adding some truly random data into the system, and, as a result, improve security for our customers.

The sky above the port was the color of television, tuned to a dead channel

We've embarked on a project to further improve our random numbers by providing a source of truly random numbers that don't come from a mathematical process. That can be done using things like radioactive decay, the motion of fluids, atmospheric noise, or other chaos.

We'll be posting details of the new system when it's online.

我们保护整个企业网络,帮助客户高效构建互联网规模的应用程序,加速任何网站或互联网应用程序抵御 DDoS 攻击,防止黑客入侵,并能协助您实现 Zero Trust 的过程

从任何设备访问 1.1.1.1,以开始使用我们的免费应用程序,帮助您更快、更安全地访问互联网。要进一步了解我们帮助构建更美好互联网的使命,请从这里开始。如果您正在寻找新的职业方向,请查看我们的空缺职位
SSLPrivacyAttacks

在 X 上关注

Cloudflare|@cloudflare

相关帖子

2024年11月20日 22:00

Bigger and badder: how DDoS attack sizes have evolved over the last decade

If we plot the metrics associated with large DDoS attacks observed in the last 10 years, does it show a straight, steady increase in an exponential curve that keeps becoming steeper, or is it closer to a linear growth? Our analysis found the growth is not linear but rather is exponential, with the slope varying depending on the metric (rps, pps or bps). ...

2024年10月02日 13:00

How Cloudflare auto-mitigated world record 3.8 Tbps DDoS attack

Over the past couple of weeks, Cloudflare's DDoS protection systems have automatically and successfully mitigated multiple hyper-volumetric L3/4 DDoS attacks exceeding 3 billion packets per second (Bpps). Our systems also automatically mitigated multiple attacks exceeding 3 terabits per second (Tbps), with the largest ones exceeding 3.65 Tbps. The scale of these attacks is unprecedented....

2024年9月27日 13:00

Network trends and natural language: Cloudflare Radar’s new Data Explorer & AI Assistant

The Cloudflare Radar Data Explorer provides a simple Web-based interface to build more complex API queries, including comparisons and filters, and visualize the results. The accompanying AI Assistant translates a user’s natural language statements or questions into the appropriate Radar API calls....