[dsfjdssdfsd] What has gone wrong with RNGs in practice

Nick Mathewson <nickm@torproject.org> Fri, 15 November 2013 16:36 UTC

Return-Path: <nick.a.mathewson@gmail.com>
X-Original-To: dsfjdssdfsd@ietfa.amsl.com
Delivered-To: dsfjdssdfsd@ietfa.amsl.com
Received: from localhost (localhost []) by ietfa.amsl.com (Postfix) with ESMTP id 35D3621F9B4B for <dsfjdssdfsd@ietfa.amsl.com>; Fri, 15 Nov 2013 08:36:39 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.99
X-Spam-Status: No, score=0.99 tagged_above=-999 required=5 tests=[BAYES_50=0.001, FM_FORGED_GMAIL=0.622, NO_RELAYS=-0.001, URI_HEX=0.368]
Received: from mail.ietf.org ([]) by localhost (ietfa.amsl.com []) (amavisd-new, port 10024) with ESMTP id k0IFwt4PMtDx for <dsfjdssdfsd@ietfa.amsl.com>; Fri, 15 Nov 2013 08:36:38 -0800 (PST)
Received: from mail-la0-x22d.google.com (mail-la0-x22d.google.com [IPv6:2a00:1450:4010:c03::22d]) by ietfa.amsl.com (Postfix) with ESMTP id 8C76521F9E3F for <dsfjdssdfsd@ietf.org>; Fri, 15 Nov 2013 08:36:16 -0800 (PST)
Received: by mail-la0-f45.google.com with SMTP id eh20so2959151lab.32 for <dsfjdssdfsd@ietf.org>; Fri, 15 Nov 2013 08:36:15 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:date:message-id:subject:from:to:content-type; bh=2FfAXRCIY3OPkk0EbIy8WAxzIhtzwhhCYevzkTH6E1U=; b=Ih4hABs8IBoyZz68M2WYltKgURTjMoXMi2j4TAT5u8tLufHAdI9ojRwrp407fQUW4k gI2Rk3x0KOXiv5k7ad0OQ+493HWvMCw+TeYBS1J2kJUVdhwbGHLtVsoUT2tMAM2fK6M9 LLPxQyam0ARFfx3EnD/CU2FjLAhxc65Q7/+TTFYwj6yvCc48U313JBWgnth9JacaiMcx 5e1GBNNV9nzvRG/ex9+vIly7Uyh78Edt+q/xgRUvQKVhjoYvn+oun2hch2lPwpBal9OP 8+JNVjqZmEb3ICElRI4M8yK2BroDdTrRUV6j1fN+iTbuM0ifcAqgcQjMu43GxoaX4uDM iYrw==
MIME-Version: 1.0
X-Received: by with SMTP id be3mr318182lab.67.1384533374858; Fri, 15 Nov 2013 08:36:14 -0800 (PST)
Sender: nick.a.mathewson@gmail.com
Received: by with HTTP; Fri, 15 Nov 2013 08:36:14 -0800 (PST)
Date: Fri, 15 Nov 2013 11:36:14 -0500
X-Google-Sender-Auth: rs9Sb6I2TCTGCFJspTgZCpTt1oI
Message-ID: <CAKDKvuw6EovCf6sv5SsY=NBqYKWxc1K9yV4TZjNFbHXBrt6rDQ@mail.gmail.com>
From: Nick Mathewson <nickm@torproject.org>
To: dsfjdssdfsd@ietf.org
Content-Type: text/plain; charset="ISO-8859-1"
Subject: [dsfjdssdfsd] What has gone wrong with RNGs in practice
X-BeenThere: dsfjdssdfsd@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "The dsfjdssdfsd list provides a venue for discussion of randomness in IETF protocols, for example related to updating RFC 4086." <dsfjdssdfsd.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dsfjdssdfsd>, <mailto:dsfjdssdfsd-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/dsfjdssdfsd>
List-Post: <mailto:dsfjdssdfsd@ietf.org>
List-Help: <mailto:dsfjdssdfsd-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dsfjdssdfsd>, <mailto:dsfjdssdfsd-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 15 Nov 2013 16:36:42 -0000


I thought it might be helpful to enumerate all the RNG failures I'm
aware of.  There are probably more, and my focus here has been on
failures that have actually led to working attacks.  See also CWE-330
[0].  Please let me know what I've missed.

== What has gone wrong in practice and led to actual working attacks:

A. Not actually using randomness at all for something that needs some
or all of the properties of a random bitstring.

Example: Sony's implementation of ECDSA failed to actually change the
k value between signatures; they just had a constant.[1]

Example: The SSL CBC weakness that enabled the BEAST attack where,
instead of using a new IV for each record, SSL and older TLS versions
would use the last record's last block as the IV for the next record.

B. Relying on an RNG output for something that doesn't need to be random.

Example: Most ECDSA implementations rely on having a strong RNG at
signing time. In fact, there are more robust ways to generate an
unpredictable value -- for example, by a cryptographic hash of the
message, the public key, and a secret value generated at key
generation time.  See the references in the  "Pseudorandom generation
or r" section of the Ed25519 paper [2]. Arguably, this is what went
wrong with the Android bitcoin problem.  (See item G below.)

C. The cryptographic function that is produces the RNG output or
updates its state is periodic, predictable, or otherwise
cryptologically inadequate.

Example: Traditional arc4random() implementations: the original RC4
cipher is not only insecure versus modern cryptanalysis [3]; it has
actual biases in its output.

Example: The core function Dual_EC_DBRG is distinguishable from a
random bitstring. Also, it has a  backdoor that could not be more
obvious if it were a drive-through car wash.

D. Using a noncryptographic PRNG when a stronger function is required
for security.

Example: When reviewing an inexperienced hobbyist's first cryptography
software, it's even money you'll find them using (say) Mersenne
Twister to generate a keystream.

Example: It has been well known for ages that initial TCP sequence
numbers need to be unpredictable to prevent various spoofing attacks.
Even so, people still used PRNGs with insufficient unpredictability,
such as linear congruental generators. The summary at [4] has a list
of CVEs for inadequate PRNGs.

E. Using an RNG that has been exposed before it can be adequately seeded.

Example: Various embedded hardware devices generate private keys on
their initial startup. Unfortunately, they tend to do this before
they've been able to collect any entropy from sources like HD timings
(what's an HD?) network timings (what's a network?) and so forth. This
has been one of the causes of a rash of predictable keys, as explored
in [5].

F. Disabling some or all of the seed inputs to an RNG.

Example: In the Debian OpenSSL PRNG bug [6], maintainers thought that
they were disabling a feature that added uninitialized RAM to the PRNG
state. Instead, they had accidentally disabled *all* inputs to the
PRNG state. (Except for the PID.)

Example: One of the reasons that Linux kernels entropy was sometimes
low around startup was (If I understand correctly!) that important
drivers or subsystems often avoided adding timing information to the
kernel entropy pool, for fear it would slow them down. More recent
Linux kernels, (again, IIUC) take the responsibility away from the
drivers, and use clever implementation hacks to ensure that no
performance penalty is felt.

G. Using the same RNG state to generate two bitstreams.

Example: The SecureRandom structure in Android was initialized once,
and then used as the initial SecureRandom for multiple processes. This
led to a large compromise in Android bitcoin wallets. [7]

Example: Python had the same problem [8]: the same RNG state was used
after fork.

H. Falling back to insecure sources.

Example: Too many supposedly strong RNG libraries have had a fallback
to random() if /dev/urandom can't be read. See for example the
discussion of Apache Harmony in [9].

I. Implementation errors.

Example: Too many to list. See for example the discussion of Apache
Harmony in [9], where the implementors thought they had a 128 bit
seed, but they only had a 64-bit seed.

== What has gone wrong, and is probably bad, but has not (AFAICT) led
to any publicized attacks yet:

These may well be getting exploited.  If you've got documentation on
that, please let me know.

AA. Backtracking from compromised state
BB. Prediction from compromised state

It's not necessary that reading an RNG's current state should enable a
attacker to learn all previous and future outputs of that RNG.
Resisting backtracking attacks is easy. Resisting prediction attacks
is a little harder, but permits a not too unattractive
performance/security tradeoff.

CC. Theoretically unsound pool constructions

See for example the recent paper on the Linux /dev/*random devices'
constructions [10].

DD. Exclusive reliance on hard-to-audit hardware or software

<Insert RDRAND paranoia here. Insert CryptGenRandom paranoia here.>

EE. Bad entropy estimation

Numerous RNGs rely on each entropy inputs being acccompanied by an
estimate of how many bits of entropy each contains. Historically,
these entropy estimates have been pretty bogus, but I'm not aware of
any attack arising out of that.

== Some antipaterns in randomness

(Here I am being perhaps too opinionated.)

* Exposing under-seeded RNGs.

If people can forget to seed an RNG, somebody will.  If an RNG can be
drained before it has been seeded, it will be.

* Strong, fast: pick one.

In many cases, the programming environment has provided a strong RNG,
but implementors have (erroneously) avoided it either on performance
grounds.  This is a shame, since there's no reason a strong userspace
PRNG can't give performance on par with many of the weak userspace
PRNGs people use today. (Shameless plug: [11])

* Strong, reliable: pick one.

In some cases, programmers avoid the safer option because they can't
rely on it being available, and so use one that might not be as well
seeded. (For example, many programmers avoid /dev/random for fear that
it might block, while using /dev/urandom, which is on some platforms
more likely to return predictable bits near initial system startup.)

* Underdocumented, underexplained randomness requirements.

Before you sniff too loudly at Sony's mistake in [1]: Pretend that you
are a programmer in a hurry looking at FIPS 186-2, or your favorite
(early) standards-body description of DSA. How well does it explain
the importance of making 'k' completely unpredictable for each
message, and how well does it explain the consequences for failing to
do so?

* Falling back to insecure sources

"I couldn't open /dev/urandom. I guess that srandom(getpid() +
malloc()) is the best I can do."

* Entropy purity taboos

This is the converse of DD above. Some implementors refuse to
incorporate entropy sources they don't trust into an entropy pool,
under the apparent belief that combining a good entropy source with an
evil entropy source will produce contaminated entropy.  In reality, a
good PRNG will survive hostile sources just fine, so long as a
sufficient threshold of the entropy inputs are good.

* Inadequate testing

This is a tremendous problem. Unless the RNG is broken, black-box
testing is usually inadequate: an unseeded RNG's output, or an
underseeded RNG's output, or a broken RNG's output, will all look.
Testing an RNG requires, at minimum; verification that it produces the
correct expected outputs given an expected seed, and that it responds
correctly to addition of more fixed seeds.

[0] http://cwe.mitre.org/data/definitions/330.html
[1] http://www.bbc.co.uk/news/technology-12116051
[2] http://ed25519.cr.yp.to/ed25519-20110926.pdf
[3] http://www.isg.rhul.ac.uk/tls/
[4] http://www.cert.org/advisories/CA-2001-09.html
[5] https://factorable.net/paper.html
[6] http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2008-0166
[7] http://bitcoin.org/en/alert/2013-08-11-android
[8] http://bugs.python.org/issue18747
[9] http://hgi.rub.de/media/nds/veroeffentlichungen/2013/03/25/paper_2.pdf
[10] http://eprint.iacr.org/2013/338.pdf
[11] https://github.com/nmathewson/libottery

hoping this was useful,
Nick Mathewson