### [Cfrg] Internal collisions

"D. J. Bernstein" <djb@cr.yp.to> Sun, 26 July 2015 19:43 UTC

Return-Path: <djb-dsn2-1406711340.7506@cr.yp.to>

X-Original-To: cfrg@ietfa.amsl.com

Delivered-To: cfrg@ietfa.amsl.com

Received: from localhost (ietfa.amsl.com [127.0.0.1])
by ietfa.amsl.com (Postfix) with ESMTP id 75F061A8863
for <cfrg@ietfa.amsl.com>; Sun, 26 Jul 2015 12:43:20 -0700 (PDT)

X-Virus-Scanned: amavisd-new at amsl.com

X-Spam-Flag: NO

X-Spam-Score: 2.896

X-Spam-Level: **

X-Spam-Status: No, score=2.896 tagged_above=-999 required=5
tests=[BAYES_50=0.8, HELO_EQ_NL=0.55, HOST_EQ_NL=1.545,
RCVD_IN_DNSWL_NONE=-0.0001, UNPARSEABLE_RELAY=0.001] autolearn=no

Received: from mail.ietf.org ([4.31.198.44])
by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024)
with ESMTP id byTumKHtHh5L for <cfrg@ietfa.amsl.com>;
Sun, 26 Jul 2015 12:43:18 -0700 (PDT)

Received: from calvin.win.tue.nl (calvin.win.tue.nl [131.155.70.11])
by ietfa.amsl.com (Postfix) with SMTP id 221E21A8852
for <cfrg@irtf.org>; Sun, 26 Jul 2015 12:43:17 -0700 (PDT)

Received: (qmail 6579 invoked by uid 1017); 26 Jul 2015 19:43:37 -0000

Received: from unknown (unknown)
by unknown with QMTP; 26 Jul 2015 19:43:37 -0000

Received: (qmail 14875 invoked by uid 1000); 26 Jul 2015 19:43:06 -0000

Date: 26 Jul 2015 19:43:06 -0000

Message-ID: <20150726194306.14873.qmail@cr.yp.to>

From: "D. J. Bernstein" <djb@cr.yp.to>

To: cfrg@irtf.org

Mail-Followup-To: cfrg@irtf.org

MIME-Version: 1.0

Content-Type: text/plain; charset=utf-8

Content-Disposition: inline

Content-Transfer-Encoding: 8bit

Archived-At: <http://mailarchive.ietf.org/arch/msg/cfrg/E-q_6ABdKfPU5XG2N6IVAs_sNhA>

Subject: [Cfrg] Internal collisions

X-BeenThere: cfrg@irtf.org

X-Mailman-Version: 2.1.15

Precedence: list

List-Id: Crypto Forum Research Group <cfrg.irtf.org>

List-Unsubscribe: <http://www.irtf.org/mailman/options/cfrg>,
<mailto:cfrg-request@irtf.org?subject=unsubscribe>

List-Archive: <https://mailarchive.ietf.org/arch/browse/cfrg/>

List-Post: <mailto:cfrg@irtf.org>

List-Help: <mailto:cfrg-request@irtf.org?subject=help>

List-Subscribe: <http://www.irtf.org/mailman/listinfo/cfrg>,
<mailto:cfrg-request@irtf.org?subject=subscribe>

X-List-Received-Date: Sun, 26 Jul 2015 19:43:20 -0000

This message elaborates on Tanja's comment about "internal collisions" at the CFRG meeting. The quick summary is that Schnorr's H(R,M) adds real protection against the possibility of collisions in H, while the alternative H(M,R) doesn't, even when H is a wide-pipe hash function. As a concrete example, take H to be 24 rounds of SHA-512/256 or SHA-384. Both of these functions have * outputs that are too big for brute-force collision attacks, and * even wider pipes. However, internally these functions process messages the same way as SHA-512 (starting from different IVs and truncating the output), and so you can break these hash functions inside H(M,R) signatures as follows: * Start from 2008 Sanadhya--Sarkar, which showed how to very quickly find distinct 1024-bit blocks B,B' that produce colliding 512-bit states after 24 rounds starting from the initial SHA-512 state. * Notice that the Sanadhya--Sarkar method also works for (e.g.) the initial SHA-384 state. I'm not saying that these particular blocks B,B' have a noticeable chance of producing collisions for other starting states. This would be an impressive break of the SHA-512 data flow, a "differential attack without message modification", a technique that has trouble breaking MD5. What I'm saying is that, _given_ any starting state h (e.g., the initial SHA-384 state or the initial SHA-512/256 state), the method _computes_ distinct 1024-bit blocks B,B' that produce colliding 512-bit states after 24 rounds starting from h. * Notice that 24 rounds of SHA-384 now collide on any inputs of the form (B,X) and (B',X), since the states after B and B' are the same---this is what's called an "internal collision". * Find an X such that (B,X) is an innocent message that the signer is willing to sign while (B',X) is a dangerous message. Obtain the signer's signature (R,S) on the innocent message (B,X). * Forge the message (B',X) with the same signature (R,S). This is a valid signature since H(B',X,R) = H(B,X,R). What went wrong here is that H(M,R) isn't resilient to _internal_ collisions in H. It wasn't useful to watch for collisions in H as a canary for potential security problems in H(M,R)---the victim died a moment after the canary did. H(M,R) for wide-pipe H has the weak property of being resilient to mere _output_ collisions---for example, if someone manages to find an H collision by brute force, then there's no reason to think that appending R to the inputs will also produce a collision---but brute force isn't the real threat against any reasonably large hash output. The way that cryptanalysts have found collisions much more quickly than brute force * in MD5, and * in 76 out of 80 steps of the SHA-1 compression function, and * in 38 out of 64 steps of the SHA-256 compression function, and * in 4 out of 24 rounds of SHA3-256, is by finding ways that small input differences can stay on a path of small differences through the entire hash computation---it's hard to imagine how differences would have a noticeable chance of disappearing otherwise! Output truncation has very little effect on this type of attack: it can remove a few conditions from the end of the computation, but most of the attack cost comes from many more conditions needed to control diffusion in many earlier rounds of the computation. (Probably even more than 24 rounds of SHA-384 etc. are feasible to break. There has been considerable progress since 2008, and the latest collision attacks break 38 out of 80 rounds of the SHA-512 compression function, although breaking 38 rounds of the SHA-512 _hash_ function will require hooking the SHA-512 initial state to the two initial states used in this compression-function collision.) For comparison, what happens to this attack if H(M,R) is replaced by Schnorr's H(R,M)? This time R immediately affects the hash processing, and this change in data flow creates serious obstacles for the attacker. Here's what happens: * _Given_ R, the attacker computes two extensions (R,M) and (R,M') that produce an internal collision. * But wait---why is this collision useful? To forge M' as a substitute for M, the attacker needs a signature involving H(R,M). * So the attacker asks the signer to sign M. * The signer now generates a signature of M---but this signature involves a new R' that the attacker has never seen before. There's no reason to think that H(R',M) and H(R',M') will collide when R' is chosen randomly. Asking for this to happen is asking for something much stronger than a collision---as above, this would be an impressive break of the SHA-512 data flow. The attacker really wants to see R before generating M and M', but the signer won't generate R without seeing M. So the attacker starts by computing just one message M, then sees R from the signer, and then tries to compute a second message M' with H(R,M') = H(R,M). The usual techniques for trying to find preimages much more quickly than brute force are again to take M' close to M and hope that the differences stay under control. However, the fact that M can't change during the second computation makes this preimage search much more difficult than a collision search: * The attacker can't simply use the best difference M'-M: if this M' doesn't work, then the attacker has to move on to the second-best difference, then the third-best, etc. In collision attacks the attacker was free to keep trying the best difference with many choices of M. * The attacker can't apply "message modification", working backwards from as many of the path conditions as possible to a subset of M that's feasible to enumerate. The attacker can of course try to modify M during the first computation so that the path conditions are more likely to be satisfied, but this has to be done in a way that gets past the obstacle of R being subsequently randomized. In short, the fact that H(R,M) is resilient to _internal_ collisions reflects a real gap in difficulty between finding H collisions and breaking H(R,M). Anyone who disputes this should be able to demonstrate feasible attacks on H(R,M) using MD5, using 24 rounds of SHA-384, and using 4 rounds of SHA3-256, all of which are broken for H(M,R). ---Dan

- [Cfrg] Internal collisions D. J. Bernstein
- Re: [Cfrg] Internal collisions Dan Brown
- Re: [Cfrg] Internal collisions David Jacobson
- Re: [Cfrg] Internal collisions Ilari Liusvaara
- Re: [Cfrg] Internal collisions Dan Brown
- Re: [Cfrg] Internal collisions Paterson, Kenny
- Re: [Cfrg] Internal collisions D. J. Bernstein
- Re: [Cfrg] Internal collisions Dan Brown
- Re: [Cfrg] Internal collisions D. J. Bernstein
- Re: [Cfrg] Internal collisions Dan Brown
- Re: [Cfrg] Internal collisions Bryan Ford
- Re: [Cfrg] Internal collisions D. J. Bernstein