[Cfrg] key as message prefix => multi-key security

"D. J. Bernstein" <djb@cr.yp.to> Thu, 17 September 2015 22:47 UTC

Return-Path: <djb-dsn2-1406711340.7506@cr.yp.to>
X-Original-To: cfrg@ietfa.amsl.com
Delivered-To: cfrg@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D7FA41A1EEE for <cfrg@ietfa.amsl.com>; Thu, 17 Sep 2015 15:47:46 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 4.096
X-Spam-Level: ****
X-Spam-Status: No, score=4.096 tagged_above=-999 required=5 tests=[BAYES_50=0.8, HELO_EQ_NL=0.55, HOST_EQ_NL=1.545, J_CHICKENPOX_111=0.6, J_CHICKENPOX_31=0.6, RCVD_IN_DNSWL_NONE=-0.0001, UNPARSEABLE_RELAY=0.001] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vFW-cwYuN29l for <cfrg@ietfa.amsl.com>; Thu, 17 Sep 2015 15:47:43 -0700 (PDT)
Received: from calvin.win.tue.nl (calvin.win.tue.nl [131.155.70.11]) by ietfa.amsl.com (Postfix) with SMTP id 6F5961A1DBE for <cfrg@irtf.org>; Thu, 17 Sep 2015 15:47:42 -0700 (PDT)
Received: (qmail 2275 invoked by uid 1017); 17 Sep 2015 22:48:00 -0000
Received: from unknown (unknown) by unknown with QMTP; 17 Sep 2015 22:48:00 -0000
Received: (qmail 29741 invoked by uid 1000); 17 Sep 2015 22:47:29 -0000
Date: Thu, 17 Sep 2015 22:47:29 -0000
Message-ID: <20150917224729.29739.qmail@cr.yp.to>
From: "D. J. Bernstein" <djb@cr.yp.to>
To: cfrg@irtf.org
Mail-Followup-To: cfrg@irtf.org
In-Reply-To: <55E99B7C.6020509@gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
Archived-At: <http://mailarchive.ietf.org/arch/msg/cfrg/44gJyZlZ7-myJqWkChhpEF1KE9M>
Subject: [Cfrg] key as message prefix => multi-key security
X-BeenThere: cfrg@mail.ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Crypto Forum Research Group <cfrg.mail.ietf.org>
List-Unsubscribe: <https://mail.ietf.org/mailman/options/cfrg>, <mailto:cfrg-request@mail.ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cfrg/>
List-Post: <mailto:cfrg@mail.ietf.org>
List-Help: <mailto:cfrg-request@mail.ietf.org?subject=help>
List-Subscribe: <https://mail.ietf.org/mailman/listinfo/cfrg>, <mailto:cfrg-request@mail.ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 17 Sep 2015 22:47:47 -0000

This message continues the security analysis of inserting the public key
A into the hash in Schnorr's signature system: e.g., H(R,A,M) instead of
Schnorr's classic H(R,M). The 'brown' and 'ladd' proposals omit the key
from the hash but the other proposals include it.

The Ed25519 paper says that including the key is an inexpensive way "to
alleviate concerns that several public keys could be attacked
simultaneously". Here's the background regarding these concerns:

   * Most of the literature on signature-system security focuses on the
     problem of forging messages under _one_ public key.

   * The real-world attacker actually sees _many_ public keys, and is
     probably happy if he solves the problem of forging messages under
     any one of those keys. (For example, he simply wants to take over
     _somebody's_ account to gain computer power, relay more attacks,
     etc.) This isn't the same problem.

   * The concern is that switching to the multi-key problem could give
     the attacker an advantage, compared to the original problem. There
     _is_ an analogous advantage in attacking secret-key protocols---
     which is why I recommend 256-bit secret keys.

   * It's easy to prove, for any signature system, that attacking a
     billion keys at once can't give the attacker an advantage of more
     than a factor of a billion. But, yikes, a billion is a big number!

I hope it's clear that the word "concern" does _not_ mean that there are
any known multi-key attacks better than single-key attacks on any of the
signature schemes under consideration. What it means is that one can
reasonably be worried about the _possibility_ of multi-key attacks, and
about the lack of study of this possibility.

I'm happy to announce that there's now a theorem guaranteeing that key
insertion completely eliminates this concern:

   probability of breaking any 1 of N signature keys using H(R,A,M)
   <= probability of breaking a targeted signature key using H(R,M).

In other words, if you believe that the classic H(R,M) system in the
single-key case is secure, then you have to believe that the H(R,A,M)
system is secure for any number of keys. The proof is a simple reduction
that takes any attack against multi-key H(R,A,M) and converts it into a
single-key attack against H(R,M); the single-key attack has practically
identical performance and success probability to the multi-key attack.  

The proof relies critically on the insertion of the public key A into
the hash. There's a huge obstacle to proving a similar theorem without 
the public key, as I'll explain below. The best theorem we know without
hashing A is the generic

   probability of breaking any 1 of N signature keys using H(R,M)
   <= N*probability of breaking a targeted signature key using H(R,M)

which still leaves the concern unaddressed. To summarize:

   * Inserting the public key eliminates the multi-key concerns.
   * Omitting the public key doesn't.

I should emphasize that the theorem is for a quite traditional form of
Schnorr's system, and hasn't yet been combined with modern features such
as derandomization, key derivation, cofactors, etc.; many of these
features obviously complicate the "provable security" picture. However,
even with its limited scope, the theorem---together with the inability
to prove the same theorem with the key omitted---justifies the idea that
inserting the key into the hash is the conservative thing to do.

The proof idea appeared many years ago, but unfortunately with a serious
mistake, leading to various incorrect conclusions, both by the original
authors and by subsequent readers. This week I've conferred with and
heard back from various people, including two of the original authors,
and everyone seems to agree with what I'm saying. The rest of this
message discusses the history, the mistake, and the proof details.

Rene Struik writes:
> one does not need to include the public key in the signing process,
> since security in the multi-user setting is roughly the same as in the
> single-user setting (see [1], [2]).

Reference #1 doesn't quantify security levels, and doesn't have any
proof techniques relevant to the concerns under discussion.

Reference #2 _does_ quantify security levels, and specifically claims to
have proven that "security does not decrease with the number of users"
for Schnorr signatures. This claim will have to be retracted. The proof
is wrong, and there's no hope of proving the claimed theorem.

The overall proof idea in reference #2 is good, _except_ that the proof
skips past a critical step (namely, proving that the alleged forgery is
on a message that wasn't legitimately signed); the huge obstacle sinks
the proof at exactly this point. Inserting the public key into the hash
dodges the obstacle, and makes the whole proof idea work, producing the
new theorem, which _does_ eliminate the concern for multi-key H(R,A,M).

End of summaries. I'll now dive into the proof details, with the goal of
allowing public verification of what I said above.

"Classic scheme", parametrized by a standard group and a standard base
point B: The signer, given a message M, produces as signature a uniform
random R, independent of everything else, and the unique S satisfying
SB = R+H(R,M)A, where A is the public key.

Let's consider single-key attacks against the classic scheme. An attack
receives as input

   * a key A and
   * an oracle that, given M, produces a uniform random R, independent
     of everything else, and the unique S satisfying SB = R + H(R,M)A.

An attack produces as output a message M and an alleged signature (R,S).
Success means that

   * SB = R+H(R,M)A, i.e., (R,S) is a signature of M under A; and
   * M was not an oracle query, i.e., key A did not legitimately sign
     message M.

The starting assumption is that such attacks have been adequately
studied, and that the classic scheme is (for appropriate distributions
of A) secure against these attacks: i.e., any attack with a feasible
cost has a negligible probability of success.

"Generalized scheme", parametrized by a standard group, a standard base
point B, and a standard function "tag": The signer, given M, produces as
signature a uniform random R, independent of everything else, and the
unique S satisfying SB = R+H(R,tag(A),M)A, where A is the public key.
I'm implicitly assuming that R and tag(A) are both self-delimiting, so
we can recover R and tag(A) and M from the concatenation (R,tag(A),M).

If tag(A) is defined as the empty string then the generalized scheme is
the same as the classic scheme. If tag(A) is defined as A then the
generalized scheme includes the whole public key in the hash.

Let's consider N-key attacks against the generalized scheme. An attack
receives as input

   * keys A_1,A_2,...,A_N and
   * an oracle that, given a key T in {A_1,A_2,...,A_N} and a message M,
     produces a uniform random R, independent of everything else, and
     the unique S satisfying SB = R+H(R,tag(T),M)T.

An attack produces as output a public key T, a message M, and an alleged
signature (R,S). Success means that

   * T is in {A_1,A_2,...,A_N}, i.e., T is one of the specified keys;
   * SB = R+H(R,tag(T),M)T, i.e., (R,S) is a signature of M under T; and
   * (T,M) was not an oracle query, i.e., key T did not legitimately
     sign message M.

What I'll do is convert any N-key attack X against the generalized
scheme into a single-key attack ReRandomize(X) against the classic
scheme. The goal is to ensure that ReRandomize(X) and X have practically
identical costs and the same identical success probabilities. We'll see
that this works if the tag function is injective. We'll also see what
the obstacle is if multiple keys collide under the tag function, for
example if tag(A) = empty.

Here's what ReRandomize(X) does.

Generate uniform random r_1,r_2,...,r_N, and convert the input key A
into N independent uniform random keys A+r_1 B, A+r_2 B, ..., A+r_N B.
Run X on these N keys. Whenever X asks for a signature of M under T:

   * Look up T in the list of A+r_1 B, A+r_2 B, ..., A+r_N B to find an
     i such that T = A+r_i B; if none exists, reject the query.

   * Relay the concatenation (tag(T),M) to the input oracle. By
     hypothesis the input oracle produces a uniform random R,
     independent of everything else, and the unique S satisfying
     SB = R + H(R,tag(T),M)A.

   * Compute S' = S+r_i H(R,tag(T),M). Then S'B = R + H(R,tag(T),M)T;
     i.e., (R,S') is a signature of M under T in the generalized scheme.

   * Return (R,S') to X. Notice that this is exactly the behavior of a
     normal signer who knows the secret key for T in the generalized
     scheme: namely, returning a uniform random R, independent of
     everything else, and a solution to the right equation.

This is exactly a run of X against N independent uniform random keys and
an appropriate oracle for those keys. Eventually X produces as output
T,M,R,S. As above find i such that T = A+r_i B (aborting if there is no
such i), and compute S' = S - r_i H(R,tag(T),M). Output the
concatenation (tag(T),M) and the pair (R,S').

_If_ tag is injective, and X is successful against the generalized
scheme, then this algorithm ReRandomize(X) is successful against the
classic scheme. Indeed, recall the definition of success for X:

   * First, T is in {A_1,A_2,...,A_N}. This means that there _will_ be an
     i such that T = A+r_i B.

   * Second, SB = R+H(R,tag(T),M)T. This implies S'B = R+H(R,tag(T),M)A;
     i.e., (R,S') is a signature of (tag(T),M) under A.

   * Third, (T,M) wasn't an oracle query from X. What I want to conclude
     at this point is that (tag(T),M) wasn't legitimately signed by A,
     i.e., wasn't an oracle query from ReRandomize(X). This is where the
     injectivity of tag(T) matters.

     The queries from ReRandomize(X) are exactly (tag(T'),M') for the
     queries (T',M') from X such that T' is in {A+r_1 B,...,A+R_N B}.
     I'm using the notation T',M' here to avoid confusion with the T,M
     output by X.

     Suppose the output (tag(T),M) was a query from ReRandomize(X). Then
     (tag(T),M) = (tag(T'),M') for some query (T',M') from X. Now tag is
     self-delimiting, so tag(T) = tag(T') and M = M'; tag is injective,
     so T = T'; so (T,M) = (T',M'); so (T,M) _was_ a query from X,
     contradiction. Hence (tag(T),M) wasn't a query from ReRandomize(X).

That's the end of the proof.

What happens if we drop the injectivity requirement for the tag
function? The argument still works if the tag function is injective _on
the generated keys_. If the tag length is, say, 160 bits, and there are
2^60 keys, then it's reasonable to conjecture that the collision chance
is around 2^-40, which is acceptable as an attack probability. Maybe one
could even prove something about the collision chance of 160-bit tags
for 256-bit curves. But this approach obviously breaks down for short
tag lengths, and in particular for empty tags.

(I don't think there's a real reason to consider any intermediate tag
lengths between empty tags and complete tags. There's negligible impact
on hashing cost. The extreme "I need to minimize secret-key storage and
can't afford to recompute the public key on demand" argument is an
argument for an empty tag, not an argument for a short tag.)

Even worse, if tags collide, the conclusion that we're trying to prove
is simply wrong. As an extreme example, take empty tags, and consider a
very slow 2-user attack X that

   * asks the first key for a signature of the message "yes",

   * applies any ECDLP algorithm (however long this takes) to compute
     the discrete logarithm of the second key, and

   * applies the normal signing procedure to forge a signature of the
     message "yes" under the second key.

This attack X has success probability almost exactly 1: the only way it
fails is in the extremely rare case that the two keys happen to be
identical. However, the attack ReRandomize(X) has success probability
exactly 0: it outputs a signature of "yes", but that's not a forgery,
since earlier it asked for a signature of "yes".

Could ReRandomize be modified somehow to work for empty tags? The
obstacles are formidable:

   * The signature scheme hashes messages, and I can't imagine how the
     reduction could modify hash inputs, so the reduction will have to
     produce a forgery on the same message "yes" forged by X.

   * For this to be a forgery, the reduction has to retroactively avoid
     asking for a signature on "yes". How can the reduction predict that
     "yes" will be the message to be forged?

   * Even if the reduction does somehow predict this, how will it answer
     X's signing query for "yes" from the first user? The reduction
     can't simply skip that query---maybe X actually does some
     interesting computation on the signature result, influencing its
     final output.

The original paper doesn't try to address any of these obstacles. It
writes down ReRandomize(X) for the case of empty tags (minor difference:
it takes r_1=0, assuming that A is uniformly distributed), and correctly
observes that success in X implies that the output of ReRandomize(X) is
a valid signature. It then leaps to the conclusion that the output is a
"valid forgery", without checking whether the message was signed before.

---Dan