[Cfrg] Does the Curve25519 Montgomery ladder always work?

"D. J. Bernstein" <djb@cr.yp.to> Sun, 31 August 2014 08:42 UTC

Return-Path: <djb-dsn2-1406711340.7506@cr.yp.to>
X-Original-To: cfrg@ietfa.amsl.com
Delivered-To: cfrg@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7C46E1A888D for <cfrg@ietfa.amsl.com>; Sun, 31 Aug 2014 01:42:35 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.101
X-Spam-Level:
X-Spam-Status: No, score=0.101 tagged_above=-999 required=5 tests=[BAYES_50=0.8, RCVD_IN_DNSWL_LOW=-0.7, UNPARSEABLE_RELAY=0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 6-sz0fGbF3BC for <cfrg@ietfa.amsl.com>; Sun, 31 Aug 2014 01:42:33 -0700 (PDT)
Received: from mace.cs.uic.edu (mace.cs.uic.edu [131.193.32.224]) by ietfa.amsl.com (Postfix) with SMTP id CD83C1A06C1 for <cfrg@irtf.org>; Sun, 31 Aug 2014 01:42:32 -0700 (PDT)
Received: (qmail 9370 invoked by uid 1011); 31 Aug 2014 08:42:31 -0000
Received: from unknown (unknown) by unknown with QMTP; 31 Aug 2014 08:42:31 -0000
Received: (qmail 11622 invoked by uid 1001); 31 Aug 2014 08:42:28 -0000
Date: 31 Aug 2014 08:42:28 -0000
Message-ID: <20140831084228.11620.qmail@cr.yp.to>
From: "D. J. Bernstein" <djb@cr.yp.to>
To: cfrg@irtf.org
Mail-Followup-To: cfrg@irtf.org
Archived-At: http://mailarchive.ietf.org/arch/msg/cfrg/pt2bt3fGQbNF8qdEcorp-rJSJrc
Subject: [Cfrg] Does the Curve25519 Montgomery ladder always work?
X-BeenThere: cfrg@irtf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Crypto Forum Research Group <cfrg.irtf.org>
List-Unsubscribe: <http://www.irtf.org/mailman/options/cfrg>, <mailto:cfrg-request@irtf.org?subject=unsubscribe>
List-Archive: <http://www.irtf.org/mail-archive/web/cfrg/>
List-Post: <mailto:cfrg@irtf.org>
List-Help: <mailto:cfrg-request@irtf.org?subject=help>
List-Subscribe: <http://www.irtf.org/mailman/listinfo/cfrg>, <mailto:cfrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Sun, 31 Aug 2014 08:42:35 -0000

The Montgomery ladder is a remarkably simple, fast, constant-time
algorithm for single-scalar multiplication on Curve25519. (Everything
I'm saying applies more broadly to any Montgomery curve y^2 = x^3+Ax^2+x
where A^2-4 is non-square, i.e., where there is a unique point of order
2.) Here's the Montgomery-ladder code that I displayed in Toronto:

   x2,z2,x3,z3 = 1,0,x1,1
   for i in reversed(range(255)):
     bit = 1 & (n >> i)
     x2,x3 = cswap(x2,x3,bit)
     z2,z3 = cswap(z2,z3,bit)
     x3,z3 = ((x2*x3-z2*z3)^2,x1*(x2*z3-z2*x3)^2)
     x2,z2 = ((x2^2-z2^2)^2,4*x2*z2*(x2^2+A*x2*z2+z2^2))
     x2,x3 = cswap(x2,x3,bit)
     z2,z3 = cswap(z2,z3,bit)
   return x2*z2^(p-2)

I commented in my talk that this code _always_ works. The input scalar n
can be _any_ integer between 0 and 2^255-1; the 255-bit limit here is
specified by the i loop. The input coordinate x1 can be _any_ integer.
One can always view x1 modulo p as X0(Q) for exactly two points Q on
the curve over F_{p^2}. The Curve25519 paper proves that this code
always maps Q to nQ, i.e., that the output is X0(nQ) for both points Q.

(The function X0 is the same as the usual Montgomery X-coordinate,
except that it maps the point at infinity to 0. The two points Q having
X0(Q)=0 are the point at infinity and (0,0), the unique point of order
2; these are on the curve over F_p and are simultaneously on the twist
over F_p. When x is nonzero in F_p, the two points Q having X0(Q)=x are
(x,+-sqrt(x^3+Ax^2+x)); these are both on the curve over F_p if
x^3+Ax^2+x is a square in F_p, and are both on the twist otherwise.)

Microsoft has issued various statements claiming that this code does not
always work. For example, Patrick Longa has claimed that "when computing
the Montgomery ladder in constant time there is a potential failure case
if the scalar matches 0 (mod r) or 0 (mod r’), where r and r’ are the
prime orders of the subgroups generated on the Montgomery curve and its
quadratic twist, respectively."

Executive summary of this message: The Microsoft claims are incorrect.
There are no failure cases. The Montgomery ladder always works for these
curves, always computing scalar multiplication correctly, exactly as
stated in the Curve25519 paper. I've carefully reviewed the underlying
Microsoft arguments and will dispose of them here; I'll also comment on
a tangential high-bit issue.

The critical Microsoft error is in http://eprint.iacr.org/2014/130.pdf,
in the middle of page 17, which claims that "there are some scalar/point
combinations which could produce exceptions." The paragraph goes on to
specifically consider multiplying

   * a point of prime order r' on the twist by
   * a scalar that's a multiple of r'.

The simple fact is that the result of this multiplication is 0, just
like multiplying a point of order r by the scalar r, or multiplying any
point by 0. This is not an exception: it is exactly what the definition
of scalar multiplication means for these inputs, and it is exactly what
any working implementation will produce. The code works correctly in
this case, as it does in all other cases.

The paragraph also claims that 37 is a weak key, because an attacker who
sees you output 37P given P can see that your scalar was 37 (modulo the
order of P). One could of course make the same claim about any scalar---
for example, the Microsoft paragraph actually makes the same argument
for r' rather than 37---demonstrating that the claim has nothing to do
with security. Relevant quote from the Ed25519 paper:

   _Weak keys._ Forgeries are trivial if A is a known multiple of B. For
   example, an attacker who knows that A = 37B can ... As an even more
   extreme example, an attacker who knows that A = 0B can ... We could
   declare that 0B and 37B are "broken" by these two "attacks" and that
   users must check for, and reject, these "weak keys"; but the same
   confused logic would require rejecting _all_ keys in _all_
   cryptosystems, and would have no relevance to the standard definition
   of signature security.

Declaring a scalar to be "weak" does not mean that it is actually weak.
More to the point, even if Microsoft can come up with a definition of
"leak" to fit its claim that a DH output of 0 would "leak information to
an attacker", there's no justification for the claim that this output is
an "exception" to the scalar-multiplication routine.

The Microsoft document goes on to refer to this correct example of
scalar multiplication as a "problem" with the Montgomery ladder, and to
claim that this "problem" is "overcome" by the choice of scalars in what
I'm now suggesting to call X25519, the widely deployed DH function
recommended in the Curve25519 paper.

It's true that X25519 restricts the choice of scalars, by always setting
the high bit and clearing the bottom 3 bits. But this has nothing to do
with imaginary exceptions in this scalar-multiplication method: it's
designed to protect implementors who accidentally end up with other
scalar-multiplication methods.

For example, as I mentioned in Toronto, many sources present a
variable-size Montgomery ladder that takes time proportional to the
position of the high bit of the scalar. This is still correct but it
takes variable time---a variation that disappears when the high-bit
position is prespecified. As a fancier example, there are many common
short-Weierstrass scalar-multiplication methods for which the X25519
choices avoid the point at infinity, rescuing an implementor who (1)
uses those methods but (2) screws up the handling of infinity.  

Slide 31 of http://cr.yp.to/talks/2005.09.20/slides.pdf mentions that
setting this high bit "avoids infinity and avoids timing attacks". This
was six years before http://eprint.iacr.org/2011/232.pdf extracted keys
from exactly this timing leak in OpenSSL.

Setting this high bit obviously restricts the range of scalars, but the
security impact of such restrictions has been thoroughly analyzed in the
literature. Range restrictions on _signature nonces_ are troublesome;
the simplest way to avoid all trouble is to generate many extra bits
(e.g., Ed25519 uses 512 bits). Small range restrictions on _secret keys_
are completely different---they're protected by an ECC feature called
"random self-reducibility". For example, if there's a subroutine to
compute discrete logs known to be between 2^254 and 2^255-1, then the
following algorithm will compute _any_ 255-bit discrete log:

   0. Do the following two steps in a random order.
   1. Try the subroutine on the target point Q.
   2. Try the subroutine on Q + 2^254 B, where B is the generator.

This algorithm takes either 1 or 2 iterations of the subroutine, on
average 1.5 iterations, no matter what Q is. The hardness of computing
discrete logs for arbitrary curve points thus implies the hardness of
computing discrete logs known to have the 254th bit set, except for a
time factor limited to 1.5. Similar comments apply to lower-probability
subroutines, variable-time subroutines, etc.: the hardness is provably
identical within a probability factor of 2. For known attacks the factor
is much closer to 1 ("kangaroo" attacks have worse constants than "rho"
attacks), and for X25519 the gap provably disappears since X25519 hides
the sign of y, but in any case the gap is guaranteed to be very small.

The Microsoft paper states that "less than half" of the multiples of B
are "possible outputs" of scalar multiplication using the restricted
scalar range. This is true but deceptive---it suggests that there is a
security question without acknowledging that the answer to the question
is already known. The Microsoft paper doesn't cite any of the literature
on this topic, and also doesn't acknowledge the relevant comments in the
original Curve25519 paper.

---Dan