Re: [Cfrg] Elliptic Curves - curve form and coordinate systems

"D. J. Bernstein" <djb@cr.yp.to> Mon, 16 March 2015 00:23 UTC

Return-Path: <djb-dsn2-1406711340.7506@cr.yp.to>
X-Original-To: cfrg@ietfa.amsl.com
Delivered-To: cfrg@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 09A0C1A1BF2 for <cfrg@ietfa.amsl.com>; Sun, 15 Mar 2015 17:23:10 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 2.897
X-Spam-Level: **
X-Spam-Status: No, score=2.897 tagged_above=-999 required=5 tests=[BAYES_50=0.8, HELO_EQ_NL=0.55, HOST_EQ_NL=1.545, LOTS_OF_MONEY=0.001, UNPARSEABLE_RELAY=0.001] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Nztnrm7GmxLr for <cfrg@ietfa.amsl.com>; Sun, 15 Mar 2015 17:23:07 -0700 (PDT)
Received: from calvin.win.tue.nl (calvin.win.tue.nl [131.155.70.11]) by ietfa.amsl.com (Postfix) with SMTP id B4C051A1BFE for <cfrg@irtf.org>; Sun, 15 Mar 2015 17:23:06 -0700 (PDT)
Received: (qmail 18306 invoked by uid 1017); 16 Mar 2015 00:23:24 -0000
Received: from unknown (unknown) by unknown with QMTP; 16 Mar 2015 00:23:24 -0000
Received: (qmail 28857 invoked by uid 1000); 16 Mar 2015 00:22:55 -0000
Date: Mon, 16 Mar 2015 00:22:55 -0000
Message-ID: <20150316002255.28855.qmail@cr.yp.to>
From: "D. J. Bernstein" <djb@cr.yp.to>
To: cfrg@irtf.org
Mail-Followup-To: cfrg@irtf.org
In-Reply-To: <5501E6A5.5040608@brainhub.org> <A6F30412-8E0A-4D8D-9F26-580307B46874@shiftleft.org>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
Archived-At: <http://mailarchive.ietf.org/arch/msg/cfrg/512dkohjs027ZCqlcY7_CYQDf6Y>
Subject: Re: [Cfrg] Elliptic Curves - curve form and coordinate systems
X-BeenThere: cfrg@irtf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Crypto Forum Research Group <cfrg.irtf.org>
List-Unsubscribe: <http://www.irtf.org/mailman/options/cfrg>, <mailto:cfrg-request@irtf.org?subject=unsubscribe>
List-Archive: <http://www.irtf.org/mail-archive/web/cfrg/>
List-Post: <mailto:cfrg@irtf.org>
List-Help: <mailto:cfrg-request@irtf.org?subject=help>
List-Subscribe: <http://www.irtf.org/mailman/listinfo/cfrg>, <mailto:cfrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Mon, 16 Mar 2015 00:23:10 -0000

One of the OpenSSL security fixes this year was correcting a bug in
point validation. The bug is estimated to allow more than 2^192 bogus
(x,y) pairs for NIST P-256, and presumably a few of those have order
below 2^66. Finding these medium-order points is an open problem for the
moment, but anyone who _can_ find them can use an active invalid-point
attack to break Microsoft's two-hour ECDH keys with <2^70 computations,
and thus to decrypt all other connections made during that two-hour
period. (The computations don't have to finish within the two hours.
Maybe this work can even be shared across multiple ECDH keys; I'd have
to look at other protocol details to figure this out.)

How many years did it take for this bug to be found in a very widely
used cryptographic library? Who's reviewing all the other NIST P-256
implementations to see whether they even _try_ to validate points, never
mind whether they get it right?

With the Montgomery ladder, the curve is reused in every step, and a bug
that corrupts computations will simply move to different points on the
same curve or its twist. Next-generation curves such as Curve25519
guarantee that _none_ of those points have dangerous orders. The fault
attack in

   https://www.di.ens.fr/~fouque/pub/fdtc08.pdf

worked only because it targeted old-fashioned curves that weren't
twist-secure, such as NIST P-224. With proper choices (twist-secure
curve, and scalar bits set accordingly) Montgomery-ladder ECDH doesn't
need _any_ input validation.

In particular, input 0 is handled correctly without any special tests,
contrary to Rene's comments and earlier comments from Microsoft. See
https://www.ietf.org/mail-archive/web/cfrg/current/msg05004.html.

Here's another interesting data point regarding code simplicity. In
flipping through Microsoft's "constant-time" ECCLib_v1.1, I noticed

   temp = (scalar[0] & val1) - val2; // ki = (k mod 2^w)/2^(w-1)
   *digits = (int)temp;
   digits++;
   res = scalar[0] - temp;           // k = (k-ki)/2^(w-1)
   borrow = (temp > 0) & (scalar[0] < (unsigned int)temp);

inside the fixed_window_recode() function. The scalar is, apparently,
supplied by an attacker, and I would expect the compiler in most cases
to turn the comparisons into variable-time branches, leaking scalar bits
through timing---clearly violating the "constant-time" promise. With the
Montgomery ladder, the scalar bits are used in a much simpler way,
minimizing the number of opportunities to screw up the implementation.

Andrey Jivsov writes:
> A sample of the interest in unified format can be found here:

You misunderstand. There are two different types of ECDH keygen code
available for X25519:

   * _simple_ key generators that use the Montgomery ladder;
   * _fast_ key generators that use precomputed Edwards points and
     convert the output to Montgomery format.

Both of these approaches produce identical Montgomery x-coordinates on
the wire. The fact that the fast implementations use Edwards internally
doesn't mean that all implementations want to do this (for most people
simplicity is more important than this extra burst of keygen speed) and
certainly doesn't mean that it's a good idea to use anything other than
Montgomery x on the wire for ECDH.

  [ send an extra coordinate v along with the Montgomery x ]
> An implementation that uses a Montgomery ladder simply ignores v

Well, yes, this is what any sensible ECDH receiver will do, so why waste
the network traffic? Meanwhile you're forcing the ECDH key generator to
compute v in the first place---a significant increase in the complexity
of the simplest (Montgomery ladder) keygen software.

> implement both signatures and ECDH

Switching from a "unified" format to Montgomery x for ECDH usually makes
this type of software _simpler_. The underlying issue is that there are
actually three different scalar-multiplication operations of interest:

   * double-scalar multiplication (signature verification);
   * fixed-base-point single-scalar multiplication (signing/keygen); and
   * variable-base-point single-scalar multiplication (ECDH).

For example, in Microsoft's library these are ECC_DBLMUL_TE,
ECC_MUL_FIXED_TE, and ECC_MUL_TE. Ripping all the ECDH-specific
functions out of the library (ECC_MUL_TE, fixed_window_recode, etc.),
and replacing them with a Montgomery ladder, would _simplify_ the code.

Furthermore, even if you can find a signature+ECDH library where a
"unified" format _reduces_ complexity rather than _increasing_ it, the
percentage saved will obviously be quite small---whereas sticking to
Montgomery x for ECDH saves a _much_ larger percentage of a pure ECDH
implementation.

> a single set of test vectors, code reuse, and the benefits of sharing
> pre-computed values

Every function needs its own test vectors in any case. Pre-computed
values are shared in any case. As for code reuse, see above.

> no (measurable) performance penalty

The extra network traffic is a measurable performance penalty, and is
more important than the CPU time saved for decompression inside
signature verification. See below.

Michael Hamburg writes:
  [ compression ]
> saves time and energy on radio links

Yes, obviously.

> but costs time and energy on wired ones

No.

Time: Curve25519 decompression takes roughly 5 microseconds on a typical
server CPU core, but this is less than the time that the core needs to
receive 256 bits from the Internet---unless you've bought a single
quad-core server to handle more than 200Mbps of Internet traffic; this
doesn't make economic sense.

Energy: 10 USD of electricity will pay for about 10 watt-years of this
125-watt quad-core CPU, i.e., a month of CPU time, i.e., 2000000000000
decompressions, saving 64 terabytes of Internet traffic---a month of
continuous traffic at 200Mbps, as above. Some ISPs do sell noticeable
fractions of 200Mbps for 10 USD/month, but that's because they know that
most users won't use the network anywhere near continuously.

> or when the point is cached

No. You pay _once_ to decompress a point the first time you use it, and
after that you cache the decompressed version. (I'm assuming that it's
cheaper for you to store the extra bytes than to recompute them; if not,
you should cache the compressed version.) Local caching doesn't affect
the performance impact of compression on the wire.

---Dan