Re: [Cfrg] Curve selection revisited

Watson Ladd <> Sat, 26 July 2014 03:40 UTC

Return-Path: <>
Received: from localhost ( []) by (Postfix) with ESMTP id 445FC1A0272 for <>; Fri, 25 Jul 2014 20:40:44 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, SPF_PASS=-0.001] autolearn=ham
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id oNFM-e2H_YzD for <>; Fri, 25 Jul 2014 20:40:41 -0700 (PDT)
Received: from ( [IPv6:2607:f8b0:4002:c01::233]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id BC4761A0270 for <>; Fri, 25 Jul 2014 20:40:41 -0700 (PDT)
Received: by with SMTP id f73so3519273yha.38 for <>; Fri, 25 Jul 2014 20:40:41 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=E7h4yGJuNo4sY1soC6ZA8eXfBRNZ4J+Bde8wjhyxfoY=; b=G9CPexKrpPSY6ZvkaiavYn3edHxPoc/EDXAG60QLty6bY2sTC2nhRHgIsYMA9k6V9q pXcuEt6vMSX1UqPtb9/Q2mGoAsAp8q3c0muMFwre5TJKgY9GtS76BMdQiYV/YsFzJLP/ 27zhsrEA0IzXstZck8tao+ovE1v2bSy2dlEnas7YqzTZyH0U18AIsqQYsFmD9k4JwvW+ k9hkYvGDseQ+2tgtA2jKCl44WQN4ztE9QGOuksdsyfswnz/trrU1iSaCDggFPw0jGrfs KADlaRclV/taG5unlCcX0W8gsy+cFGjpc+EBkr2FjdSvAhju+dZN3PvJIV+Qa9MNSgn1 ciwQ==
MIME-Version: 1.0
X-Received: by with SMTP id g23mr28290469yhc.15.1406346040916; Fri, 25 Jul 2014 20:40:40 -0700 (PDT)
Received: by with HTTP; Fri, 25 Jul 2014 20:40:40 -0700 (PDT)
In-Reply-To: <>
References: <> <> <> <> <>
Date: Fri, 25 Jul 2014 20:40:40 -0700
Message-ID: <>
From: Watson Ladd <>
To: Robert Ransom <>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Cc: "" <>
Subject: Re: [Cfrg] Curve selection revisited
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Crypto Forum Research Group <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Sat, 26 Jul 2014 03:40:44 -0000

Thanks for the much more detailed analysis: I'll have to revise my opinion.
On Fri, Jul 25, 2014 at 7:25 PM, Robert Ransom <> wrote:
> On 7/25/14, Watson Ladd <> wrote:
>> So I'd like to explain a bit about how implementations work, and why
>> performance will not help us decide between Curve25519 in twisted
>> Edwards form and the NUMS curve of similar size in twisted Edwards
>> form.
>> The first question is how to do the field arithmetic. All the primes
>> being considered are of the form 2^s-c, so either one of saturated
>> arithmetic
>> or unsaturated will work. Unsaturated is more portable: C cannot
>> access the carry flag. It offers more freedom in radix, and lets carry
>> chains be vectorized. Karatsuba can be used or not, which depends on
>> the saturated/unsaturated choice. But the exact values of s and c
>> don't matter as much.
> The exact value of c determines whether a portable or vectorized
> implementation can delay carries until after reduction.  For example,
> after curve25519-donna performs a multiply or squaring, each of 19
> 64-bit limbs has absolute value less than 14*2^54.  When
> freduce_degree reduces a field element to 10 limbs without carrying,
> the result has absolute value less than (c+1)*14*2^54 = 280*2^54, or a
> bit more than 2^62.  In the Microsoft field 2^255 - 765, or even 2^256
> - 189, this would not be possible.
> Carries between limbs can be performed more efficiently than in
> curve25519-donna -- curve25519-donna goes to extra effort to keep
> limbs signed.  In the Montgomery ladder, this optimization does not
> add any extra carry steps, because the Montgomery-ladder formulas
> never perform more than two add/subtract operations between a
> multiply/square operation.  With Edwards-curve operations, this
> optimization may require an additional carry step in the doubling
> formula.
> The exact value of s determines whether carries can be vectorized
> within a coordinate-field element.  For example, 2^251 - 9 can be
> represented using one 26-bit limb (in the least significant position)
> and 9 25-bit limbs, but this interferes with the vectorization
> strategy described in the neoncrypto paper (by Dr. Bernstein et al.).
> Since c=9 is so small, this can be worked around by representing
> coordinates modulo 2^252 - 18 instead, but this trick is rarely
> possible without running into the bound on c.

True: since square roots in F_p with p=5(8) are easy, one of the rationals
for the NUMS curves is just wrong, and we can live with smaller c.
>> The second question and third question what coordinates and what
>> addition formulas to use when, and what algorithm to use. All twisted
>> Edwards curves expose the same possibilities here: large precomputed
>> tables with mixed coordinate addition vs. smaller tables for variable
>> base, or a conversion to some other isogenous curve for variable base
>> exponentiation.
> Affine Montgomery-form points can be unpacked to and packed from
> projective (optionally twisted) Edwards-form points at a trivial extra
> cost over unpacking from or packing to a compressed affine
> Edwards-form point.  Variable-base scalar-multiplication operations
> which use Edwards form internally are not slowed significantly by
> having a projective rather than affine input (if any operations rely
> on their input being affine, they must occur during table setup), and
> they are not slowed at all if they double and/or apply isogenies to
> clear the cofactor group before operating on an ‘incomplete’ curve
> (e.g. a=-1 in a field in which -1 is a non-square).

I think I follow all the way up to the last phrase: the incompleteness
means some
additions might blow up, complicating the analysis, and I think a=-1
only gives the twist
if -1 is a nonsquare. Otherwise it is a curve isogenous to the
original one, under a 1-isogeny.
I've not thought about it that much.

> As you should know, fixed-base scalar-multiplication operations which
> use Edwards form internally are not slowed at all by producing
> Montgomery-form output.
> Affine Edwards-form points cannot be unpacked to affine
> Montgomery-form points for use with the Montgomery ladder without an
> extra inversion at the beginning of a scalar multiplication, and using
> a projective input in the Montgomery ladder is always slower than
> inversion by coordinate exponentiation.
> Since the Montgomery ladder remains the simplest implementation
> strategy, and the performance benefit to more complex implementations
> from using an Edwards-form point format is small at most, I still see
> no reason to use an Edwards-form point format.

Signatures: the y-encoding strategies I've seen are too clever by
half, so keys for
ECIES and signatures look very different. There is no security issue
with reuse and
there are papers to that effect, or so I'm told.
That said adding Edwards support or Montgomery support to an implementation
of the other partner in the isogenous pair is easy: we can live with
two point formats.

(there is an additional question of small parameters: a Montgomery
curve with small
parameter is birationally equivalent to a twisted Edwards curve of
small parameter, but
the converse is not (obviously) true. Most generated curves seem to be
in Edwards form
>> If we are willing to use Montgomery form (which the TLS WG initially
>> had strong support for in Curve25519)
> Has the TLS WG changed its support for using the widely implemented,
> de-facto standard point format for Curve25519?  If so, when was this
> decision made?

Quite true: perhaps the leadership would like to explain why
"curve25519 has no security
issues" was insufficient for the adoption of curve25519 in TLS for
ECDHE, given substantial
implementor support and existing drafts.

>> one caches ephemeral DH
>> parameters so the cost of fixed-base exponentiation is amortized
>> across connections. OpenSSL does this anyway, and this affects
>> technique
>> slightly.
> This is good for performance even if you use a table to optimize
> fixed-base scalar multiplications.

Very true: the NUMS performance results are thus dead wrong in ECDHE costs,
which are two variable-base multiplications plus a very amortized
fixed-base multiplication, not
the fixed+variable base they used.

>> But to be clear, a fast implementation for one prime and
>> curve of a certain size and shape will likely be adaptable to a
>> different curve
>> over a similar prime with the same shape.
> This is simply not true.  Curve operations can often be reused, and
> some other utility routines can easily be modularized in a way that
> supports other coordinate fields with similar properties (e.g.
> q=3mod4), but primes which make portable or vectorized coordinate
> operation implementations efficient are too rare for there to be any
> pairs of ‘similar prime’s.

I think within 10% or 5% is likely. Certainly not outside 50%.
>> Performance data will not distinguish between s=255 and s=256.
> It will distinguish between Curve25519 and the Microsoft curves in the
> implementations (either portable, open-source C, or heavily-optimized
> machine code) which are actually used.

Well, absent hard numbers I'm not sure we can say how much it distinguishes.
> Robert Ransom

Watson ladd