Re: [Cfrg] Curve selection revisited

Robert Ransom <rransom.8774@gmail.com> Sat, 26 July 2014 02:25 UTC

Return-Path: <rransom.8774@gmail.com>
X-Original-To: cfrg@ietfa.amsl.com
Delivered-To: cfrg@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D63961A01E5 for <cfrg@ietfa.amsl.com>; Fri, 25 Jul 2014 19:25:56 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.75
X-Spam-Level:
X-Spam-Status: No, score=-1.75 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, SPF_PASS=-0.001] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cRmR8KMXYvp3 for <cfrg@ietfa.amsl.com>; Fri, 25 Jul 2014 19:25:55 -0700 (PDT)
Received: from mail-qg0-x22a.google.com (mail-qg0-x22a.google.com [IPv6:2607:f8b0:400d:c04::22a]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A70981A01D5 for <cfrg@irtf.org>; Fri, 25 Jul 2014 19:25:55 -0700 (PDT)
Received: by mail-qg0-f42.google.com with SMTP id j5so6036906qga.1 for <cfrg@irtf.org>; Fri, 25 Jul 2014 19:25:54 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=zs21VYF2/O9gJI620pb+jO0xPdulIe8y2Sjm41cDB9o=; b=dOSKZHOnAikmeHk8NnqtmpCXGqqoB9xiUCdwSgBuTcE9cj0dWrXvLy+QNpE8l4jEUb iNBHCli2hqPVc9+b6ApUpY562OtBt2iAHtVVto9/Mkb6Cp0pdHLTovgjQ7qmGEdV0IhP XrDu4mHzdlDOZthCp/PKruTRnLmFdNW8NqyH482zo81xfNK7jWfOAiD53WaII6V2c9Kg Gvjq2XjNc/JvOGz0frS13Nwsb5yOp1H6hQ/2iOVnVmWztScfKWAwWHaaAP5ndeIwFDig r5c2dbBEK6uAb6W1DUierqJGD40F6mn44JSh9KqdkC/4LpvwYoEtmy4CsBo630+3gKZ2 SxQw==
MIME-Version: 1.0
X-Received: by 10.224.134.201 with SMTP id k9mr33883918qat.59.1406341554734; Fri, 25 Jul 2014 19:25:54 -0700 (PDT)
Received: by 10.140.98.233 with HTTP; Fri, 25 Jul 2014 19:25:54 -0700 (PDT)
In-Reply-To: <CACsn0ckqFigWoH2+OOEHSd2VWPp8y6=m8H5OsFRyjXmjK7+m4w@mail.gmail.com>
References: <CA+Vbu7xroa68=HOZtbf=oz7kK2EeUv_z1okpnjxHPR0ZtHD5cA@mail.gmail.com> <CFF7E184.28E9F%kenny.paterson@rhul.ac.uk> <53D2781B.8030605@sbcglobal.net> <CACsn0ckqFigWoH2+OOEHSd2VWPp8y6=m8H5OsFRyjXmjK7+m4w@mail.gmail.com>
Date: Fri, 25 Jul 2014 19:25:54 -0700
Message-ID: <CABqy+srxMNuG0AaQd0SaegHvZWgbW762EQq+iAHL_fbu6sOJJQ@mail.gmail.com>
From: Robert Ransom <rransom.8774@gmail.com>
To: Watson Ladd <watsonbladd@gmail.com>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: http://mailarchive.ietf.org/arch/msg/cfrg/Wvsgkn1nuKPf5Gal8WCII7a_AMM
Cc: "cfrg@irtf.org" <cfrg@irtf.org>
Subject: Re: [Cfrg] Curve selection revisited
X-BeenThere: cfrg@irtf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Crypto Forum Research Group <cfrg.irtf.org>
List-Unsubscribe: <http://www.irtf.org/mailman/options/cfrg>, <mailto:cfrg-request@irtf.org?subject=unsubscribe>
List-Archive: <http://www.irtf.org/mail-archive/web/cfrg/>
List-Post: <mailto:cfrg@irtf.org>
List-Help: <mailto:cfrg-request@irtf.org?subject=help>
List-Subscribe: <http://www.irtf.org/mailman/listinfo/cfrg>, <mailto:cfrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Sat, 26 Jul 2014 02:25:57 -0000

On 7/25/14, Watson Ladd <watsonbladd@gmail.com> wrote:

> So I'd like to explain a bit about how implementations work, and why
> performance will not help us decide between Curve25519 in twisted
> Edwards form and the NUMS curve of similar size in twisted Edwards
> form.
>
> The first question is how to do the field arithmetic. All the primes
> being considered are of the form 2^s-c, so either one of saturated
> arithmetic
> or unsaturated will work. Unsaturated is more portable: C cannot
> access the carry flag. It offers more freedom in radix, and lets carry
> chains be vectorized. Karatsuba can be used or not, which depends on
> the saturated/unsaturated choice. But the exact values of s and c
> don't matter as much.

The exact value of c determines whether a portable or vectorized
implementation can delay carries until after reduction.  For example,
after curve25519-donna performs a multiply or squaring, each of 19
64-bit limbs has absolute value less than 14*2^54.  When
freduce_degree reduces a field element to 10 limbs without carrying,
the result has absolute value less than (c+1)*14*2^54 = 280*2^54, or a
bit more than 2^62.  In the Microsoft field 2^255 - 765, or even 2^256
- 189, this would not be possible.

Carries between limbs can be performed more efficiently than in
curve25519-donna -- curve25519-donna goes to extra effort to keep
limbs signed.  In the Montgomery ladder, this optimization does not
add any extra carry steps, because the Montgomery-ladder formulas
never perform more than two add/subtract operations between a
multiply/square operation.  With Edwards-curve operations, this
optimization may require an additional carry step in the doubling
formula.

The exact value of s determines whether carries can be vectorized
within a coordinate-field element.  For example, 2^251 - 9 can be
represented using one 26-bit limb (in the least significant position)
and 9 25-bit limbs, but this interferes with the vectorization
strategy described in the neoncrypto paper (by Dr. Bernstein et al.).
Since c=9 is so small, this can be worked around by representing
coordinates modulo 2^252 - 18 instead, but this trick is rarely
possible without running into the bound on c.


> The second question and third question what coordinates and what
> addition formulas to use when, and what algorithm to use. All twisted
> Edwards curves expose the same possibilities here: large precomputed
> tables with mixed coordinate addition vs. smaller tables for variable
> base, or a conversion to some other isogenous curve for variable base
> exponentiation.

Affine Montgomery-form points can be unpacked to and packed from
projective (optionally twisted) Edwards-form points at a trivial extra
cost over unpacking from or packing to a compressed affine
Edwards-form point.  Variable-base scalar-multiplication operations
which use Edwards form internally are not slowed significantly by
having a projective rather than affine input (if any operations rely
on their input being affine, they must occur during table setup), and
they are not slowed at all if they double and/or apply isogenies to
clear the cofactor group before operating on an ‘incomplete’ curve
(e.g. a=-1 in a field in which -1 is a non-square).

As you should know, fixed-base scalar-multiplication operations which
use Edwards form internally are not slowed at all by producing
Montgomery-form output.

Affine Edwards-form points cannot be unpacked to affine
Montgomery-form points for use with the Montgomery ladder without an
extra inversion at the beginning of a scalar multiplication, and using
a projective input in the Montgomery ladder is always slower than
inversion by coordinate exponentiation.

Since the Montgomery ladder remains the simplest implementation
strategy, and the performance benefit to more complex implementations
from using an Edwards-form point format is small at most, I still see
no reason to use an Edwards-form point format.


> If we are willing to use Montgomery form (which the TLS WG initially
> had strong support for in Curve25519)

Has the TLS WG changed its support for using the widely implemented,
de-facto standard point format for Curve25519?  If so, when was this
decision made?

> one caches ephemeral DH
> parameters so the cost of fixed-base exponentiation is amortized
> across connections. OpenSSL does this anyway, and this affects
> technique
> slightly.

This is good for performance even if you use a table to optimize
fixed-base scalar multiplications.

> But to be clear, a fast implementation for one prime and
> curve of a certain size and shape will likely be adaptable to a
> different curve
> over a similar prime with the same shape.

This is simply not true.  Curve operations can often be reused, and
some other utility routines can easily be modularized in a way that
supports other coordinate fields with similar properties (e.g.
q=3mod4), but primes which make portable or vectorized coordinate
operation implementations efficient are too rare for there to be any
pairs of ‘similar prime’s.


> Performance data will not distinguish between s=255 and s=256.

It will distinguish between Curve25519 and the Microsoft curves in the
implementations (either portable, open-source C, or heavily-optimized
machine code) which are actually used.


Robert Ransom