Re: [Cfrg] On relative performance of Edwards v.s. Montgomery Curve25519, variable base

Watson Ladd <watsonbladd@gmail.com> Mon, 26 January 2015 04:16 UTC

Return-Path: <watsonbladd@gmail.com>
X-Original-To: cfrg@ietfa.amsl.com
Delivered-To: cfrg@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8D8B61A1BD4 for <cfrg@ietfa.amsl.com>; Sun, 25 Jan 2015 20:16:43 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.4
X-Spam-Level:
X-Spam-Status: No, score=-1.4 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, J_CHICKENPOX_22=0.6, SPF_PASS=-0.001] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id u_Ce8ruCO57d for <cfrg@ietfa.amsl.com>; Sun, 25 Jan 2015 20:16:41 -0800 (PST)
Received: from mail-yk0-x233.google.com (mail-yk0-x233.google.com [IPv6:2607:f8b0:4002:c07::233]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 55B271A1B85 for <cfrg@irtf.org>; Sun, 25 Jan 2015 20:16:41 -0800 (PST)
Received: by mail-yk0-f179.google.com with SMTP id 142so3006355ykq.10 for <cfrg@irtf.org>; Sun, 25 Jan 2015 20:16:40 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=PumQ8sbTgT6wrUkE+KtBmYNPLYfo00nVTy1Dmq4YOek=; b=uE+ltrRPxg/JS9aZrGdBIie1896l55PUv/mA6n77Zuca8qwvDac07+VxnhyA8RretB Of8lcXtBzzNJTPJyp49eyS5YJDikJy9vdeIPweqKIHw6mWkBfLQU0z4BduP4FSPVp3Pd zOkPlFaoQv1p87vz4IJZ+10NyowBtNC49dA72p2pR2ZA9Fmwu7D0B7G4w6mA0gwtz0HX lNqbSV4rJPdsbb7Gds95qLVHULNyq35iygGByvXwcNvvLHKcLpOsfOasgOA/odm2tLuy EnWYNJqh5dXz1x5eTQ1hbXBfynwDfTJ1nV89X6qqgE+sJhrpyThWEJnV+p8YGl00SY+E hFPQ==
MIME-Version: 1.0
X-Received: by 10.170.112.215 with SMTP id e206mr9396713ykb.126.1422245800616; Sun, 25 Jan 2015 20:16:40 -0800 (PST)
Received: by 10.170.115.77 with HTTP; Sun, 25 Jan 2015 20:16:40 -0800 (PST)
In-Reply-To: <54C56190.2060407@brainhub.org>
References: <54AA4AB9.70505@brainhub.org> <54B315CA.6040900@brainhub.org> <88805D27-3B08-421D-B62A-2FC61AC5851A@shiftleft.org> <CACsn0c=qxBXCkr7hCtzgY9U+5_N8hY=jShU7g=hUbqkrUMYxNw@mail.gmail.com> <3C94ED57-5089-4A6D-9CC6-2DCD452C7BCF@shiftleft.org> <CACsn0ck6q9nxioS7q66MkB6M+YmaGj=Nmqop1LQ-DuG0q78GaQ@mail.gmail.com> <54BCAE24.6020408@brainhub.org> <CACsn0ckb6t_bsAocYnBhiqRkaJ3HExF4QNDf83riQqc=uHdQ0A@mail.gmail.com> <54BF6B8C.2020707@brainhub.org> <CACsn0cn9vT8QnaKhZx5LQ1FSjRyagOzHDnW9Ub=xvV93E4EBJw@mail.gmail.com> <54C56190.2060407@brainhub.org>
Date: Sun, 25 Jan 2015 20:16:40 -0800
Message-ID: <CACsn0cnG=0N7nSAC4ZVbAAFA64pXeTw==pANT0xgud9g9uGV+w@mail.gmail.com>
From: Watson Ladd <watsonbladd@gmail.com>
To: Andrey Jivsov <crypto@brainhub.org>
Content-Type: text/plain; charset=UTF-8
Archived-At: <http://mailarchive.ietf.org/arch/msg/cfrg/b-iV7oiKR-5UAW7LpsOrbJmpGdc>
Cc: "cfrg@irtf.org" <cfrg@irtf.org>
Subject: Re: [Cfrg] On relative performance of Edwards v.s. Montgomery Curve25519, variable base
X-BeenThere: cfrg@irtf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Crypto Forum Research Group <cfrg.irtf.org>
List-Unsubscribe: <http://www.irtf.org/mailman/options/cfrg>, <mailto:cfrg-request@irtf.org?subject=unsubscribe>
List-Archive: <http://www.irtf.org/mail-archive/web/cfrg/>
List-Post: <mailto:cfrg@irtf.org>
List-Help: <mailto:cfrg-request@irtf.org?subject=help>
List-Subscribe: <http://www.irtf.org/mailman/listinfo/cfrg>, <mailto:cfrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Mon, 26 Jan 2015 04:16:43 -0000

On Sun, Jan 25, 2015 at 1:35 PM, Andrey Jivsov <crypto@brainhub.org> wrote:
> On 01/21/2015 05:39 PM, Watson Ladd wrote:
>>
>> On Wed, Jan 21, 2015 at 1:04 AM, Andrey Jivsov <crypto@brainhub.org>
>> wrote:
>>>
>>> On 01/19/2015 04:20 PM, Watson Ladd wrote:
>>>>
>>>> And once again, the Montgomery ladder is extremely small in codesize,
>>>> one the field operations are implemented. Or is there some other
>>>> benefit I don't understand you are thinking of?
>>>
>>>
>>> The benefits of using extended twisted Edwards coordinates would be:
>>>
>>> * An order of magnitude faster key generation (this is a part of
>>> signature
>>> generation or ECDH ephemeral key generation)
>>> * Ability to add points (needed for signatures and many other protocols)
>>> * The same code can do what Montgomery ladder does (variable case
>>> scalarmult) at the same speed.
>>>
>>> It plausible that a library that needs more then ECDH variable base
>>> scalarmult would implement the above operations without the Montgomery
>>> ladder.
>>>
>>> However, if there there is a penalty to recover 'y', that unified
>>> implementation is less likely to happen.
>>
>> And the cost here is what? Note that your first bullet point is false:
>> the comb method on twisted Edwards can deliver a point on a Montgomery
>> curve, with minimal change in performance. I agree we would be using
>> twisted Edwards for signatures and protocols that require addition
>> (although Mike Hamburg has some ideas on safer point formats, based on
>> Jacobi quartics). But that doesn't mean we should use the same method
>> for ECDH: there are security advantages which your proposal doesn't
>> have.
>>
>> Once again this got discussed extensively over the summer.
>
>
> You are proposing to use a Montgomery ladder for second flight of ECDH
> (variable base). There is a benefit of simplicity of code here and
> protection against timing attacks for free, but not much of performance.
> This makes sense.
>
> The first flight of ECDH (fixed-base) is too slow with same implementation
> used for variable base (or any key generation), and so you propose to use a
> discussed over the summer idea to have another type of Montgomery ladder
> (which I couldn't locate). It won't be as simple as the first Montgomery
> ladder and it will probably be more complex as a simple ~20 additions of
> points from a pre-computed read-only table.

Let's look at actual performance numbers, and consider what the
algorithm actually is.

First performance http://bench.cr.yp.to/results-dh.html The Curve25519
implementation presented there outperforms a rather optimized P256
implementation even if we just use the Montgomery ladder for fixed
based exponentiation.

But we can do better via precomputation: what is the algorithm that
lets us do this? Well, we can compute the multiplication in Edwards
form and evaluate the isomorphism or isogeny to Montgomery form. This
was already spelled out January 6th, in Mike Hamburg's email.

Ah, but what if I need more speed? Well, Edwards form isn't faster for
variable time scalarmult. If you want the absolute fastest speed,
Kummer surfaces of genus 2 or GLV/GLS curves are standing by to take
your call.

>
> This, however, still doesn't solve the problem of digital signatures and
> SPAKE, which need to add points.

So we have a different point format for them: again this is what is
being done in SSH right now. Why is this impossible?

>
> For example, in https://tools.ietf.org/html/draft-ladd-spake2-01 each peer
> does 1 fixed-base, 2 variable-base, 2 addition. I would do this with a
> single implementation of a curve formula (e.g. using extended twisted
> Edward).

You seem to be conflating curves, coordinates, and computations, and
then managed to ignore standard speedups Of course, the SPAKE2
protocol needs to preserve the sign of points, and so cannot work with
x-coordinate only Montgomery. But what is the actual cost of SPAKE2?
Firstly, the additions are minimal cost. Secondly, xG+yP is best
computed by an algorithm due to Strauss, and is much closer to 1.5
variable-base multiplications.

When using the comb method, the comb contains affine points, and the
formulas for mixed addition are significantly faster than general
addition. One could of course forgo this increased performance, but
its not clear why one would. Similarly, double-base scalarmult
requires a complete addition law, while for single base one can get by
with an incomplete one that is potentially faster. All these
optimizations are generally implemented and deployed today.

>
> IMO a library like OpenSSL that anticipates to implement most of these
> algorithms *may* consider a single implementation that can add points with
> no performance loss.

But why is that implementation choice so valuable as to justify using
something other than x-coordinate Montgomery? It's not codesize,
except in the most constrained environments.

Sincerely,
Watson Ladd