Re: [Cfrg] On relative performance of Edwards v.s. Montgomery Curve25519, variable base

Andrey Jivsov <crypto@brainhub.org> Thu, 29 January 2015 07:20 UTC

Return-Path: <crypto@brainhub.org>
X-Original-To: cfrg@ietfa.amsl.com
Delivered-To: cfrg@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 171501A8FD2 for <cfrg@ietfa.amsl.com>; Wed, 28 Jan 2015 23:20:52 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.301
X-Spam-Level:
X-Spam-Status: No, score=-1.301 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, J_CHICKENPOX_22=0.6, SPF_PASS=-0.001] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id kuOhvCyQDxMQ for <cfrg@ietfa.amsl.com>; Wed, 28 Jan 2015 23:20:49 -0800 (PST)
Received: from resqmta-po-03v.sys.comcast.net (resqmta-po-03v.sys.comcast.net [IPv6:2001:558:fe16:19:96:114:154:162]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D417D1A0063 for <cfrg@irtf.org>; Wed, 28 Jan 2015 23:20:48 -0800 (PST)
Received: from resomta-po-09v.sys.comcast.net ([96.114.154.233]) by resqmta-po-03v.sys.comcast.net with comcast id ljLo1p00152QWKC01jLoa6; Thu, 29 Jan 2015 07:20:48 +0000
Received: from [192.168.1.2] ([71.202.164.227]) by resomta-po-09v.sys.comcast.net with comcast id ljLn1p0064uhcbK01jLnqs; Thu, 29 Jan 2015 07:20:48 +0000
Message-ID: <54C9DF4F.4010909@brainhub.org>
Date: Wed, 28 Jan 2015 23:20:47 -0800
From: Andrey Jivsov <crypto@brainhub.org>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0
MIME-Version: 1.0
To: Watson Ladd <watsonbladd@gmail.com>
References: <54AA4AB9.70505@brainhub.org> <54B315CA.6040900@brainhub.org> <88805D27-3B08-421D-B62A-2FC61AC5851A@shiftleft.org> <CACsn0c=qxBXCkr7hCtzgY9U+5_N8hY=jShU7g=hUbqkrUMYxNw@mail.gmail.com> <3C94ED57-5089-4A6D-9CC6-2DCD452C7BCF@shiftleft.org> <CACsn0ck6q9nxioS7q66MkB6M+YmaGj=Nmqop1LQ-DuG0q78GaQ@mail.gmail.com> <54BCAE24.6020408@brainhub.org> <CACsn0ckb6t_bsAocYnBhiqRkaJ3HExF4QNDf83riQqc=uHdQ0A@mail.gmail.com> <54BF6B8C.2020707@brainhub.org> <CACsn0cn9vT8QnaKhZx5LQ1FSjRyagOzHDnW9Ub=xvV93E4EBJw@mail.gmail.com> <54C56190.2060407@brainhub.org> <CACsn0cnG=0N7nSAC4ZVbAAFA64pXeTw==pANT0xgud9g9uGV+w@mail.gmail.com>
In-Reply-To: <CACsn0cnG=0N7nSAC4ZVbAAFA64pXeTw==pANT0xgud9g9uGV+w@mail.gmail.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20140121; t=1422516048; bh=sLextKb/T18tovUEEeuXVChmIhpUF4U5NAAHbJuuF24=; h=Received:Received:Message-ID:Date:From:MIME-Version:To:Subject: Content-Type; b=dJ2Vl4I7mrUmvdid6ySTNmMarteVgN2UsX6ziZGbUw/h2OD8+ydwFyI0Etnt3MMXQ VNw/stZu3f4t2K4odiahVpAXa68/YbqKCgD+C8Sck87tP1F8e5qeOxoXQlhylqR9Mq QE6xA9S0FMyU1Iw5PakgscSjIw4pofGjZ3id4FJ4PUXmpRiLulvgzaYMSM9UvOuD6Y 5DMZykYY0cK8IAMxnYnGu9AhzFswiuZQvcHkBUKiTcH0ihypvzkk/uZIRMLtL9rKky 4MuU+CLkIc5ytY9BsOSJMOWBkWdfokKd3oXWz3a7TV9VKO32pAfLckrrDTTYM08cov pxgPsNHzC6RGA==
Archived-At: <http://mailarchive.ietf.org/arch/msg/cfrg/NyfP1nVFEusNTzNm7GEwbTutfbY>
Cc: "cfrg@irtf.org" <cfrg@irtf.org>
Subject: Re: [Cfrg] On relative performance of Edwards v.s. Montgomery Curve25519, variable base
X-BeenThere: cfrg@irtf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Crypto Forum Research Group <cfrg.irtf.org>
List-Unsubscribe: <http://www.irtf.org/mailman/options/cfrg>, <mailto:cfrg-request@irtf.org?subject=unsubscribe>
List-Archive: <http://www.irtf.org/mail-archive/web/cfrg/>
List-Post: <mailto:cfrg@irtf.org>
List-Help: <mailto:cfrg-request@irtf.org?subject=help>
List-Subscribe: <http://www.irtf.org/mailman/listinfo/cfrg>, <mailto:cfrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Thu, 29 Jan 2015 07:20:52 -0000

On 01/25/2015 08:16 PM, Watson Ladd wrote:
> On Sun, Jan 25, 2015 at 1:35 PM, Andrey Jivsov <crypto@brainhub.org> wrote:
>> On 01/21/2015 05:39 PM, Watson Ladd wrote:
>>>
>>> On Wed, Jan 21, 2015 at 1:04 AM, Andrey Jivsov <crypto@brainhub.org>
>>> wrote:
>>>>
>>>> On 01/19/2015 04:20 PM, Watson Ladd wrote:
>>>>>
>>>>> And once again, the Montgomery ladder is extremely small in codesize,
>>>>> one the field operations are implemented. Or is there some other
>>>>> benefit I don't understand you are thinking of?
>>>>
>>>>
>>>> The benefits of using extended twisted Edwards coordinates would be:
>>>>
>>>> * An order of magnitude faster key generation (this is a part of
>>>> signature
>>>> generation or ECDH ephemeral key generation)
>>>> * Ability to add points (needed for signatures and many other protocols)
>>>> * The same code can do what Montgomery ladder does (variable case
>>>> scalarmult) at the same speed.
>>>>
>>>> It plausible that a library that needs more then ECDH variable base
>>>> scalarmult would implement the above operations without the Montgomery
>>>> ladder.
>>>>
>>>> However, if there there is a penalty to recover 'y', that unified
>>>> implementation is less likely to happen.
>>>
>>> And the cost here is what? Note that your first bullet point is false:
>>> the comb method on twisted Edwards can deliver a point on a Montgomery
>>> curve, with minimal change in performance. I agree we would be using
>>> twisted Edwards for signatures and protocols that require addition
>>> (although Mike Hamburg has some ideas on safer point formats, based on
>>> Jacobi quartics). But that doesn't mean we should use the same method
>>> for ECDH: there are security advantages which your proposal doesn't
>>> have.
>>>
>>> Once again this got discussed extensively over the summer.
>>
>>
>> You are proposing to use a Montgomery ladder for second flight of ECDH
>> (variable base). There is a benefit of simplicity of code here and
>> protection against timing attacks for free, but not much of performance.
>> This makes sense.
>>
>> The first flight of ECDH (fixed-base) is too slow with same implementation
>> used for variable base (or any key generation), and so you propose to use a
>> discussed over the summer idea to have another type of Montgomery ladder
>> (which I couldn't locate). It won't be as simple as the first Montgomery
>> ladder and it will probably be more complex as a simple ~20 additions of
>> points from a pre-computed read-only table.
>
> Let's look at actual performance numbers, and consider what the
> algorithm actually is.
>
> First performance http://bench.cr.yp.to/results-dh.html The Curve25519
> implementation presented there outperforms a rather optimized P256
> implementation even if we just use the Montgomery ladder for fixed
> based exponentiation.

Yes, but P256 is not a subject this this thread.

( For P256 v.s. Curve25519+Montgomery I timed that in
https://www.ietf.org/mail-archive/web/cfrg/current/msg05211.html and 
http://www.ietf.org/mail-archive/web/cfrg/current/msg05251.html : up to 
65% faster )

> But we can do better via precomputation: what is the algorithm that
> lets us do this? Well, we can compute the multiplication in Edwards
> form and evaluate the isomorphism or isogeny to Montgomery form. This
> was already spelled out January 6th, in Mike Hamburg's email.

I think this will have a ~10% performance penalty due to y missing.

There was no free way to recover y noted. E.g., if we start with 
Montgomery Z!=1, as Mike Hamburg observed, it is cheaper to perform an 
inverse first than to use ( X, Z!=1 ) in the Montgomery ladder formula 
(because squaring is faster than multiplication), and so on.

>
> Ah, but what if I need more speed? Well, Edwards form isn't faster for
> variable time scalarmult. If you want the absolute fastest speed,
> Kummer surfaces of genus 2 or GLV/GLS curves are standing by to take
> your call.

I wonder if this will be worth the benefit (in a single thread).

>
>>
>> This, however, still doesn't solve the problem of digital signatures and
>> SPAKE, which need to add points.
>
> So we have a different point format for them: again this is what is
> being done in SSH right now. Why is this impossible?

There are benefits to have single implementation (that can add points).

However, regarding the wire format, it may be desirable for an 
encryption key to sign (think OpenPGP top-key only, proof of key 
possession, etc). Ignoring the "existing implementation practice", 
defining the Montgomery (x,y) as the point representation allows to do 
signatures, or encryption, and have single implementation, without 
performance penalty to recover y.

>
>>
>> For example, in https://tools.ietf.org/html/draft-ladd-spake2-01 each peer
>> does 1 fixed-base, 2 variable-base, 2 addition. I would do this with a
>> single implementation of a curve formula (e.g. using extended twisted
>> Edward).
>
> You seem to be conflating curves, coordinates, and computations, and
> then managed to ignore standard speedups Of course, the SPAKE2
> protocol needs to preserve the sign of points, and so cannot work with
> x-coordinate only Montgomery. But what is the actual cost of SPAKE2?
> Firstly, the additions are minimal cost. Secondly, xG+yP is best
> computed by an algorithm due to Strauss, and is much closer to 1.5
> variable-base multiplications.

Hm, 1.5 variable base multiplication doesn't seem appealing here, 
because xG is a factor of 10 faster than yP (for a reasonable 
precomputed table for ~256 bit F(p)) with EC formula that can add points).

>
> When using the comb method, the comb contains affine points, and the
> formulas for mixed addition are significantly faster than general
> addition. One could of course forgo this increased performance, but
> its not clear why one would. Similarly, double-base scalarmult
> requires a complete addition law, while for single base one can get by
> with an incomplete one that is potentially faster. All these
> optimizations are generally implemented and deployed today.
>
>>
>> IMO a library like OpenSSL that anticipates to implement most of these
>> algorithms *may* consider a single implementation that can add points with
>> no performance loss.
>
> But why is that implementation choice so valuable as to justify using
> something other than x-coordinate Montgomery? It's not codesize,
> except in the most constrained environments.

It's system simplicity. One then needs to implement, tune, and debug a 
single implementation.

(Single) Montgomery ladder is an excellent implementation strategy for 
many applications. If these applications' needs are served by the 
Montgomery ladder -- they should use a Montgomery ladder.


> Sincerely,
> Watson Ladd
>