Re: [Cfrg] On relative performance of Edwards v.s. Montgomery Curve25519, variable base

Watson Ladd <watsonbladd@gmail.com> Mon, 12 January 2015 15:28 UTC

Return-Path: <watsonbladd@gmail.com>
X-Original-To: cfrg@ietfa.amsl.com
Delivered-To: cfrg@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 841811ABC0F for <cfrg@ietfa.amsl.com>; Mon, 12 Jan 2015 07:28:16 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Level:
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id k0kRuqTuw_BM for <cfrg@ietfa.amsl.com>; Mon, 12 Jan 2015 07:28:12 -0800 (PST)
Received: from mail-yk0-x233.google.com (mail-yk0-x233.google.com [IPv6:2607:f8b0:4002:c07::233]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 10DC81AC3A0 for <cfrg@irtf.org>; Mon, 12 Jan 2015 07:28:12 -0800 (PST)
Received: by mail-yk0-f179.google.com with SMTP id 19so9441443ykq.10 for <cfrg@irtf.org>; Mon, 12 Jan 2015 07:28:11 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=iTpwh9REkXXYzsRRmM+3aF70KrnuV3EO8Sijvaqbs8U=; b=B1L6W/g85uEOcbNMxVWQET3GaSsNd8VehrZYTo4gUGgGbjHt2baLsGnVHA1S1MBWVS 5irGAGSJIGXvv9O+rpvhQe4yBy3z2Ld3D76IXpx5ScsW+xT96ExRX2UCF2TLU9zMNnFl BqBVmrWaDzXF85R9M8/NFek4Osql8wcCr1T9SCbJ0pBs66wYNl15Estu1EdgiV0VjYwq 2V1zyykeF/FmmrnT7cP25UgOsybO/Ozteni4K1CEaw14dvaUswd2SpdZRUAds7kn388D I74oNMDC98n0WKPm/+4tKTP6WpCt8sP7pgCLXsmaoWGs0MbF0HMwVD/C1QHHldPzre/d M0Ig==
MIME-Version: 1.0
X-Received: by 10.236.11.45 with SMTP id 33mr22909344yhw.4.1421076491027; Mon, 12 Jan 2015 07:28:11 -0800 (PST)
Received: by 10.170.207.6 with HTTP; Mon, 12 Jan 2015 07:28:10 -0800 (PST)
In-Reply-To: <88805D27-3B08-421D-B62A-2FC61AC5851A@shiftleft.org>
References: <54AA4AB9.70505@brainhub.org> <54B315CA.6040900@brainhub.org> <88805D27-3B08-421D-B62A-2FC61AC5851A@shiftleft.org>
Date: Mon, 12 Jan 2015 07:28:10 -0800
Message-ID: <CACsn0c=qxBXCkr7hCtzgY9U+5_N8hY=jShU7g=hUbqkrUMYxNw@mail.gmail.com>
From: Watson Ladd <watsonbladd@gmail.com>
To: Michael Hamburg <mike@shiftleft.org>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <http://mailarchive.ietf.org/arch/msg/cfrg/GUDHsWIbNblqyF0NbD-KWQZFVuo>
Cc: "cfrg@irtf.org" <cfrg@irtf.org>
Subject: Re: [Cfrg] On relative performance of Edwards v.s. Montgomery Curve25519, variable base
X-BeenThere: cfrg@irtf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Crypto Forum Research Group <cfrg.irtf.org>
List-Unsubscribe: <http://www.irtf.org/mailman/options/cfrg>, <mailto:cfrg-request@irtf.org?subject=unsubscribe>
List-Archive: <http://www.irtf.org/mail-archive/web/cfrg/>
List-Post: <mailto:cfrg@irtf.org>
List-Help: <mailto:cfrg-request@irtf.org?subject=help>
List-Subscribe: <http://www.irtf.org/mailman/listinfo/cfrg>, <mailto:cfrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Mon, 12 Jan 2015 15:28:16 -0000

On Sun, Jan 11, 2015 at 6:52 PM, Michael Hamburg <mike@shiftleft.org> wrote:
> Thanks for your analysis!  However, please note that with lazy extended homogeneous coordinates on a twisted Edwards curve with a=-1, the cost of an Edwards addition to an accumulator is 8M and not 10M+1S, and the cost of a doubling is still 3M+4S, perhaps with an additional 1M to convert from this form to a suitable one for the window table.
>
> Furthermore, the multiplication by a constant in the Montgomery curve is not negligible, but neither is the cost of a constant-time lookup in the Edwards window table.
>
> Cheers,
> — Mike

Yes, this is the fastest coordinate system. Inverted Edwards, which I
used for my calculation (but with a stupidly slow inversion routine)
speeds doubling but not additions significantly, compared to extended
homogeneous coordinates.

The table is filled with a cost of 8M for additions and 4M+4S for
doublings so 2^(w-2)*8M+2^(w-2)*(4M+4S). Then we proceed for
ceil(255/w)-1steps, each step consisting of (w-1) 3M+4S doublings, and
1 4M+4S doubling. Ergo the total cost becomes
2^(w-2)*(12M+4S)+(ceil(255/w)-1)*(w*(3M+4S)+1M).

Evaluating we get w=3 as the best choice, with 864 multiplications and
1016 squarings. Of course, it's possible I made a typo in the above.
Adding in the inversion at the end, gives 875*M+1261*S. The total
savings over Montgomery ladder with M=0.8S are about 20%.

But we're neglecting the square root, which seems to be the point of
contention. A square root is essentially an exponentiation to 2^252-2.
Writing down an obvious addition chain I get 376 squares and 14
multiplications. It's very possible, indeed likely, that this can be
tightened. If not, then the total work for this approach with
Montgomery x only is 889*M+1637*S, which with M=0.8S is 4% less.

Of course, the implementation described above is considerably more
complex than the montgomery ladder, and has to validate its inputs in
ways that are frequently forgotten. The addition formulas used aren't
marked as complete, only strongly unified, which introduces occasional
corner cases. If you actually care about the top performance possible
for variable-base multiplication without sacrificing security, then
try Kummer surfaces.

Sincerely,
Watson Ladd

>
>> On Jan 11, 2015, at 4:31 PM, Andrey Jivsov <crypto@brainhub.org> wrote:
>>
>> Counting multiplications (M) and squares (S) in F(p), ECC formulas use at least 5% more operations v.s. Montgomery.
>>
>> The lowest non-Montgomery count is reached with twisted Edwards, signed window, w=5, which yields 6%/5.5% more M/S.
>>
>> I pasted my notes below.
>>
>> It's possible that my initial results in this thread are due to the difference in F(p) operations between the two implementations.
>>
>> See also http://www.ietf.org/mail-archive/web/cfrg/current/msg05857.html
>>
>> Thanks to Mike and Watson for pointers and comments.
>>
>>
>> -----------------------------------------------------------------
>> Montgomery method:
>> -----------------------------------------------------------------
>>
>> From http://hyperelliptic.org/EFD/g1p/auto-montgom-xz.html (8.2M for differential addition and doubling) with Z1=1: 5M+4S.
>>
>> 256*(5*M+4*S)=1275*M + 1020*S
>> Inverse: 11*M + 245*S
>>
>> (also checked curve25519-donna-c64.c)
>>
>> Total: 1285*M + 1265*S
>>
>> -----------------------------------------------------------------
>> t.Edwards window method:
>> -----------------------------------------------------------------
>>
>> Write the windows size as w and n-bit E(F(p)) curve (e.g. n=255). 2^(w-1) signed elements need to be calculated.
>>
>> The half of 2^(w-1) is filled with doubles, half with additions
>>
>> 2^(w-2) additions, 2^(w-2) doubles, z!=1
>>
>> addition: 10M+1S
>> double: 3M+4S
>> from http://hyperelliptic.org/EFD/g1p/auto-twisted-projective.html
>>
>> Window filling:
>> 2^(w-2)*(10*M+S) + 2^(w-2)*(3*M+4*S)
>>
>> For windowed scalar multiplication there are w(ceil(n/w)-1) doubles and ceil(255/w)-1 additions
>>
>> w*(ceil(n/w)-1)*(3*M+4*S)+(ceil(n/w)-1)*(10*M+S)
>>
>> The formula is 2^(w-2)*(10*M+S) + 2^(w-2)*(3*M+4*S) + w*(ceil(n/w)-1)*(3*M+4*S)+(ceil(n/w)-1)*(10*M+S)
>>
>> w=3: 1622*M + 1102*S
>> w=4: 1438*M + 1091*S
>> w=5: 1354*M + 1090*S
>> w=6: 1384*M + 1130*S
>>
>> Inverse: 11*M + 245*S
>>
>> Total (w=5) with inverse:
>> 1354*M + 1090*S +11*M + 245*S = 1365*M + 1335*S
>>
>> So we have
>>
>> Montgomery (M):     1285*M + 1265*S
>> t.Edwards w=5 (tE): 1365*M + 1335*S
>> ( tE needs 6% more M, 5.5% more S )
>>
>> Unifying S+M, this is:
>>
>> for S=M cost:
>> M: 2550*M
>> tE: 2700*M
>>
>> for S=0.625M cost:
>> M: 2075M
>> E: 2200M
>>
>> Watson's estimate is:
>> Montgomery: 2491M
>> t.Edwards w=5: 2714M
>>
>> -----------------------------------------------------------------
>> "Jacobian coordinates with a4=0 for short Weierstrass curves", window method
>> -----------------------------------------------------------------
>> http://hyperelliptic.org/EFD/g1p/auto-shortw-jacobian-0.html
>>
>> doubling: 2M+5S
>> addition:  11M+5S
>>
>> 2^(w-2)*(11*M+5*S) + 2^(w-2)*(2*M+5*S) + w*(ceil(n/w)-1)*(2*M+5*S)+(ceil(n/w)-1)*(11*M+5*S) + 11*M + 245*S
>>
>> w=5: 1165*M + 1825*S
>>
>> for S=M cost
>> 2990*M
>>
>> for S=0.625M cost
>> 2305*M
>>
>> -----------------------------------------------------------------
>> variable curve size n, window method
>> -----------------------------------------------------------------
>>
>> Hand-waive (extrapolate from Curve25519) the inverse as
>> 0.95*n*S+0.05*n*M
>>
>> Montgomery:
>> (n+1)*(5*M+4*S)+0.95*n*S+0.05*n*M
>>
>> t.Edwards:
>> F=2^(w-2)*(10*M+S) + 2^(w-2)*(3*M+4*S) + w*(ceil(n/w)-1)*(3*M+4*S)+(ceil(n/w)-1)*(10*M+S)+ 0.95*n*S+0.05*n*M
>>
>> n=384:
>> M: 1944*M + 1904*S
>> w=5:
>> tE: 2023*M + 2000*S
>> w=6:
>> tE: 1991*M + 2019*S
>>
>> n=255, w=5  (compare to the actual, above):
>> M: 1292*M + 1266*S
>> tE: 1366*M + 1332*S
>>
>> n=400:
>> M: 2025*M + 1984*S
>> w=5:
>> tE: 2099*M + 2079*S
>> w=6:
>> tE: 2076*M + 2110.*S
>> w=7:
>> tE: 2203*M + 2193*S
>>
>> n=512:
>> M: 2590*M + 2538*S
>> w=5:
>> tE: 2679*M + 2668*S
>> w=6:
>> tE: 2613*M + 2691*S
>>
>> -----------------------------------------------------------------
>> Comb method
>> -----------------------------------------------------------------
>> The method is described here: http://eprint.iacr.org/2005/222.pdf, but always add O:
>>
>> precomp: (w-1)*ceil(n/w)(3M+4S) + ((2^w)-w-1)*(10M+1S)
>> mult: (ceil(n/w)-1)*(3M+4S+10M+1S)
>> inverse (n=255): 11*M + 245*S
>>
>> total: (w-1)*ceil(n/w)*(3*M+4*S) + ((2^(w-1))-w-1)*(10*M+S) + (ceil(n/w)-1)*(3*M+4*S+10*M+S)+11*M + 245*S
>>
>> n=255:
>> w=3:
>> 1653*M + 1349*S
>> w=4:
>> 1516*M + 1339*S
>> w=5:
>> 1533*M + 1337*S
>> w=6:
>> 1772*M + 1372*S
>>
>> _______________________________________________
>> Cfrg mailing list
>> Cfrg@irtf.org
>> http://www.irtf.org/mailman/listinfo/cfrg
>
> _______________________________________________
> Cfrg mailing list
> Cfrg@irtf.org
> http://www.irtf.org/mailman/listinfo/cfrg



-- 
"Those who would give up Essential Liberty to purchase a little
Temporary Safety deserve neither  Liberty nor Safety."
-- Benjamin Franklin