Re: [Cfrg] On relative performance of Edwards v.s. Montgomery Curve25519, variable base

Andrey Jivsov <crypto@brainhub.org> Mon, 12 January 2015 00:31 UTC

Return-Path: <crypto@brainhub.org>
X-Original-To: cfrg@ietfa.amsl.com
Delivered-To: cfrg@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BD6871A8866 for <cfrg@ietfa.amsl.com>; Sun, 11 Jan 2015 16:31:11 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.002
X-Spam-Level:
X-Spam-Status: No, score=-0.002 tagged_above=-999 required=5 tests=[BAYES_20=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id xua-h2TVRx1G for <cfrg@ietfa.amsl.com>; Sun, 11 Jan 2015 16:31:09 -0800 (PST)
Received: from resqmta-ch2-08v.sys.comcast.net (resqmta-ch2-08v.sys.comcast.net [IPv6:2001:558:fe21:29:69:252:207:40]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 0BB981A884B for <cfrg@irtf.org>; Sun, 11 Jan 2015 16:31:08 -0800 (PST)
Received: from resomta-ch2-17v.sys.comcast.net ([69.252.207.113]) by resqmta-ch2-08v.sys.comcast.net with comcast id eoWC1p0032TL4Rh01oX79C; Mon, 12 Jan 2015 00:31:07 +0000
Received: from [192.168.1.2] ([71.202.164.227]) by resomta-ch2-17v.sys.comcast.net with comcast id eoX61p00Z4uhcbK01oX71n; Mon, 12 Jan 2015 00:31:07 +0000
Message-ID: <54B315CA.6040900@brainhub.org>
Date: Sun, 11 Jan 2015 16:31:06 -0800
From: Andrey Jivsov <crypto@brainhub.org>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0
MIME-Version: 1.0
To: cfrg@irtf.org
References: <54AA4AB9.70505@brainhub.org>
In-Reply-To: <54AA4AB9.70505@brainhub.org>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20140121; t=1421022667; bh=iiMemkafbx5LiK6yBLTss22g38WAEcdfP6kyhVNlOnQ=; h=Received:Received:Message-ID:Date:From:MIME-Version:To:Subject: Content-Type; b=UX1X3f+ZaXf8ayiwcyXb4wCNJ0kP+jINlfq/h0cOurZYo2q7uZLafdGsFt8APEZly UDbyKlXrLHeRroJnztFPjCe6XpAVDrXeGDe/U4Ot1qhWu3AN5QUCnkKo07YBJGooxC +mCiEnWq7d8fB2Gdr87ggAiSSYTK+/hT2OUoEl4bxtHagLaaKcMDr5Itp6azuFpevl fC7HGCmVBS3RVq+fwXIJuyHaxr2qpouOsm1DkDxdJnke+11HCAubyh0MaNKSTFx5vz owGvmMCRv/yl332It+ZwH8Y7/2SLVfLKuOiMQ39qkUDeG0VmfVTIOORnLojUyDkNLu M/BzOxvn12hdw==
Archived-At: <http://mailarchive.ietf.org/arch/msg/cfrg/Z2eoBjLUQNXhYIz-CfQTvwHbtYY>
Subject: Re: [Cfrg] On relative performance of Edwards v.s. Montgomery Curve25519, variable base
X-BeenThere: cfrg@irtf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Crypto Forum Research Group <cfrg.irtf.org>
List-Unsubscribe: <http://www.irtf.org/mailman/options/cfrg>, <mailto:cfrg-request@irtf.org?subject=unsubscribe>
List-Archive: <http://www.irtf.org/mail-archive/web/cfrg/>
List-Post: <mailto:cfrg@irtf.org>
List-Help: <mailto:cfrg-request@irtf.org?subject=help>
List-Subscribe: <http://www.irtf.org/mailman/listinfo/cfrg>, <mailto:cfrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Mon, 12 Jan 2015 00:31:12 -0000

Counting multiplications (M) and squares (S) in F(p), ECC formulas use 
at least 5% more operations v.s. Montgomery.

The lowest non-Montgomery count is reached with twisted Edwards, signed 
window, w=5, which yields 6%/5.5% more M/S.

I pasted my notes below.

It's possible that my initial results in this thread are due to the 
difference in F(p) operations between the two implementations.

See also http://www.ietf.org/mail-archive/web/cfrg/current/msg05857.html

Thanks to Mike and Watson for pointers and comments.


-----------------------------------------------------------------
Montgomery method:
-----------------------------------------------------------------

 From http://hyperelliptic.org/EFD/g1p/auto-montgom-xz.html (8.2M for 
differential addition and doubling) with Z1=1: 5M+4S.

256*(5*M+4*S)=1275*M + 1020*S
Inverse: 11*M + 245*S

(also checked curve25519-donna-c64.c)

Total: 1285*M + 1265*S

-----------------------------------------------------------------
t.Edwards window method:
-----------------------------------------------------------------

Write the windows size as w and n-bit E(F(p)) curve (e.g. n=255). 
2^(w-1) signed elements need to be calculated.

The half of 2^(w-1) is filled with doubles, half with additions

2^(w-2) additions, 2^(w-2) doubles, z!=1

addition: 10M+1S
double: 3M+4S
from http://hyperelliptic.org/EFD/g1p/auto-twisted-projective.html

Window filling:
2^(w-2)*(10*M+S) + 2^(w-2)*(3*M+4*S)

For windowed scalar multiplication there are w(ceil(n/w)-1) doubles and 
ceil(255/w)-1 additions

w*(ceil(n/w)-1)*(3*M+4*S)+(ceil(n/w)-1)*(10*M+S)

The formula is 2^(w-2)*(10*M+S) + 2^(w-2)*(3*M+4*S) + 
w*(ceil(n/w)-1)*(3*M+4*S)+(ceil(n/w)-1)*(10*M+S)

w=3: 1622*M + 1102*S
w=4: 1438*M + 1091*S
w=5: 1354*M + 1090*S
w=6: 1384*M + 1130*S

Inverse: 11*M + 245*S

Total (w=5) with inverse:
1354*M + 1090*S +11*M + 245*S = 1365*M + 1335*S

So we have

Montgomery (M):     1285*M + 1265*S
t.Edwards w=5 (tE): 1365*M + 1335*S
( tE needs 6% more M, 5.5% more S )

Unifying S+M, this is:

for S=M cost:
M: 2550*M
tE: 2700*M

for S=0.625M cost:
M: 2075M
E: 2200M

Watson's estimate is:
Montgomery: 2491M
t.Edwards w=5: 2714M

-----------------------------------------------------------------
"Jacobian coordinates with a4=0 for short Weierstrass curves", window method
-----------------------------------------------------------------
http://hyperelliptic.org/EFD/g1p/auto-shortw-jacobian-0.html

doubling: 2M+5S
addition:  11M+5S

2^(w-2)*(11*M+5*S) + 2^(w-2)*(2*M+5*S) + 
w*(ceil(n/w)-1)*(2*M+5*S)+(ceil(n/w)-1)*(11*M+5*S) + 11*M + 245*S

w=5: 1165*M + 1825*S

for S=M cost
2990*M

for S=0.625M cost
2305*M

-----------------------------------------------------------------
variable curve size n, window method
-----------------------------------------------------------------

Hand-waive (extrapolate from Curve25519) the inverse as
0.95*n*S+0.05*n*M

Montgomery:
(n+1)*(5*M+4*S)+0.95*n*S+0.05*n*M

t.Edwards:
F=2^(w-2)*(10*M+S) + 2^(w-2)*(3*M+4*S) + 
w*(ceil(n/w)-1)*(3*M+4*S)+(ceil(n/w)-1)*(10*M+S)+ 0.95*n*S+0.05*n*M

n=384:
M: 1944*M + 1904*S
w=5:
tE: 2023*M + 2000*S
w=6:
tE: 1991*M + 2019*S

n=255, w=5  (compare to the actual, above):
M: 1292*M + 1266*S
tE: 1366*M + 1332*S

n=400:
M: 2025*M + 1984*S
w=5:
tE: 2099*M + 2079*S
w=6:
tE: 2076*M + 2110.*S
w=7:
tE: 2203*M + 2193*S

n=512:
M: 2590*M + 2538*S
w=5:
tE: 2679*M + 2668*S
w=6:
tE: 2613*M + 2691*S

-----------------------------------------------------------------
Comb method
-----------------------------------------------------------------
The method is described here: http://eprint.iacr.org/2005/222.pdf, but 
always add O:

precomp: (w-1)*ceil(n/w)(3M+4S) + ((2^w)-w-1)*(10M+1S)
mult: (ceil(n/w)-1)*(3M+4S+10M+1S)
inverse (n=255): 11*M + 245*S

total: (w-1)*ceil(n/w)*(3*M+4*S) + ((2^(w-1))-w-1)*(10*M+S) + 
(ceil(n/w)-1)*(3*M+4*S+10*M+S)+11*M + 245*S

n=255:
w=3:
1653*M + 1349*S
w=4:
1516*M + 1339*S
w=5:
1533*M + 1337*S
w=6:
1772*M + 1372*S