[Cfrg] On relative performance of Edwards v.s. Montgomery Curve25519, variable base

Andrey Jivsov <crypto@brainhub.org> Mon, 05 January 2015 08:26 UTC

Return-Path: <crypto@brainhub.org>
X-Original-To: cfrg@ietfa.amsl.com
Delivered-To: cfrg@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com []) by ietfa.amsl.com (Postfix) with ESMTP id 9EF371A1BE4 for <cfrg@ietfa.amsl.com>; Mon, 5 Jan 2015 00:26:37 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 3.099
X-Spam-Level: ***
X-Spam-Status: No, score=3.099 tagged_above=-999 required=5 tests=[BAYES_50=0.8, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, MANGLED_DEALS=2.3, SPF_PASS=-0.001] autolearn=no
Received: from mail.ietf.org ([]) by localhost (ietfa.amsl.com []) (amavisd-new, port 10024) with ESMTP id LL3RXudK3wpu for <cfrg@ietfa.amsl.com>; Mon, 5 Jan 2015 00:26:35 -0800 (PST)
Received: from resqmta-ch2-04v.sys.comcast.net (resqmta-ch2-04v.sys.comcast.net [IPv6:2001:558:fe21:29:69:252:207:36]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 676231A1E0B for <cfrg@irtf.org>; Mon, 5 Jan 2015 00:26:35 -0800 (PST)
Received: from resomta-ch2-12v.sys.comcast.net ([]) by resqmta-ch2-04v.sys.comcast.net with comcast id c8Sa1p0012LrikM018SaCx; Mon, 05 Jan 2015 08:26:34 +0000
Received: from [] ([]) by resomta-ch2-12v.sys.comcast.net with comcast id c8SZ1p0024uhcbK018SZ7N; Mon, 05 Jan 2015 08:26:34 +0000
Message-ID: <54AA4AB9.70505@brainhub.org>
Date: Mon, 05 Jan 2015 00:26:33 -0800
From: Andrey Jivsov <crypto@brainhub.org>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0
MIME-Version: 1.0
To: "cfrg@irtf.org" <cfrg@irtf.org>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 7bit
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20140121; t=1420446394; bh=Y73mcZ4GbsVWEgukKc+5bcaUeu+F33z7waWXt/h+fGY=; h=Received:Received:Message-ID:Date:From:MIME-Version:To:Subject: Content-Type; b=JHl4FS25VrFm88S14/RRR06P22jFYrVoT7VsYbgSCYy268LlNaV2l4SIidRQy589M 4hI4wi+jSVheXbam1FB7Gv23o/TVnNEcJhetXU5BPyzTyWdWx17yq7iqWzywNX0Z12 67XQ+FFFQkcXCzRAMBc0jnUjSJ00pR2teiDAmq25Mnuc6aqVwh0vTw2aG11qZYctiQ Z70J06I9IouNeUpCYXApeTTcW0IvvmBmVtZM718BXFhZCQ4hVnVqwWmPMWOG1BG7bg mV/9tlSMoWAznX61siQe/vQuNTKS6nAfarRf/Hff5FtF7Yj3MpF3v3VBAUXXgvuL7W GSryPHob8qcpw==
Archived-At: http://mailarchive.ietf.org/arch/msg/cfrg/9KoGU8f98JLxLfVVprG-2itbp_o
Subject: [Cfrg] On relative performance of Edwards v.s. Montgomery Curve25519, variable base
X-BeenThere: cfrg@irtf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Crypto Forum Research Group <cfrg.irtf.org>
List-Unsubscribe: <http://www.irtf.org/mailman/options/cfrg>, <mailto:cfrg-request@irtf.org?subject=unsubscribe>
List-Archive: <http://www.irtf.org/mail-archive/web/cfrg/>
List-Post: <mailto:cfrg@irtf.org>
List-Help: <mailto:cfrg-request@irtf.org?subject=help>
List-Subscribe: <http://www.irtf.org/mailman/listinfo/cfrg>, <mailto:cfrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Mon, 05 Jan 2015 08:26:37 -0000

I timed the EdDSA using the code recommended in 

To my surprise the performance of the verification v.s. variable base 
ECDH are about the same on a (non-Haswell) i5-3550 CPU @ 3.30GHz CPU ( 
the same I used before in prior reports).

git clone https://github.com/brainhub/curve25519-donna.git
$ make speed-curve25519-donna-c64 && ./speed-curve25519-donna-c64
71 us, 14063.8 op/s, 234115 cycles/op

git clone https://github.com/floodyberry/ed25519-donna && cd ed25519-donna
$ gcc ed25519.c -m64 -O3 -c  && gcc test.c ed25519.o -l crypto -o _ && ./_
61965 ticks/public key generation
65268 ticks/signature
223425 ticks/signature verification
60219 ticks/curve25519 basepoint scalarmult
failed to generate expected result
got : 
diff: f8,85,c3,8a, 

110063 ticks/verification (batch)

(I commented out exit(1) in test.c due to tests failing and added 
"batch" above).

I then changed the ge25519_double_scalarmult_vartime, which is doing 
"[s1]p1 + [s2]basepoint" to only do [s1]p1, making sure that the piece 
that takes window values from the precomputed static const 
ge25519_double_scalarmult_vartime is disabled.

This resulted in:
202080 ticks/signature verification

We have ECDH performance: 234115,  a relevant portion of signature 
verification: 202080.

This shows that the performance of Montgomery ladder v.s. Edwards 
variable base scalar multiplication for p ~= 2^256 is about 15% faster. 
The table for the window values is only 40*4*2^3=1280 bytes. This is 
with 64-bit C code in both cases.

I was expecting that a Montgomery ladder will be faster. Which brings 
the question how much faster should we expect Montgomery ladder to be 
for p ~= 2^256 x86 or other architecture? I don't see it on x86 in my 
quick tests.

Also, I saw a remark in 
https://www.ietf.org/mail-archive/web/cfrg/current/msg05712.html "At the 
128-bit security level the ladder can be faster. At the > 200-bit+ 
security levels the ladder is slower."