[Cfrg] Timing of libsodium, curve25519-donna, MSR ECCLib, and openssl-master

Andrey Jivsov <crypto@brainhub.org> Sun, 17 August 2014 01:10 UTC

I timed libsodium, curve25519-donna, MSR ECCLib, and openssl-master.

In all cases minor tweaks to the source code were added to measure and 
report the timing. I made sure to time the variable base scalar 
multiplication. I also timed the fixed base multiplication and 
precomputation (only needed for MSR ECCLib).

Operations are reported as operations per second. I used default compile 

MSR ECCLib was slightly faster in variable base operations. It uses 
assembler code.

Interestingly, MSR ECCLib Weierstrass a=-3 curves are only 10% slower 
than curve25519-donna. At the same time all pseudo-Mersenne prime curves 
are ~5 times faster than NIST P-256 (this is better than factor of 2 
back-of-envelope difference in modp multiplication performance)

The factor of 2+ improvement for fixed base calculation in MSR ECCLib is 
impressive. Note, however, the significant penalty that precalculation 
step adds. If the pre-calculation is included in timing, we could do 
~50% more EDH agreements with NIST P-256.

CPU: Intel(R) Core(TM) i5-3550 CPU @ 3.30GHz, no AVX2. Fedora Core 20 64 

modified tests in libsodium/test/default to take the timing:
crypto_scalarmult_curve25519_base: 15620.2 op/s
crypto_scalarmult_curve25519: 15602.8 op/s

make ./speed-curve25519-donna-c64 && ./speed-curve25519-donna-c64
63 us, 15722.1 op/s
(also modified to check variable base v.s. generator 9 -- no difference)

OpenSSL 1.0.1e-fips 11 Feb 2013:
openssl speed ecdhp256 (ECDH_compute_key)
  256 bit ecdh (nistp256)   0.0003s   3245.4  op      op/s
and from git://git.openssl.org/openssl.git:
  256 bit ecdh (nistp256)   0.0003s   3406.7  op      op/s


In the function that prints "Crypto operations: Weierstrass a=-3 over 
with variable base (baseecdh_secret_agreement_Jac256) 14047.9 op/sec
with fixed base (ecdh_keygen_Jac256) 35370 op/sec
table precomp (ecdh_generator_table_Jac256) 1284.03 op/sec
table precomp+keygen+variable base 1056.86 op/sec
"ECDH(E) runs in [...] 328926 cycles"

In the function that prints "Crypto operations: twisted Edwards a=-1 
over GF(2^256-189)"
with variable base (ecdh_secret_agreement_Ted256): 17482 op/sec
with fixed base (ecdh_keygen_Ted256) 35370 op/sec: 45762.9 op/sec
table precomp (ecdh_generator_table_Ted256) 1346.98 op/sec
table precomp+keygen+variable base 1195.89 op/sec
"ECDH(E) runs in [...] 261385 cycles"

memcpy of the 32 bytes: 595968511 op/sec, see attached code
( i.e. memcpy count / crypto_scalarmult_curve25519 count = 38042 )