Re: [Cfrg] Elliptic Curves

Rene,

Dan’s technical point is that except in the most network-glutted environments (maybe a datacenter LAN), it’s essentially always cheaper in both time and energy and money to compute v on the recipient’s side instead of sending it, if you aren’t using the Montgomery ladder.  Furthermore it doesn’t add much complexity to compute on the recipient’s side.  And if you are using the Montgomery ladder — which saves complexity instead of adding it — then sending the extra coordinate is an utter waste.

Dan,

I’m not convinced that the Montgomery ladder is automatically safe from fault attacks on a twist-secure curve.  The ladder’s internal state has an invariant, that P+Q=R.  If that condition is violated, you could end up somewhere different.  The ladder does use the curve constant at every step, but I’m not sure that automatically gives immunity.  But it does make the attacker’s job harder.

— Mike

> On Mar 15, 2015, at 9:09 PM, Rene Struik <rstruik.ext@gmail.com> wrote:
> 
> Hi Dan:
> 
> It seems you confuse the sender and recipient side here: on the sender's side, the incremental cost of computing an affine point representation from an x-coordinate-only one is quite insignificant, whereas, indeed, on the recipient's side, this incremental cost is quite high. Hence, the entire discussion.
> 
> Rene
> 
> 
> [excerpt of email Dan Bernstein as of Sun March 15, 2015, 8.22pm EDT]
> 
> Meanwhile you're forcing the ECDH key generator to
> compute v in the first place---a significant increase in the complexity
> of the simplest (Montgomery ladder) keygen software.
> 
> 
> On 3/15/2015 8:22 PM, D. J. Bernstein wrote:
>> One of the OpenSSL security fixes this year was correcting a bug in
>> point validation. The bug is estimated to allow more than 2^192 bogus
>> (x,y) pairs for NIST P-256, and presumably a few of those have order
>> below 2^66. Finding these medium-order points is an open problem for the
>> moment, but anyone who _can_ find them can use an active invalid-point
>> attack to break Microsoft's two-hour ECDH keys with <2^70 computations,
>> and thus to decrypt all other connections made during that two-hour
>> period. (The computations don't have to finish within the two hours.
>> Maybe this work can even be shared across multiple ECDH keys; I'd have
>> to look at other protocol details to figure this out.)
>> 
>> How many years did it take for this bug to be found in a very widely
>> used cryptographic library? Who's reviewing all the other NIST P-256
>> implementations to see whether they even _try_ to validate points, never
>> mind whether they get it right?
>> 
>> With the Montgomery ladder, the curve is reused in every step, and a bug
>> that corrupts computations will simply move to different points on the
>> same curve or its twist. Next-generation curves such as Curve25519
>> guarantee that _none_ of those points have dangerous orders. The fault
>> attack in
>> 
>>    https://www.di.ens.fr/~fouque/pub/fdtc08.pdf
>> 
>> worked only because it targeted old-fashioned curves that weren't
>> twist-secure, such as NIST P-224. With proper choices (twist-secure
>> curve, and scalar bits set accordingly) Montgomery-ladder ECDH doesn't
>> need _any_ input validation.
>> 
>> In particular, input 0 is handled correctly without any special tests,
>> contrary to Rene's comments and earlier comments from Microsoft. See
>> https://www.ietf.org/mail-archive/web/cfrg/current/msg05004.html.
>> 
>> Here's another interesting data point regarding code simplicity. In
>> flipping through Microsoft's "constant-time" ECCLib_v1.1, I noticed
>> 
>>    temp = (scalar[0] & val1) - val2; // ki = (k mod 2^w)/2^(w-1)
>>    *digits = (int)temp;
>>    digits++;
>>    res = scalar[0] - temp;           // k = (k-ki)/2^(w-1)
>>    borrow = (temp > 0) & (scalar[0] < (unsigned int)temp);
>> 
>> inside the fixed_window_recode() function. The scalar is, apparently,
>> supplied by an attacker, and I would expect the compiler in most cases
>> to turn the comparisons into variable-time branches, leaking scalar bits
>> through timing---clearly violating the "constant-time" promise. With the
>> Montgomery ladder, the scalar bits are used in a much simpler way,
>> minimizing the number of opportunities to screw up the implementation.
>> 
>> Andrey Jivsov writes:
>>> A sample of the interest in unified format can be found here:
>> You misunderstand. There are two different types of ECDH keygen code
>> available for X25519:
>> 
>>    * _simple_ key generators that use the Montgomery ladder;
>>    * _fast_ key generators that use precomputed Edwards points and
>>      convert the output to Montgomery format.
>> 
>> Both of these approaches produce identical Montgomery x-coordinates on
>> the wire. The fact that the fast implementations use Edwards internally
>> doesn't mean that all implementations want to do this (for most people
>> simplicity is more important than this extra burst of keygen speed) and
>> certainly doesn't mean that it's a good idea to use anything other than
>> Montgomery x on the wire for ECDH.
>> 
>>   [ send an extra coordinate v along with the Montgomery x ]
>>> An implementation that uses a Montgomery ladder simply ignores v
>> Well, yes, this is what any sensible ECDH receiver will do, so why waste
>> the network traffic? Meanwhile you're forcing the ECDH key generator to
>> compute v in the first place---a significant increase in the complexity
>> of the simplest (Montgomery ladder) keygen software.
>> 
>>> implement both signatures and ECDH
>> Switching from a "unified" format to Montgomery x for ECDH usually makes
>> this type of software _simpler_. The underlying issue is that there are
>> actually three different scalar-multiplication operations of interest:
>> 
>>    * double-scalar multiplication (signature verification);
>>    * fixed-base-point single-scalar multiplication (signing/keygen); and
>>    * variable-base-point single-scalar multiplication (ECDH).
>> 
>> For example, in Microsoft's library these are ECC_DBLMUL_TE,
>> ECC_MUL_FIXED_TE, and ECC_MUL_TE. Ripping all the ECDH-specific
>> functions out of the library (ECC_MUL_TE, fixed_window_recode, etc.),
>> and replacing them with a Montgomery ladder, would _simplify_ the code.
>> 
>> Furthermore, even if you can find a signature+ECDH library where a
>> "unified" format _reduces_ complexity rather than _increasing_ it, the
>> percentage saved will obviously be quite small---whereas sticking to
>> Montgomery x for ECDH saves a _much_ larger percentage of a pure ECDH
>> implementation.
>> 
>>> a single set of test vectors, code reuse, and the benefits of sharing
>>> pre-computed values
>> Every function needs its own test vectors in any case. Pre-computed
>> values are shared in any case. As for code reuse, see above.
>> 
>>> no (measurable) performance penalty
>> The extra network traffic is a measurable performance penalty, and is
>> more important than the CPU time saved for decompression inside
>> signature verification. See below.
>> 
>> Michael Hamburg writes:
>>   [ compression ]
>>> saves time and energy on radio links
>> Yes, obviously.
>> 
>>> but costs time and energy on wired ones
>> No.
>> 
>> Time: Curve25519 decompression takes roughly 5 microseconds on a typical
>> server CPU core, but this is less than the time that the core needs to
>> receive 256 bits from the Internet---unless you've bought a single
>> quad-core server to handle more than 200Mbps of Internet traffic; this
>> doesn't make economic sense.
>> 
>> Energy: 10 USD of electricity will pay for about 10 watt-years of this
>> 125-watt quad-core CPU, i.e., a month of CPU time, i.e., 2000000000000
>> decompressions, saving 64 terabytes of Internet traffic---a month of
>> continuous traffic at 200Mbps, as above. Some ISPs do sell noticeable
>> fractions of 200Mbps for 10 USD/month, but that's because they know that
>> most users won't use the network anywhere near continuously.
>> 
>>> or when the point is cached
>> No. You pay _once_ to decompress a point the first time you use it, and
>> after that you cache the decompressed version. (I'm assuming that it's
>> cheaper for you to store the extra bytes than to recompute them; if not,
>> you should cache the compressed version.) Local caching doesn't affect
>> the performance impact of compression on the wire.
>> 
>> ---Dan
>> 
>> _______________________________________________
>> Cfrg mailing list
>> Cfrg@irtf.org
>> http://www.irtf.org/mailman/listinfo/cfrg
> 
> 
> -- 
> email: rstruik.ext@gmail.com | Skype: rstruik
> cell: +1 (647) 867-5658 | US: +1 (415) 690-7363
> 
> _______________________________________________
> Cfrg mailing list
> Cfrg@irtf.org
> http://www.irtf.org/mailman/listinfo/cfrg

Re: [Cfrg] Elliptic Curves - curve form and coordinate systems