Re: [Cfrg] normative references

On Wed, Jan 15, 2014 at 12:50 PM, Paul Lambert <paul@marvell.com> wrote:
>
>
>  ⨳|-----Original Message-----
>  ⨳|From: Watson Ladd [mailto:watsonbladd@gmail.com]
>  ⨳|Sent: Wednesday, January 15, 2014 12:05 PM
>  ⨳|To: Paul Lambert
>  ⨳|Cc: David McGrew; Yaron Sheffer; cfrg@irtf.org
>  ⨳|Subject: Re: [Cfrg] normative references
>  ⨳|
>  ⨳|On Jan 15, 2014 10:02 AM, "Paul Lambert" <paul@marvell.com> wrote:
>  ⨳|>
>  ⨳|>
>  ⨳|>
>  ⨳|> On 1/15/14, 4:54 AM, "David McGrew" <mcgrew@cisco.com> wrote:
>  ⨳|>
>  ⨳|> >On 01/15/2014 03:25 AM, Yaron Sheffer wrote:
>  ⨳|> >> Hi David,
>  ⨳|> >>
>  ⨳|> >> Somewhere down the Safe Curves thread you mention that you expect
>  ⨳|a
>  ⨳|> >> CFRG draft to contain "stable normative references" to
>  ⨳|definitions
>  ⨳|> >> of crypto mechanisms. I share this wish in part, in particular I
>  ⨳|> >> would appreciate "stable" references (e.g., to academic papers).
>  ⨳|> >> However many/most crypto publications are not aimed at
>  ⨳|> >> implementors. Often they are not well-specified enough to
>  ⨳|> >> implement. And we don't want to wait a few years for NIST to
>  ⨳|create
>  ⨳|> >> normative versions. So I would actually expect the new CFRG
>  ⨳|document to become the "normative"
>  ⨳|> >> reference.
>  ⨳|> >
>  ⨳|> >This is a good point, Yaron.
>  ⨳|> Yes!!!
>  ⨳|>
>  ⨳|> I¹ve just been through the web sites and many/most references and
>  ⨳|find
>  ⨳|> it extremely frustrating to find any clear and consistent
>  ⨳|> implementation details.
>  ⨳|
>  ⨳|(sidenote: I got mojobake in this email. Check your encodings are set
>  ⨳|correctly. And please, can we ban rich text from this list?)
>  ⨳|
>  ⨳|I think it is important to remember that normative references specify
>  ⨳|what, not how. How would be informative, as it doesn't affect
>  ⨳|compliance.
>  ⨳|
>  ⨳|>
>  ⨳|> Example issues include:
>  ⨳|>  - changes in notation and terminology
>  ⨳|>    E.g. For Edwards point addition , is it:
>  ⨳|>         y3 = (y1*y2   +    x1*x2) * inv(1-d*x1*x2*y1*y2)
>  ⨳|>       or is it:
>  ⨳|>         y3 = (y1*y2   -    x1*x2) * inv(1-d*x1*x2*y1*y2)
>  ⨳|>    Š It depends on the way the curve class is named, it¹s actually
>  ⨳|>         y3 = (y1*y2  +  a*x1*x2) * inv(1-d*x1*x2*y1*y2)
>  ⨳|>         and usually a=1
>  ⨳|
>  ⨳|>
>  ⨳|>  - unclear guidelines on important security topics
>  ⨳|>     E.g.
>  ⨳|>       ed25519 has restrictions on the secret key structure for ECDH
>  ⨳|>       Does this extend to other curves?
>  ⨳|>
>  ⨳|>  - references describe the poofs, but hand wave the actual way
>  ⨳|>    you would send and receive and process as data
>  ⨳|>    E.g.  Hand waving of when and if to use projective coordinates
>  ⨳|
>  ⨳|Affine of course: the same is true for Weierstrass form. One uses
>  ⨳|projective form for intermediate calculations.
>  ⨳|I'll make this clear in the draft if it isn't already. The security
>  ⨳|issues cofactors present I will be sure to discuss: I thought they
>  ⨳|were well known because over F_p^{\times} you always have a cofactor.
>  ⨳|
>  ⨳|>
>  ⨳|>  - Lack of clarity on edge conditions (small subgroups, special
>  ⨳|> points)
>  ⨳|
>  ⨳|This is important and I will definitely try to address it.
>  ⨳|
>  ⨳|>
>  ⨳|>  - Unclear algorithm selection guidelines
>  ⨳|>    E.g. Scalar multiplication and point addition
>  ⨳|>      - fastest algorithms no longer have all the Œsafe¹ properties
>  ⨳|>
>  ⨳|>  - unclear implications of use of the cofactor (since it is not =1)
>  ⨳|
>  ⨳|I'll explain the group structure and the necessary changes to DH.
>  ⨳|For algorithms, you can get away with the fast ones if you are careful
>  ⨳|to show that the problems won't matter. For instance if I have a point
>  ⨳|that isn't of small order, I can multiply using the 2008 algorithms if
>  ⨳|I take care never to add two points that might be the same in my
>  ⨳|multiplication algorithm.
>  ⨳|
>  ⨳|The safe method is still pretty fast. If you use an unsafe method,
>  ⨳|it's your responsibility to check that it will work, and not mine.
>  ⨳|
>  ⨳|>
>  ⨳|>  - no guidelines on use of different EC algorithms with the curve
>  ⨳|types
>  ⨳|>    E.g.  ECDSA with Montgomery curve using just x-coordinate math
>  ⨳|???
>  ⨳|>     Do we use cofactor DH since  cofactor >1?
>  ⨳|
>  ⨳|I'm not sure what you mean by cofactor DH, but yes, the implication of
>  ⨳|the cofactor is you need to represent your private keys as coercive,
>  ⨳|forcing points into the large subgroup, or you leak a few bits.
>  ⨳|(Basically the same from a security standpoint).
> NIST defined ...
>
> [ ⨳]
>  ⨳|
>  ⨳|>
>  ⨳|>  - Terms in the references are used in papers in an inconsistent
>  ⨳|manner
>  ⨳|>    With respect to an implementation
>  ⨳|>    E.g. ³Twist²
>  ⨳|
>  ⨳|I'm going to avoid discussing math on the quadratic or cubic twist of
>  ⨳|a curve, because it isn't necessary for implementions.
>  ⨳|Just plug in the formulas and the algorithms, and it will all work.
>  ⨳|
>  ⨳|>
>  ⨳|>  - math is represented inconsistently and imprecisely for an
>  ⨳|implementation
>  ⨳|>     E.g.  + versus * for   ECDH
>  ⨳|>           ^  versus **  on curve equationss
>  ⨳|>           Œmod¹  imply all math is in field versus as an operation
>  ⨳|at
>  ⨳|> a specific operation
>  ⨳|
>  ⨳|The notation issue won't get fixed anytime soon. I'll be consistent in
>  ⨳|the draft, but that means being inconsistent with half of everyone
>  ⨳|else. The operations issue is the sort of thing where a little bit of
>  ⨳|reading comprehension goes a long way: spelling out what exactly +
>  ⨳|means at any given time is annoying, especially when it is always
>  ⨳|obvious.
> On plus ... no need to define every operator, I was referring to
> to EC notation conventions of point addition versus multiplication
> e.g.  s*G   versus G**s
> I'm partial to the notation based on scalar multiplies of points for DH

Of course I'm going to use additive notation for an abelian group! The
multiplicative
notation is a legacy from F_{p}^{\times} crypto where it really was
multiplication.

>
>  ⨳|
>  ⨳|>
>  ⨳|>  - no test vectors or example calculations readily available
>  ⨳|
>  ⨳|This cannot be said about Curve25519: "Cryptography in NaCL" contains
>  ⨳|a Python test program together with an example DH exchange, but in
>  ⨳|little-endian.
> Badly represented. Vectors could be extracted from the C code examples.
>
>  ⨳|
>  ⨳|For the other curves I agree this is an issue. I'll do a black-box
>  ⨳|interop test with you and other interested people before we publish.
>  ⨳|
>  ⨳|>
>  ⨳|> It appears that much of the interest and ability to use some of
>  ⨳|these
>  ⨳|> curves comes from their availability in a few specific C libraries.
>  ⨳|> These might be used as a reference, but are curve specific and it¹s
>  ⨳|> unclear how to extract generic guidelines for more than the
>  ⨳|implemented curve.
>  ⨳|
>  ⨳|> There is a significant body of work on implementing Weierstrass
>  ⨳|curves
>  ⨳|> (ANSI, IEEE, NIST, NSA, etc.).  Edwards, Twisted Edwards and
>  ⨳|> Montgomery need the same level of documentation to progress.
>  ⨳|
>  ⨳|Or the same level of C library support. I'd much rather have everyone
>  ⨳|copy a well-made library then write their own bugs we have to deal
>  ⨳|with down the line/that the SIGINT people exploit. But yes, better
>  ⨳|documentation about how to implement is necessary.
> We're in the documentation business here - not code libraries. Plus we're
> supposed to have multiple interoperable different implementations.

Do you really think everyone implementing P256 does it themselves? Or do they
take example code from the Internet?

>
>  ⨳|
>  ⨳|>
>  ⨳|> So Š. lot¹s of issues, but they could be solved by a clear
>  ⨳|specification.
>  ⨳|>
>  ⨳|> Initial Recommendations:
>  ⨳|>  1) Always use the most general form of any of the representations
>  ⨳|>      Replace any algorithms for Edwards with Twisted Edwards (more
>  ⨳|general)
>  ⨳|>      E.g.  Curve as:
>  ⨳|>           (a*y**2+x**2) % p == (1 + d*x**2*y**2) % p
>  ⨳|>            and
>  ⨳|>             y3 = (y1**2  +  a*x1**2) * inv(1-d*x1*x2*y1*y2)
>  ⨳|>
>  ⨳|>      One question here Š I also see alternate form with Œc'
>  ⨳|>             (y**2+x**2) % p == c*(1 + d*x**2*y**2) % p
>  ⨳|>
>  ⨳|>                 or
>  ⨳|>             (y**2+x**2) % p == c**4*(1 + d*x**2*y**2) % p
>  ⨳|>
>  ⨳|>          Not sure if any advantages versus Œa¹ or both Œa¹ and Œc'
>  ⨳|>             (a*y**2+x**2) % p == c*(1 + d*x**2*y**2) % p
>  ⨳|
>  ⨳|If we aren't specifying any twisted Edwards curves, why would we
>  ⨳|define the formulas?
>  ⨳|I suppose for use with Montgomery form, ala Ransom's suggestion.
> [ ⨳]
> Yes ... and why did you leave out Ed25519?       It's being used and should at least be discussed/reviewed/

It's the same as Curve25519, and not in safecurves. Let's be real
clear: isogenies don't change anything substantial.
If you want to discuss it separately, go ahead: plenty of people have
been discussing alternatives on this list already.

I'm also going to explain the Montgomery ladder and double-and-add on
Edwards and that's it. RFC 6090 doesn't talk about
radix-k either, nor does it explain Shamir's trick, or fun with the
Brier-Joyce ladder, or w-NAF, etc.
>
>
>  ⨳|
>  ⨳|>
>  ⨳|>   2) reuse where possible terminology and variables from Weierstrass
>  ⨳|ECC
>  ⨳|>      E.g.  P for prime defining Fp,  n for order, h for cofactor
>  ⨳|>          This would allow all classes of curves to viewed equally in
>  ⨳|>          an implementation catalog
>  ⨳|>           E.g.
>  ⨳|>
>  ⨳|>         curveId = 'curve25519'
>  ⨳|>         strength = 127
>  ⨳|>         oid = None
>  ⨳|>         p  = 2**255-19
>  ⨳|>         a  = 486662
>  ⨳|>         xG = 9
>  ⨳|>         yG =
>  ⨳|>
>  ⨳|1478161944758954479102059356840998688726460613461647528896488183775558
>  ⨳|> 62374
>  ⨳|> 01
>  ⨳|>         n  = 2**252 + 27742317777372353535851937790883648493
>  ⨳|>         h  = 8
>  ⨳|
>  ⨳|Strength is the base 2 log of the operations to break, right?
> Yes

I still don't see why this form is much better than a paragraph
listing these things.

>  ⨳|Also, different things should be written differently.
> [ ⨳]
>
> On the Edwards curves with cofactors >1 I believe the strength estimate may be a little less
> than simply taking the bit length of the size / 2
>
> Advice on this small adjustment would be appreciated.

Size of the group generated by g, not the size of the curve.

>
>
>  ⨳|
>  ⨳|>     Also Š long numbers should be consistently represented hex
>  ⨳|versus
>  ⨳|> decimal
>  ⨳|
>  ⨳|Why hex? Most math papers use decimal, and more importantly GP/PARI
>  ⨳|doesn't accept hex.
>  ⨳|MAGMA, PARI and Sage are going to be used for test vector validation.
> [ ⨳]
> By you maybe ...
>
> Hex fits better in an RFC ...  decimal would be fine also.

Six of one, half-dozen of the other.

>
>  ⨳|
>  ⨳|>     Note also - Weierstrass Œa¹ is not the same as Edwards Œa¹, but
>  ⨳|>     can be written around Š and we should keep Edwards equations
>  ⨳|consistent
>  ⨳|>     where possible with literature.
>  ⨳|
>  ⨳|When we define DH exchange on a group yes. With setting the parameters
>  ⨳|like you did we're tempting fate.
> Not really ... just another area for clear specification, eg:
>
> Suggested RFC text:
>
>     The ECC definitions include three classes of curves defined over the field
>     of integers modulo a prime p (Fp).  Each class of curve has an equation and
>     assoicated parameters unique to the class type.  Three classes of curves
>     are described:
>
>       SmallWeierstrassCurveFp       y**2 = x**3 + a*x + b  mod p
>       EdwardsCurveFp         x**2 + y**2 = c*(1 + d*x**2 * y**2)  mod p
>       MontgomeryCurveFp             y**2 = x**3 + a*x**2 + x  mod p
>
>     The defintions include for each curve type:
>
>       curveId - An ASCII string used to identify the curve parameters
>       strength - an integer providing the estimated cryptographic strength
>                  of the curve as a power of 2
>       oid - a list of integers that correspond to the associated ASN.1 object
>             identifier. The oid is listed as 'None' if there is no assigned oid
>       p - The prime modulus of the group (Fp). Often p is a Mersenne prime
>           to facilitate more efficient modular inversion
>       xG - the x-coordinate of a generator point G for the curve
>       yG - the y-coordinate of a generator point G for the curve
>       n  - the order of the generator point G
>       seed - included for reference when available
>       h - the cofactor of the curve
>
>     Each curve type has the following unique parameters
>
>       Small Wierstrass      y**2 = x**3 + a*x + b  mod p
>         a - often set to p-3 for efficiency
>         b - elected for the security properties of the curve shape
>         z - used by 'twisted' Brainpool curves for isogenous transform
>             of untwisted curve to a curve with a = p-3
>
>       Montgomery            y**2 = x**3 + c*x**2 + x  mod p
>         a - selected for the security properties of the curve shape
>
>       Edwards               x**2 + y**2 = c*(1 + d*x**2 * y**2) mod p
>         c - typically set to 1 so the equation is reduced to:
>                             x**2 + y**2 = 1 + d*x**2 * y**2  mod p
>         d - selected for the security properties of the curve shape
>
> Hummm .. could add Twisted Edwards to the above

But none of the curves in safecurves are presented in twisted Edwards form,
I'm not going to rehash short Weierstrass here, (RFC 6090 already exists).

Anyway, I think this conversation would be far more productive when I
finish the revisions and put it before
the group again. You can follow on github in between.

>
>  ⨳|
>  ⨳|>
>  ⨳|>
>  ⨳|>   3) Use math representation in RFCs that maps directly to a
>  ⨳|computer
>  ⨳|>      language notation - Python would be a good choice for
>  ⨳|>      readability.
>  ⨳|>      E.g. Twisted Edwards point addition definition
>  ⨳|>
>  ⨳|>        x3 = ((x1*y2+x2*y1) * inv(1+d*x1*x2*y1*y2)) % p
>  ⨳|>        y3 = ((y1*y2-a*x1*x2) * inv(1-d*x1*x2*y1*y2)) % p
>  ⨳|
>  ⨳|The math notation I am using maps directly to that of most CAS
> [ ⨳]
> No - for example:
> " On Montgomery curves, curves of the form y^2=x^3+Ax^2+x, the typical"
> is Ax a variable 'Ax' or A*x ... this is sloppy non-parseable notation

>  ⨳|systems. Anyway, at some point I will be done with the major revision,
>  ⨳|and kick it around for comments.
> [ ⨳]
> CAS???
> I prefer running code useable for a protocol implementation.

So I can just copy the relevant parts of tweetnacl.c and be done with it?

>
> Paul
>
>  ⨳|
>  ⨳|>
>  ⨳|>    4) curve names need consolidation and should clearly indicate
>  ⨳|curve
>  ⨳|> type
>  ⨳|
>  ⨳|Blame Tanja. I really do not like the idea of changing existing names
>  ⨳|especially when referring people to papers with them. SEC/NIST has
>  ⨳|already caused enough confusing as it is.
> [ ⨳]
> No need to blame, just make the change but  include reference to original author and curve name.
> NIST did not add confusion ... just branded the results.
> CFRG branding would be valuable.  More curves will happen and  a consistent notation is a good thing.

What do you propose for the name scheme? E,M with prime in the expc
format is my best proposal, but this might not be unique all the time.
>
> Paul
>
>  ⨳|
>  ⨳|>
>  ⨳|>
>  ⨳|>
>  ⨳|>
>  ⨳|> Paul
>  ⨳|>
>  ⨳|>
>  ⨳|>
>  ⨳|> >
>  ⨳|> >David
>  ⨳|> >
>  ⨳|> >>
>  ⨳|> >> Thanks,
>  ⨳|> >>     Yaron
>  ⨳|> >>
>  ⨳|> >> _______________________________________________
>  ⨳|> >> Cfrg mailing list
>  ⨳|> >> Cfrg@irtf.org
>  ⨳|> >> http://www.irtf.org/mailman/listinfo/cfrg
>  ⨳|> >>
>  ⨳|> >
>  ⨳|> >_______________________________________________
>  ⨳|> >Cfrg mailing list
>  ⨳|> >Cfrg@irtf.org
>  ⨳|> >http://www.irtf.org/mailman/listinfo/cfrg
>  ⨳|>
>  ⨳|> _______________________________________________
>  ⨳|> Cfrg mailing list
>  ⨳|> Cfrg@irtf.org
>  ⨳|> http://www.irtf.org/mailman/listinfo/cfrg

-- 
"Those who would give up Essential Liberty to purchase a little
Temporary Safety deserve neither  Liberty nor Safety."
-- Benjamin Franklin