Re: [Cfrg] Adoption request: draft-hdevalence-cfrg-ristretto

"Riad S. Wahby" <rsw@jfet.org> Sat, 27 July 2019 02:16 UTC

Date: Fri, 26 Jul 2019 19:16:25 -0700
From: "Riad S. Wahby" <rsw@jfet.org>
To: Henry de Valence <ietf@hdevalence.ca>
Cc: Filippo Valsorda <filippo@ml.filippo.io>, draft-hdevalence-cfrg-ristretto@ietf.org, cfrg@irtf.org
Message-ID: <20190727021625.qam6cxl3t66yyipf@positron.jfet.org>
References: <20190724212030.ddcswlg5uxm3muzo@positron.jfet.org> <CAPC=aNVCV2cn62rhQsu+RsJsdjt2Dqqw_rqooLsuc8J5v9s3kQ@mail.gmail.com> <20190725004633.l5k7toldcgy7uthb@positron.jfet.org> <a391f8d5-c4c4-4650-9392-860864543198@www.fastmail.com> <20190725015259.betglszxmwpgg7q7@positron.jfet.org> <3dde344b-40d2-4ea8-84c9-7ddec039afe3@www.fastmail.com> <20190725034958.bp6nnhqedjwmyzng@positron.jfet.org> <20190725143626.aetja4eeyzyimcne@positron.jfet.org> <CA+jiKjNsH6ROj9j-R_+AuRDEMSDq=Ee7hg_z63P_8CTjyE0r2g@mail.gmail.com> <CA+jiKjOPT_JpWULgQ6kJ2q1yqP-iymAms80NCcDf_mHE7mi3qg@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
In-Reply-To: <CA+jiKjOPT_JpWULgQ6kJ2q1yqP-iymAms80NCcDf_mHE7mi3qg@mail.gmail.com>
Archived-At: <https://mailarchive.ietf.org/arch/msg/cfrg/w91AdGVdLWccafiPub862MObk5c>
Subject: Re: [Cfrg] Adoption request: draft-hdevalence-cfrg-ristretto
Precedence: list

Hi Henry,

I really appreciate that you took the time to read my emails and
give a point-by-point response.

From your email, it's obvious that the document is clear to you.
This is great! since that's obviously a prerequisite for writing a
good document. My purpose in reviewing your document was to try and
give you the perspective of a reader who is familiar with the area
but does not already have the meaning of your document committed
to memory. So while I appreciate the in-depth response and your
attempts to clarify, probably we can agree that explaining what you
meant to write on this mailing list doesn't scale to the (hopefully)
many readers who will come after me.

This is closely related to the "curse of knowledge": it's very hard
to know whether something you've written is clear or not, because of
course it's clear to *you*. That's why we have reviewers: because this
is (the?) one aspect of the document for which ignorance is a virtue.

...and I am very virtuous :)

As one high-level comment: many of your responses are dedicated to
carefully maintaining the fiction that there exist curves other than
those isomorphic to Curve25519 which could be useful for implementing
ristretto255. To me, this idea is plainly unhelpful. The reality is
that, to first order, *every* ristretto255 implementation will use
edwards25519 or curve25519 internally. The document should be written
to acknowledge that reality, and to assist those implementations.
Implementors who know enough to get fancy don't need your help.

To that end, there are several spots below where one might be tempted
to respond "ah, but that's only true if the internal representation
is isomorphic to Curve25519." Yes! and so it is (true).

Another high-level comment: many of the explanations you've given
about why things work or don't work are, frankly, quite hand-wavy.
This relates to but is not identical to the "fiction" issue. As
a specific example, (far) below I've quoted your line about how the
hash I suggested does not give valid internal representations. I think
everyone on this list would appreciate, when you make such a claim,
if you would back it up with a technical justification. I'm happy
to dig into the details on my own, but at some point it goes from
"reviewing your document" to "rederiving your document," and that
just wastes time and makes me less useful as a virtuous reviewer :)

Finally, lest it be lost in the back-and-forth soup below: the primary
reason I'd like to see Ristretto incorporate hash-to-curve by reference
is that it means that there's one source of truth for implementors
of conforming mappings to elliptic curves, which in turn means that
one Elligator2 implementation suffices for any application involving
Curve25519 (including, per the above, Ristretto). To me, this is far
more important than an extraneous HKDF invocation here or there. To
be clear: the hash-to-curve document can change to accommodate this.

Some further inline comments below.

Henry de Valence <ietf@hdevalence.ca> wrote:
> That email suggests using "Curve25519" to refer to the underlying
> elliptic curve, "X25519" and "Ed25519" to refer to particular
> DH and signature schemes using that curve, and names for various
> coordinate systems found in the academic literature.  Our draft
> follows this terminology.

Your draft is a proposal to create an IETF document; as such, it
certainly seems sensible to write it using the same terminology that
other IETF documents do. That means, in this case, citing RFC7748
and using its terminology.

At a guess, your target audience is not going to trawl the CFRG
mailing list for djb's posts explaining your document.

> > Is it edwards25519 point addition?
> 
> Yes, as described in Section 3.3, quoted above.

I'd go back and check that quote again. It certainly does not say
anything about edwards25519 point addition.

With apologies for repeating myself: you know what you mean; your
readers do not, because they do not assimilate every word of the
document in a single pass---especially words several pages away.

> Riad S. Wahby <rsw@jfet.org> wrote:
> > In particular, there is no group isomorphic to Curve25519 with order l,
> > because Curve25519 has order 8 * l.
> 
> The requirement is that the "isomorphic group" provided by an
> implementation is isomorphic to the ristretto255 group, not to
> Curve25519, so this is not an obstruction.

In that case, it's definitionally true. I'd strongly recommend
removing it, as it does not help the reader to understand what's
going on and---as we've now seen---can have the opposite effect.

> > Here is the problem: Sections 3.2 and 3.3 *do not* provide those
> > operations. For example, the document does not indicate how to perform
> > point addition, because nowhere in the document is it made clear that
> > the output of DECODE as defined in Section 3 is an edwards25519 point.
> 
> I think this is made clear throughout the draft for all the reasons
> mentioned above (modulo adding an explicit note (A) that Curve25519
> refers to the abstract elliptic curve).

See comments above re: curse of knowledge.

> Regarding the proposal, I have the following comments:
> 
> > 0. Define the term "internal representation" up front. Make clear
> >    that it's a restricted version of an elliptic curve point that
> >    is defined over some underlying curve, and that all operations on
> >    intenal representations not explicitly allowed by the text of the
> >    document are prohibited.
> 
> This is already defined up front in Section 2, "Notation and
> Conventions Used In This Document".

This certainly seems like the right place for it, but please compare
the content of my proposed definition to the content (read: the
words on the screen) of Section 2. To my eyes, anyway, the latter
is incomplete.

> > 1. Make the Section 3 implementations of ENCODE, DECODE, EQUALS, and
> >    FROM_UNIFORM_BYTES normative, and clarify that the inputs to ENCODE
> >    and EQUALS and the outputs of DECODE and FROM_UNIFORM_BYTES are
> >    edwards25519 points that are "internal representations."
> 
> I don't understand how these implementations can be normative (an
> implementation MUST implement them) without making a normative choice
> of curve to use internally

Apologies that I was not clear. My feedback is that the document
*should* make this choice, up to isomorphisms. Otherwise it is not
specific enough to help implementors do their job.

To say it another way: the document should enshrine Curve25519
(the abstract curve) as *the* correct internal representation.
To do that, the functions should be written in terms of some concrete
instantiation---Edwards seems reasonable---and should detail precisely
the ways in which implementors may inject isomorphisms in order to
arrive at the curve they want to operate on internally.

This was the content of my suggestions #1, #3, and #6. Again, sorry
that I was unclear the first time.

> > 1.1. Clarify in the above that inputs to ENCODE and EQUALS MUST be
> > internal representations; any other inputs are prohibited.  Also
> > clarify that the only way to get an internal representation is from
> > DECODE, FROM_UNIFORM_BYTES, or from a permitted operation applied to
> > an existing internal representation.
> 
> We can repeat the definitions of ENCODE and EQUALS from Section 1 in
> Section 3, and repeat the requirements from Section 4 and Section 6 in
> Section 1 to make this even more clear, but I think these requirements
> are already described fairly explicitly.

I'm not asking for repetition, I'm asking for clarification.
Specifically, my request is that "internal representation" should be
completely defined, in one place in the document, in terms of (and
with references to concrete definitions of) the operations that are
allowed and disallowed.

By the way, the clarity problem I'm alluding to is perfectly
illustrated by the mishmash of section numbers you had to list above.

> > 3. Explicitly allow mapping any internal representation to any curve
> >    isomorphic to edwards25519, and state that the output of any such
> >    mapping remains an internal representation and thus subject to
> >    all of these requirements.
> 
> The header of Section 3 already allows an implementor to use any curve
> model they want internally, as long as it provides an isomorphic group
> with the same byte encoding.

The point of this comment is that it does so only implicitly, and it
does not give sufficient explanation *how* this should be done.

> >    (List the edwards25519/curve25519 isomorphism as an example,
> >    and direct the reader to the birational map in RFC7748.)
> 
> I would rather direct the reader (and list it as an informative
> reference (D)) to the Ristretto notes linked in the document, which
> explain in detail which other curves can be used internally and the
> particular role that the birational map plays in the Ristretto
> construction (see comments on 5 & 6 below).

This suggestion was grounded in an (unstated) assumption, namely,
that many readers will be using this document to add ristretto255
support to an existing Curve25519 implementation. If this is true,
then readers of this document will likely be familiar with RFC7748
already. Those readers will appreciate if you anchor this document
to another one with which they're familiar.

I see no issue with *also* referring interested readers to the details
at https://ristretto.group, or to informative implementations, etc.
But those should not be required reading.

> >    I'd also get rid of the text in the Section 3 header that talks
> >    about groups of order l, since edwards25519 is not such a group
> >    and thus no group isomorphic to it is either.
> 
> The Section 3 header talks about groups of order l because
> ristretto255 is such a group, and a compatible implementation provides
> a group isomorphic to the one described in the document.

This is too high-level a statement, possibly for the whole document
but certainly for Section 3. And as I observed above, this is the
definition of a compatible implementation, not an explanation of how
to build one.

> > 4. Explicitly allow addition, subtraction / inversion, and
> >    multiplication on any point that is an internal representation.
> >    These operations MUST be isomorphic to addition, subtraction /
> >    inversion, and multiplication on edwards25519.
> 
> These operations are already explicitly allowed, and the group
> operations are already required to match the ones specified in the
> test vectors.

I understand that Section 3.3 is trying to say this. Again, this is
clear *to you*. I am suggesting a way of making it clear to everyone
else. This suggestion is related to what I said above about defining
"internal representations" up front in terms of what's allowed and
what's not.

> Since ristretto255 is not isomorphic to Curve25519 for the reason you
> mentioned above -- they're different groups, one of which MAY be
> implemented using the other -- I don't think it makes sense to define
> anything in terms of isomorphism to Curve25519.

Section 3 appears to disagree with you.

Again: the point is to explicitly define what *concrete* operations
are allowed on internal representations.

> > 5. Explicitly state that any internal representation that is the
> >    result of a sequence of mapping operations MUST be converted back
> >    to edwards25519 internal representation via the inverse sequence
> >    of mappings before applying the ENCODE or EQUALS function.
> 
> > 6. Explain that implementors MAY compose (e.g.)  DECODE/mapping,
> >    FROM_UNIFORM_BYTES/mapping, reverse-mapping/ENCODE,
> >    reverse-mapping/EQUALS, and similar into monolithic functions.
> >    In other words, the idea isn't to dictate the structure of the code,
> >    only the underlying algebraic properties.
> 
> I think these are already covered by the fact that their
> implementations are not normative (cf. your suggestion 1), and the
> Section 3 header.

Yes, of course they're covered, and yes, of course this is clear to
you. But no, unfortunately, the document as it stands does not make
this clear to the reader.

Put another way: your response makes clear that the reader has to
extract this information from a synthesis of several paragraphs of
text. In fact, it seems from your response ("I think...") that even
you---the author of the document!---are not certain whether these
are covered or not.

> As explained in the Decaf paper, as well as the
> https://ristretto.group website referenced in Section 1
[snip, with apologies]

The document needs to be written clearly enough to stand alone as a
reference for implementing Ristretto.

Pace Watson, I don't care if it justifies the algebra---citations for
that are better---but if a reader has to go to https://ristretto.group
to make sense of this document, that's a failure.

> Instead, the draft, as written, has two goals:
> 
> 1. To provide test vectors for the ristretto255 encoding, decoding,
> group-operations, and from-uniform-bytes functions (Appendix A), which
> provide sufficient rigidity that passing them enforces compatibility
> between implementations, regardless of which curve is used internally;
> 
> 2. To provide concrete formulas aimed at allowing implementors to add
> support for ristretto255 to an existing Ed25519 implementation.  I
> think that the speed with which a conformant ristretto255
> implementation was added to libsodium following publication of the
> draft is illustrative of its success with respect to this goal.

I think you're missing a goal, namely, to allow a competent implementor
generally familiar with the area to create a de novo implementation of
Ristretto using this document, any referenced RFCs, and a reasonable
set of standard reference books.

In particular, readers should not need to consult your website or
your code in order to get things running.

> 1. edwards25519-SHA256-EDELL2-RO is not compatible with ristretto255, as
> it does not produce valid internal representatives;

Whoops, this is true! Sorry :)

But as far as I can tell, only because of the cofactor clearing
step. Elligator to edwards25519 and a multiplication by 2 (rather than
8) gives [2]E, whereupon torque-and-quotient gives a valid encoding.
So it would appear that this is easily fixed.

If that's not right, could you please clarify precisely the issue?

> 2. even if it did, ristretto255 implementations are not required to
> use Curve25519 internally, so the hash-to-group method cannot be
> defined in terms of Curve25519;

Given the current definition of FROM_UNIFORM_BYTES, you have to admit
that this is pretty hilarious as written.

In any case, it's false. As an example, the Decaf map is to a Jacobi
quartic isogenous to Curve25519, but another completely valid way of
defining the map would be via a map to some concrete instantiation
of Curve25519 followed by a map to the quartic. Something slightly
more complicated but analogous works for Ristretto.

The point is, the above is not a real objection; it's another piece
of fiction-scaffolding supporting the claim that Curve25519 isn't
the internal representation.

> 3. ristretto255 already provides a hash-to-group method, which is defined
> so that implementations remain wire-compatible even when changing the
> curve used internally, as noted in the Ristretto website and the Decaf
> paper;

Taken together, my suggestions (modulo the cofactor clearing issue)
also give a map that is consistent independent of the internal
representation, so this is not an argument against what I've suggested.

> 4. requiring use of a specific hash function is a layering violation,
> as the definition of an elliptic curve should not depend on use of a
> particular symmetric primitive.

This is a fair criticism.

(But of course, specifying hash-to-curve alongside a curve is also
a layering violation, since there are plenty of mappings to choose
from, just like there are plenty of cryptographic hash functions to
choose from.)

The solution is not, I think, to redefine Elligator in the Ristretto
document when it's already available in the h2c doc. Instead, the
right solution is probably to define a family of hash-to-curve suites,
parameterized in H, that use Elligator2 to map to edwards25519,
and map the result to [2]E.

Happy to chat more about this if there are details that seem wrong
or if we can make life easier for you in h2c, of course.

> Why should it be forced to launder its domain-separated hash output
> through a second HKDF construction?

It sounds so dramatic when you say it this way :)

But the reality is, it makes no difference to the performance, and it
(extract-and-expand) is far more misuse resistant. To me, the benefit
(preventing mistakes) outweighs the cost (~none).

Anyway, to me the above is missing the point. As I said (far) above,
forcing an implementor to build yet another hash-to-curve construction
just for ristretto255 when they could have have used the one they
already wrote for edwards25519 is a failure. Between "invoke HKDF one
more time before doing a slew of curve ops" and "implement another
tricky hash-to-curve algorithm," I know which one seems worse to me.

> it seems like the cleaner layering would be for the hash-to-curve
> draft to incorporate this draft by reference

Hey, that's a plausible solution, too! I'd be curious to hear what
other people think. I'm in favor of whatever is cleanest and clearest.

-=rsw

[Cfrg] Adoption request: draft-hdevalence-cfrg-ri… Filippo Valsorda
Re: [Cfrg] Adoption request: draft-hdevalence-cfr… Jeff Burdges
Re: [Cfrg] Adoption request: draft-hdevalence-cfr… Filippo Valsorda
Re: [Cfrg] Adoption request: draft-hdevalence-cfr… Paterson Kenneth
Re: [Cfrg] Adoption request: draft-hdevalence-cfr… Jeff Burdges
Re: [Cfrg] Adoption request: draft-hdevalence-cfr… Riad S. Wahby
Re: [Cfrg] Adoption request: draft-hdevalence-cfr… Jack Grigg
Re: [Cfrg] Adoption request: draft-hdevalence-cfr… Jeff Burdges
Re: [Cfrg] Adoption request: draft-hdevalence-cfr… Riad S. Wahby
Re: [Cfrg] Adoption request: draft-hdevalence-cfr… Riad S. Wahby
Re: [Cfrg] Adoption request: draft-hdevalence-cfr… Filippo Valsorda
Re: [Cfrg] Adoption request: draft-hdevalence-cfr… Filippo Valsorda
Re: [Cfrg] Adoption request: draft-hdevalence-cfr… Riad S. Wahby
Re: [Cfrg] Adoption request: draft-hdevalence-cfr… Riad S. Wahby
Re: [Cfrg] Adoption request: draft-hdevalence-cfr… Filippo Valsorda
Re: [Cfrg] Adoption request: draft-hdevalence-cfr… Filippo Valsorda
Re: [Cfrg] Adoption request: draft-hdevalence-cfr… Riad S. Wahby
Re: [Cfrg] Adoption request: draft-hdevalence-cfr… Watson Ladd
Re: [Cfrg] Adoption request: draft-hdevalence-cfr… Riad S. Wahby
Re: [Cfrg] Adoption request: draft-hdevalence-cfr… Riad S. Wahby
Re: [Cfrg] Adoption request: draft-hdevalence-cfr… Henry de Valence
[Cfrg] draft-hdevalence-cfrg-ristretto and draft-… Filippo Valsorda
Re: [Cfrg] Adoption request: draft-hdevalence-cfr… Riad S. Wahby