Re: [CFRG] draft-irtf-cfrg-vrf09 feedback

Leonid Reyzin <reyzin@cs.bu.edu> Wed, 26 January 2022 14:33 UTC

MIME-Version: 1.0
References: <CAH2-gx2RW+4ODs2KwPapv+HGWgXxo-Lz-LNk=f7Xev7BPn3zQA@mail.gmail.com>
In-Reply-To: <CAH2-gx2RW+4ODs2KwPapv+HGWgXxo-Lz-LNk=f7Xev7BPn3zQA@mail.gmail.com>
From: Leonid Reyzin <reyzin@cs.bu.edu>
Date: Wed, 26 Jan 2022 09:33:19 -0500
Message-ID: <CAHZ6D0vtc65QRa6CuoQF+dJDuTQuDZgsDTfToxgH07Au3Stcew@mail.gmail.com>
To: CFRG <cfrg@irtf.org>
Cc: Brian Chen <brian.chen@zoom.us>, Antonio Marcedone <antonio.marcedone@zoom.us>
Content-Type: multipart/alternative; boundary="000000000000c4a40705d67d16f0"
Archived-At: <https://mailarchive.ietf.org/arch/msg/cfrg/TkuoYhs3osNWqe2z-NySrzCf1Qk>
Subject: Re: [CFRG] draft-irtf-cfrg-vrf09 feedback
Precedence: list

Dear Antonio and Brian,

Thank you for your thorough reading of the document and detailed and
thoughtful feedback. Responses below.

> Hello CFRG,
>
> Here is some additional feedback on draft-irtf-cfrg-vrf-09.
>
> In general we think the draft is well-written and helpful to implement
the VRF functionality. We support its publication as an RFC, but have some
concrete suggestions that we feel would clarify the specification, increase
interoperability, and reduce the potential for bad parameter choices.
>
> Technical comments:
>
> - Lack of well-defined RSA-FDH ciphersuites: The specification (and the
references quoted) do not seem to recommend a specific hash function, or a
specific RSA modulus length. Therefore a compliant implementation might
accept weak keys (that are too short), or there might be confusion around
other parameters. We suggest adding a section specifying RSA ciphersuites,
similarly to how it is done for ECVRF. In particular, defining ciphersuites
with a fixed size for the RSA modulus and a specific hash function would
reduce ambiguity and help to quickly determine whether different
implementations of this standard can interoperate. To directly parallel
Section 5, we could add a section 4.4 called “RSA-FDH-VRF Ciphersuites”,
and add a reference to this section in the “Fixed options” of Section 4.

The case of RSA-FDH-VRF is quite different from the case of ECVRF, where
many parameters must be chosen to work together. In particular, there is no
reason to fix specific key lenghts, as typical RSA implementations can
handle variable-length keys, and security levels are often chosen based on
the importance of the key. We therefore decided to include three
ciphersuites, using SHA-256, SHA-384, and SHA-512 hash functions, but not
specify key lengths. See [this commit](
https://github.com/cfrg/draft-irtf-cfrg-vrf/commit/2e414c74d5d3f96448e7c2f3179ce8843d906585
)

> - ECVRF PK Validation: The PK validation procedure for ECVRF seems like a
likely thing for implementers to forget to do, especially given the way it
is described in the standard. Even if a library claimed to implement
ECVRF-P256-SHA256-TAI exactly according to the spec, it would be unclear if
the implementation performs this check or not, which may cause confusion
and security issues in cases where this check is important. We suggest
mandating that compliant implementations MUST perform this check in all
cases, as opposed to only when full uniqueness and full collision
resistance are deemed necessary. Another option might be to have
ciphersuites that turn this on explicitly, and others which don’t.
>
> Adding an additional test vector including a bad public key that the
implementation should reject would ensure implementations correctly cover
this case.

Validation is not needed in some applications and may be very expensive in
curves with a large cofactor (such as pairing-friendly curves). As noticed
by Chris Peikert and Jiyayu Xu in their last call response, key validation
is not needed for full uniqueness; it's needed only for full
collision-resistance, which is not a property required for many
applications. We do not wish to mandate a potentially expensive and
unnecessary step. Making key validation part of the ciphersuite also seems
imprudent, because different ciphersuites produce incompatible outputs,
while there are scenarios when the need to validate depends on the
verifier. We have, however, edited the draft to make key validation an
explicit input to the Verification requirement. The changes are spread over
two commits [one was overly complex](
https://github.com/cfrg/draft-irtf-cfrg-vrf/commit/a739d8c9c65867371c45dcd177ebc9812a8381d1)
and the [second one simplified it](
https://github.com/cfrg/draft-irtf-cfrg-vrf/commit/966557c6c3b4d3cc35b1a81e0e33bc901f984355)
Test vectors don't seem to be right the vehicle here, since all public keys
that are bad are explicitly specified in the second algorithm of 5.4.5.

>
> Minor technical comments:
>
> - The suggestion of increasing the security parameter for applications
concerned about tightness of cryptographic reduction cannot really apply to
ECVRF, as the curves are fixed by the spec. If you decide to accept our
previous suggestions and define parameters for the RSA based version, the
same would apply there too. In that case, maybe clarify that this is more
of a forward-looking suggestion for future versions of the spec?

Clarified, and also added a quantitative explanation [in this commit](
https://github.com/cfrg/draft-irtf-cfrg-vrf/commit/524a4fb97a07c55599e5a0d80379491eccd88e74
)

> - Section 4: Should the size of the modulus n (or a bound on it) be a
“Fixed option” like hLen?

As discussed above, we do not feel we should be getting into the business
of RSA key sizes. (Getting into EC key sizes is necessary for ciphersuites,
so we have no choice.)

> - Domain separation and notation: Section 7.7 implies that constants like
one_string and two_string that are often used as inputs to the hash
functions are for domain separation. In the RSA case, domain separation is
trivial to check, as these different octets are at the beginning of inputs
to both MGF1 and proof_to_hash. The only thing we would recommend is
perhaps naming the constants in a way that makes this intent self-evident,
such as defining TAG_PROOFTOHASH=0x01 and TAG_MGF1=0x02.
> On the other hand, in the ECVRF case things are a bit more complex, as
these octets appear at different positions in different hash functions
(sometimes they are in second position, sometimes in the last), and in some
cases aren’t even there (and we rely on the hash inputs being only known to
honest parties). We wonder if perhaps using a distinct octet (defined as a
constant with a descriptive name) as the first octet of the input to each
hash function would make both intent and reasoning easier.

We are, unfortunately, stuck with certain compatibility constraints with
other standards and drafts that have forced us to have the somewhat more
cumbersome approach to domain separation than we wanted. We explain domain
separation rationale in Section 7.9. We have named the constants as you
suggest. See [this commit](
https://github.com/cfrg/draft-irtf-cfrg-vrf/commit/dd39d85d3d9aa4ea4e5d58735dac7a0064c8f5d6
).

> - Section 5.4.1.1: Suggest mentioning that arbitrary_string_to_point is
allowed to return “INVALID” when it’s first defined, e.g., by inserting ‘or
“INVALID” ’ after “conversion of an arbitrary octet string to an EC point.”
On a first reading, the current definition seems to suggest that it already
maps every octet string to a valid EC point. The fact that there are both a
string_to_point and an arbitrary_string_to_point functions that seem to
serve similar purposes is a bit confusing: as others have noted, the name
doesn’t convey the difference.

We changed the name to interpret_hash_value_as_a_point and made clear it's
allowed to return INVALID [in this commit](
https://github.com/cfrg/draft-irtf-cfrg-vrf/commit/3e3c18195d869780108bfab977618aa9f93f1f3a
).

> - Section 5.4.1.1, step 6: The loop is slightly suspicious in that it
seems potentially unbounded; ctr looks like it can just keep going past a
single octet, causing int_to_string to fail internally in a way that is not
obviously handled. Suggest one sentence of clarification such as “The loop
in step 6 errors if ctr reaches 256, but for all ciphersuites in Section
5.5, this is overwhelmingly unlikely.”, which may be helpful for
implementers.

Added a note [in this commit](
https://github.com/cfrg/draft-irtf-cfrg-vrf/commit/3e3c18195d869780108bfab977618aa9f93f1f3a
).

>
> Minor editorial comments:
>
> - Abstract: The text talks about “several VRF constructions”. But then
“One VRF uses RSA and the other VRF uses...”. It could say “Some VRFs are
based on RSA and others on Elliptic Curves (EC).”

Fixed [in this commit](
https://github.com/cfrg/draft-irtf-cfrg-vrf/commit/58ed2ae724af435ff210c8a0b6c0c45d9397434b
)

> - Implicit definition of the Key Generation algorithm. Section 2
explicitly defines a syntax for the VRF_hash, VRF_prove, VRF_proof_to_hash,
…, but only implicitly defines the need for a key generation algorithm
(there is no VRF_keygen or similar). We think that explicitly mentioning
this algorithm (with its inputs and outputs both in the EC case and the RSA
case) would improve the clarity of the document, even if its actual
implementation is deferred to some other specification.

Key generation is not specified as an explicit algorithm in RFC 8017, and
we are trying to follow this example for the RSA-FDH-VRF. Key generation
for the ECVRF outputs slightly different versions of the key depending on
the ciphersuite. The text in section 5 makes it clear that it's specified
in the ciphersuites.

> - Section 3.4: Suggest elaborating slightly on what “certain VRF
applications” are, in addition to the provided references. A concrete
proposal: “(such as preventing bias when selecting participants in some
consensus protocols as in [GHMVZ17] and [DGKR18])”.

Fixed [in this commit](
https://github.com/cfrg/draft-irtf-cfrg-vrf/commit/58ed2ae724af435ff210c8a0b6c0c45d9397434b
)

> - Section 4, under Primitives: Add a short parenthetical sentence about
what MGF1 does to match the other primitives above it, for example: “(given
a seed, which is an octet string, and a length, deterministically produce a
pseudorandom octet string of the specified length)”

Fixed [in this commit](
https://github.com/cfrg/draft-irtf-cfrg-vrf/commit/58ed2ae724af435ff210c8a0b6c0c45d9397434b
)

> - Section 4.1: The RSAFDHVRF_prove algorithm explicitly uses the RSA
modulus n and its length k. Although these can be recomputed/extracted from
K as in RFC8017, these are not formally inputs to the algorithm and so this
makes the description hard to follow. Maybe we could add a step to
explicitly extract n and k from K (as in Step 1 of ECVRF_prove), or add n
as an explicit input.

Fixed [in this commit](
https://github.com/cfrg/draft-irtf-cfrg-vrf/commit/58ed2ae724af435ff210c8a0b6c0c45d9397434b
)

> - The concatenation operator || is listed as notation in section 4 and as
a primitive in section 5. In general the two sections are not organized
consistently in terms of notation. We realize that this is to use a similar
notation to other unrelated RFCs, but contents in the two sections do not
flow organically.

Indeed, there is a choice to be internally consistent or to be consistent
to prior work that the readers are more likely to be familiar with. We
found more value in the first choice. Also, the two VRFs have very
different parameters and notation, and we felt that organizing it in a way
that is most logical for each VRF is better than artificially striving for
parallel structure.

> - The & operator in step 5 of the algorithm to validate keys in ed25519
(Section 5.6.1) is not defined, as well as the notation string[int] to
indicate individual octets in a string. Given that * and || are defined, we
suggest defining those too.

We focused on defining notation that is nonstandard or ambiguous (ECs can
be written multiplicatively or additively; ^ can mean XOR or
exponentiation; || can mean "or" or concatenation). The bitwise "and" and
string indexing are standard enough that we didn't feel it was necessary to
define them.

> - Section 3.4: The text says “Additionally, the ECVRF is believed to also
satisfy this property ...” with respect to the random-oracle-like
unpredictability property. The language doesn’t make it clear if this is
just a conjecture or if a proof of this (in some formal model) exists in
the literature. A more concrete reference would help.

Fixed [in this commit](
https://github.com/cfrg/draft-irtf-cfrg-vrf/commit/58ed2ae724af435ff210c8a0b6c0c45d9397434b
)

> - Section 5.4.2.2 implicitly assumes that Hash has at least a 64 bytes
output. Indeed, in RFC8032 Hash is fixed to be SHA512. If the intent is to
use the same steps as the Signature algorithm from RFC8032, perhaps it
would be worth explicitly fixing Hash here as well.

Added "To use this algorithm, hLen MUST be at least 64."


> Typos:
> - Section 5.4.1.1: Right before its description,
ECVRF_hash_to_try_and_increment -> ECVRF_hash_to_curve_try_and_increment

Fixed [in this commit](
https://github.com/cfrg/draft-irtf-cfrg-vrf/commit/3e3c18195d869780108bfab977618aa9f93f1f3a
).

> - Sec 7.7: “inputs all the hash functions used” -> “inputs to all the
hash functions used“

Fixed

> Note that we did not check the test vectors.
>
> Sincerely yours in cryptography,
> Antonio Marcedone and Brian Chen
> Zoom Video Communications
> Note: these are our own views and do not reflect those of our employer

[CFRG] draft-irtf-cfrg-vrf09 feedback Antonio Marcedone
Re: [CFRG] draft-irtf-cfrg-vrf09 feedback Leonid Reyzin