Re: [Cfrg] RGLC on draft-irtf-cfrg-hash-to-curve-10

Greg Hudson <ghudson@mit.edu> Sat, 17 October 2020 10:10 UTC

To: "Stanislav V. Smyshlyaev" <smyshsv@gmail.com>, CFRG <cfrg@irtf.org>
Cc: cfrg-chairs@ietf.org
References: <CAMr0u6=-rzVW_tsmmifPu-7FA9DaZ1z83_akp4pkTjHRDGUHiA@mail.gmail.com>
From: Greg Hudson <ghudson@mit.edu>
Autocrypt: addr=ghudson@mit.edu; keydata= mDMEXqnt4RYJKwYBBAHaRw8BAQdAzXfl3g5JJqlqM42fUUk/heS/9HBlRsg+nxe2STu4Su+0 HUdyZWcgSHVkc29uIDxnaHVkc29uQG1pdC5lZHU+iJYEExYIAD4WIQS7YOmQRa0ieO6SH+BO swnsPlpb8QUCXqnt4QIbAwUJCWYBgAULCQgHAgYVCgkICwIEFgIDAQIeAQIXgAAKCRBOswns Plpb8aqtAP42pvOVq1EMSxNC1700RRyc1vhn0oHwcvQvh9KFjeLrbwEAnhQDwJsF3jJEsUhm 3pYkGXbUNFmTeAmKpSWxNa1tvgW4OAReqe3hEgorBgEEAZdVAQUBAQdAAaEKW1gflS0YVNfR azqT484BHfoNGd6HC5sidhGX5AUDAQgHiH4EGBYIACYWIQS7YOmQRa0ieO6SH+BOswnsPlpb 8QUCXqnt4QIbDAUJCWYBgAAKCRBOswnsPlpb8bFNAP40xH2VSjRL9fJ6AwFLH9kC2nLMIbf9 SaqB5KymlBlKtAD+NFHB1W68lmQGqlNglGxobCmVvlP7/kgNlfzfETgs+Aw=
Message-ID: <c43ee53d-56ae-d8ef-0703-4840aeaac959@mit.edu>
Date: Sat, 17 Oct 2020 06:10:24 -0400
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0
MIME-Version: 1.0
In-Reply-To: <CAMr0u6=-rzVW_tsmmifPu-7FA9DaZ1z83_akp4pkTjHRDGUHiA@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Content-Language: en-US
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/cfrg/o1y2Ue3wY_w2_IQEUEtPi92wifg>
Subject: Re: [Cfrg] RGLC on draft-irtf-cfrg-hash-to-curve-10
Precedence: list

I have some concerns over the use of the terms "random" and "uniform" in
this document.

The term "random" is sometimes used to describe the output of
deterministic functions.  For instance, section 5 says "it first hashes
the input byte string to produce a uniformly random byte string", and
section 5.2 says that an alternative hash_to_field must output field
elements that are "uniformly random except with bias at most 2^-k".
hash_to_field is a deterministic function; it introduces no randomness.

The term "uniform" describes a probability distribution where all values
have equal probability.  There is an assumption in the document that
applying a hash function to an input string produces an output that is
somehow uniform or uniformly random, without discussing the input as
being selected from any particular probability distribution.

Perhaps there is a definition of "uniform" which makes this language
more rigorous, but the document does not define the term or provide a
reference, so I can only read it as having the meaning from probability.

I understand that the document wants to specify functions
indistinguishable from random oracles, and therefore needs to be
concerned with not introducing bias beyond what is intrinsic in the
input distribution.  (An example of bias intrinsic to the input
distribution would be hashing passwords to curve elements, where 90% of
the passwords are "mypassword".  One output element will necessarily
have 90% probability, no matter how random-looking that single element
might be.)  I unfortunately don't know how to rigorously describe that,
so can't suggest specific wording changes.

[Cfrg] RGLC on draft-irtf-cfrg-hash-to-curve-10 Stanislav V. Smyshlyaev
Re: [Cfrg] RGLC on draft-irtf-cfrg-hash-to-curve-… Greg Hudson
Re: [Cfrg] RGLC on draft-irtf-cfrg-hash-to-curve-… Watson Ladd
Re: [Cfrg] RGLC on draft-irtf-cfrg-hash-to-curve-… Hal Murray
Re: [CFRG] [Cfrg] RGLC on draft-irtf-cfrg-hash-to… Leonid Reyzin