Re: [CFRG] hash_to_field requires implementing a new 25519 field op

Loup Vaillant-David <> Wed, 21 July 2021 21:43 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id E7D9D3A2B7C for <>; Wed, 21 Jul 2021 14:43:08 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.895
X-Spam-Status: No, score=-1.895 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id xPLkls4Iq6rq for <>; Wed, 21 Jul 2021 14:43:03 -0700 (PDT)
Received: from ( []) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 230E03A2B78 for <>; Wed, 21 Jul 2021 14:43:01 -0700 (PDT)
Received: (Authenticated sender: by (Postfix) with ESMTPSA id 472F960002; Wed, 21 Jul 2021 21:42:49 +0000 (UTC)
Message-ID: <>
From: Loup Vaillant-David <>
To: "Riad S. Wahby" <>, Filippo Valsorda <>
Date: Wed, 21 Jul 2021 23:42:49 +0200
In-Reply-To: <20210721191123.i3f33p3qvkwxlbtl@kaon.local>
References: <> <20210721191123.i3f33p3qvkwxlbtl@kaon.local>
Content-Type: text/plain; charset="UTF-8"
X-Mailer: Evolution 3.28.5-0ubuntu0.18.04.2
Mime-Version: 1.0
Content-Transfer-Encoding: 7bit
Archived-At: <>
Subject: Re: [CFRG] hash_to_field requires implementing a new 25519 field op
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Crypto Forum Research Group <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 21 Jul 2021 21:43:09 -0000


> But (as you know) in general it is not safe to reduce ceil(log2(p))
> bits mod p, since this can give significant bias for general p. The
> goal in the draft is to specify a hash-to-field method that is safe
> for any p without requiring any parameter except the security level
> in bits, k, and which gives bias roughly 2^-k. Adding parameters is
> dangerous in the sense that incorrectly choosing parameters gives a
> hash-to-field function that looks OK but is subtly broken.

The three primes I know of are pretty clear cut. Two are clearly safe,
the other clearly is not.

1. 2^255 - 2^4 - 2^1 - 1               (Curve25519)
2. 2^448 - 2^224 - 1                   (Curve448)
3. 2^256 - 2^224 + 2^96 + 2^192 - 1    (P-256)

The first  has a bias of about 2^-(255-4)   == 2^-252. => Safe
The second has a bias of about 2^-(448-224) == 2^-224. => Safe
The third  has a bias of about 2^-(256-224) == 2^-32.  => Unsafe

So, Curve25519 and Curve448 have biases well below 2^-128, and as such
are virtually undetectable (you need about 2^128 samples, which you are
not going to get). P-256's bias however can be detected with as little
as a few billion samples, and as such is easily detected in realistic

Yes, having yet another parameter is a bummer, but as long as we don't
use primes with a bias close to 2^-128, choosing an acceptable
threshold (like 2^-128) should not be too much of a problem.

> In my mind the second is the much more convincing argument, because
> every EC-relevant field library that I'm aware of implements a wide
> reduction already, inside the multiplication routine (which gives a
> p^2-size intermediate result).
> So this
> > Curve25519 implementations will all have to implement and expose
> > a new field operation (like which will be
> > only
> > used for h2c, meaning it will not be well tested or maintained,
> > or h2c implementations will have to roll their own, probably
> > in variable time.
> isn't really true. If one exposes the ceil(log2(p^2)) bit reduction
> that's already implemented for multiplication, then there's no need
> for extra code, and maybe even no need for extra tests.

Wait a minute, that reduction is not *nearly* as large as you make it
out to be.  Let's take a look at Curve25519's ref10 implementation on
SUPERCOP. You would think that because the result of the multiplication
(before carry propagation) is stored in 10 64-bit signed integers, the
number they represent easily exceeds 2^512, or even approaches 2^640.


In fact, that number those wide limbs represent does not even exceed
2^293. That's because the limbs overlap: even though they're 63-bit
wide, they're only 25 or 26 bits apart.

How then can we compute the multiplication *at all*, would you say?
I've explained it on
Search for "Cheating at modular arithmetic".

The trick is to perform part of the modular reduction by taking
advantage of the fact that you're dealing with a pseudo Mersene prime.
See, when you work modulo 2^255-19, multiplying 2^255 is the same as
multiplying by 19. So you can just shift numbers to the right 255 bits,
and multiply by 19 to compensate. That's why field multiplication has
those *19 all over the place: instead of representing the number over
twenty limbs, we just shift directly to one of the ten least
significant limbs, and multiply by 19 to compensate.

Long story short, your trick won't work. We will need to implement a
separate reduction routine.


And before someone is tempted to point out that EdDSA involves the
reduction of a 512-bit modulo the prime order of the curve that we
could reuse, hold your horses: while TweetNaCl and Monocypher uses a
fairly generic algorithm that can easily be adapted by changing a
single constant, I've seen ref10 use a highly optimised reduction where
the value of the order of the curve is deeply embedded in the
algorithm. I personally could not even make sense of it, so I don't
think it can be adapted.

Moreover, keep in mind that 2^255-19 is a pseudo Mersene prime, and as
such can use a much faster reduction algorithm: just shift the most
significant bits to the right, multiply by 19, and add it to the least
significant bits. You'll get a full reduction in a couple iterations
(less if you're clever). Still, implementing this is not for Arthur
Random Programmer. I personally would test it quite extensively, and
write a proof of correctness on top.

> Would it partially allay your concerns if the hash-to-field section
> explicitly pointed out that it's possible to use the multiplication
> routine (or, well, half of it) for the wide modular reduction?

Since to my knowledge it is *not* possible, it would actually make me
very worried: either I missed something big (and I want to know), or
the RFC writers did (and it needs to be pointed out).