From nobody Wed Jul 21 16:10:54 2021
Return-Path:
X-Original-To: cfrg@ietfa.amsl.com
Delivered-To: cfrg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1])
by ietfa.amsl.com (Postfix) with ESMTP id 8450F3A2E11
for ; Wed, 21 Jul 2021 16:10:52 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.898
X-Spam-Level:
X-Spam-Status: No, score=-1.898 tagged_above=-999 required=5
tests=[BAYES_00=-1.9, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001,
SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44])
by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024)
with ESMTP id 5KFj2V4Nb61x for ;
Wed, 21 Jul 2021 16:10:47 -0700 (PDT)
Received: from relay8-d.mail.gandi.net (relay8-d.mail.gandi.net
[217.70.183.201])
(using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits))
(No client certificate requested)
by ietfa.amsl.com (Postfix) with ESMTPS id C892B3A2E0E
for ; Wed, 21 Jul 2021 16:10:45 -0700 (PDT)
Received: (Authenticated sender: loup@loup-vaillant.fr)
by relay8-d.mail.gandi.net (Postfix) with ESMTPSA id 554051BF204;
Wed, 21 Jul 2021 23:10:40 +0000 (UTC)
Message-ID: <05222110505718e6aa884adb78e185f17ea40ac2.camel@loup-vaillant.fr>
From: Loup Vaillant-David
To: Rene Struik , "Riad S. Wahby" ,
Filippo Valsorda
Cc: cfrg@irtf.org
Date: Thu, 22 Jul 2021 01:10:39 +0200
In-Reply-To:
References:
<20210721191123.i3f33p3qvkwxlbtl@kaon.local>
Content-Type: text/plain; charset="UTF-8"
X-Mailer: Evolution 3.28.5-0ubuntu0.18.04.2
Mime-Version: 1.0
Content-Transfer-Encoding: 7bit
Archived-At:
Subject: Re: [CFRG] hash_to_field requires implementing a new 25519 field op
X-BeenThere: cfrg@irtf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Crypto Forum Research Group
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
X-List-Received-Date: Wed, 21 Jul 2021 23:10:53 -0000
Hi Rene,
Something similar could indeed work (untested code below):
void reduce(uint8_t r[32], uint8_t w[64])
{
fe low, high;
fe_frombytes(low , w);
fe_frombytes(high, w + 32);
fe_mul_small(high, high, 19);
fe_add (low , low, high);
fe_tobytes(r, low);
// maybe wipe low and high
}
There's a little problem though: this reduction ignores *two* bits from
the input, one of which is right in the middle. As much as I love how
it makes the implementation easy, you might hate how it makes the
specification weird.
Or I could use the same trick to read the two missing bits, and have a
full, clean, wide reduction. Let's see (untested code):
void reduce(uint8_t r[32], uint8_t w[64])
{
fe low, high;
fe_frombytes(low , w);
fe_frombytes(high, w + 32);
fe_mul_small(high, high, 38);
fe_add (low , low, high);
low[0] += (w[31] >> 7) * 19;
low[0] += (w[63] >> 7) * 19
* 38; // may be omited
fe_tobytes(r, low);
// maybe wipe low and high
}
What do you know, that's much less ugly than I anticipated. Nice.
Slower than a dedicated reduction, but if I'm being honest the
performance difference probably never matters.
Thanks for the tip,
Loup.
On Wed, 2021-07-21 at 18:22 -0400, Rene Struik wrote:
> Hi Loup, Filippo:
>
> I do not see the modular reduction implementation issue:
>
> if x=x0+x1*2^m and N=2^m-alpha, x (mod N) = (x0 +alpha*x1) (mod N) =
> ZNadd(x0,ZNmul(alpha,x1)), where ZNadd and ZNmul are operators on
> the
> set ZN=[0..N-1). This may not be the most efficient, but using two
> invocations of a binary operator to implement a single modular
> reduction
> should work for "Arthur Random Programmer" (whoever that new [to me]
> creature is). For p=2^255-19, one could take m=255 and and alpha=19
> (i.e., N=p) or m=256 and alpha=38 (i.e., N=2*p), depending on libary
> details. The above also works for NIST prime curves, Brainpool
> curves,
> Gost curves, secp256k1 (all with negligible relative incremental
> cost).
>
> I do not think a specification needs to spell out all implementation
> details, as long as it is plausible that effective implementations
> are
> possible.
>
> This all being said, the suitability of curve maps (including
> detailed
> steps) depends on the application(s) one wishes to cater to. {I do
> not
> recall having seen such discussion, e.g., related to password
> schemes
> and pairing-based applications.}
>
> Rene
>
> On 2021-07-21 5:42 p.m., Loup Vaillant-David wrote:
> > Hi,
> >
> >
> > > But (as you know) in general it is not safe to reduce
> > > ceil(log2(p))
> > > bits mod p, since this can give significant bias for general p.
> > > The
> > > goal in the draft is to specify a hash-to-field method that is
> > > safe
> > > for any p without requiring any parameter except the security
> > > level
> > > in bits, k, and which gives bias roughly 2^-k. Adding parameters
> > > is
> > > dangerous in the sense that incorrectly choosing parameters gives
> > > a
> > > hash-to-field function that looks OK but is subtly broken.
> >
> > The three primes I know of are pretty clear cut. Two are clearly
> > safe,
> > the other clearly is not.
> >
> > 1. 2^255 - 2^4 - 2^1 - 1 (Curve25519)
> > 2. 2^448 - 2^224 - 1 (Curve448)
> > 3. 2^256 - 2^224 + 2^96 + 2^192 - 1 (P-256)
> >
> > The first has a bias of about 2^-(255-4) == 2^-252. => Safe
> > The second has a bias of about 2^-(448-224) == 2^-224. => Safe
> > The third has a bias of about 2^-(256-224) == 2^-32. => Unsafe
> >
> > So, Curve25519 and Curve448 have biases well below 2^-128, and as
> > such
> > are virtually undetectable (you need about 2^128 samples, which you
> > are
> > not going to get). P-256's bias however can be detected with as
> > little
> > as a few billion samples, and as such is easily detected in
> > realistic
> > settings.
> >
> > Yes, having yet another parameter is a bummer, but as long as we
> > don't
> > use primes with a bias close to 2^-128, choosing an acceptable
> > threshold (like 2^-128) should not be too much of a problem.
> >
> >
> > > In my mind the second is the much more convincing argument,
> > > because
> > > every EC-relevant field library that I'm aware of implements a
> > > wide
> > > reduction already, inside the multiplication routine (which gives
> > > a
> > > p^2-size intermediate result).
> > >
> > > So this
> > >
> > > > Curve25519 implementations will all have to implement and
> > > > expose
> > > > a new field operation (like https://git.io/JlU6L) which will be
> > > > only
> > > > used for h2c, meaning it will not be well tested or maintained,
> > > > or h2c implementations will have to roll their own, probably
> > > > in variable time.
> > >
> > > isn't really true. If one exposes the ceil(log2(p^2)) bit
> > > reduction
> > > that's already implemented for multiplication, then there's no
> > > need
> > > for extra code, and maybe even no need for extra tests.
> >
> > Wait a minute, that reduction is not *nearly* as large as you make
> > it
> > out to be. Let's take a look at Curve25519's ref10 implementation
> > on
> > SUPERCOP. You would think that because the result of the
> > multiplication
> > (before carry propagation) is stored in 10 64-bit signed integers,
> > the
> > number they represent easily exceeds 2^512, or even approaches
> > 2^640.
> >
> > Nope.
> >
> > In fact, that number those wide limbs represent does not even
> > exceed
> > 2^293. That's because the limbs overlap: even though they're 63-bit
> > wide, they're only 25 or 26 bits apart.
> >
> > How then can we compute the multiplication *at all*, would you say?
> > I've explained it on
> > https://loup-vaillant.fr/tutorials/poly1305-design
> > Search for "Cheating at modular arithmetic".
> >
> > The trick is to perform part of the modular reduction by taking
> > advantage of the fact that you're dealing with a pseudo Mersene
> > prime.
> > See, when you work modulo 2^255-19, multiplying 2^255 is the same
> > as
> > multiplying by 19. So you can just shift numbers to the right 255
> > bits,
> > and multiply by 19 to compensate. That's why field multiplication
> > has
> > those *19 all over the place: instead of representing the number
> > over
> > twenty limbs, we just shift directly to one of the ten least
> > significant limbs, and multiply by 19 to compensate.
> >
> > Long story short, your trick won't work. We will need to implement
> > a
> > separate reduction routine.
> >
> > ---
> >
> > And before someone is tempted to point out that EdDSA involves the
> > reduction of a 512-bit modulo the prime order of the curve that we
> > could reuse, hold your horses: while TweetNaCl and Monocypher uses
> > a
> > fairly generic algorithm that can easily be adapted by changing a
> > single constant, I've seen ref10 use a highly optimised reduction
> > where
> > the value of the order of the curve is deeply embedded in the
> > algorithm. I personally could not even make sense of it, so I don't
> > think it can be adapted.
> >
> > Moreover, keep in mind that 2^255-19 is a pseudo Mersene prime, and
> > as
> > such can use a much faster reduction algorithm: just shift the most
> > significant bits to the right, multiply by 19, and add it to the
> > least
> > significant bits. You'll get a full reduction in a couple
> > iterations
> > (less if you're clever). Still, implementing this is not for Arthur
> > Random Programmer. I personally would test it quite extensively,
> > and
> > write a proof of correctness on top.
> >
> >
> > > Would it partially allay your concerns if the hash-to-field
> > > section
> > > explicitly pointed out that it's possible to use the
> > > multiplication
> > > routine (or, well, half of it) for the wide modular reduction?
> >
> > Since to my knowledge it is *not* possible, it would actually make
> > me
> > very worried: either I missed something big (and I want to know),
> > or
> > the RFC writers did (and it needs to be pointed out).
> >
> > Loup.
> >
> >
> > _______________________________________________
> > CFRG mailing list
> > CFRG@irtf.org
> > https://www.irtf.org/mailman/listinfo/cfrg
>
>