Re: [CFRG] Small subgroup question for draft-irtf-cfrg-hash-to-curve

Mike Hamburg <mike@shiftleft.org> Sat, 10 April 2021 20:20 UTC

Return-Path: <mike@shiftleft.org>
X-Original-To: cfrg@ietfa.amsl.com
Delivered-To: cfrg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4AC4D3A1A17 for <cfrg@ietfa.amsl.com>; Sat, 10 Apr 2021 13:20:59 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.305
X-Spam-Level:
X-Spam-Status: No, score=-1.305 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RDNS_NONE=0.793, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=shiftleft.org
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zdUC5lDWt_1f for <cfrg@ietfa.amsl.com>; Sat, 10 Apr 2021 13:20:54 -0700 (PDT)
Received: from doomsayer.shiftleft.org (unknown [54.219.126.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B3E2E3A1A18 for <cfrg@irtf.org>; Sat, 10 Apr 2021 13:20:54 -0700 (PDT)
Received: from [192.168.7.53] (unknown [198.207.18.242]) (Authenticated sender: mike) by doomsayer.shiftleft.org (Postfix) with ESMTPSA id C6384BB869; Sat, 10 Apr 2021 20:20:51 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shiftleft.org; s=sldo; t=1618086053; bh=7HY/gB727bd077t0EX5e9kPfCqRjHe2VJyYsNkcaddk=; h=From:Subject:Date:In-Reply-To:Cc:To:References:From; b=XVlWEXeNSTPXSBC0cVgcGZ7T0bD/RCnI9h9GTcujI6Vg7dIsEtPMFC0ap5iP5kt0U f79s+reRdepzjL1fka5mepfp8Y1wRqL2XunqYsmafvMZL8ULEpMt1++Zg20yw7OWrr Y1yUp0C8IlST8kLBS5ckGLQyPQg2gz2Sl6oznJ3Q=
From: Mike Hamburg <mike@shiftleft.org>
Message-Id: <22A2A668-3190-4671-AD93-9916D0AE640A@shiftleft.org>
Content-Type: multipart/alternative; boundary="Apple-Mail=_4555501A-22F2-4C8B-83C0-FAE37C1291FD"
Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.60.0.2.21\))
Date: Sat, 10 Apr 2021 17:20:50 -0300
In-Reply-To: <b4a305d5-5ebe-e52f-5b1a-6bdd9a9892d6@gmail.com>
Cc: Hugo Krawczyk <hugo@ee.technion.ac.il>, CFRG <cfrg@irtf.org>, "Hao, Feng" <Feng.Hao=40warwick.ac.uk@dmarc.ietf.org>
To: Rene Struik <rstruik.ext@gmail.com>
References: <e270e62d-941d-0a87-7dc9-cf80f73b5aeb@jacaranda.org> <d0778523-5f5d-4327-b795-279918c1899c@www.fastmail.com> <CAMr0u6=PBX1W5zQFmpxKQ=ViUXN9QK00BREL4M0=2HOkaXaiZw@mail.gmail.com> <VI1SPR01MB03573585C37B871D200ECC23D6739@VI1SPR01MB0357.eurprd01.prod.exchangelabs.com> <trinity-f323065e-9f30-48fd-9ead-0865e8f877eb-1618002469856@3c-app-webde-bap03> <VI1SPR01MB035772443E4DA3206E4CD4D3D6739@VI1SPR01MB0357.eurprd01.prod.exchangelabs.com> <7944D4F1-81F8-44FC-95D1-45D47733B385@shiftleft.org> <VI1SPR01MB03574E592790FD59C1ACEB84D6729@VI1SPR01MB0357.eurprd01.prod.exchangelabs.com> <20210410151254.7ze5pt4lpvblhk3f@muon> <CADi0yUNo7o07qM2Qw8yd_eVw_-cM-9wNy3CrLw_Pif79oD_+Og@mail.gmail.com> <f9265449-6921-7b3a-7b02-5580c8bf1b75@gmail.com> <529FE732-1982-4E1F-B0A3-4CE715E4444C@shiftleft.org> <b4a305d5-5ebe-e52f-5b1a-6bdd9a9892d6@gmail.com>
X-Mailer: Apple Mail (2.3654.60.0.2.21)
Archived-At: <https://mailarchive.ietf.org/arch/msg/cfrg/GGXLny2K-JyI4U1ruay2VtWJRzk>
Subject: Re: [CFRG] Small subgroup question for draft-irtf-cfrg-hash-to-curve
X-BeenThere: cfrg@irtf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Crypto Forum Research Group <cfrg.irtf.org>
List-Unsubscribe: <https://www.irtf.org/mailman/options/cfrg>, <mailto:cfrg-request@irtf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cfrg/>
List-Post: <mailto:cfrg@irtf.org>
List-Help: <mailto:cfrg-request@irtf.org?subject=help>
List-Subscribe: <https://www.irtf.org/mailman/listinfo/cfrg>, <mailto:cfrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Sat, 10 Apr 2021 20:20:59 -0000

Hi Rene,

But again, there is an attack if the adversary can find hash outputs of known dlog, even if that dlog isn’t 0.

Have a nice outing,
— Mike

> On Apr 10, 2021, at 5:05 PM, Rene Struik <rstruik.ext@gmail.com> wrote:
> 
> Hi Mike:
> 
> Arguments as to why something is implausible would not even be necessary if one could simply rule out specific corner cases by design. This is, to my understanding, the point Feng Hao tried to bring across (and a reason for me to weigh-in).
> 
> Rene
> {heading back outside due to unseasonably warm weather of 22 oC right now in Toronto}
> 
> On 2021-04-10 3:09 p.m., Mike Hamburg wrote:
>> Hi Rene,
>> 
>> I’m not sure about specific PAKEs, but typically a password would be hashed through a salted KDF construction like Argon2 or PBKDF2, and some protocols include a nonce as well.  For honest users, this means that (unless the hash is truly terrible, i.e. fails to roughly preserve entropy) the identity will never be encountered, even if the hash or map-to-curve is subtly backdoored.
>> 
>> The argument breaks down if an active adversary can find a salt/nonce such that hash(pw, salt) maps to the identity point for many passwords pw.  This is unlikely to happen even if the hash is broken, because the salt is typically short.  But even so, I’m not sure this is unique to the identity element: at least if the client confirms first, then the adversary just needs to find a salt such that dlog(hash(pw1, salt), hash(pw2, salt)) is known — or better yet, such that dlog relations are known for a large set of passwords — in order to break SPEKE or CPace's security.
>> 
>> 
>> 
>> Again, I’m sympathetic to the idea that returning the identity complicates error handling.  At least for some protocols, we have to forbid sending it to keep the protocol contributive (or for other reasons), so the identity would cause an error, which might be the only possible error in some code paths.  We could instead specify a hash_to_curve that’s uniform over non-identity elements.  This could be achieved by eg computing (1 + (hash1 mod p-1)) * cofactor * map_to_curve(hash2), at least if you can keep low-order elements out of map_to_curve(h2).  If a low-order element is an image, you could use (1 + (hash1 mod p-1)) * cofactor * (map_to_curve(hash2) + Q), where Q is the precomputed lexicographically least point such that no low-order point is an image.
>> 
>> This is slower, but if you’re going to scalarmul again anyway, then you could fold the multiplies together so that the leading term costs basically nothing.  Or, since most protocols don’t actually need uniformity, just do the analysis and drop the leading term.
>> 
>> But I don’t think this is convincing.  The counterargument is: uniform maps to G are already analyzed, already implemented, simpler, and preferable in some cases (again, at least for PAK), and the security implications are pretty much negligible, and are likely to remain negligible even if the hash is broken.  So it’s just not worth re-doing at the map to curve level.  It’s easier to just say: if in your protocol you don’t want the identity to come out of hash_to_curve, then detect it and replace it with the generator.  Or again, drop uniformity and use cofactor * (map_to_curve + Q).
>> 
>> Cheers,
>> — Mike
>> 
>>> On Apr 10, 2021, at 2:51 PM, Rene Struik <rstruik.ext@gmail.com <mailto:rstruik.ext@gmail.com>> wrote:
>>> 
>>> Hi Hugo:
>>> 
>>> It would be of interest what the impact of a "broken" hash function in the hash to curve construction on the security of password based schemes would be. In the case of mappings that always yield a high-order point, the impact may not immediately be obvious, while with mappings that do not preclude low-order points this might show more readily, also to offline observers. This may be an interesting addition to security proof papers that now focus mostly on idealized functionality (such as random oracles).
>>> 
>>> As an aside, as I noted in my email of yesterday [1],  mappings that fix a particular image point for a specific input are trivial if the construction itself has "free variables" one could fix after the fact (e.g., pick delta such that delta*H("Hugo123")^2=t0), thereby yielding an immediate attack, irrespective of hash function functionality. See specific comment a) in [1].
>>> 
>>> All in all, it seems that guaranteeing that curve mappings would never yield low-order points seems prudent, esp. if it comes at roughly the same computational cost as ones that do not give those assurances.
>>> 
>>> Best regards, Rene
>>> 
>>> Ref: [1] https://mailarchive.ietf.org/arch/msg/cfrg/lAhuAaEBIgaxhf3Q81jOFjXQLz0/ <https://mailarchive.ietf.org/arch/msg/cfrg/lAhuAaEBIgaxhf3Q81jOFjXQLz0/>
>>> 
>>> On 2021-04-10 1:27 p.m., Hugo Krawczyk wrote:
>>>> Feng, you say:
>>>> 
>>>> > On the other hand, from the perspective of a higher layer protocol (say CPace, OPAQUE and PAK), it’s simply impossible to handle the exception. As soon as map-to-curve hits the small subgroup, the password in a PAKE system will be compromised. Therefore, the above warning is self-defeating and not meaningful.
>>>> 
>>>> If I understand correctly, you are saying that in the case of password protocols, the unlikely event of (a correctly designed, correctly implemented) hash-to-curve mapping some value to the identity has irrecoverable consequences that are specific to the PAKE setting.
>>>> 
>>>> I wanted to comment that in the case of OPAQUE, you could check during password registration that a user's password maps to the identity and ask to choose a new password (we are used to websites rejecting some passwords). However, when that happens, the website should immediately (*) sound an alarm to be heard across the universe. You would have found a preimage of the identity under a RO-modeled hash function. Either you are observing an event with probability, say, 2^{-256}, or you are observing a hugely more probable event: Someone broke the one-wayness of the hash function. STOP USING IT IMMEDIATELY FOR ANY PURPOSE.
>>>> 
>>>> (The same alarm should go off if it just happens in a run of any other protocol.)
>>>> 
>>>> (*) Of course, you should first check that you really have a preimage of the identity under the hash - the most probable event to produce such a result is an implementation error.
>>>> 
>>>> Hugo
>>>> 
>>>> PS: When you have an analysis that assumes a uniform distribution and then decide to deviate from it as a way to "enhance" the design, you may be introducing subtle weaknesses. An historic example (irrelevant to the cases we are discussing here but illustrating the principle) is Enigma's avoidance  of encrypting letters to themselves - something Turing was fast to exploit
>>>> https://en.wikipedia.org/wiki/Cryptanalysis_of_the_Enigma#Security_properties <https://en.wikipedia.org/wiki/Cryptanalysis_of_the_Enigma#Security_properties>
>>>> 
>>>> On Sat, Apr 10, 2021 at 11:13 AM <rsw@cs.stanford.edu <mailto:rsw@cs.stanford.edu>> wrote:
>>>> Hello Feng,
>>>> 
>>>> "Hao, Feng" <Feng.Hao=40warwick.ac.uk@dmarc.ietf.org <mailto:40warwick.ac.uk@dmarc.ietf.org>> wrote:
>>>> > Rsw also gave a similar example of having all zeros for the hash.
>>>> > Let me clarify that we are not – and shouldn’t be - concerned with
>>>> > any of such cases since the values are uniformly distributed within
>>>> > their respective range.
>>>> 
>>>> Right. And the argument is precisely the same for hash-to-curve!
>>>> 
>>>> Let me be perfectly clear: the property that hash_to_curve gives
>>>> is that the output is a uniformly* distributed point in the (big)
>>>> prime-order subgroup of the target elliptic curve.
>>>> 
>>>> At the risk of seeming didactic (in which case, apologies): the
>>>> identity element is indeed an element of the target group G.
>>>> 
>>>> Put another way: fix a generator g of group G of prime order q. Then,
>>>> hash_to_curve returns g^r in G, for r sampled uniformly* at random
>>>> in 0 <= r < q. Under the assumption that discrete log is hard in G,
>>>> hash_to_curve does not reveal r. Under the preimage and collision
>>>> resistance of the underlying hash function, one cannot choose any
>>>> particular r or find two inputs that hash to the same r.
>>>> 
>>>> I hope this helps clarify the security properties, and why focus
>>>> on low-order points at intermediate steps of the computation is not
>>>> relevant to the security of hash_to_curve as specified.
>>>> 
>>>> * uniformly except for some statistical distance less than 2^-100.
>>>> 
>>>> Regards,
>>>> 
>>>> -=rsw
>>>> 
>>>> _______________________________________________
>>>> CFRG mailing list
>>>> CFRG@irtf.org <mailto:CFRG@irtf.org>
>>>> https://www.irtf.org/mailman/listinfo/cfrg <https://www.irtf.org/mailman/listinfo/cfrg>
>>>> 
>>>> 
>>>> _______________________________________________
>>>> CFRG mailing list
>>>> CFRG@irtf.org <mailto:CFRG@irtf.org>
>>>> https://www.irtf.org/mailman/listinfo/cfrg <https://www.irtf.org/mailman/listinfo/cfrg>
>>> 
>>> -- 
>>> email: rstruik.ext@gmail.com <mailto:rstruik.ext@gmail.com> | Skype: rstruik
>>> cell: +1 (647) 867-5658 | US: +1 (415) 287-3867
>>> _______________________________________________
>>> CFRG mailing list
>>> CFRG@irtf.org <mailto:CFRG@irtf.org>
>>> https://www.irtf.org/mailman/listinfo/cfrg <https://www.irtf.org/mailman/listinfo/cfrg>
>> 
> 
> -- 
> email: rstruik.ext@gmail.com <mailto:rstruik.ext@gmail.com> | Skype: rstruik
> cell: +1 (647) 867-5658 | US: +1 (415) 287-3867