Re: [CFRG] Small subgroup question for draft-irtf-cfrg-hash-to-curve

Mike Hamburg <mike@shiftleft.org> Fri, 09 April 2021 16:59 UTC

Return-Path: <mike@shiftleft.org>
X-Original-To: cfrg@ietfa.amsl.com
Delivered-To: cfrg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C9D823A27B1 for <cfrg@ietfa.amsl.com>; Fri, 9 Apr 2021 09:59:41 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.305
X-Spam-Level:
X-Spam-Status: No, score=-1.305 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RDNS_NONE=0.793, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=shiftleft.org
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id SMAY9jn1o_F1 for <cfrg@ietfa.amsl.com>; Fri, 9 Apr 2021 09:59:37 -0700 (PDT)
Received: from doomsayer.shiftleft.org (unknown [54.219.126.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 024833A27AB for <cfrg@irtf.org>; Fri, 9 Apr 2021 09:59:36 -0700 (PDT)
Received: from [192.168.7.53] (unknown [198.207.18.242]) (Authenticated sender: mike) by doomsayer.shiftleft.org (Postfix) with ESMTPSA id DBCBABB868; Fri, 9 Apr 2021 16:59:32 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shiftleft.org; s=sldo; t=1617987573; bh=6WB630ssC94LjfXCDP0rsG7XxlgYEEtQprV1TVnp7sI=; h=From:Subject:Date:In-Reply-To:Cc:To:References:From; b=FIq+td1Zw73OVN23COMGuA7tLsY52Apk5/SWKm5JPq0B+9G4SeZXW6K9ORomK74EO gG+zAF4qc3aRp8n9SlGh/ueY9rmjzex7+bADEUxTG1ZJbtiXKl92ptW7TRPZCfxXrM bQW71FdLTytxSeIiSfX9K1SvJid9VSYYTIgHuKnk=
From: Mike Hamburg <mike@shiftleft.org>
Message-Id: <6F4F0566-3465-4C9C-8993-1B3FDFDDD792@shiftleft.org>
Content-Type: multipart/alternative; boundary="Apple-Mail=_296C8CE4-0276-4EDC-97D5-4F702EEB83C4"
Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.60.0.2.21\))
Date: Fri, 09 Apr 2021 13:59:30 -0300
In-Reply-To: <VI1SPR01MB0357AE729116A79C8DF70516D6739@VI1SPR01MB0357.eurprd01.prod.exchangelabs.com>
Cc: Loup Vaillant-David <loup@loup-vaillant.fr>, CFRG <cfrg@irtf.org>
To: "Hao, Feng" <Feng.Hao@warwick.ac.uk>
References: <e270e62d-941d-0a87-7dc9-cf80f73b5aeb@jacaranda.org> <d0778523-5f5d-4327-b795-279918c1899c@www.fastmail.com> <CAMr0u6=PBX1W5zQFmpxKQ=ViUXN9QK00BREL4M0=2HOkaXaiZw@mail.gmail.com> <VI1SPR01MB03573585C37B871D200ECC23D6739@VI1SPR01MB0357.eurprd01.prod.exchangelabs.com> <4590aaa512acf5a482c9890ebe48f1760e5831a5.camel@loup-vaillant.fr> <F9593D27-3244-470E-89BE-85215B2DC9E7@shiftleft.org> <VI1SPR01MB0357AE729116A79C8DF70516D6739@VI1SPR01MB0357.eurprd01.prod.exchangelabs.com>
X-Mailer: Apple Mail (2.3654.60.0.2.21)
Archived-At: <https://mailarchive.ietf.org/arch/msg/cfrg/KhxYzE5RX-VyNe7Ma0lm5rziTaY>
Subject: Re: [CFRG] Small subgroup question for draft-irtf-cfrg-hash-to-curve
X-BeenThere: cfrg@irtf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Crypto Forum Research Group <cfrg.irtf.org>
List-Unsubscribe: <https://www.irtf.org/mailman/options/cfrg>, <mailto:cfrg-request@irtf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cfrg/>
List-Post: <mailto:cfrg@irtf.org>
List-Help: <mailto:cfrg-request@irtf.org?subject=help>
List-Subscribe: <https://www.irtf.org/mailman/listinfo/cfrg>, <mailto:cfrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Fri, 09 Apr 2021 16:59:42 -0000

Hello Feng,

> On Apr 9, 2021, at 1:18 PM, Hao, Feng <Feng.Hao@warwick.ac.uk> wrote:
> 
> Hi Loup and Mike,
>  
> Thanks for the comments. As your arguments have something in common, so I will reply together.
>  
> First, let me re-iterate my two main questions:
>  
> Do we agree we need an algorithm specification be “complete”, covering all usual and edge cases? In the current hash-to-curve draft, there is a failure mode that is non-recoverable. I’m concerned about this as I haven’t come across something like this before. My question is not on the probability, but the “completeness” of an algorithm specification.

An algorithm specification should be complete, though many algorithms do have paths that end in failure.  The hash-to-curve specification is, as far as I’m aware, complete.  It does not end in failure if it returns the identity point.  That point is simply the correct output for certain inputs.

I don’t know if the same holds for OPAQUE or CPace: for all I know, they may have specification holes and/or end in failure in that case.

> If a protocol is provably secure under some assumption, do we agree the assumption should be “precise”? There is a clear discrepancy in what’s assumed in the CPace, OPAQUE and maybe other protocols and what’s actually provided by the concrete construction. Either should the assumption in the security proof be modified to fit the real construction, or the construction modified to fit the actual assumption in the proof.

Ideally assumptions should be precise, though in practice they are not: they assume that an adversary cannot feasibly solve a certain mathematical problem with more than negligible probability, where “feasibly” and “more than negligible” are a bit fuzzy.  That said, the mathematical problem being solved should be precise.

Security proofs, which might more accurately be called security arguments, usually take the following form:
*   Given an algorithm A which breaks the scheme’s security goals in some model with probability epsilon,
*   We can create an algorithm B, running in “not much more time or other resources” than A
**   (or some other bound; “not much more time" is not precise but pinning it down exactly is a huge pain for usually not much gain),
*   Which solves an assumed-hard math problem with some probability delta
*   Where delta is bounded from below based on epsilon and perhaps some other factors **   (e.g. the number of times the adversary calls the hash function, or the number of users of the system).

If hitting the point at infinity affects the probability of B’s success — in some cases it will, and in some (e.g. PAK) it won’t — then the proof must account for this in its bound.  However, if the adversary can’t cause this to happen, then the term from this effect will be on the order of q/p, where q is the amount of work that honest parties do, and p is the order of the group.  Such terms are typically much less than the epsilon for a reasonable attack strategy, so they won’t significantly worsen the bounds in the security proof.

> Both of you mention that the probability of falling into a small subgroup in common elliptic curves is negligible, and hence should be ignored. However, we should be aware of the function creep once the map-to-curve functions are deployed at scale. For example, if a user already has an input value in F_p, does he still have to go through the hash_to_field step (which is complex on its own)? A conscious implementer will quite likely skip this step. But that will give more freedom to an attacker to engineer the input such that the output falls into a small subgroup.

Hash_to_field is an essential part of hash_to_curve.  These protocols are often completely insecure if you skip that step and allow the attacker to choose the inputs, small subgroup or no.  Also, if an implementer skips this step, then their implementation won’t work: or rather, it won’t interoperate with anyone else’s.

> The negligible probability in your analysis will no longer hold. For a safe implementation, precluding small-group points should be built into the map-to-curve function itself (in constant time of course). I know that’s hard, but that’s exactly why the hash-to-curve function has remained so tricky to get right in PAKE since early 2000s.
>  
> Cheers,
> Feng

We could always say something like: if the output would be the identity then instead output the generator.  But again, I’m not aware of a case where this meaningfully affects the security, especially because even finding an input that maps to the identity likely constitutes an attack on the hash function.

Furthermore, the goal of hash-to-curve is usually to find a point of unknown discrete log.  If the attacker can guess the discrete log of the hash image, or if honest parties choose a point whose discrete log the adversary somehow knows — which happens with the same probability 1/q as finding the identity point — then in most cases the adversary wins.  So again, changing the identity to some other point often doesn’t change anything.

Overall, while I like to remove rough edges, I’m not sure this is a rougher edge than a check that changes the identity point to some other point.

Cheers,
— Mike