Re: [CFRG] Small subgroup question for draft-irtf-cfrg-hash-to-curve

Rene Struik <> Fri, 09 April 2021 18:50 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 445753A2ADB for <>; Fri, 9 Apr 2021 11:50:00 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2.097
X-Spam-Status: No, score=-2.097 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id kN1YdbEx1IDd for <>; Fri, 9 Apr 2021 11:49:54 -0700 (PDT)
Received: from ( [IPv6:2607:f8b0:4864:20::733]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id A19593A2AD8 for <>; Fri, 9 Apr 2021 11:49:54 -0700 (PDT)
Received: by with SMTP id g15so6850418qkl.4 for <>; Fri, 09 Apr 2021 11:49:54 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=to:cc:references:from:subject:message-id:date:user-agent :mime-version:in-reply-to:content-language; bh=9iD4l15fUo2RUOALG+vJIt+qoVio+CoWbOy15ImWiN0=; b=pDj4tpcXRvtWXA0prCZ3f73wNs4kj8Gwvm8a7v+lIBpHZmfRVYiIv+/h6voaC/8spg gHSUg6CC/4JjMIW/tYixInZtz60SQ/cJDj/h+ShOKC4TiCI4Z0HiY3PKfSbV2HEAxE8R 0+TSxfxSP0D2fPwJ6qb/VeL3V19ubnHiFXlrLQ2soJWrLrD2UIPuY9dsyS/SKAxxbjoO Rh8l9NwvX15ySTeICeH7cQS8Lz8T9lsohwP55NLdSGfaAvnbA3KOG1WSf2ECxxnPyQ5w ZwVkzeJMKnVd2gNK7yVm/Ixo8Vnmxg5DHzBCfyxstHZBfRArOfuH/x2HiHc/9wEF06cp 852w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:to:cc:references:from:subject:message-id:date :user-agent:mime-version:in-reply-to:content-language; bh=9iD4l15fUo2RUOALG+vJIt+qoVio+CoWbOy15ImWiN0=; b=mORAUwiaMQuUTWfMm2zQ2jl1qpyJeiboiHcd5rfU3coPMvqNAgr2JVUMCXYG7+krRr lwcpqqYjrdl88PK3N33SPEQq6yk1G4cCQCHGF4+IaUg1lRDfdf0dgvAI2lg8YyHuVZnG kIw0iUDCem0aFlVBb7Pwm61l3g7lLD6AiI8Sm6S02GUKZYbKHFmkr7FPO2D2Q9luHfQI KejiJ1GCSo4QL8GafIw53lEMEEYLFKusS/jkernNqfoI0NVhqeI1pXwwW55zw8RrEdOC 6S67jXOwk38PXynk8j/y2JRkigNtCegDwpqJh2hS/o0jaMvOk9cr2IqMRrsiPDpYIYUI pyEQ==
X-Gm-Message-State: AOAM531QBdA/0KSCSsakhRTkCA6zPyBLWtKYGx2qt3Y8uIpQrk/0boX2 CZQ4FFY+jCW/nLNoYA1223C2mtF7UH0=
X-Google-Smtp-Source: ABdhPJy3DRq12GCd7Bardjm2XMUpAQxdaf1Z4I+eEuzcD6tgPMcOvcQ/4p+D+0fqvrmKHZnctjLvWA==
X-Received: by 2002:a37:8744:: with SMTP id j65mr2885407qkd.304.1617994192574; Fri, 09 Apr 2021 11:49:52 -0700 (PDT)
Received: from ?IPv6:2607:fea8:8a0:1397:adbb:7488:638:712e? ([2607:fea8:8a0:1397:adbb:7488:638:712e]) by with ESMTPSA id t24sm2360326qto.23.2021. (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 09 Apr 2021 11:49:52 -0700 (PDT)
To: "Hao, Feng" <>
Cc: CFRG <>
References: <> <> <> <> <> <> <> <>
From: Rene Struik <>
Message-ID: <>
Date: Fri, 9 Apr 2021 14:49:49 -0400
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.9.0
MIME-Version: 1.0
In-Reply-To: <>
Content-Type: multipart/alternative; boundary="------------1402388D883A3F662370125E"
Content-Language: en-US
Archived-At: <>
Subject: Re: [CFRG] Small subgroup question for draft-irtf-cfrg-hash-to-curve
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Crypto Forum Research Group <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Fri, 09 Apr 2021 18:50:00 -0000

Hi Feng et al:

I think one should not bank on "rare" exceptions being rare, since 
history of cryptographic constructs is full of corner cases that can be 
exploited to pull the rug. Fortunately, it is not hard to design the 
curve mapping so as to avoid low-order points, by design -- at least 
when the co-factor h is small (in fact, I looked at this two years ago). 
Here, for this "single" map this can be done without a check of inputs 
or outputs and at very minor cost. For the "summation" map, this may 
require a simple (rare) output check.

I do think one should bear in mind that some of the mappings P( ) have a 
rich structure, e.g., with Montgomery curves and if mapping the pair of 
field elements (t1,t2):=(delta*u1^2, delta*u2^2), where delta is a fixed 
non-square element of GF(p), to R:=P(t1)+P(t2), this yields the identity 
element of the curve if delta*u1*u2=1 [1] or if u1+u2=0. Combining this 
with the hash function (u1,u2):=(H(m1),H(m2)), this condition occurs if 

delta*H(m1)*H(m2)=1 (mod p) or H(m1)+H(m2)=0 (mod p). While this is 
unlikely to occur if the hash function H can be modeled as a random 
functions and while m1 and m2 are derived from message m here, it would 
be hard to argue that practical instantiations of hash functions have 
been studied that well with these properties in mind. So, perhaps, one 
should look beyond modeling things abstractly as "ROM", here.

Two specific comments:
a) if delta above is a "free variable", one can trivially compute the 
pre-image t of a low-order point and realize delta*H(m)^2=t by picking 
delta after the fact. With the hash to curve draft, delta is called Z 
and is not required to be fixed, so this attack would indeed be possible.
b) as Mike Hamburg suggested, "hash to field is an essential element of 
hash to curve". It is somewhat odd that draft-irtf-cfrg-cpace-01 picks 
and chooses some elements of the constructions in 
draft-irtf-cfrg-hash-to-curve-10, where it picks another "hash to field" 
function and ignores the "multiplication by the co-factor" in the end. 
If the hash-to-curve draft were that practical, this deviation should 
not have been necessary. It occurs to me that harmonization potential 
should be further explored. {In my mind, the co-factor multiplication 
can be safely omitted, since could be added by an application that needs 
this. The other complexities seem due to squeezing in both extension 
field and prime field constructions in the same draft: are we sure it 
satisfies both use cases for password-based protocols and for pairings?}

Best regards, Rene

[1] this ignores rules to determine whether the output is a point or its 

On 2021-04-09 12:59 p.m., Mike Hamburg wrote:
> Hello Feng,
>> On Apr 9, 2021, at 1:18 PM, Hao, Feng < 
>> <>> wrote:
>> Hi Loup and Mike,
>> Thanks for the comments. As your arguments have something in common, 
>> so I will reply together.
>> First, let me re-iterate my two main questions:
>>  1. Do we agree we need an algorithm specification be “complete”,
>>     covering all usual and edge cases? In the current hash-to-curve
>>     draft, there is a failure mode that is non-recoverable. I’m
>>     concerned about this as I haven’t come across something like this
>>     before. My question is not on the probability, but the
>>     “completeness” of an algorithm specification.
> An algorithm specification should be complete, though many algorithms 
> do have paths that end in failure.  The hash-to-curve specification 
> is, as far as I’m aware, complete.  It does not end in failure if it 
> returns the identity point.  That point is simply the correct output 
> for certain inputs.
> I don’t know if the same holds for OPAQUE or CPace: for all I know, 
> they may have specification holes and/or end in failure in that case.
>> 1.
>>  2. If a protocol is provably secure under some assumption, do we
>>     agree the assumption should be “precise”? There is 
a clear
>>     discrepancy in what’s assumed in the CPace, OPAQUE and maybe
>>     other protocols and what’s actually provided by the concrete
>>     construction. Either should the assumption in the security proof
>>     be modified to fit the real construction, or the construction
>>     modified to fit the actual assumption in the proof.
> Ideally assumptions should be precise, though in practice they are 
> not: they assume that an adversary cannot feasibly solve a certain 
> mathematical problem with more than negligible probability, where 
> “feasibly” and “more than negligible” are a bit fuzzy.  That said, the 
> mathematical problem being solved should be precise.
> Security proofs, which might more accurately be called security 
> arguments, usually take the following form:
> *   Given an algorithm A which breaks the scheme’s security goals in 
> some model with probability epsilon,
> *   We can create an algorithm B, running in “not much more time or 
> other resources” than A
> **   (or some other bound; “not much more time" is not precise but 
> pinning it down exactly is a huge pain for usually not much gain),
> *   Which solves an assumed-hard math problem with some probability delta
> *   Where delta is bounded from below based on epsilon and perhaps 

> some other factors **   (e.g. the number of times the adversary calls 
> the hash function, or the number of users of the system).
> If hitting the point at infinity affects the probability of B’s 

> success — in some cases it will, and in some (e.g. PAK) it won’t — 
> then the proof must account for this in its bound.  However, if the 
> adversary can’t cause this to happen, then the term from this effect 
> will be on the order of q/p, where q is the amount of work that honest 
> parties do, and p is the order of the group.  Such terms are typically 
> much less than the epsilon for a reasonable attack strategy, so they 
> won’t significantly worsen the bounds in the security proof.
>> Both of you mention that the probability of falling into a small 
>> subgroup in common elliptic curves is negligible, and hence should be 
>> ignored. However, we should be aware of the function creep once the 
>> map-to-curve functions are deployed at scale. For example, if a user 
>> already has an input value in F_p, does he still have to go through 
>> the hash_to_field step (which is complex on its own)? A conscious 
>> implementer will quite likely skip this step. But that will give more 
>> freedom to an attacker to engineer the input such that the output 
>> falls into a small subgroup.
> Hash_to_field is an essential part of hash_to_curve.  These protocols 
> are often completely insecure if you skip that step and allow the 
> attacker to choose the inputs, small subgroup or no.  Also, if an 
> implementer skips this step, then their implementation won’t work: or 
> rather, it won’t interoperate with anyone else’s.
>> The negligible probability in your analysis will no longer hold. For 
>> a safe implementation, precluding small-group points should be built 
>> into the map-to-curve function itself (in constant time of course). I 
>> know that’s hard, but that’s exactly why the hash-to-curve function 
>> has remained so tricky to get right in PAKE since early 2000s.
>> Cheers,
>> Feng
> We could always say something like: if the output would be the 
> identity then instead output the generator.  But again, I’m not aware 
> of a case where this meaningfully affects the security, especially 
> because even finding an input that maps to the identity likely 
> constitutes an attack on the hash function.
> Furthermore, the goal of hash-to-curve is usually to find a point of 
> unknown discrete log.  If the attacker can guess the discrete log of 
> the hash image, or if honest parties choose a point whose discrete log 
> the adversary somehow knows — which happens with the same probability 
> 1/q as finding the identity point — then in most cases the adversary 
> wins.  So again, changing the identity to some other point often 
> doesn’t change anything.
> Overall, while I like to remove rough edges, I’m not sure this is a 
> rougher edge than a check that changes the identity point to some 
> other point.
> Cheers,
> — Mike
> _______________________________________________
> CFRG mailing list

email: | Skype: rstruik
cell: +1 (647) 867-5658 | US: +1 (415) 287-3867