[CFRG] Prehashing in EdDSA vs ML-KEM

Phillip Hallam-Baker <phill@hallambaker.com> Thu, 29 August 2024 16:55 UTC

Return-Path: <hallam@gmail.com>
X-Original-To: cfrg@ietfa.amsl.com
Delivered-To: cfrg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7475BC14F697 for <cfrg@ietfa.amsl.com>; Thu, 29 Aug 2024 09:55:30 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.654
X-Spam-Level:
X-Spam-Status: No, score=-1.654 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, FREEMAIL_FORGED_FROMDOMAIN=0.001, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.25, HTML_MESSAGE=0.001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ZjhFR7Wp_gXA for <cfrg@ietfa.amsl.com>; Thu, 29 Aug 2024 09:55:29 -0700 (PDT)
Received: from mail-pg1-f179.google.com (mail-pg1-f179.google.com [209.85.215.179]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature ECDSA (P-256) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B25CAC14F681 for <cfrg@irtf.org>; Thu, 29 Aug 2024 09:55:29 -0700 (PDT)
Received: by mail-pg1-f179.google.com with SMTP id 41be03b00d2f7-70b2421471aso492514a12.0 for <cfrg@irtf.org>; Thu, 29 Aug 2024 09:55:29 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724950529; x=1725555329; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=yzV4p3Z4GjjFF6YVxHULwgdQm7X1PkUUguyndRcV6uk=; b=YgBEvfs41MhGypcDVC+o1RMUNtPs6ydCYjpp3MqsyVNH6IDSPQnLL3pqC+tlb8vXNQ NBgpJ/UWG/3vfjNHYcjC2hBOx6gGKMhPhkwdGHr45knItBjxulqnMhc8ziyt9NrJI4kh zhfYxLvm/8XAvyklZENOjJrS/HrxwJzomIkxNZsTzPxiueohF5bKxVtewG2fplXNyHqv eoaReQ7NRV1qiHL3AjZd4XiG0yt7ubaylvOWHCnE3eWp+HKK7qb7kFKYCOSzT/PG8GGj KIinLxI46tD0+u5NNyOI9CckjPnp5i9PWM5mI1fY1HEMfR34LGsCLZM9e8GhXTz7X9tq 8FuQ==
X-Forwarded-Encrypted: i=1; AJvYcCWGE9P9cxZqsiJTuny0RQWmRuMPvqMReTY07jaktGi9MrY+2d0gJw7i1+PRkwcF2bXrPmOX@irtf.org
X-Gm-Message-State: AOJu0YyMYPJWJiKPU1GTL6PrS/E0e36dWkIrMho/JJD1O0LdUBGOcp5V a0rAVDeE1mhkfrYWtt8Tn3CcGucY+75pRs4paqr93aQaWlygoRePm8EJhyYcaM7iGR9tAFDk35f CHK4FJODYPAcO07Do8R/ICPp4N9wmlw==
X-Google-Smtp-Source: AGHT+IEvIOkM7lZAz7fOyNChAzUO3hi5FxGxNcDHA2cl5Od1AeDne/ZXsdtsh2jIkM+1eyiXl7/rpFZoftpeVvstV7Q=
X-Received: by 2002:a17:90b:3141:b0:2d3:c862:aa80 with SMTP id 98e67ed59e1d1-2d856503ac0mr3471773a91.41.1724950528676; Thu, 29 Aug 2024 09:55:28 -0700 (PDT)
MIME-Version: 1.0
From: Phillip Hallam-Baker <phill@hallambaker.com>
Date: Thu, 29 Aug 2024 12:55:15 -0400
Message-ID: <CAMm+LwjEgPEVViySRY1HRe2EOHSAvQsfSetvpJko7ORzcDX2BQ@mail.gmail.com>
To: IETF SAAG <saag@ietf.org>, IRTF CFRG <cfrg@irtf.org>
Content-Type: multipart/alternative; boundary="000000000000845b1a0620d557af"
Message-ID-Hash: RJCZ2IG4NLAPRNQU6LOJMSWOXJSRBMJD
X-Message-ID-Hash: RJCZ2IG4NLAPRNQU6LOJMSWOXJSRBMJD
X-MailFrom: hallam@gmail.com
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-cfrg.irtf.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
X-Mailman-Version: 3.3.9rc4
Precedence: list
Subject: [CFRG] Prehashing in EdDSA vs ML-KEM
List-Id: Crypto Forum Research Group <cfrg.irtf.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/cfrg/EOd17zNEERUjuZ0IF0S4pFC67lA>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cfrg>
List-Help: <mailto:cfrg-request@irtf.org?subject=help>
List-Owner: <mailto:cfrg-owner@irtf.org>
List-Post: <mailto:cfrg@irtf.org>
List-Subscribe: <mailto:cfrg-join@irtf.org>
List-Unsubscribe: <mailto:cfrg-leave@irtf.org>

There is a lot of confusion among implementers of ML-KEM and the difference
between the 'pure' and 'prehash' versions. While working on my
implementation, I am now convinced we just got it wrong in Ed25519 and
Ed448 which do things differently.

The protocol concern here is binding the message digest to the signature.
This is an issue that Butler Lampson and Ron Rivest have told me is
essential and I consider essential and why would we even have an argument
over whether a digest substitution attack is a concern. It just is.

As a side bar, let me also point out that semantic substitution is also a
legitimate concern and that might well mean that we end up making some
different choices in protocol designs.

As a protocol designer, there are two use cases of interest. One is where I
get to pick the content digest without restriction and the other is where
the choice is made for me because the content is already digested at the
point I am generating the signature. Often this is because it is being
digested for multiple purposes, in the DARE construction I use in the Mesh,
I digest the content of every envelope with SHA-2-512 or SHA-3-512 and I
use that digest to construct the Merkle tree over the message sequence and
as an input to the signature.


The objective of ML-KEM is to prevent a digest substitution attack by
binding the digest algorithm to the message digest. What it is doing in
effect is to create a manifest, prefixing a flag saying 'manifest' and
signing that with ML-DSA-Internal.

The objective of ML-DSA PURE is to allow direct access to ML-DSA-Internal
to sign messages when the content hasn't been prehashed already. People
have suggested quite a few purposes for this but the only purpose I see a
good argument for is to create *more expressive* manifest formats.


For example, let us say we are signing a JPG, we probably want to say we
are signing a JPG and the principled way to do that in JOSE or COSE would
be to build a manifest with the content digest, content digest algorithm
and content type.

[Yes, I am aware that you could use context for this but a much better use
for that is to declare your manifest format]

In the DARE construction, I am just updating the code so the signature is
over the envelope content digest AND the Merkle tree head. So one signature
does both.


OK so what went wrong in Ed25519 and Ed448? Problem was we were focused on
the security of the signature and NOT how it fits into other protocols. And
as a result we don't actually have a functional scheme for signing content
that was already digested with a specific hash.

What seems to have happened in practice is that instead of accepting the
hash choices picked by EdDSA, the implementers have done hash-then-sign in
the exact same manner as for RSA and passed the result in to be signed by a
random choice of Pre or pure.

So it appears that we probably have protocols that have effectively broken
uses of EdDSA. There is no viable attack right now but this is something we
should fix.

Certificates are probably the exception because we care enough about those
to make sure we get them right.


One approach is to update the RFC so that it matches the functionality of
ML-DSA. I think we should do that regardless simply to ease deployment of
both. We are going to be having this issue come up again and again.

A new rev to the RFC gives us an opportunity to tell people the correct and
safe way to deal with content that has already been hashed with a specific
digest. But the problem there is EdDSA has already made its way into a lot
of crypto APIs and it will take years for any update to make its way out.

So we need to look at using manifests for all content signatures. And that
is actually the better way to go because we can address multiple
substitution attacks, not just the one least likely to be attempted. And we
can do things like binding in notary and encryption witness values and
doing all sorts of other useful stuff.