[openpgp] Re: text vs. binary in an OpenPGP "Signed Message"

Justus Winter <justus@sequoia-pgp.org> Mon, 03 March 2025 09:49 UTC

Return-Path: <justus@sequoia-pgp.org>
X-Original-To: openpgp@mail2.ietf.org
Delivered-To: openpgp@mail2.ietf.org
Received: from localhost (localhost [127.0.0.1]) by mail2.ietf.org (Postfix) with ESMTP id 78664577DB3 for <openpgp@mail2.ietf.org>; Mon, 3 Mar 2025 01:49:36 -0800 (PST)
X-Virus-Scanned: amavisd-new at ietf.org
X-Spam-Flag: NO
X-Spam-Score: -2.1
X-Spam-Level:
X-Spam-Status: No, score=-2.1 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: mail2.ietf.org (amavisd-new); dkim=pass (4096-bit key) header.d=sequoia-pgp.org
Received: from mail2.ietf.org ([166.84.6.31]) by localhost (mail2.ietf.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Wn0UNChOh2qP for <openpgp@mail2.ietf.org>; Mon, 3 Mar 2025 01:49:34 -0800 (PST)
Received: from mailgate02.uberspace.is (mailgate02.uberspace.is [IPv6:2a00:d0c0:200:0:1c7b:a6ff:fee0:8ea4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-256) server-digest SHA256) (No client certificate requested) by mail2.ietf.org (Postfix) with ESMTPS id 7F058577D74 for <openpgp@ietf.org>; Mon, 3 Mar 2025 01:49:33 -0800 (PST)
Received: from harrington.uberspace.de (harrington.uberspace.de [185.26.156.85]) by mailgate02.uberspace.is (Postfix) with ESMTPS id 70082180C1F for <openpgp@ietf.org>; Mon, 3 Mar 2025 10:49:31 +0100 (CET)
Received: (qmail 16613 invoked by uid 500); 3 Mar 2025 09:49:31 -0000
Authentication-Results: harrington.uberspace.de; auth=pass (plain)
Received: from unknown (HELO unkown) (::1) by harrington.uberspace.de (Haraka/3.0.1) with ESMTPSA; Mon, 03 Mar 2025 10:49:31 +0100
From: Justus Winter <justus@sequoia-pgp.org>
To: Daniel Kahn Gillmor <dkg@fifthhorseman.net>, openpgp@ietf.org
In-Reply-To: <871pvq4yhe.fsf@fifthhorseman.net>
References: <871pvq4yhe.fsf@fifthhorseman.net>
Date: Mon, 03 Mar 2025 10:49:30 +0100
Message-ID: <87o6yiigut.fsf@europ.lan>
MIME-Version: 1.0
Content-Type: multipart/signed; boundary="=-=-="; micalg="pgp-sha512"; protocol="application/pgp-signature"
X-Rspamd-Bar: --
X-Rspamd-Report: BAYES_HAM(-0.410258) SIGNED_PGP(-2) MIME_GOOD(-0.2)
X-Rspamd-Score: -2.610258
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=sequoia-pgp.org; s=uberspace; h=from:to:subject:date; bh=4U04ifQgEXkx4N80jNbscTPKx0d7vAer2U33zj6Dlys=; b=Po7NS0TzB92954YvzgvdZ8inWkI1xuGhWr1rD87e2MjB2FTAvShtS1v6lFa3ErMHNcpxiWH7va T9Lkl2jVVO3/ORhXJjL5/gg8Ud80T8xpM+CHMT5+M+GzvGvopMzZqJvE4giB5XLx+enlJYaiGPwu mhAiVBZHQoLvNFv6v6dRH+ZGc8Rrk+qEFUZgEzG5XCrQTVfZE5rqBuWqb9AALaPZVQzYDWDFqxTo n7TmSOXESWQ+uOWvfNiVJ4kESQ8FMUnLq1ypmMAty7GUmbPSwkQ4U6HSPk4BWxtmfR1ay3Jcv8da XyWDJACyqrrNG1ZchIS6I/p05+iiFJicH9hjzjadDnBOvzYBLuwPfczGdwmpxKiviVOdtMjCXo+0 ux3hxJTmaLyPIHA5P8cf7FH+aHXDJQTFvuK8NXBc45PC/3NbP4sW5WyFyBnlebNokcva/wJMr7LG ofPV+0P+b3KA5FfzNZrBkPBMwo5b0u7t8FK+aNpMVn+kurr7ceFM8YTLt8BhwHA4zUis0K+A8Rcx ndT3F1P2gQmFFyXt+szQfqxQoH4jMWsCFAY5B+9GE/3ZXagBvawsvXFR0fj4kPvzCjwFqYHRjOGd MHUXWO7mbti1cFWuVwU7XYVHQbZzTaVcSiBiSplvfbeEH6ytLv3Q05SzlG8QpRwV+Zfq+Ovn7ce0 o=
Message-ID-Hash: 2L2KHSRFRTL2HXSFKXOSQBFTYMU2PVXR
X-Message-ID-Hash: 2L2KHSRFRTL2HXSFKXOSQBFTYMU2PVXR
X-MailFrom: justus@sequoia-pgp.org
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-openpgp.ietf.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
X-Mailman-Version: 3.3.9rc6
Precedence: list
Subject: [openpgp] Re: text vs. binary in an OpenPGP "Signed Message"
List-Id: "Ongoing discussion of OpenPGP issues." <openpgp.ietf.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/openpgp/ugmuQ1XkSK-t5wkv67ryXYmdeFE>
List-Archive: <https://mailarchive.ietf.org/arch/browse/openpgp>
List-Help: <mailto:openpgp-request@ietf.org?subject=help>
List-Owner: <mailto:openpgp-owner@ietf.org>
List-Post: <mailto:openpgp@ietf.org>
List-Subscribe: <mailto:openpgp-join@ietf.org>
List-Unsubscribe: <mailto:openpgp-leave@ietf.org>

Hi :)

Daniel Kahn Gillmor <dkg@fifthhorseman.net> writes:

> So, we still have four different valid kinds of "Signed Message" with a
> single signature:
>
> - (a) OPS0 LITb SIG0
> - (b) OPS1 LITt SIG1
> - (c) OPS0 LITt SIG0
> - (d) OPS1 LITb SIG1
>
> Every implementation i've tested appears to agree how to verify (a) and
> (b).  I've yet to find an implementation that generates (c).  But i've
> found multiple implementations that produce (d):
>
>     https://gitlab.com/sequoia-pgp/sequoia-sop/-/issues/46 and
>     https://github.com/pgpainless/pgpainless/issues/465, so far…
>
> There are at least two different interpretations about what to do with
> (d): GnuPG attempts to verify the unmodified bytestream within the LITb,
> without converting the line-endings to CRLF, while every other
> implementation i've tried (including at least RNP, Sequoia, PGPainless,
> rpgp, and gosop) appears to try to convert the bytestream to CRLF before
> verification.

Just to shed some light on what we do:

- We don't use the literal data's format specifier for anything.  It
  defaults to binary, you can set it to anything when producing PGP
  data, and we parse it and you can query it on consumption, but nothing
  in Sequoia will act on the format specifier.

- Notably, we make no attempt of validating whether the message is UTF-8
  encoded or anything, nor do we try to do any kind of encoding / line
  ending conversion.  For us, literal data is a stream of bytes, and
  that is what we'll hand down to consumers.

- In fact, we consider the literal data metadata a bug in the
  specification: it is neither protected by signatures, nor does it have
  documented semantics.

- If we compute/verify a text signature, we transform the data stream on
  the fly for hashing purposes, but the downstream consumer gets the
  data as is.

In short, if you use Sequoia, what bytes go in come out again,
unvalidated and unchanged.

> So i guess my questions for the WG are:
>
> - Is it a bug if a signer produces (c) or (d)?

No.

> - Should a verifier reject (c) or (d) automatically as malformed?

No.

> - If not, should a verifier that encounters (d) attempt to apply CRLF
>   line endings to the LITb?

No.

> What do you think?

The literal data packet's header fields are a mis-feature, and should be
treated as such.  Implementations should produce literal data packets
with "inert" values: Sequoia uses binary, zero-length file name,
timestamp 0 by default.  Implementations should not act on the values of
these fields on consumption.  The next revision of OpenPGP should
reflect this more clearly.

Best,
Justus