[openpgp] Re: I-D list for Open Specification for Pretty Good Privacy notification: Changes to draft-gallagher-openpgp-code-point-exhaustion

Andrew Gallagher <andrewg@andrewg.com> Wed, 19 March 2025 10:00 UTC

Return-Path: <andrewg@andrewg.com>
X-Original-To: openpgp@mail2.ietf.org
Delivered-To: openpgp@mail2.ietf.org
Received: from localhost (localhost [127.0.0.1]) by mail2.ietf.org (Postfix) with ESMTP id 1A326EA41A6 for <openpgp@mail2.ietf.org>; Wed, 19 Mar 2025 03:00:02 -0700 (PDT)
X-Virus-Scanned: amavisd-new at ietf.org
X-Spam-Flag: NO
X-Spam-Score: -2.101
X-Spam-Level:
X-Spam-Status: No, score=-2.101 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: mail2.ietf.org (amavisd-new); dkim=pass (2048-bit key) header.d=andrewg.com
Received: from mail2.ietf.org ([166.84.6.31]) by localhost (mail2.ietf.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cSmvi_hR8I61 for <openpgp@mail2.ietf.org>; Wed, 19 Mar 2025 03:00:01 -0700 (PDT)
Received: from fum.andrewg.com (fum.andrewg.com [IPv6:2a01:4f9:c011:23ad::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-256) server-digest SHA256) (No client certificate requested) by mail2.ietf.org (Postfix) with ESMTPS id 7C19CEA4160 for <openpgp@ietf.org>; Wed, 19 Mar 2025 02:59:49 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=andrewg.com; s=andrewg-com; t=1742378388; bh=rXsWwqkShzj/EZuw4ZspSeuTJ/fsZPIFNZj+hnLIP7g=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=q8EIktnXyFm+59r/JOAOCPEC0EwHWnmOGjWXslw2Y33sozefnE74Zs2NiMwbxjqV6 xqERuNy8Ey3AR/Kxj3UyyFxwYdIVJGUMJeHfelLsG8u3d/TyXFU1P5BQVHKUzEgIDG yS2P/tqC/kapsi9Yyn7vOmkcYLOuyXwtJ+eWd3NOoPa1zOhOMd19JxHcgxHTZk9PnF tTyrx7c/osbQcRjIrAR9WGFnFvQ1CX5qY8COWkTar2+XorZlPTyvEReFtXKug80kOQ hiwo4MXVa6DLe7BBO+hpNN4PHmX5eg/L3PPHbtvEcrq76wuWH4nhhowiRGsaRKRt71 4DH3rn3iUWUTA==
Received: from smtpclient.apple (serenity [IPv6:fc93:5820:7349:eda2:99a7::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by fum.andrewg.com (Postfix) with ESMTPSA id CFDE45DDA2; Wed, 19 Mar 2025 09:59:47 +0000 (UTC)
Content-Type: multipart/signed; boundary="Apple-Mail=_6BA6C4B4-93CB-4C70-93E1-5F82B2A36ADC"; protocol="application/pgp-signature"; micalg="pgp-sha512"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3731.700.6.1.9\))
From: Andrew Gallagher <andrewg@andrewg.com>
In-Reply-To: <87tt7p8n9d.fsf@europ.lan>
Date: Wed, 19 Mar 2025 09:59:30 +0000
Message-Id: <D4D840C5-CD2F-47D8-8101-DF3F950B3ECB@andrewg.com>
References: <174231559348.277.2581535826712330509@dt-celery-57d64c6895-fcmg2> <B321DC63-56E0-44C2-96AA-D60205C148B2@andrewg.com> <87tt7p8n9d.fsf@europ.lan>
To: Justus Winter <justus@sequoia-pgp.org>
X-Mailer: Apple Mail (2.3731.700.6.1.9)
Message-ID-Hash: W6US4FCFJJ4HM5PYXRAHJPVGNWZ23XTG
X-Message-ID-Hash: W6US4FCFJJ4HM5PYXRAHJPVGNWZ23XTG
X-MailFrom: andrewg@andrewg.com
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-openpgp.ietf.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
CC: IETF OpenPGP <openpgp@ietf.org>
X-Mailman-Version: 3.3.9rc6
Precedence: list
Subject: [openpgp] Re: I-D list for Open Specification for Pretty Good Privacy notification: Changes to draft-gallagher-openpgp-code-point-exhaustion
List-Id: "Ongoing discussion of OpenPGP issues." <openpgp.ietf.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/openpgp/Vkp8jzimlRLs78fjioxHI0cBWGI>
List-Archive: <https://mailarchive.ietf.org/arch/browse/openpgp>
List-Help: <mailto:openpgp-request@ietf.org?subject=help>
List-Owner: <mailto:openpgp-owner@ietf.org>
List-Post: <mailto:openpgp@ietf.org>
List-Subscribe: <mailto:openpgp-join@ietf.org>
List-Unsubscribe: <mailto:openpgp-leave@ietf.org>

On 19 Mar 2025, at 07:58, Justus Winter <justus@sequoia-pgp.org> wrote:
> 
> I'm not a fan of this.  First, we discussed code point exhaustion while
> working on RFC9580, and decided that it is not a concern.  We are
> nowhere near exhausting any code point space, not even the relatively
> tiny packet tag space.

Sure, however this was under the assumption that we would be allocating code points one at a time. If we agree to reserve ranges for certain kinds of algorithms (such as persistent symmetric) then this assumption may not be as safe as we previously thought.

> Then, you couldn't use multi-byte code points in any existing packet
> version, because that would turn what every software on this planet
> expects to be a fixed-size field into a variable-sized field, and
> failing to understand the scheme leads to catastrophic loss of parser
> synchronization (maybe with security implications).

The use of octets >=128 for the extension scheme ensures that parser desynchronisation always results in the emission of an unallocated code point. In most cases the remainder of the packet is not parseable even in principle if the code point is unknown, and in the rest the errors are minor, with no security implications. I checked the wire format of every packet and subpacket that contains a code point, and the results are listed in section 5 of the document. If I have missed anything, please let me know!

> Therefore, multi-byte code points can only be used in newer packet
> versions.  But, if we design new packet versions, we can just make them
> use a two-byte field for the code point in question, and say that the
> new algorithms must only be used with the new packet version.

This would be a significant breaking change though, and the old and new versions would have to coexist for an extended transition period. A backwards compatible extension would be more complex to implement, but would be transparent to the end user. And OpenPGP does not have a good track record for managing breaking changes… ;-)

> You bring up the comparison with UTF-8.  For text, we are interested in
> storing it efficiently, and are okay with a complex encoding.  I don't
> think this holds for our code points.  And even if storage efficiency
> were a concern, a two-byte code point, or even a four-byte code point,
> compares very favorably with OIDs.

I don’t believe storage efficiency is a notable property of UTF-8 when used for natural language - the key properties of UTF-8 are backwards compatibility with ASCII and self-synchronisation. For most natural languages, UTF-16 is the most efficient encoding.

Similarly, storage efficiency was not a consideration here - in fact, I specifically avoided two-octet encodings in order to improve the self-synchronisation properties, at the expense of storage efficiency.

> Finally, your solution to the (again, non-existant) code point scarcity
> makes the problem worse by halving the existing space.

If code point exhaustion is not a concern, then halving the available space is not a significant extra restriction; but if we ever do allocate half of any registry, it will have proven the need for an extension scheme and the reservation will have been worth it. In the meantime, we don’t have to implement anything. I doubt that any extension scheme would become necessary in the near future, and we may not want to use this particular one if and when the time comes. But I do think that we should be careful to keep our options open.

Thanks,
A