[openpgp] Algorithm-specific data: problems with Simple Octet Strings, and possible alternatives
Daniel Kahn Gillmor <dkg@fifthhorseman.net> Thu, 25 March 2021 00:54 UTC
Return-Path: <dkg@fifthhorseman.net>
X-Original-To: openpgp@ietfa.amsl.com
Delivered-To: openpgp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D488D3A1482 for <openpgp@ietfa.amsl.com>; Wed, 24 Mar 2021 17:54:59 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.306
X-Spam-Level:
X-Spam-Status: No, score=-1.306 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RDNS_NONE=0.793, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=neutral reason="invalid (unsupported algorithm ed25519-sha256)" header.d=fifthhorseman.net header.b=3z06e+C9; dkim=pass (2048-bit key) header.d=fifthhorseman.net header.b=koRD5xzl
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id AnnAp-Rf5TPy for <openpgp@ietfa.amsl.com>; Wed, 24 Mar 2021 17:54:55 -0700 (PDT)
Received: from che.mayfirst.org (unknown [162.247.75.117]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 066B03A1481 for <openpgp@ietf.org>; Wed, 24 Mar 2021 17:54:55 -0700 (PDT)
DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/simple; d=fifthhorseman.net; i=@fifthhorseman.net; q=dns/txt; s=2019; t=1616633692; h=from : to : cc : subject : date : message-id : mime-version : content-type : from; bh=z/rESuw6V35xnPNGuY0lsGGPcrm8Bn8CpYr4E/fLmz8=; b=3z06e+C9smanAhlZIIWhDk0E8VBSuWVkd/k0ejrbaA7xrilHoa9neQxtAtlYcXF7/gJ2H DYBI7OsQ3SIwTijCQ==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=fifthhorseman.net; i=@fifthhorseman.net; q=dns/txt; s=2019rsa; t=1616633692; h=from : to : cc : subject : date : message-id : mime-version : content-type : from; bh=z/rESuw6V35xnPNGuY0lsGGPcrm8Bn8CpYr4E/fLmz8=; b=koRD5xzlAwq74ce5md01myxjDmzkhQyW2t50DE1lSUlb1Uo+AWfP3mJ8KhzXCKzG4f7ZQ eQFUwXjf1LMEjRgn8zD0TdGjoh59C4tRw/ijXLoqAGdBxZwZyFDp0xOB2DAkvFNNcvNmnZo ctDWWAwOi4OwWBGFmkQghlmbgmvsB/ta+qlaZ4G8I7fbJzSfldMVLGWiLyZmf2frjPtGH/n UmeBv2KzGjyPSiPspWGdIo6f6hyVD0JI2V+c58UDcmtFqBl5ux+Vu2lpBQXO53OumaGEYqf DlrVCY7zUPWVzfUCitrNXVPZYSdoXe+fMGhjxA6OIjS0XnKBjjRHRSBwBOAw==
Received: from fifthhorseman.net (unknown [IPv6:2001:470:1f07:60d:841d:2bce:26c3:59c6]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by che.mayfirst.org (Postfix) with ESMTPSA id DE018F9A7; Wed, 24 Mar 2021 20:54:52 -0400 (EDT)
Received: by fifthhorseman.net (Postfix, from userid 1000) id 084BF2054D; Wed, 24 Mar 2021 20:54:50 -0400 (EDT)
From: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
To: openpgp@ietf.org
Cc: NIIBE Yutaka <gniibe@fsij.org>
Autocrypt: addr=dkg@fifthhorseman.net; prefer-encrypt=mutual; keydata= mDMEX+i03xYJKwYBBAHaRw8BAQdACA4xvL/xI5dHedcnkfViyq84doe8zFRid9jW7CC9XBiI0QQf FgoAgwWCX+i03wWJBZ+mAAMLCQcJEOCS6zpcoQ26RxQAAAAAAB4AIHNhbHRAbm90YXRpb25zLnNl cXVvaWEtcGdwLm9yZ/tr8E9NA10HvcAVlSxnox6z62KXCInWjZaiBIlgX6O5AxUKCAKbAQIeARYh BMKfigwB81402BaqXOCS6zpcoQ26AADZHQD/Zx9nc3N2kj13AUsKMr/7zekBtgfSIGB3hRCU74Su G44A/34Yp6IAkndewLxb1WdRSokycnaCVyrk0nb4imeAYyoPtBc8ZGtnQGZpZnRoaG9yc2VtYW4u bmV0PojRBBMWCgCDBYJf6LTfBYkFn6YAAwsJBwkQ4JLrOlyhDbpHFAAAAAAAHgAgc2FsdEBub3Rh dGlvbnMuc2VxdW9pYS1wZ3Aub3JnL0Gwxvypz2tu1IPG+yu1zPjkiZwpscsitwrVvzN3bbADFQoI ApsBAh4BFiEEwp+KDAHzXjTYFqpc4JLrOlyhDboAAPkXAP0Z29z7jW+YzLzPTQML4EQLMbkHOfU4 +s+ki81Czt0WqgD/SJ8RyrqDCtEP8+E4ZSR01ysKqh+MUAsTaJlzZjehiQ24MwRf6LTfFgkrBgEE AdpHDwEBB0DkKHOW2kmqfAK461+acQ49gc2Z6VoXMChRqobGP0ubb4kBiAQYFgoBOgWCX+i03wWJ BZ+mAAkQ4JLrOlyhDbpHFAAAAAAAHgAgc2FsdEBub3RhdGlvbnMuc2VxdW9pYS1wZ3Aub3Jnfvo+ nHoxDwaLaJD8XZuXiaqBNZtIGXIypF1udBBRoc0CmwICHgG+oAQZFgoAbwWCX+i03wkQPp1xc3He VlxHFAAAAAAAHgAgc2FsdEBub3RhdGlvbnMuc2VxdW9pYS1wZ3Aub3JnaheiqE7Pfi3Atb3GGTw+ jFcBGOaobgzEJrhEuFpXREEWIQQttUkcnfDcj0MoY88+nXFzcd5WXAAAvrsBAIJ5sBg8Udocv25N stN/zWOiYpnjjvOjVMLH4fV3pWE1AP9T6hzHz7hRnAA8d01vqoxOlQ3O6cb/kFYAjqx3oMXSBhYh BMKfigwB81402BaqXOCS6zpcoQ26AADX7gD/b83VObe14xrNP8xcltRrBZF5OE1rQSPkMNy+eWpk eCwA/1hxiS8ZxL5/elNjXiWuHXEvUGnRoVj745Vl48sZPVYMuDgEX+i03xIKKwYBBAGXVQEFAQEH QIGex1WZbH6xhUBve5mblScGYU+Y8QJOomXH+rr5tMsMAwEICYjJBBgWCgB7BYJf6LTfBYkFn6YA CRDgkus6XKENukcUAAAAAAAeACBzYWx0QG5vdGF0aW9ucy5zZXF1b2lhLXBncC5vcmcEAx9vTD3b J0SXkhvcRcCr6uIDJwic3KFKxkH1m4QW0QKbDAIeARYhBMKfigwB81402BaqXOCS6zpcoQ26AAAX mwD8CWmukxwskU82RZLMk5fm1wCgMB5z8dA50KLw3rgsCykBAKg1w/Y7XpBS3SlXEegIg1K1e6dR fRxL7Z37WZXoH8AH
Date: Wed, 24 Mar 2021 20:54:49 -0400
Message-ID: <87eeg42gti.fsf@fifthhorseman.net>
MIME-Version: 1.0
Content-Type: multipart/signed; boundary="=-=-="; micalg="pgp-sha256"; protocol="application/pgp-signature"
Archived-At: <https://mailarchive.ietf.org/arch/msg/openpgp/t5qhbQCnlDaUc6Zbze7r8tTDd0Q>
Subject: [openpgp] Algorithm-specific data: problems with Simple Octet Strings, and possible alternatives
X-BeenThere: openpgp@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Ongoing discussion of OpenPGP issues." <openpgp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/openpgp>, <mailto:openpgp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/openpgp/>
List-Post: <mailto:openpgp@ietf.org>
List-Help: <mailto:openpgp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/openpgp>, <mailto:openpgp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 25 Mar 2021 00:55:00 -0000
Hi OpenPGP folks-- I spent a good part of today trying to write a patch for including Simple Octet Strings in the crypto-refresh draft, pursuant to gniibe's presentation at IETF 110 and his public memo at: https://www.gniibe.org/memo/standard/openpgp/ecc-in-openpgp-sos.html This message doesn't include the patch i wanted to make because I think i ran into a problem with it. I'm trying to explain the problem here, and how my thinking has shifted in the course of trying to write the thing down. Please correct me where I've erred. IIUC, SOS is basically intended as a genericization of MPI; it represents its length in a two-vector scalar as (8*octet-count) instead of as (bit-count). It "fits" where MPI fits, but doesn't require leading-zero removal, or require interpretation as an integer at all. The problem comes with the different length representations, in particular for existing ECC mechanisms. Because existing ECC points are currently represented with a leading prefix byte (0x04 for NIST curves, 0x40 for 25519), and that byte has the leading bit cleared, the MPI version of the length scalar is *different* on the wire from the SOS version of the length scalar. As a concrete example, a NIST p256 pubkey looks today like this (using visualization from "sq packet dump --hex"): ``` Public-Key Packet, old CTB, 2 header bytes + 82 bytes Version: 4 Creation time: 2021-03-24 22:06:14 UTC Pk algo: ECDSA public key algorithm Pk size: 256 bits Fingerprint: D4732B93D4E82D068EE197A63A73ED20925926A6 KeyID: 3A73ED20925926A6 00000000 98 CTB 00000001 52 length 00000002 04 version 00000003 60 5b b7 d6 creation_time 00000007 13 pk_algo 00000008 08 curve_len 00000009 2a 86 48 ce 3d 03 01 curve 00000010 07 00000011 02 03 ecdsa_public_len 00000013 04 29 75 bb 53 55 49 31 80 ab 03 54 15 ecdsa_public 00000020 a0 6c 29 10 a8 85 b7 2f e5 93 81 90 48 08 b8 0e 00000030 c7 02 55 aa 51 b2 b7 6a 00 7a a1 2b cf 25 a7 e7 00000040 9c 95 b1 fa 79 87 c2 cb 94 53 a8 da c2 00 58 45 00000050 79 a0 96 8e ``` but if it was represented in SOS form, we'd see this change: ```text/x-diff 00000010 07 - 00000011 02 03 ecdsa_public_len + 00000011 02 08 ecdsa_public_len 00000013 04 29 75 bb 53 55 49 31 80 ab 03 54 15 ecdsa_public ``` The result is that the fingerprint would also change (to C9EAB3BE6B20CAEA568BE93F9BA3D29CBB873842), since the bytestream that feeds into the fingerprint calculation changes. This means that we can represent a single EC public key (with the exact same creation date) in at least two different ways just by varying whether the public point is seen as an SOS or an MPI. (note: this concern is not necessarily just for ECC; it's also relevant for traditional RSA, DSA, and ElGamal key MPI components that might have the highest set bit somewhere other than just below an octet boundary. For example, if we were to "retcon" RSA's e component as an SOS instead of an MPI, the most common value for e (65537) could change on the wire from [00 11 01 00 01] to [00 18 11 00 01], resulting in a similar change to the fingerprint) This leaves me wondering about the utility of the SOS abstraction. I can imagine a few distinct options: a) Introduce SOS and apply it to existing ECC pubkeys. This is what I understand to be gniibe's proposal. It looks like this means an implementation will have to cope with multiple distinct wire representations and fingerprints of the exact same public key. b) Introduce the SOS abstraction, but only use it for *new* pubkey algorithms. All existing ECC (including NIST and 25519) would still be this weird pseudo-MPI, representing the count based on the highest set bit, not 8*octet-count, and requiring clearing of leading zeros. c) Introduce a variant of SOS that indicates count based on highest set bit (not 8*octet-count) and use that for everything, but without requiring removal of leading all-zero octets. d) Not introduce the SOS abstraction at all, but instead explicitly say that new algorithms can define their own fixed algorithm-specific bytestream for public key material, secret key material, and signatures -- it does not need to be MPIs (gniibe's memo calls this approach "Each data format definition by each curve"). e) No SOS abstraction, and just keep on doing the kind of pseudo-MPI we've been doing for any new curves (gniibe's memo calls this "Practically easiest"). f) Introduce a new distinct data type that uses an actual octet count for each bytestring (gniibe's memo calls this "Non-strange but Just an Octet String") Of these choices, I'm leaning toward (d) but i'd love to hear what other people think. (a) Seems problematic because of the multiple acceptable representations outlined above. It's already pretty tricky that the fingerprint of a piece of given public key material can vary depending on the creation timestamp. Some implementations might try to normalize an SOS into a "compliant" MPI, thereby affecting their ability to calculate the correct fingerprint. I don't think we currently have any tests that demonstrate interoperability in the face of this kind of confusion. (b) Seems sort of pointless because it doesn't let us clean up any of the existing mess. If we're going to keep all the existing mess, we might as well go with (d). (c) Introduces a new problem: When encoding an SOS string, how many leading zero-octets should be used? If the answer is "implementation gets to decide" then we share the multiple-representations problem with (a); And if the answer is "each algorithm has a mandated size" then we might as well go with (d). Also, when consuming such a string, an implementation would need to first scan through all the leading zero-octets before it could use the length count to jump to the next field. That's a new piece of ugliness for OpenPGP parsing. (e) Bothers me because it means that each new algorithm introduced has to declare not only its data formats, but how to transform those data formats into the semblance of an OpenPGP "MPI". This just seems like extra work and new ways for both implementers and specifiers to get things wrong. (f) Seems excessive: why declare a standard for encoding lengths of fields in those cases where the lengths are pre-determined by the algorithm in question (i expect that to be most cases for new algorithms). For specific algorithms which have variable length parameters, *those algorithms* can declare how they want to delimit the fields. If the goal is to be able to skip over algorithm-specific details of an algorithm that we don't know, what are we skipping *to*? For public key packets and signature packets, the only thing that comes next is the end of the packet, whose size we already know. For secret key packets, we *might* try to skip from the public key material to the secret key material, but again, if we don't know how many MPIs (or SOSes or JOSes) to skip for public keys we might not even be able to tell when the s2k usage octet starts, so we can't do that today anyway. I will probably try to draw up a concrete patch for (d) unless other folks in the WG think it would be useful for consideration. The patch will likely ask IANA to add new columns to the "public key algorithm" registry (e.g. "public key representation", "secret key representation" and "signature representation") and to formally include an Elliptic Curves registry with similar columns, so that we have a compact way to view these distinctions, and a place for new proposals to include updates. If any of my analysis above is wrong, please help me understand it better! --dkg
- [openpgp] Algorithm-specific data: problems with … Daniel Kahn Gillmor
- Re: [openpgp] Algorithm-specific data: problems w… Daniel Kahn Gillmor
- Re: [openpgp] Algorithm-specific data: problems w… Ángel
- Re: [openpgp] Algorithm-specific data: problems w… NIIBE Yutaka
- Re: [openpgp] Algorithm-specific data: problems w… Daniel Kahn Gillmor
- Re: [openpgp] Algorithm-specific data: problems w… Niibe Yutaka
- Re: [openpgp] Algorithm-specific data: problems w… Florian Weimer
- Re: [openpgp] Algorithm-specific data: problems w… Ángel
- Re: [openpgp] Algorithm-specific data: problems w… Wiktor Kwapisiewicz
- Re: [openpgp] Algorithm-specific data: problems w… NIIBE Yutaka