[openpgp] Partial review of the crypto refresh

Daniel Huigens <d.huigens@protonmail.com> Thu, 24 November 2022 13:53 UTC

Return-Path: <d.huigens@protonmail.com>
X-Original-To: openpgp@ietfa.amsl.com
Delivered-To: openpgp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5DDAFC14F738 for <openpgp@ietfa.amsl.com>; Thu, 24 Nov 2022 05:53:36 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.098
X-Spam-Level:
X-Spam-Status: No, score=-2.098 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_PASS=-0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=protonmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id clqnbNjJcgO6 for <openpgp@ietfa.amsl.com>; Thu, 24 Nov 2022 05:53:32 -0800 (PST)
Received: from mail-4322.protonmail.ch (mail-4322.protonmail.ch [185.70.43.22]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id AE6E1C14CEE4 for <openpgp@ietf.org>; Thu, 24 Nov 2022 05:53:31 -0800 (PST)
Date: Thu, 24 Nov 2022 13:53:21 +0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=protonmail.com; s=protonmail3; t=1669298009; x=1669557209; bh=LfKScxcwiGexpBFhJWvxg8cJjtnlTncEHA2QbYzZHtc=; h=Date:To:From:Subject:Message-ID:Feedback-ID:From:To:Cc:Date: Subject:Reply-To:Feedback-ID:Message-ID:BIMI-Selector; b=lDF12kQ2tVUgCKU/dGCOQDoaCfPHFk9esXXnuRkgJGumpcg2RgwmszTcpG/JUaU9j QeL+1p4TCXGZGHt0INFmO4qjqDK66QfCc2WmY4OprvIr4EYA2nnaEufwJIzHOIUi0s 2NLG1Zzvh4UZ51MEWUneaoz1kjOEQXRMye2KZbAbmK/arQRpwjcxNApZXUhlv3J8Bz EFGB+ngJd/qNVES4GVqw3Fi302E9w+b5P5+lWRmGq3Sz69r2iyscZBUhuuaMtklImf DVAMvG5m4f/tCwSPCzek1DhBUg8rj4HnCtPmUhEt+yvVua/B0Xbq+FZW/05/K8MAU1 DPbOo4YcAz9ZA==
To: IETF OpenPGP WG <openpgp@ietf.org>
From: Daniel Huigens <d.huigens@protonmail.com>
Message-ID: <HniDSkOrqQhzJeIb0B_7yLgQjsIDVZZdGPnwttTdfpk4LCN7B4Nh1J6xzv1eZIV-OR6UemykSEdao4pWe5gFfr5BUWhEfHX8mdj6Jhla6xg=@protonmail.com>
Feedback-ID: 2934448:user:proton
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/openpgp/pVH1QU2N9Vz6p_juqqvh8jjBBVw>
Subject: [openpgp] Partial review of the crypto refresh
X-BeenThere: openpgp@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "Ongoing discussion of OpenPGP issues." <openpgp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/openpgp>, <mailto:openpgp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/openpgp/>
List-Post: <mailto:openpgp@ietf.org>
List-Help: <mailto:openpgp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/openpgp>, <mailto:openpgp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 24 Nov 2022 13:53:36 -0000

Hi all,

The following review is based on the current main branch on GitLab
(commit 3fd930c6), since that contains fixes for some previous comments
that I didn't want to accidentally repeat.

It's not a complete review; I skipped most of Section 5 (Packet Types)
- except for one small thing that caught my eye - since it's the part we
looked the most at in the Design Team, and this review was getting quite
long already :') I'll try to still do a complete read-through of that
section later.

> 1.1.  Terms
> 
>    *  OpenPGP - This is a term for security software that uses PGP 5 as
>       a basis, formalized in this document.
> 
>    *  PGP - Pretty Good Privacy.  PGP is a family of software systems
>       developed by Philip R. Zimmermann from which OpenPGP is based.
> 
>    *  PGP 2 - This version of PGP has many variants; where necessary a
>       more detailed version number is used here.  PGP 2 uses only RSA,
>       MD5, and IDEA for its cryptographic transforms.  An informational
>       RFC, RFC 1991, was written describing this version of PGP.
> 
>    *  PGP 5 - This version of PGP is formerly known as "PGP 3" in the
>       community.  It has new formats and corrects a number of problems
>       in the PGP 2 design.  It is referred to here as PGP 5 because that
>       software was the first release of the "PGP 3" code base.
> 
>    *  GnuPG - GNU Privacy Guard, also called GPG.  GnuPG is an OpenPGP
>       implementation that avoids all encumbered algorithms.
>       Consequently, early versions of GnuPG did not include RSA public
>       keys.
> 
>    "PGP", "Pretty Good", and "Pretty Good Privacy" are trademarks of PGP
>    Corporation and are used with permission.  The term "OpenPGP" refers
>    to the protocol described in this and related documents.

Except for the first one, most of these "terms" are really just names of
implementations, and they are not used much in the document, only "PGP"
and "PGP 2" are referenced a couple times in the context of backwards
compatiblity considerations - but I'm not sure if it's necessary to
define them up front here?

This definition of "OpenPGP" is also not a very relevant one at this
point; I think it's better to just say that OpenPGP is what's defined
by this document, which is sort of implied by its title. So all in all
I think that all these terms, and the note below them, can be removed.

> 2.  General functions
> 
>    OpenPGP provides data integrity services for messages and data files
>    by using these core technologies:
> 
>    *  digital signatures
> 
>    *  encryption
> 
>    *  compression
> 
>    *  Radix-64 conversion

I'm not sure that compression and base64 should be considered "core
technologies" of OpenPGP. Also, I think "confidentiality" is missing
from this sentence. So how about something like:

   OpenPGP provides data confidentiality and integrity for messages
   and data files by using public-key and/or symmetric encryption,
   and digital signatures.

> 2.1.  Confidentiality via Encryption
> 
>    OpenPGP combines symmetric-key encryption and public-key encryption
>    to provide confidentiality.  When made confidential, first the object
>    is encrypted using a symmetric encryption algorithm.  Each symmetric
>    key is used only once, for a single object.  A new "session key" is
>    generated as a random number for each object (sometimes referred to
>    as a session).  Since it is used only once, the session key is bound
>    to the message and transmitted with it.  To protect the key, it is
>    encrypted with the receiver's public key.  The sequence is as
>    follows:

Since this section and the list that follows refer to using public-key
encryption, perhaps the first two lines should be changed to:

   OpenPGP combines symmetric-key encryption and (optionally) public-key
   encryption to provide confidentiality.  When using public keys, (...)

>    With symmetric-key encryption, an object may be encrypted with a
>    symmetric key derived from a passphrase (or other shared secret), or
>    a two-stage mechanism similar to the public-key method described
>    above in which a session key is itself encrypted with a symmetric
>    algorithm keyed from a shared secret.

The first method is deprecated, and doesn't need to be mentioned here,
I think.

>    Both digital signature and confidentiality services may be applied to
>    the same message.  First, a signature is generated for the message
>    and attached to the message.  Then the message plus signature is
>    encrypted using a symmetric session key.  Finally, the session key is
>    encrypted using public-key encryption and prefixed to the encrypted
>    block.

Should be:

   (...) encrypted using public-key or symmetric encryption and prefixed
   to the encrypted block.

> 2.2.  Authentication via Digital Signature
> 
>    The digital signature uses a hash code or message digest algorithm,
>    and a public-key signature algorithm.  The sequence is as follows:

This reads a bit like there's a choice to be made here. How about:

   The digital signature uses a cryptographic hash function, and a
   public-key signature algorithm.  The sequence is as follows:

And then, in the list that follows it, I would change "hash code" to
"hash digest" everywhere.

> 2.4.  Conversion to Radix-64

Maybe let's call this "base-64" (throughout the document)?

>    (...)  OpenPGP provides the
>    service of converting the raw 8-bit binary octet stream to a stream
>    of printable ASCII characters, called Radix-64 encoding or ASCII
>    Armor.

How about:

   (...)  OpenPGP allows converting the raw 8-bit binary octet stream to
   a stream of printable ASCII characters using base-64 encoding, in a
   format called ASCII Armor (see Section 6).

> 3.2.  Multiprecision Integers
>    (...)
>    Also note that when an MPI is encrypted, the length refers to the
>    plaintext MPI.  It may be ill-formed in its ciphertext.

This note has always confused me. Does someone know what it means? It
seems obvious to me that when you encrypt an MPI, it's no longer an MPI
as such, it's a ciphertext that decrypts to an MPI. Was there some
previous version where the octets of the MPI were encrypted without the
length, perhaps? In any case I think this note can probably be removed.

> 3.2.1.  Using MPIs to encode other data
> 
>    Note that MPIs are used in some places used to encode non-integer

Typo: one of the "used"s should be removed.

> 3.7.  String-to-Key (S2K) Specifiers
> 
>    A string-to-key (S2K) specifier is used to convert a passphrase
>    string into a symmetric-key encryption/decryption key.  They are used
>    in two places, currently: to encrypt the secret part of private keys
>    in the private keyring, and to convert passphrases to encryption keys
>    for symmetrically encrypted messages.

I think "in the private keyring" should be removed since a private key
doesn't necessarily have to be part of a keyring.

> 3.7.1.  String-to-Key (S2K) Specifier Types
> 
>    There are four types of S2K specifiers currently supported, and some
>    reserved values:

This is a bit nitpicky but I think "supported" should be "specified",
since the former is more a property of an implementation.

>    |  ID | S2K Type     | Generate?        | S2K field     | Reference |
>    |     |              |                  | size (octets) |           |

Even more nitpicky, but maybe the "Generate?" column should be last?

> 3.7.1.3.  Iterated and Salted S2K
> 
>    This includes both a salt and an octet count.  The salt is combined
>    with the passphrase and the resulting value is hashed repeatedly.

To make this clearer, "hashed repeatedly" should be changed to
"repeated and then hashed", as that's what actually happens. Similarly,

>    Then the salt, followed by the passphrase data, is repeatedly hashed
>    until the number of octets specified by the octet count has been
>    hashed.

could be something like "repeatedly passed to the hash function".

>    After the hashing is done, the data is unloaded from the hash
>    context(s) as with the other S2K algorithms.

I think this is trying to say:

   After the hashing is done, the key data is produced from the hash
   digest(s) as with the other S2K algorithms.

> 3.7.2.  String-to-Key Usage

I think what is potentially missing from this section is some guidance
about "Iterated and Salted S2K", e.g. hinting that it is not very
strong, or at least that a high octet count should be used. For
example, we could append to this section:

   If Argon2 is not available, Iterated and Salted S2K MAY be used if
   care is taken to use a sufficiently high octet count. However, this
   method does not provide memory-hardness, unlike Argon2.

>               Table 2: Version 4 Secret Key protection details
> 
>    If the "Generate?" column is not "Y", the Secret Key protection
>    details entry is used only for reading in backwards compatibility
>    mode and MUST NOT be used to generate new output.

This table uses "Yes", not "Y" like Table 1 does. It would be good to
use either "Yes"/"No" or "Y"/"N" consistently, in Table 1 and 2 and the
text below both.

> 5.2.3.19.  Regular Expression
> 
>    (null-terminated regular expression)

Is it really null-terminated...? (I've never used this, so I'm not sure,
does anyone know?) If it is, maybe that should be expanded on a bit,
e.g. say that implementations should strip the trailing \0 when parsing
and add it when generating this subpacket, if that's really required.

> 6.3.  Encoding Binary in Radix-64
> (through..)
> 6.5.  Examples of Radix-64

Are these sections really necessary? Section 6 already says that
"The base64 encoding is identical to the MIME base64 content-transfer-
encoding [RFC2045]." Can we just refer to that, or some other source
for the definition of base64, rather than redefining it here?

> 8.  Regular Expressions

Can we get rid of this, also? There is a reference for this in Section
5.2.3.19 (Regular Expression), but it is to a book, which is not super
accessible, perhaps. But, after doing a bit of research, apparently
there are only two formats in common use, "<[^>]+[@.]example\.com>$" and
"example.com" (with the latter seemingly not really being intended as a
regular expression at all?), so perhaps we can just hardcode in the
subpacket that it must be one of those formats? And/or, deprecate this
subpacket and replace it with something simpler, if needed at all?
(Of course that would be a larger change, and may be out of scope.)

> 12.1.  Supported ECC Curves

"Supported" should be removed, I think.

> 13.9.  OpenPGP CFB Mode

I have many quarrels with this section, but I already discussed them and
offered a rewrite in merge request 205:
https://gitlab.com/openpgp-wg/rfc4880bis/-/merge_requests/205.

> 12.1.  Supported ECC Curves
> 
>    This document references three named prime field curves defined in
>    [FIPS186] as "Curve P-256", "Curve P-384", and "Curve P-521"; and
>    three named prime field curves defined in [RFC5639] as
>    "brainpoolP256r1", "brainpoolP384r1", and "brainpoolP512r1".  These
>    three [FIPS186] curves and the three [RFC5639] curves can be used
>    with ECDSA and ECDH public key algorithms. (...)

Tiny nit: "These three [FIPS186] curves" should be "The three (...)",
since the brainpool curves have been inserted in between.

>    The named curves are referenced as a sequence of octets in this
>    document, called throughout, curve OID.

Awkward phrasing, how about:

   The named curves are identified by a sequence of octets, called the
   curve OID.

> 13.2.  Symmetric Algorithm Preferences
>    (...) Note that the
>    MUST-implement algorithm, AES-128, ensures that the intersection is
>    not null.

"non-empty" might be clearer.

> 14.  Security Considerations
>    (...)
>    *  The MD5 hash algorithm has been found to have weaknesses, with
>       collisions found in a number of cases.  MD5 is deprecated for use
>       in OpenPGP.  Implementations MUST NOT generate new signatures
>       using MD5 as a hash function.  They MAY continue to consider old
>       signatures that used MD5 as valid.

This guidance is mostly redundant with, and slightly conflicting with,
Section 9.5. Also, SHA-1 is missing here (although it is mentioned in
Section 9.5 and 14.1.) And finally, should there be normative
requirements in this section at all? Maybe it can be replaced with:

   *  The MD5 and SHA-1 hash algorithms have been found to have
      weaknesses, with collisions found in a number of cases.
      MD5 and SHA-1 are deprecated for use in OpenPGP.

>    *  SHA2-224 and SHA2-384 require the same work as SHA2-256 and
>       SHA2-512, respectively.  In general, there are few reasons to use
>       them outside of DSS compatibility.  You need a situation where one
>       needs more security than smaller hashes, but does not want to have
>       the full 256-bit or 512-bit data length.

This doesn't really seem like a security consideration, perhaps it can
be removed?

>    *  Many security protocol designers think that it is a bad idea to
>       use a single key for both privacy (encryption) and integrity
>       (signatures).  In fact, this was one of the motivating forces
>       behind the v4 key format with separate signature and encryption
>       keys.  If you as an implementer promote dual-use keys, you should
>       at least be aware of this controversy.

Maybe we can make the last sentence a bit stronger and at least say:

      Using a single key for encrypting and signing is discouraged.

>    *  Some of the encryption algorithms mentioned in this document have
>       been analyzed less than others.  For example, although CAST5 is
>       presently considered strong, it has been analyzed less than
>       TripleDES.  Other algorithms may have other controversies
>       surrounding them.

To avoid this coming across as an implicit endorsement for TripleDES,
maybe let's mention AES in its place, instead?

>    *  Some technologies mentioned here may be subject to government
>       control in some countries.

Is this still true? Can this be removed?

---

That's all for now. For the things that are not purely editorial, let me
know if you want me to propose some more concrete text or make a pull
request.

Thanks!

Best,
Daniel