Re: [openpgp] User ID conventions (it's not really a RFC2822 name-addr)
Daniel Kahn Gillmor <dkg@fifthhorseman.net> Wed, 06 November 2019 17:52 UTC
Return-Path: <dkg@fifthhorseman.net>
X-Original-To: openpgp@ietfa.amsl.com
Delivered-To: openpgp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1977E12081B for <openpgp@ietfa.amsl.com>; Wed, 6 Nov 2019 09:52:41 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.457
X-Spam-Level:
X-Spam-Status: No, score=-0.457 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DATE_IN_PAST_06_12=1.543, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=neutral reason="invalid (unsupported algorithm ed25519-sha256)" header.d=fifthhorseman.net header.b=4o02UaLm; dkim=pass (2048-bit key) header.d=fifthhorseman.net header.b=U612i2uT
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id jPR6ytPPesGs for <openpgp@ietfa.amsl.com>; Wed, 6 Nov 2019 09:52:38 -0800 (PST)
Received: from che.mayfirst.org (che.mayfirst.org [IPv6:2001:470:1:116::7]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id AC70712013F for <openpgp@ietf.org>; Wed, 6 Nov 2019 09:52:38 -0800 (PST)
DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/simple; d=fifthhorseman.net; i=@fifthhorseman.net; q=dns/txt; s=2019; t=1573062756; h=from : to : cc : subject : in-reply-to : references : date : message-id : mime-version : content-type : from; bh=osfgQeX0XKQDpsMEs64BEkezZWIFyAjs/kNmfLM6v/0=; b=4o02UaLmDbABHgqwHi5zjhhd9Kg0kW/vHiW35IMLc/Rpz/TRBSHY6ZTO 668LB1kRVrtyxVpFyGDePRM5l0zCAA==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=fifthhorseman.net; i=@fifthhorseman.net; q=dns/txt; s=2019rsa; t=1573062756; h=from : to : cc : subject : in-reply-to : references : date : message-id : mime-version : content-type : from; bh=osfgQeX0XKQDpsMEs64BEkezZWIFyAjs/kNmfLM6v/0=; b=U612i2uTj9IktB8SiTuEuLgam7OsddzEQ6ljTi6kEfF+M6PWJnHC0vFo STfITFdEZfpqNYe2A8BJikHd/v5P1OjQlwNTGCry9Xi8LA/g0RGQ0RFjQ7 kaFvtSbICtpX6Kq2xBijb6uC0kL2Vxnt4EO3CzBQoFyRkrpRaNfGQ/k7/O bnjcoV/tJcjo1q2mFBmZjfMqOTitnNOhe+1oaL0NBOYpZL8fxbmW/LG7fw N/ClUE9BTebqeGkLZze6ChR0WoS5JziNs0atfwwqJQe62qUCQmHeKN03vF qtd+tg14r7/YD7mUsOVCEd+MOUptcvFU+QSP4gxssoVgrUIFbFnN0A==
Received: from fifthhorseman.net (unknown [38.88.5.182]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by che.mayfirst.org (Postfix) with ESMTPSA id 8F107F9A7; Wed, 6 Nov 2019 12:52:36 -0500 (EST)
Received: by fifthhorseman.net (Postfix, from userid 1000) id C97582038D; Wed, 6 Nov 2019 01:37:14 -0500 (EST)
From: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
To: "Neal H. Walfield" <neal@walfield.org>
Cc: openpgp@ietf.org
In-Reply-To: <87v9rydk9s.wl-neal@walfield.org>
References: <87woe7zx7o.fsf@fifthhorseman.net> <87v9rydk9s.wl-neal@walfield.org>
Autocrypt: addr=dkg@fifthhorseman.net; prefer-encrypt=mutual; keydata= mDMEXEK/AhYJKwYBBAHaRw8BAQdAr/gSROcn+6m8ijTN0DV9AahoHGafy52RRkhCZVwxhEe0K0Rh bmllbCBLYWhuIEdpbGxtb3IgPGRrZ0BmaWZ0aGhvcnNlbWFuLm5ldD6ImQQTFggAQQIbAQUJA8Jn AAULCQgHAgYVCgkICwIEFgIDAQIeAQIXgBYhBMS8Lds4zOlkhevpwvIGkReQOOXGBQJcQsbzAhkB AAoJEPIGkReQOOXG4fkBAO1joRxqAZY57PjdzGieXLpluk9RkWa3ufkt3YUVEpH/AP9c+pgIxtyW +FwMQRjlqljuj8amdN4zuEqaCy4hhz/1DbgzBFxCv4sWCSsGAQQB2kcPAQEHQERSZxSPmgtdw6nN u7uxY7bzb9TnPrGAOp9kClBLRwGfiPUEGBYIACYWIQTEvC3bOMzpZIXr6cLyBpEXkDjlxgUCXEK/ iwIbAgUJAeEzgACBCRDyBpEXkDjlxnYgBBkWCAAdFiEEyQ5tNiAKG5IqFQnndhgZZSmuX/gFAlxC v4sACgkQdhgZZSmuX/iVWgD/fCU4ONzgy8w8UCHGmrmIZfDvdhg512NIBfx+Mz9ls5kA/Rq97vz4 z48MFuBdCuu0W/fVqVjnY7LN5n+CQJwGC0MIA7QA/RyY7Sz2gFIOcrns0RpoHr+3WI+won3xCD8+ sVXSHZvCAP98HCjDnw/b0lGuCR7coTXKLIM44/LFWgXAdZjm1wjODbg4BFxCv50SCisGAQQBl1UB BQEBB0BG4iXnHX/fs35NWKMWQTQoRI7oiAUt0wJHFFJbomxXbAMBCAeIfgQYFggAJhYhBMS8Lds4 zOlkhevpwvIGkReQOOXGBQJcQr+dAhsMBQkB4TOAAAoJEPIGkReQOOXGe/cBAPlek5d9xzcXUn/D kY6jKmxe26CTws3ZkbK6Aa5Ey/qKAP0VuPQSCRxA7RKfcB/XrEphfUFkraL06Xn/xGwJ+D0hCw==
Date: Wed, 06 Nov 2019 01:37:14 -0500
Message-ID: <878soto6hx.fsf@fifthhorseman.net>
MIME-Version: 1.0
Content-Type: multipart/signed; boundary="=-=-="; micalg="pgp-sha256"; protocol="application/pgp-signature"
Archived-At: <https://mailarchive.ietf.org/arch/msg/openpgp/guulh9vGTsKM84ww4VV2rsgh9Rs>
Subject: Re: [openpgp] User ID conventions (it's not really a RFC2822 name-addr)
X-BeenThere: openpgp@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Ongoing discussion of OpenPGP issues." <openpgp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/openpgp>, <mailto:openpgp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/openpgp/>
List-Post: <mailto:openpgp@ietf.org>
List-Help: <mailto:openpgp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/openpgp>, <mailto:openpgp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 06 Nov 2019 17:52:41 -0000
Hi Neal-- Thanks for this thoughtful writeup! On Tue 2019-11-05 23:35:11 +0100, Neal H. Walfield wrote: > Beyond being more fleshed out, this grammar is different from the > grammar in dkg's second proposal in a few ways. > > First, it matches comments. dkg made this a non-goal. Given that > people who add comments intend them as comments and not as part of > their name, it seems reasonable to me to not display comments in > places where only the user's name is desired. And, since it turns out > that matching non-nested comments is relatively straightforward, why > not? Note: doing this might actually help deprecate comments, because > they won't be shown as often. User IDs are full UTF-8 strings. The idea that any part of that string would be hidden from the user is pretty disturbing to me. Consider the situation where someone is certifying a user ID on an OpenPGP certificate. If the comment is hidden, do they know what identity assertion they're making? I'd much rather have comments be deprecated *because they are weird and show up in places where you'd think your name should go* rather than have them be some vestigial thing that people don't even notice any longer. I would recommend dropping the comment from your grammar and letting the "name" part subsume it, when you're splitting out e-mail address from the rest of the user ID. Furthermore, because you've allowed "(" and ")" in atext-specials, it looks to me like your proposed grammar is ambiguous: bob (joe) <bob@example.net> is either: name: "bob (joe)" comment: None addr-spec: "bob@example.net" or: name: "bob" comment: "joe" addr-spec: "bob@example.net" I don't think this is helpful to anyone. > The grammar more carefully handles whitespace. It ignores whitespace > at the beginning of the User ID (this is what motivates the > name-char-start production) and between the individual components in > the pgp-uid-convention production. As is, the grammar only ignores > the 0x20 space character. We may also want to include the tab > character, unicode's NO-BREAK SPACE (U+00A0) character and its > IDEOGRAPHIC SPACE (U+3000) character for thoroughness. But, since > software will normally concatenate the individual components, just > recognizing the ASCII space character here is probably fine. Whatever > the case, I think we can safely ignore the rest of unicode's > whitespace characters: > > https://en.wikipedia.org/wiki/Whitespace_character I'm fine with being judicious about selecting whitespace characters. In addition to tab (U+0009, ascii "HT"), i note that you've declined to include U+000A and U+000D (ascii "LF" and "CR") in the grammar at all. I like that kind of opinionated decision, as unprintable symbols like this are likely to be problematic in many ways (hard for users to distinguish at least!) I also think that whitespace at the beginning of a user ID is asking for trouble, and would be happy with a grammar that considers that user ID non-conventional. Is there a use case for leading whitespace in a user ID? > My pgp-uid-convention production also matches user ids without email > addresses, e.g., "Daniel Kahn Gillmor". This is convenient. Instead > of having to figure out why parsing failed (is it not valid UTF-8? is > it just missing an addr-spec?), we explicitly cover this common > pattern in the grammar. I think this will significantly simplify code > that uses this interface: if there is an error, then the code can just > assume the User ID is trash and can be ignored. I should be clear that i intended my earlier proposal specifically to match OpenPGP User ID conventions *that have an e-mail address in them*. There are indeed other User ID conventions (like "Daniel Kahn Gillmor", or "ssh://foo.example") that aren't covered by this, and i thought i would be doing folks a favor by focusing on the e-mail address side of things specifically. My thought was that common interfaces would allow for matching against a User ID that has an e-mail address, and then they would have other matchers for other common conventions that they could try applying if this convention didn't match. This is probably an implementation detail, though. > In RFC 2822, "specials" are only allowed in a display name if they are > quoted. dkg removes this requirements. I think this is mostly > sensible, but it means that we can have User IDs like: > "<foo@example.org> <foo@example.org>" where the first > <foo@example.org> is the display name and the second is the addr-spec. > I think we should exclude angle brackets from the display name. In my > grammar, I have an "atext-specials" which is just RFC 2822 specials > without the angle brackets. I totally agree with this constraint. If you're doing away with comments (as i recommend above) then you would have to prohibit angle brackets in commas too, which seems fine to me. Even if you decide to go ahead with splitting out comments, I would go so far as to ban them in comments too. is there any plausible reason for including angle brackets in a comment? Simplify simplify :) > I'm a bit concerned about allowing the backslash character: with this > grammar, it is just a normal character, but for an RFC 2822 parser, > it's an escape character. Since User IDs may be used in contexts > where RFC 2822 things are expected, we should be careful. But, I fear > that if we reject it, we'll end up gratuitiously rejecting some > emojis. ¯\_(ツ)_/¯. There are all kinds of things that will break if implementations casually stick OpenPGP user IDs into an e-mail header, not just backslashes. for example, commas are likely to cause a problem. consider trying to mail two people whose OpenPGP certificates have these User IDs: Lucy Hernandez, MD <lucy@example.com> Chuck Wilson, Jr. <chuck@example.net> A simple concatenation with commas yields the disastrous: To: Lucy Hernandez, MD <lucy@example.com>, Chuck Wilson, Jr. <chuck@example.net> and DQUOTE is just as bad if not worse :) So i have no problem with including backslash in the display name area. --dkg
- [openpgp] User ID conventions (it's not really a … Daniel Kahn Gillmor
- Re: [openpgp] User ID conventions (it's not reall… Daniel Kahn Gillmor
- Re: [openpgp] User ID conventions (it's not reall… Michael Richardson
- Re: [openpgp] User ID conventions (it's not reall… Jon Callas
- Re: [openpgp] User ID conventions (it's not reall… Daniel Kahn Gillmor
- Re: [openpgp] User ID conventions (it's not reall… Daniel Kahn Gillmor
- Re: [openpgp] User ID conventions (it's not reall… Neal H. Walfield
- Re: [openpgp] User ID conventions (it's not reall… brian m. carlson
- Re: [openpgp] User ID conventions (it's not reall… Neal H. Walfield
- Re: [openpgp] User ID conventions (it's not reall… Neal H. Walfield
- Re: [openpgp] User ID conventions (it's not reall… Daniel Kahn Gillmor
- Re: [openpgp] User ID conventions (it's not reall… brian m. carlson