Re: [openpgp] User ID conventions (it's not really a RFC2822 name-addr)

"Neal H. Walfield" <neal@walfield.org> Wed, 06 November 2019 07:37 UTC

Return-Path: <neal@walfield.org>
X-Original-To: openpgp@ietfa.amsl.com
Delivered-To: openpgp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 936D7120122 for <openpgp@ietfa.amsl.com>; Tue, 5 Nov 2019 23:37:33 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, SPF_NONE=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4MpHvndXhudb for <openpgp@ietfa.amsl.com>; Tue, 5 Nov 2019 23:37:31 -0800 (PST)
Received: from mail.dasr.de (mail.dasr.de [217.69.77.164]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4524712004F for <openpgp@ietf.org>; Tue, 5 Nov 2019 23:37:31 -0800 (PST)
Received: from p54abdd01.dip0.t-ipconnect.de ([84.171.221.1] helo=forster.huenfield.org) by mail.dasr.de with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.86_2) (envelope-from <neal@walfield.org>) id 1iSFsZ-0002Pg-KL; Wed, 06 Nov 2019 07:37:23 +0000
Received: from grit.huenfield.org ([192.168.20.9] helo=grit.walfield.org) by forster.huenfield.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from <neal@walfield.org>) id 1iSFsY-0005oF-Va; Wed, 06 Nov 2019 08:37:23 +0100
Date: Wed, 06 Nov 2019 08:37:22 +0100
Message-ID: <87tv7he9ql.wl-neal@walfield.org>
From: "Neal H. Walfield" <neal@walfield.org>
To: "brian m. carlson" <sandals@crustytoothpaste.net>
Cc: Daniel Kahn Gillmor <dkg@fifthhorseman.net>, openpgp@ietf.org
In-Reply-To: <20191106000546.GE32531@camp.crustytoothpaste.net>
References: <87woe7zx7o.fsf@fifthhorseman.net> <87v9rydk9s.wl-neal@walfield.org> <20191106000546.GE32531@camp.crustytoothpaste.net>
User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL/10.8 EasyPG/1.0.0 Emacs/26 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO)
MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue")
Content-Type: text/plain; charset=US-ASCII
X-SA-Exim-Connect-IP: 192.168.20.9
X-SA-Exim-Mail-From: neal@walfield.org
X-SA-Exim-Scanned: No (on forster.huenfield.org); SAEximRunCond expanded to false
Archived-At: <https://mailarchive.ietf.org/arch/msg/openpgp/Cmf3wQFp_uChnsZYExG0pkl-xTw>
Subject: Re: [openpgp] User ID conventions (it's not really a RFC2822 name-addr)
X-BeenThere: openpgp@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Ongoing discussion of OpenPGP issues." <openpgp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/openpgp>, <mailto:openpgp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/openpgp/>
List-Post: <mailto:openpgp@ietf.org>
List-Help: <mailto:openpgp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/openpgp>, <mailto:openpgp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 06 Nov 2019 07:37:33 -0000

Hi Brian,

On Wed, 06 Nov 2019 01:05:46 +0100,
brian m. carlson wrote:
> On 2019-11-05 at 22:35:11, Neal H. Walfield wrote:
> > I'm considering using the following "grammar".  (I've put grammar in
> > scare quotes, because it is not a valid grammar according to RFC 5322
> > due to several ambiguities.  In particular, the production "*WS
> > [name] *WS" is ambiguous when applied to a string containing a single
> > whitespace character: the whitespace character could match the first
> > WS or the second one.  In practice, this ambiguity doesn't matter,
> > because we only care about what the "name", "comment-content" and
> > "addr-spec" productions match.)
> > 
> >      WS                 = 0x20 (space character)
> > 
> >      comment-specials   = "<" / ">" /   ; RFC 2822 specials - "(" and ")"
> >                           "[" / "]" /
> >                           ":" / ";" /
> >                           "@" / "\" /
> >                           "," / "." /
> >                           DQUOTE
> > 
> >      atext-specials     = "(" / ")" /   ; RFC 2822 specials - "<" and ">".
> >                           "[" / "]" /
> >                           ":" / ";" /
> >                           "@" / "\" /
> >                           "," / "." /
> >                           DQUOTE
> > 
> >      atext              = ALPHA / DIGIT /   ; Any character except controls,
> >                           "!" / "#" /       ;  SP, and specials.
> >                           "$" / "%" /       ;  Used for atoms
> >                           "&" / "'" /
> >                           "*" / "+" /
> >                           "-" / "/" /
> >                           "=" / "?" /
> >                           "^" / "_" /
> >                           "`" / "{" /
> >                           "|" / "}" /
> >                           "~" /
> >                           \u{80}-\u{10ffff} ; Non-ascii, non-control UTF-8
> > 
> >      name-char-start    = atext / atext-specials
> > 
> >      name-char-rest     = atext / atext-specials / WS
> > 
> >      name               = name-char-start *name-char-rest
> > 
> >      comment-char       = atext / comment-specials / WS
> > 
> >      comment-content    = *comment-char
> > 
> >      comment            = "(" *WS comment-content *WS ")"
> > 
> >      addr-spec          = dot-atom-text "@" dot-atom-text
> 
> dot-atom-text isn't defined here, so it isn't clear to me what it
> includes.  Does it permit UTF-8 in addresses according to the SMTPUTF8
> RFCs?

Thanks for catching that.  When turning my code into a grammar, I
somehow forgot that production.


The dot_atom_text is unchanged from e.g. RFC 2822:

   dot_atom_text      = 1*atext *("." *atext)

But since we've extended atext to include non-control UTF-8
characters, this should allow international email addresses.

RFC 6531 (the SMTPUTF8 RFC) extends atext as follows:

  atext   =/  UTF8-non-ascii
    ; extend the implicit definition of atext in
    ; RFC 5321, Section 4.1.2, which ultimately points to
    ; the actual definition in RFC 5322, Section 3.2.3

  https://tools.ietf.org/html/rfc6531#section-3.3

which, I think, is what I did above.

But, I've only skimmed RFC 6531 so I might have missed something else.

Thanks!

:) Neal