[Emailcore] IRe: A/S outstanding issue #51 (email addresses in HTML forms)

John C Klensin <john-ietf@jck.com> Sun, 23 October 2022 03:00 UTC

Return-Path: <john-ietf@jck.com>
X-Original-To: emailcore@ietfa.amsl.com
Delivered-To: emailcore@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 579AEC14CF15 for <emailcore@ietfa.amsl.com>; Sat, 22 Oct 2022 20:00:06 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.908
X-Spam-Level:
X-Spam-Status: No, score=-1.908 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_NONE=0.001, T_SCC_BODY_TEXT_LINE=-0.01] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id wcfZ1EHai6UG for <emailcore@ietfa.amsl.com>; Sat, 22 Oct 2022 20:00:05 -0700 (PDT)
Received: from bsa2.jck.com (ns.jck.com [70.88.254.51]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 10F16C14CF14 for <emailcore@ietf.org>; Sat, 22 Oct 2022 20:00:04 -0700 (PDT)
Received: from [198.252.137.10] (helo=PSB) by bsa2.jck.com with esmtp (Exim 4.82 (FreeBSD)) (envelope-from <john-ietf@jck.com>) id 1omRDX-0006JR-Iw; Sat, 22 Oct 2022 23:00:03 -0400
Date: Sat, 22 Oct 2022 22:59:57 -0400
From: John C Klensin <john-ietf@jck.com>
To: Ken Murchison <murch@fastmail.com>, emailcore@ietf.org
Message-ID: <40DFBB2CB9DCE6F4B0559613@PSB>
In-Reply-To: <86bccf43-d4a0-1979-9e61-e178823dcdeb@fastmail.com>
References: <77dc11b9-29a4-c194-0d2d-ab078d6c8643@fastmail.com> <0deebfb9-fc29-30ea-c4ad-63e03dfa60e4@fastmail.com> <86bccf43-d4a0-1979-9e61-e178823dcdeb@fastmail.com>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
X-SA-Exim-Connect-IP: 198.252.137.10
X-SA-Exim-Mail-From: john-ietf@jck.com
X-SA-Exim-Scanned: No (on bsa2.jck.com); SAEximRunCond expanded to false
Archived-At: <https://mailarchive.ietf.org/arch/msg/emailcore/EXjo-CU5rTEJAvvR3vg7n9HuBo8>
Subject: [Emailcore] IRe: A/S outstanding issue #51 (email addresses in HTML forms)
X-BeenThere: emailcore@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: EMAILCORE proposed working group list <emailcore.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/emailcore>, <mailto:emailcore-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/emailcore/>
List-Post: <mailto:emailcore@ietf.org>
List-Help: <mailto:emailcore-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/emailcore>, <mailto:emailcore-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 23 Oct 2022 03:00:06 -0000

Ken,

I think this is probably ok.  It is very different from what I
had half-written when your note arrived, but the differences are
mostly a matter of style.  (If you want to see my incomplete
version, let me know.)

Two suggestions:

The key distinction made in 5321 (if not clear enough, should be
identified and fixed) is between what it describes as "final
delivery" and everything else, e.g., systems handling outgoing
messages from submission and in transit.  There is also an
implicit issue/authority about those final delivery systems (you
say "recipient mail server" in at least one place, which I think
is fine) which is that they (or those responsible for them)
determine which mailbox addresses they assign or allow to be
allocated.  It may therefore be useful to separate the two
explicitly, not just talk about "implementations" but instead
more explicitly about implementations, operations, or
administrative decisions at the delivery end versus what goes on
elsewhere.

I've got mixed feelings about the HTML form material.  First,
there are ongoing discussions, initiated by the W3C I18N WG,
about relaxing the HTML format rules enough to allow
SMTPUTF8-conformant addresses.    If that effort succeeds, we
may get rid of some of the other restrictions imposed by the
current syntax rules.  It won't change the "no comments" and "no
name phrases" restriction: they (W3C and the HTML group) are
really interested in addresses (<addr-spec> in 5322-speak and
"Mailbox" in 5321-speak).   You may also want to reference the
<Mailbox> syntax of 5321bis rather than that section of 5322bis:
it would get rid of the phrase and comment issues and should be
the slightly more restrictive syntax.  Referencing 5322bis would
make more sense if you were concerned about non-Internet mail,
but you aren't.  OTOH, if you want to stick with 5322bis, note
that comments and phrases are in the first part of Section 3.4,
not in 3.4.1.  For your draft text, all of this, but especially
the proposed changes to the HTML spec, suggests you may need to
be a bit careful to indicate that the HTML rules might change
and to give a good reference.  

Then, I don't know what "Implementations that intend to use
email addresses in HTML forms..." means.  In general, people
establish email addresses for whatever reasons they do that.
Those addresses and HTML forms intersect only when someone
building a web application decides to ask the user to put an
email address into a form.  Nothing requires that they use an
<input> element specifying email either -- they can accept free
text and do their own evaluation (if any) and that is the only
way to get SMTPUTF8 addresses into web forms today.   Internet
email implementations don't use HTML forms or have much to do
with this.

best,
    john




--On Friday, October 21, 2022 09:08 -0400 Ken Murchison
<murch@fastmail.com> wrote:

> Revised Section 3.2 for bashing:
> 
> 3.2.  Use of Email Addresses
> 
>     SMTP specifies that the local-part of an email address is
> case-
>     sensitive (see Section 2.4 of
> [I-D.ietf-emailcore-rfc5321bis]):
> 
>               The local-part of a mailbox MUST BE treated as
> case
>               sensitive.  Therefore, SMTP implementations MUST
> take
>               care to preserve the case of mailbox local-parts.
>               In particular, for some hosts, the user "smith"
> is
>               different from the user "Smith".  However,
> exploiting
>               the case sensitivity of mailbox local-parts
> impedes
>               interoperability and is discouraged.
> 
>     While case-sensitivity is specified as an absolute
> requirement, it is
>     important to stress that most implementations do not make
> case
>     distinctions in local parts (most treat "smith", "Smith",
> and "SMITH"
>     as the same), and most implementations do preserve the
> case that is
>     received (from SMTP or HTTP, from address books, or from
> user input).
>     Maximum interoperability will be achieved by keeping
> local-parts
>     unchanged (and especially making no attempt to change
> their case in
>     any way) and by assuming that local-parts that differ only
> in their
>     case probably refer to the same mailbox.  This is
> particularly
>     important for software that validates user-input fields,
> where case
>     changes are tempting, but must be avoided.
> 
>     It is also important to note, as we encounter non-ASCII
> local-parts
>     over time, that case changes are both character-set
> dependent and
>     language dependent, and attempts to change case without
> having the
>     full context necessary are likely to be wrong often enough
> to matter.
> 
>     Additionally, implementations vary in how they interpret
> the use of
>     delimiters such as '+' and '.' in local-parts.  Some
> implementations
>     make distinctions between local-parts such as "smith" and
>     "smith+foo", or "jane.doe" and "janedoe", while others
> treat them as
>     referring to the same mailboxes respectively.  Since only
> the
>     recipient mail server can properly interpret the
> local-part of an
>     address, implementations are discouraged from making any
> changes to
>     local-parts containing such delimiters.
> 
> 3.2.1.  HTML Forms
> 
>     Email addresses are frequently used as input in HyperText
> Markup
>     Language (HTML) forms but the allowed grammar of these
> email
>     addresses is more restrictive than the grammar in Section
> 3.4.1 of
>     [I-D.ietf-emailcore-rfc5322bis] (limited characters
> allowed in
>     domains and the lack of comments and quoted strings).
>     Implementations that intend to use email addresses in HTML
> forms
>     SHOULD consult the valid email address grammar in Section
> 4.10.5.1.5
>     of [HTML].
> 
>     Additionally, the following implementation guidance is
> provided:
> 
>     *  Few mail systems allow leading, trailing, or
> consecutive unquoted
>        dots ('.') in the local-part of email addresses even
> though the
>        HTML grammar allows them.  Therefore, implementations
> are
>        discouraged from accepting such addresses in HTML forms.
> 
>     *  Some mail systems allow a trailing dot ('.') in the
> domain part of
>        email addresses (as allowed by Domain Names [RFC1035]),
> but this
>        is not interoperable with all systems.  As such,
> implementations
>        of HTML forms are encouraged to allow a trailing dot in
> the domain
>        and then strip it prior to using the address.