Re: [Emailcore] A/S outstanding issue #51 (email addresses in HTML forms)

Ken Murchison <murch@fastmail.com> Thu, 20 October 2022 18:43 UTC

Return-Path: <murch@fastmail.com>
X-Original-To: emailcore@ietfa.amsl.com
Delivered-To: emailcore@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B31B6C14CF10 for <emailcore@ietfa.amsl.com>; Thu, 20 Oct 2022 11:43:01 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.805
X-Spam-Level:
X-Spam-Status: No, score=-2.805 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=fastmail.com header.b=GL+t9oBR; dkim=pass (2048-bit key) header.d=messagingengine.com header.b=MpCx/7zD
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id bKUue4i4Jbr2 for <emailcore@ietfa.amsl.com>; Thu, 20 Oct 2022 11:42:57 -0700 (PDT)
Received: from out4-smtp.messagingengine.com (out4-smtp.messagingengine.com [66.111.4.28]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 69A6CC14F745 for <emailcore@ietf.org>; Thu, 20 Oct 2022 11:42:57 -0700 (PDT)
Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id C1E495C00E1; Thu, 20 Oct 2022 14:42:56 -0400 (EDT)
Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Thu, 20 Oct 2022 14:42:56 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fastmail.com; h= cc:cc:content-transfer-encoding:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to; s=fm3; t=1666291376; x= 1666377776; bh=+/SidTRo4QZTgqhsvn3ztEOCqLSwk9nCQAA9yRhF6/Y=; b=G L+t9oBRGBkN7DPivEiLeT25G95c5MwQ1FWN0wWRidT/M/1mgxdR10OLFNusE73tS N0x6hEYFDr7Vj1/d8MbUHjU0HUZcVMZ/KO8TP4dJ+Xz1sM8XFvFkZDSqUzPrz9LG 2QCypiyjdxb9PxloJ+haHL+f4j9O+r7t9ewlxPg8NdV8oIBTSPaRReeTSsEaJ8vv 41OdVWxjOz8qjvb5C3CjZqe9gzh/QuBOqjtT+h0q0yltkQLb8mvWyTA1YgubRFe+ Bv5wjuKjorFp/cdNYZvoxuUKUTGWxDt52jZ/hcQ1DzrxyI/8u7ndqZbbZ0DLdRgk D/VZH3W5UckR2B6cbG6MQ==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm3; t=1666291376; x= 1666377776; bh=+/SidTRo4QZTgqhsvn3ztEOCqLSwk9nCQAA9yRhF6/Y=; b=M pCx/7zDGpbWEYXLJBzZajYwKDOg5I9Dz/FUjuJ/6a1J9YOPfGgyroXiMrDlVbn6V EJEdIGXS1omAsNGYVcjyvN8RHC/0OMfBaHMBCOv22RVeLB1q2e6aU5xZtSQ5kkvp 1+ZQtoXGephhyBpkO+T6K6g0g0rr/WoZvYQAI29p2VVBrrA6yXbjIPLtl2TpYqkQ ISRGV6VHDVSRpl2CHiiNrGXdR2nBoFE156TIMbYtcV4xKBpZ9S0JcKm3XMxZ9I4Q jq5eI7XRMPJATzQscLkqs2jpF0tq6SgPiWQeMPlCRpYpJMLFW6mpzdPvFsgVqNFz Pm1xLMD/MADHdnpFX2pqw==
X-ME-Sender: <xms:sJZRYwDehnIKijRC5kGDcgE_UsBtJft1yMCGIWfup4oIjDY9dRkulQ> <xme:sJZRYyii1t7JG2avH3mN0fwN_Xz73Uzk0PVvt66RHum0gE6akXX5qikVS71LUjoNs 6LTTCWlYTL25Q>
X-ME-Received: <xmr:sJZRYzlTU9sCP_LTD4p1Xk29_Rv_IB-_UxVZwz3CP3bcvrmXRi7S0AzmXMAUv6ww0nZHub5TSxFGaFK5frZkn3PLyo6tZspKJbnm4g>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvfedrfeeliedguddvjecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd enucfjughrpefkffggfgfuvfevfhfhjggtgfesthekredttdefjeenucfhrhhomhepmfgv nhcuofhurhgthhhishhonhcuoehmuhhrtghhsehfrghsthhmrghilhdrtghomheqnecugg ftrfgrthhtvghrnhepvdetueevkeeggffghffggeffheejkeejtefhteffjeejjeduvdfh teejieekjeelnecuffhomhgrihhnpehivghtfhdrohhrghenucevlhhushhtvghrufhiii gvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehmuhhrtghhsehfrghsthhmrghilhdr tghomh
X-ME-Proxy: <xmx:sJZRY2wRr6qrGuXcKUNNMNxEOy0zWwiqPQvkigQFIgHwxdoSjpgzMg> <xmx:sJZRY1Q3BepHUPpkeBoqevxhK4iJF6Ja6UwcMz65C8E7MRrgzgIXng> <xmx:sJZRYxajRK6z0LIeOvSjgtaCJsjFAno3HnjYi67HqLWITsWiqhejgQ> <xmx:sJZRY3cV-jnOSJLiTUF00k8nADvqUOhCDDb8dFlpWP6e9u389AOxzQ>
Feedback-ID: ibf914243:Fastmail
Received: by mail.messagingengine.com (Postfix) with ESMTPA; Thu, 20 Oct 2022 14:42:56 -0400 (EDT)
Message-ID: <ed2c9cc2-3fa1-164b-186b-cc3a69706d36@fastmail.com>
Date: Thu, 20 Oct 2022 14:42:55 -0400
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.3.1
Content-Language: en-US
To: Barry Leiba <barryleiba@computer.org>
Cc: John C Klensin <john-ietf@jck.com>, Alexey Melnikov <aamelnikov@fastmail.fm>, emailcore@ietf.org, John R Levine <johnl@taugh.com>
References: <20221007203938.49CCD4C1266B@ary.qy> <f4e4025f-82dc-4453-866c-8c8893f64421@app.fastmail.com> <5A01B9831F9D4C0D01CA61BB@JcK-HP5> <fd5dc688-621f-4f1e-97fd-0231dcff2232@app.fastmail.com> <7D9B45F3E50A3F0DBF3BAE98@JcK-HP5> <CALaySJJeM6myw0ZhmDp=-A-46WfutWNQdL0+iV-FXDA5HQ25Cg@mail.gmail.com> <9b021a56-e226-3a34-3a72-933ceaf724b5@fastmail.com> <CALaySJKbVOTmXijit-nZO2wasWVmoLkQCe6SF+_85Xe5zE9iQg@mail.gmail.com>
From: Ken Murchison <murch@fastmail.com>
In-Reply-To: <CALaySJKbVOTmXijit-nZO2wasWVmoLkQCe6SF+_85Xe5zE9iQg@mail.gmail.com>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/emailcore/0RFWZd0rYHk6EVY_27SsASl2Xr8>
Subject: Re: [Emailcore] A/S outstanding issue #51 (email addresses in HTML forms)
X-BeenThere: emailcore@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: EMAILCORE proposed working group list <emailcore.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/emailcore>, <mailto:emailcore-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/emailcore/>
List-Post: <mailto:emailcore@ietf.org>
List-Help: <mailto:emailcore-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/emailcore>, <mailto:emailcore-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 20 Oct 2022 18:43:01 -0000

Thanks Barry,

Are we intending for this new text to apply only within the context of 
Section 3.2 (HTML Forms) or should this be considered more general guidance?

I'm also wondering if the point regarding "plus" addressing is more 
general guidance rather than limited to just HTML forms.



On 10/20/22 2:34 PM, Barry Leiba wrote:
>> I look forward to your proposed text.  If you can post it and/or send it
>> to me, I can get it in the A/S update that I intend to post before the
>> deadline on Monday.
> Here is text that I proposed adding after the paragraph Ken proposed
> for the new Section 3.2:
>
> ADD
>
> In particular, SMTP specifies that the local-part of an email address
> is case-sensitive (see Section 2.4 of
> [I-D.ietf-emailcore-rfc5322bis]):
>
>     The local-part of a mailbox MUST BE treated as case sensitive.
>     Therefore, SMTP implementations MUST take care to preserve the case
>     of mailbox local-parts.  In particular, for some hosts, the user
>     "smith" is different from the user "Smith".  However, exploiting the
>     case sensitivity of mailbox local-parts impedes interoperability and
>     is discouraged.
>
> While case-sensitivity is specified as an absolute requirement, it is
> important to stress that most implementations do not make case
> distinctions in local parts (most treat “smith”, “Smith”, and “SMITH”
> as the same), and most implementations do preserve the case that is
> received (from SMTP or HTTP, from address books, or from user input).
> Maximum interoperability will be achieved by keeping local-parts
> unchanged (and especially making no attempt to change their case in
> any way) and by assuming that local-parts that differ only in their
> case probably refer to the same mailbox.  This is particularly
> important for software that validates user-input fields, where case
> changes are tempting, but must be avoided.
>
> It is also important to note, as we encounter non-ASCII local-parts
> over time, that case changes are both character-set dependent and
> language dependent, and attempts to change case without having the
> full context necessary are likely to be wrong often enough to matter.
>
> END
>
> I also wonder if, somewhere, we should say that new implementations
> SHOULD make local-parts that differ only in their case refer to the
> same mailbox, thus strengthening the "is discouraged" from the SMTP
> spec.  We might also be able to get away with moving from "is
> discouraged" to some SHOULD NOT wording in SMTP, while still moving to
> Internet Standard.
>
> Barry
>
>> On 10/17/22 10:53 AM, Barry Leiba wrote:
>>> Process: I think that it we change the case-sensitivity of local-part,
>>> we are no longer in an Internet Standard path, but would have to go
>>> back to Proposed Standard.
>>>
>>> I think the best approach for us now is to leave the text in 5321bis
>>> that's in Section 2.4, which discourages case-sensitivity, to put very
>>> clear text in the AS that actually using case-sensitive local-part is
>>> bad for interoperability and will break with a lot of current software
>>> that assume insensitivity, however incorrectly, and to thus have the
>>> AS highlight that discouragement.
>>>
>>> The result would be that the formal grammar would still allow
>>> case-sensitive local-part and SMTP would still normatively say, "The
>>> local-part of a mailbox MUST BE treated as case sensitive.  Therefore,
>>> SMTP implementations MUST take care to preserve the case of mailbox
>>> local-parts."  (Except that the "BE" should be in lower case... JCK
>>> please note.)  But it also would still say, "However, exploiting the
>>> case sensitivity of mailbox local-parts impedes interoperability and
>>> is discouraged," and the AS would follow up on that part.
>>>
>>> I'm working on some text to propose for the AS in line with what I'm suggesting.
>>>
>>> Barry
>>>
>>> On Mon, Oct 17, 2022 at 10:32 AM John C Klensin <john-ietf@jck.com> wrote:
>>>>
>>>> --On Monday, 17 October, 2022 14:35 +0100 Alexey Melnikov
>>>> <aamelnikov@fastmail.fm> wrote:
>>>>
>>>>> Hi John,
>>>>>
>>>>> On Mon, Oct 17, 2022, at 2:25 PM, John C Klensin wrote:
>>>>>> As participant only...
>>>>> Likewise.
>>>>>
>>>>>> --On Monday, 17 October, 2022 14:00 +0100 Alexey Melnikov
>>>>>> <aamelnikov@fastmail.fm> wrote:
>>>>>>
>>>>>>> Hi John,
>>>>>>> I agree with you that we should say a bit more about
>>>>>>> problematic cases. Possible add something like your text
>>>>>>> after the paragraph that Ken suggested.
>>>>>>>
>>>>>>> Some specific comments below:
>>>>>>>
>>>>>>> On Fri, Oct 7, 2022, at 9:39 PM, John Levine wrote:
>>>>>>>> It appears that Ken Murchison  <murch@fastmail.com> said:
>>>>>>>>> I have crafted the following text for this issue:
>>>>>>> ...
>>>>>>>> If we are going to stick our foot into this swamp at all, I
>>>>>>>> think we should dive in and describe the popular ways that
>>>>>>>> non-mail systems screw up mail addresses such as
>>>>>>>>
>>>>>>>> * Everyone assumes ASCII upper and lower case are
>>>>>>>> equivalent. Many turn addresses into all upper or all lower
>>>>>>>> before sending
>>>>>>> Yes, I think we should this.
>>>>>> Agreed, but "everyone" is too strong and therein lies the
>>>>>> problem.  A bit more needs to be said to discourage the
>>>>>> practices and/or to predict occasional problems when those
>>>>>> transformations are made.
>>>>> I think enough systems assume ASCII case-insensitivity that
>>>>> insisting that they are not is not going to work in many
>>>>> cases. I am afraid the boat has sailed on enforcing this one.
>>>> Then someone should be proposing that we change 5321bis, not
>>>> just make a comment in the A/S.  Either way, this increases my
>>>> concern about excluding SMTPUTF8 comments/advice from the A/S.
>>>> Based on the "case sensitive local parts" requirement, the EAI
>>>> WG decided that it did not need to explicitly insist on that.
>>>> However, if we say something equivalent to "it is ok to assume
>>>> that local-parts of addresses are case-insensitive because
>>>> everyone else does", then we probably need to be clear that, in
>>>> general, that does not apply to non-ASCII addresses in either
>>>> the local-part or, if expressed in UTF-8 rather than Punycode
>>>> encoding, the domain part. The A/S already steps rather far into
>>>> that swamp by saying that Internationalized Email SHOULD be
>>>> supported in Section 2.4 (incidentally the citation there is
>>>> wrong).  And then we probably need to figure out whether those
>>>> who assume case insensitivity for ASCII also assume it for
>>>> non-ASCII Latin script strings.  A reasonable, but naive,
>>>> assumption is that it should ("after all, what difference does a
>>>> diacritical make?") but the reality is that it does not work for
>>>> many cases.
>>>>
>>>> (( Example for those who have avoided immersion in the i18n
>>>> swamp: for some languages, in some localities, the upper case of
>>>> "á" (U+00E1) is "A" (U+0041).   Now, in a context in which
>>>> SMTPUTF8 addresses are allowed, what is the lower case of
>>>> "ABC@EFG".  If one assumes, a priori, that is an ASCII string,
>>>> then "abc@efg" is a reasonable (and correct and unique) answer.
>>>> But what if the "real" address was "ábc@éfg" and someone got
>>>> "ABC@EFG" by applying a "drop the diacritical marks when going
>>>> to upper case" rule?   The Unicode Case Mapping and Case Folding
>>>> rules prevent doing that, but the SMTPUTF8 specs don't reference
>>>> them as useful operations.   And, at the risk of invoking an
>>>> issue that brought about conflicting standards in the IDN world,
>>>> the character "ß" (U+00DF) does not have a distinct upper case
>>>> form... except when it does.  Those are just example that should
>>>> be at least mostly understandable to those reading this: there
>>>> are cases that are arguably much worse.  ))
>>>>
>>>> So, if we are going to say something in the A/S that essentially
>>>> changes the requirement, we'd better write it very carefully --
>>>> and probably explicitly include RFC 6530ff in its scope.
>>>>
>>>>>>> ...
>>>> More generally, as non-ASCII email addresses (even ASCII local
>>>> parts with IDNs expressed in UTF-8 not Punycode) become more
>>>> prevalent and especially if the A/S is going to put a SHOULD on
>>>> Internationalized Address support, I am becoming convinced that
>>>> we would be performing a real disservice to the international
>>>> email community, as well as nearly contradicting ourselves, by
>>>> pretending that issues like the above by ignoring the i18n
>>>> issues and, in particular, saying "ASCII addresses" and assuming
>>>> the reader will understand all of those subtleties .
>>>>
>>>> (A/S co-author hat momentarily back on.)
>>>> Ken, unless someone sees a way to avoid the i18n issues that I
>>>> don't and can quickly get what appears to be WG consensus behind
>>>> it, I believe the next draft should include (at least) a
>>>> placeholder section after the current Section 4 (" MIME and Its
>>>> Implications") called "Internationalization of Addresses and
>>>> Headers and Its Implications" or words to that effect.
>>>>
>>>> And I hope that at least some of those who are actively
>>>> promoting the use of SMTPUTF8 addresses and also following this
>>>> list will do some writing rather than either expecting me to do
>>>> it or assuming the correct text will magically appear.
>>>>
>>>> best,
>>>>       john
>>>>
>>>>
>>>>      best,
>>>>        john
>>>>
>>>> --
>>>> Emailcore mailing list
>>>> Emailcore@ietf.org
>>>> https://www.ietf.org/mailman/listinfo/emailcore

-- 
Kenneth Murchison
Senior Software Developer
Fastmail US LLC