Re: [Emailcore] Applicability statement and spam blowback

John C Klensin <john-ietf@jck.com> Fri, 12 January 2024 17:39 UTC

Return-Path: <john-ietf@jck.com>
X-Original-To: emailcore@ietfa.amsl.com
Delivered-To: emailcore@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 02F6CC14F602 for <emailcore@ietfa.amsl.com>; Fri, 12 Jan 2024 09:39:56 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.907
X-Spam-Level:
X-Spam-Status: No, score=-1.907 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id gZSKP6lDtn7b for <emailcore@ietfa.amsl.com>; Fri, 12 Jan 2024 09:39:52 -0800 (PST)
Received: from bsa2.jck.com (ns.jck.com [70.88.254.51]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 04422C14F5F1 for <emailcore@ietf.org>; Fri, 12 Jan 2024 09:39:51 -0800 (PST)
Received: from [198.252.137.10] (helo=PSB) by bsa2.jck.com with esmtp (Exim 4.82 (FreeBSD)) (envelope-from <john-ietf@jck.com>) id 1rOLVS-000NNU-Hn; Fri, 12 Jan 2024 12:39:46 -0500
Date: Fri, 12 Jan 2024 12:39:40 -0500
From: John C Klensin <john-ietf@jck.com>
To: Pete Resnick <resnick@episteme.net>
cc: Steffen Nurpmeso <steffen@sdaoden.eu>, emailcore@ietf.org
Message-ID: <9D19855C1E0525B119A19389@PSB>
In-Reply-To: <E3D3B1FA-7768-4DB5-80CE-6269A7B9CE51@episteme.net>
References: <20240103212013.3D1607FB4F23@ary.qy> <24EDAA5E899507B3F067E6EB@PSB> <20240111004300.7PMe4svt@steffen%sdaoden.eu> <0FC201C5B4CEB19EAE7F03F7@PSB> <E3D3B1FA-7768-4DB5-80CE-6269A7B9CE51@episteme.net>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
X-SA-Exim-Connect-IP: 198.252.137.10
X-SA-Exim-Mail-From: john-ietf@jck.com
X-SA-Exim-Scanned: No (on bsa2.jck.com); SAEximRunCond expanded to false
Archived-At: <https://mailarchive.ietf.org/arch/msg/emailcore/BHSUOA0p-ckZgdjpPCLCoZM-NkE>
Subject: Re: [Emailcore] Applicability statement and spam blowback
X-BeenThere: emailcore@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: EMAILCORE proposed working group list <emailcore.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/emailcore>, <mailto:emailcore-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/emailcore/>
List-Post: <mailto:emailcore@ietf.org>
List-Help: <mailto:emailcore-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/emailcore>, <mailto:emailcore-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 12 Jan 2024 17:39:56 -0000


--On Thursday, January 11, 2024 16:43 -0600 Pete Resnick
<resnick@episteme.net> wrote:

> On 11 Jan 2024, at 15:46, John C Klensin wrote:
> 
>> --On Thursday, January 11, 2024 01:43 +0100 Steffen Nurpmeso
>> <steffen@sdaoden.eu> wrote:
>> 
>>> ...i have seen that the address examples no longer show
>>> syntax as weird as in the past.  This also seems to root in
>>> syntax adjustments regarding placement of comments (in
>>> addresses etc), but in hindight to the actual state of I-M-F
>>> code parsers in the wild i think it would be beneficial if at
>>> least one or two totally exaggerated examples showing
>>> obsolete syntax, with lots of FWS everywhere, would be
>>> beneficial.
> 
> I'm not sure how much more "weird" or "exaggerated" the
> examples from A.5 and A.6.3 could be. :-)
> 
> https://www.ietf.org/archive/id/draft-ietf-emailcore-rfc5322bi
> s-08.html#name-white-space-comments-and-ot
> https://www.ietf.org/archive/id/draft-ietf-emailcore-rfc5322bi
> s-08.html#name-obsolete-white-space-and-co

>...

Sorry, Pete.  I did not look closely enough at 5322bis to
realize that those "Chris" examples came directly out of it.
My apologies.  It is probably a clear sign that I need to
concentrate on 5321bis (and finishing up AUTH48 on
draft-freed-smtp-limits) for a few days and leave 5322bis to you
and others.  I will now finish up this note and then do that.

>...

> [Insert Pete yelling about how stupid he was 20-odd years ago
> to be part of the group in favor of removing the explicit
> multi-level parsing model. I can only claim I was young and
> stupid.]

Without getting anywhere new "stupid", that model would
certainly have made this discussion much easier.

> What do you mean by "will become"? Any reasonable parser will
> take "Chris Jones <c@(Chris's host.)public.example>" and turn
> it into a series of tokens: A display-name of "Chris Jones", a
> special of "<", a local-part of "c", a special of "@", a
> comment of "Chris's host.", a domain of "public.example", and
> a special of ">". (And yes, the display-name is a phrase and
> the local-part is a dot-atom made up entire of atext, etc.).

The only part of that where we _might_ disagree is the
definition of "Any" and/or "reasonable" in "Any reasonable
parser".    The author of a different sort of parser at either
end of the pipe, who might still qualify as reasonable, might
get as far into 5322bis (or its predecessors) as 

   name-addr = [display-name] angle-addr
   angle-addr =   [CFWS] "<" addr-spec ">" [CFWS] /
           obs-angle-addr
and
   addr-spec =   local-part "@" domain

and decide that anyone who was sending or expecting anything
that required an understanding of obs-angle-addr, other 5322bis
details of <addr-spec> that were not conformant to the  was
either hostile or stupid and they just did not want to handle
email messages who address part (aka <angle-addr> did not
conform to the 5321bis definition of a mailbox.

Now, if one could see the inside of the code in which a parser
designed that way was written, we could say "does not strictly
conform to rfc5322bis".  However, not only would that
declaration plus a few dollars be worth a cup of coffee, but if
one looked at the system using that code as a black box, it
would be indistinguishable from an administrative decision.  We
have nothing in the specs that requires that ain Internet mail
system send messages to, or accept messages from, any particular
address (and such a requirement would get us nowhere).


> And you'd get a different order of tokens in the second
> example, but in any event:
> 
>>> whereas the colourful
>>> 
>>>   Chris Jones <c@(in the past.)public.example(.host of
>>>   Chris)>
>>> 
>>> or the like ... you know?
>>> .. it is only that i stumbled upon it.
> 
> In any of these examples, if you're looking for an address, it
> is surely the addr-spec of "c@public.domain". And if we still
> had the multi-level parser in there the way 822 did, that
> would be clearer. C'est la vie.

Exactly.  But, if an actual system looked at the example above
and said "the person or entity who was asserting this address is
just being too clever for words as is likely either immature or
hostile enough that we are not going to handle the message, that
would not be parser decision, it would be an administrative one.
Where we could start some (IMO really useless) quibbling would
be whether the rejection message it produced would be "bad
address", "nonsense address", or "just not going to handle a
message with that sort of address in it".  Only the latter would
be conforming and accurate, but...


>> I think there are at least three almost-separate
>> questions/issues with examples like that:
>> 
>> (1) There is a complicated tradeoff between illustrating what
>> the syntax does (or does not) permit so that the odds
>> implementations getting their "accept" syntax right and (even
>> if inadvertently) encouraging people to use tricky stuff that
>> I think we can predict would sometimes not work.
>> 
>> (2) The examples themselves are inherently confusing so, if
>> they were to be used, I would think they would need to be
>> accompanied by prose explanations about what is going on
>> when, e.g., parts of the domain part are reordered.
> 
> I think (hope) the 5322bis examples sections explain and make
> those tradeoffs well.

Having now read what I think is all of it, I agree, at least to
a "good enough that we should not mess further with it"
approximation.  Again, apologies for not doing that reading
earlier.

>> (3) One thing all three examples have in common is that the
>> domain-part of the addresses would be rejected by a conforming
>> SMTP/5321bis parser because neither "(" nor ")" are allowed in
>> the RFC 1035 "preferred name syntax" (see Section 2.3.5 of
>> rfc5321bis-24).  Consistent with that, the opening paragraph
>> of Section 3.4.1 of rfc5322bis-08 forbids use of comments or
>> FWS around the "@" and the note that follows reinforces the
>> message. That prohibition is a "SHOULD NOT"; I wonder if it
>> would improve interoperability if either we made it "MUST
>> NOT" or if the text explained what would justify an exception.
> 
> So, referring back to the parsing explanation I gave above:
> 5322bis and its precursors assume that as a MUA, you can jam
> all sorts of stuff into an email address header, like the
> display name or other textual information in the form of
> comments (for whatever nefarious or wonderful purposes you can
> come up with), but the process of taking whatever is there and
> forming an envelope to pass to an MTA (i.e., the MSA process)
> involves parsing out the address itself and handing it off in
> the correct form. So, beyond the comments within the addr-spec
> in the examples above, 5321bis does not allow a display name
> to appear before the angle brackets, and also requires the
> angle brackets. Even if those comments weren't there, you'd
> have some parsing and some reassembling to do in order to get
> the address into the correct form.

Yes, and I think that is consistent with my comments above
(although probably not with a reasonable reading of my earlier
message).

> The SHOULD NOT in there is to warn folks about putting
> comments around the "@" because historically some
> implementations have screwed up the "parse and reassemble"
> step if they are there, but there might be reasons to still
> engage in such behavior. I'm still not convinced it ought to
> be a MUST NOT.

And that "historically" position was the one I was taking.  I'm
not convinced that MUST NOT is justified either.  However, I
think the balance between the two (or, more likely, SHOULD NOT
with an even stronger and more explicit warning) is whether we
can identify any necessary use cases for the construction.  As
of this instant (I reserve the option of changing my mind), I
think 5322bis should be left unchanged.   One of the roles of
the A/S, IMO, should be to elaborate on  cases where 5322bis
(and 5321bis and maybe related specs) are very permissive to
allow for odd cases, explain, and say "you really should not
encourage that" or something stronger.  If there is consensus
that this particular probably dead horse needs further kicking,
that would be how and where to do it and the proposed paragraph
below and yours above might be a starting point.

>> Even if we decided to incorporate examples like that, I don't
>> know whether more text in rfc5322bis would be appropriate or
>> whether the better answer would be a paragraph in the A/S that
>> might look something like:
>> 
>>     "As noted in 3.4.1 of rfc5322bis-08 the syntax given
>>     there for the domain portion of the address is very
>>     permissive (even more so when obs- form are considered).
>>     It would permit, for example, form like   [...].
>>     However, as it also notes, rfc5321bis and the basic
>>     Internet domain name specifications are far more
>>     restrictive.  Users and systems that expect email to
>>     actually be moved between hosts on the Internet without
>>     errors or confusion are advised to allow only the domain
>>     part of an address that conforms to those other
>>     specifications or, when appropriate, to the SMTPUTF8
>>     specifications."
> 
> Perhaps including something akin to the previous paragraph I
> typed would also help.

Yes, see above.
 
>> And that might justify a MUST even in rfc5322bis, because of
>> that explanation, does not.
> 
> Nor do I think it needs to.

In case the above is not clear (and even I don't know any more
exactly what my botched sentence above was intended to say), I
am far from convinced that any change from SHOULD NOT to MUST
NOT in 5322bis is justified or needed.  I'm not even convinced
that additional explanation in the A/S is really needed and/or
worth the effort.   

So I think we agree.  

best,
  john