Re: [ietf-smtp] Should we update an RFC if people refuse to implement parts of it ?

John C Klensin <john-ietf@jck.com> Mon, 31 May 2021 01:19 UTC

Return-Path: <john-ietf@jck.com>
X-Original-To: ietf-smtp@ietfa.amsl.com
Delivered-To: ietf-smtp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B3D3C3A1CB0 for <ietf-smtp@ietfa.amsl.com>; Sun, 30 May 2021 18:19:26 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Level:
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4ugDtz1ZRBXU for <ietf-smtp@ietfa.amsl.com>; Sun, 30 May 2021 18:19:21 -0700 (PDT)
Received: from bsa2.jck.com (bsa2.jck.com [70.88.254.51]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 174AE3A1CAF for <ietf-smtp@ietf.org>; Sun, 30 May 2021 18:19:20 -0700 (PDT)
Received: from [198.252.137.10] (helo=PSB) by bsa2.jck.com with esmtp (Exim 4.82 (FreeBSD)) (envelope-from <john-ietf@jck.com>) id 1lnWaJ-0005XS-TW; Sun, 30 May 2021 21:19:15 -0400
Date: Sun, 30 May 2021 21:19:09 -0400
From: John C Klensin <john-ietf@jck.com>
To: Ned Freed <ned.freed@mrochek.com>, Viktor Dukhovni <ietf-dane@dukhovni.org>
cc: ietf-smtp@ietf.org
Message-ID: <E23639ADA7487360C9B5A93C@PSB>
In-Reply-To: <01RZNI90M6SS0085YQ@mauve.mrochek.com>
References: <20210525182946.079748B872C@ary.qy> <EFDA46E00EFF0E48802D046A@PSB> <2021052700585304660213@cnnic.cn> <YK7E1dBKneP8B8Ib@straasha.imrryr.org> <01RZNI90M6SS0085YQ@mauve.mrochek.com>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
X-SA-Exim-Connect-IP: 198.252.137.10
X-SA-Exim-Mail-From: john-ietf@jck.com
X-SA-Exim-Scanned: No (on bsa2.jck.com); SAEximRunCond expanded to false
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf-smtp/y8Xk4cAwAFMg_w_2jQMRh8t6kxs>
Subject: Re: [ietf-smtp] Should we update an RFC if people refuse to implement parts of it ?
X-BeenThere: ietf-smtp@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussion of issues related to Simple Mail Transfer Protocol \(SMTP\) \[RFC 821, RFC 2821, RFC 5321\]" <ietf-smtp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf-smtp>, <mailto:ietf-smtp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf-smtp/>
List-Post: <mailto:ietf-smtp@ietf.org>
List-Help: <mailto:ietf-smtp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf-smtp>, <mailto:ietf-smtp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 31 May 2021 01:19:27 -0000

Gentlemen,

An observation and a question...

--On Sunday, May 30, 2021 15:49 -0700 Ned Freed
<ned.freed@mrochek.com> wrote:

>> On Thu, May 27, 2021 at 12:59:53AM +0800, Jiankang Yao wrote:
> 
>> > > I recognize the distinction while also realizing that, as
>> > > you know, things often leak.  My recollection (if
>> > > Jiankang is following this, his memory is probably better
>> > > than mine) is that the WG explicitly discussed the issue
>> > > and concluded that U-labels were a better idea than
>> > > A-labels.
>> > 
>> > Yes, I think so. The EAI WG discussed this issue.  Section
>> > 3.7.3. Trace Information  encourages to use UTF-8 form.
>> > One reason I think is that trace information will be put
>> > into header for human reading.
> 
>> But, and this is crucial, the human reading the trace
>> information is rarely either the sender or the ultimate
>> recipient of the message, who are generally presented with a
>> subset of the headers fields ("To", "Cc", "Date", "Subject"
>> ...).  Examination of trace headers is far more likely to a
>> task for a mail system administrator.  They're used in abuse
>> reports and the like, and a uniform representation is more
>> important than familiarity to the community of readers of
>> some given language.
> 
> And what the admin usually wants to do is either a comparison
> or check the domain with the DNS in some way. So an A-label
> can be more convenient.
> 
> And in the unlikely event an admin needs to translate the
> A-label to a U-label, there are an abundance of tools that I
> can use to do it.

I may live in a different world, or at least with different
users, than the two of you, but I quite often see users pointed
to trace fields in order to make estimates of the validity of a
message.  And that brings me close to the reasoning I think the
WG used more generally: that it was better to stick to "native
character" forms (in this case, U-labels), especially when there
was a possibility of a full mailbox name with a UTF-8 local part
and a domain part containing IDN labels.   That is obviously not
the case for clauses of "Received:" fields other than "for",
but, if the WG extrapolated from that principle to this
particular case (and I don't remember how much detailed
attention the particular case got.  The other argument for
U-labels is that, precisely as Ned points out, if an admin needs
to translate an A-label, there are many tools and admins are
likely to know where to find them.  By contrast, if a user is
presented with an A-label, the reaction is at least as likely to
be "WT<rude word>" as "I have a tool handy that does that".

A different piece of the same story is that, if I had much more
confidence that authors of MUAs and other mail access tools
would think carefully about these subtle issues and get them
right, I'd say that, given the dual relationship between
A-labels and U-labels, it makes no difference which of the two
are used on the wire -- it is all a presentation issue.  On the
other hand, my experience of the last decade or so has given me
no such confidence.

>...
>> Sure, if the text is Russian, some Latin-based alphabet or at
>> a stretch Greek, I can more easily distinguish one U-label
>> string from another than an A-label form like
>> "xn--b1adqpd3ao5c.org", ... and yet I'd much rather see
>> A-labels in trace headers than Arabic or Chinese.  The text
>> in 3.7.3 is not something I'm inclined to implement.

And you might feel differently if, instead of reading primarily
Greek, Latin, and Cyrillic, your primary familiarity was with
Chinese or Devanagari or Mongolian or Thai or Arabic or anything
else that is very different from those three very closely
related scripts.   And your ability to distinguish U-labels in
the three scripts you cite may be severely impaired if someone
is deliberately trying to deceive given the number of characters
that are reused among the three scripts and, in many type
styles, almost indistinguishable.  

And, yes, the conclusion from the above is that sometimes
A-labels are better and sometimes U-labels are better and that
sometimes it depends on the reader.

> Actually, it depends on the A-labels. Because of the
> compression involved A-labels often emphasize small
> differences that may be difficult to see in a crap monospaced
> font for a non-Latin script, even one you're familiar with.

Exactly.

>> Specifying the use of U-labels in the "from" and "by" clauses
>> rather looks like a bad judgement call, rough consensus or
>> not.  Until the Protocol Police show up, I'm sticking with
>> A-labels. :-(
> 
> Me too.

IMO, wfm.  The text says "SHOULD", the two are easily
convertible by anyone who knows where to find the tools (and
that will far more often be admins than end-users even those I'm
concerned about the latter too) and, for "from" at least, simply
copying the EHLO field (which necessarily uses A-labels for any
IDN label fields) seems to me to be a fairly strong argument.
But that isn't the question in the subject line that started
this thread: should we open the document and change it, perhaps
to say "SHOULD...A-label".   Well, I'm not convinced that is it
worth the trouble, especially given all of the other issues we
are having getting SMTPUTF8 universally understood and deployed.

On the other hand, and here comes the question: There are
proposals floating around that would define new header fields
that would reflect, include, or depend on, different sorts of
forward-pointing and reverse-pointing addresses.  Should we be
taking the position that none of those should move forward
unless they explicitly address what should be done when those
fields are not traditional ASCII ones?

best,
  john