Re: [EAI] UTF-8 in Message-IDs

John C Klensin <klensin@jck.com> Wed, 05 October 2011 21:10 UTC

Return-Path: <klensin@jck.com>
X-Original-To: ima@ietfa.amsl.com
Delivered-To: ima@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BE08E21F8C44 for <ima@ietfa.amsl.com>; Wed, 5 Oct 2011 14:10:56 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.547
X-Spam-Level:
X-Spam-Status: No, score=-2.547 tagged_above=-999 required=5 tests=[AWL=-0.100, BAYES_00=-2.599, SARE_SUB_ENC_UTF8=0.152]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9A-fv+U-oZDm for <ima@ietfa.amsl.com>; Wed, 5 Oct 2011 14:10:56 -0700 (PDT)
Received: from bs.jck.com (ns.jck.com [209.187.148.211]) by ietfa.amsl.com (Postfix) with ESMTP id EE10C21F8C36 for <ima@ietf.org>; Wed, 5 Oct 2011 14:10:55 -0700 (PDT)
Received: from [127.0.0.1] (helo=localhost) by bs.jck.com with esmtp (Exim 4.34) id 1RBYmt-000EIW-TD; Wed, 05 Oct 2011 17:14:00 -0400
Date: Wed, 05 Oct 2011 17:13:59 -0400
From: John C Klensin <klensin@jck.com>
To: Frank Ellermann <hmdmhdfmhdjmzdtjmzdtzktdkztdjz@gmail.com>
Message-ID: <A48F698A08B601A60F5A9719@PST.JCK.COM>
In-Reply-To: <CAHhFybrr0jWaSMxHwnJ4NuKFBJRSzw423aYHnEmta8M+1=2+1Q@mail.gmail.com>
References: <20111004014257.8027.qmail@joyce.lan> <op.v2viju2m6hl8nm@clerew.man.ac.uk> <34E8E4E5F1CBE344994E3F8B@PST.JCK.COM> <CAHhFybrr0jWaSMxHwnJ4NuKFBJRSzw423aYHnEmta8M+1=2+1Q@mail.gmail.com>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline
Cc: IMA <ima@ietf.org>
Subject: Re: [EAI] UTF-8 in Message-IDs
X-BeenThere: ima@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "EAI \(Email Address Internationalization\)" <ima.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ima>, <mailto:ima-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ima>
List-Post: <mailto:ima@ietf.org>
List-Help: <mailto:ima-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ima>, <mailto:ima-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 05 Oct 2011 21:10:56 -0000

--On Wednesday, October 05, 2011 22:28 +0200 Frank Ellermann
<hmdmhdfmhdjmzdtjmzdtzktdkztdjz@gmail.com> wrote:

First, thanks to John Levine for the response to your first
question.  I have nothing to add to his response.  These are two
systems that use very similar formats but that communicate
through gateways, not one system.

>...
>> Speaking personally, I would be much more sympathetic to the
>> position you are taking had I not seen all sorts of chaos
>> caused over the years by news systems believing that netnews
>> Message-IDs could be compared, uncritically, to Internet email
>> Message-IDs and actions taken on the results.  But those
>> disruptions have occurred and presumably continue to occur.
>> People live with them, and I don't see the disruptions this
>> change to Message-IDs are likely to cause being any worse.
>>  YMMD.
> 
> A message-ID is a message-ID:  Any chaos you have seen
> presumably involved naive or broken gateways from NetNews to
> e-mail, or more obscure sources, e.g., Fido (FTN) echomail.

A Message-ID is a Message-ID until something (usually a gateway)
decides it needs or wants to improve on it.  Those decisions can
be due to misunderstanding or stupidity, they can be because
other things change and the gateway believes that makes a new
message, they can be because the gateway wants to be able to
guarantee a unique identifier in the other environment.
Sometimes changes are made to protect systems further downstream
that are believed to be fragile, sometimes the gateways
themselves are fragile and change things to recover from
breakage.  The changes are usually made with the best of
intentions; whether they are naive, broken, or really useful and
creative is in the minds of the beholders and the purpose to
which they are supposed to be put.  I would not presume to try
to categorize all of the transformations I've seen, but some of
them probably don't fall under your description above.

This isn't trivial.  Remember that simple transcoding of header
fields from net-ASCII into, e.g., EBCDIC or what we call
ISO-2022-JP means that dumb octet-string comparisons of
Message-IDs (and other things) will fail.  There is a rather
long history of those issues.

If someone asked my opinion about whether a gateway should mess
with Message-IDs, I'd say "no, unless it is absolutely required
by the systems on the other end".  But lots of people don't ask
and even fewer listen.   Neither RFC 5321/5322 nor RFC 5536
offer any reciprocal guarantees.

> EAI won't make that worse.

If there are fragile systems out there -- and there almost
certainly still are-- how can you possibly know that?

>  Sadly SASL killed it, otherwise
> CRAM-MD5 might be now in theory ready for "EAI-IDs".  In
> practice I'd guess that CRAM-MD5 implementations never bother
> to check Message-ID syntax details.

In spite of being co-author of CRAM-MD5, I don't understand this
comment at all. Certainly the CRAM-MD5 spec, regardless of its
other strengths and weaknesses, doesn't even mention Message-IDs
(or any other signature over headers or message content for that
matter).

    john

> 
> -Frank