Re: [EAI] UTF-8 in Message-IDs

John C Klensin <klensin@jck.com> Tue, 04 October 2011 02:31 UTC

Return-Path: <klensin@jck.com>
X-Original-To: ima@ietfa.amsl.com
Delivered-To: ima@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 262F321F8E7D for <ima@ietfa.amsl.com>; Mon, 3 Oct 2011 19:31:35 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.4
X-Spam-Level:
X-Spam-Status: No, score=-2.4 tagged_above=-999 required=5 tests=[AWL=-0.253, BAYES_00=-2.599, MIME_8BIT_HEADER=0.3, SARE_SUB_ENC_UTF8=0.152]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Hq-7zl-npJzF for <ima@ietfa.amsl.com>; Mon, 3 Oct 2011 19:31:34 -0700 (PDT)
Received: from bs.jck.com (ns.jck.com [209.187.148.211]) by ietfa.amsl.com (Postfix) with ESMTP id 82E8F21F8E7C for <ima@ietf.org>; Mon, 3 Oct 2011 19:31:34 -0700 (PDT)
Received: from [127.0.0.1] (helo=localhost) by bs.jck.com with esmtp (Exim 4.34) id 1RAuq4-0000XT-IZ; Mon, 03 Oct 2011 22:34:36 -0400
Date: Mon, 03 Oct 2011 22:34:35 -0400
From: John C Klensin <klensin@jck.com>
To: =?UTF-8?Q?Julien_=C3=89LIE?= <julien@trigofacile.com>, ima@ietf.org
Message-ID: <725E50D595CDB0AD7E4687A0@PST.JCK.COM>
In-Reply-To: <4E8A2E61.5060308@trigofacile.com>
References: <CAHhFybo47--0YjCRcvSO4asoV_R89+ULDB3tyij+ba=O_6gKsQ@mail.gmail.com> <01O4T11O8X4M00VHKR@mauve.mrochek.com> <op.vz8z3v0a6hl8nm@clerew.man.ac.uk> <01O4VFNKDGEE00VHKR@mauve.mrochek.com> <op.v0cswsg76hl8nm@clerew.man.ac.uk> <4E8A2E61.5060308@trigofacile.com>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline
Cc: Charles Lindsey <chl@clerew.man.ac.uk>
Subject: Re: [EAI] UTF-8 in Message-IDs
X-BeenThere: ima@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "EAI \(Email Address Internationalization\)" <ima.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ima>, <mailto:ima-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ima>
List-Post: <mailto:ima@ietf.org>
List-Help: <mailto:ima-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ima>, <mailto:ima-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 04 Oct 2011 02:31:35 -0000

--On Monday, October 03, 2011 23:51 +0200 Julien ÉLIE
<julien@trigofacile.com> wrote:

> Hi Charles,
> 
>> The Message-ID is crucial to the correct operation of the 
>> Netnews transport, and the effect of using utf-8 in it has
>> not been  investigated (but should be).  It is expected that
>> some ancient transport  agents might not like it but even
>> that might not be a show stopper,  since Netnews is very good
>> at navigating around such obstacles.
> 
> For what is worth, I do not think there would be any major
> issue with INN (a news server) because message-IDs are
> internally "hashed" after having parsed them.  They can be
> retrievable.
> 
> The problem is the transition period.  INN currently rejects
> any message whose Message-ID: header field does not comply
> with RFC 5536. It also does not want to search for such
> message-IDs.
>...
 
Julien,

The edge case of a message with a non-ASCII Message-ID but no
other non-ASCII header fields aside, how would the news servers
you and Charles are familiar with respond to non-ASCII addresses
or header fields more generally?

Note also that, to some extent, this is not a news reader
problem at all but a gateway issue between Internet mail and
news.  If articles originate in news and move to mail, there is
no issue because they will be ASCII.  So the question is what
the gateways in the "to news" direction do.  If they reject
messages with non-ASCII headers (Message-IDs or otherwise), the
readers won't see such messages... and the problem is really no
different from an attempt to send an extended mail message to a
server that is not UTF8SMTP-capable.

   john