Re: [EAI] UTF-8 in Message-IDs

"Charles Lindsey" <chl@clerew.man.ac.uk> Mon, 29 August 2011 13:24 UTC

Return-Path: <chl@clerew.man.ac.uk>
X-Original-To: ima@ietfa.amsl.com
Delivered-To: ima@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8DD5221F8B0D for <ima@ietfa.amsl.com>; Mon, 29 Aug 2011 06:24:56 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.963
X-Spam-Level:
X-Spam-Status: No, score=-2.963 tagged_above=-999 required=5 tests=[AWL=-2.116, BAYES_50=0.001, RCVD_IN_DNSWL_LOW=-1, SARE_SUB_ENC_UTF8=0.152]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id qCBJo3E2Fkf4 for <ima@ietfa.amsl.com>; Mon, 29 Aug 2011 06:24:55 -0700 (PDT)
Received: from outbound-queue-1.mail.thdo.gradwell.net (outbound-queue-1.mail.thdo.gradwell.net [212.11.70.34]) by ietfa.amsl.com (Postfix) with ESMTP id 6D09921F8B0E for <ima@ietf.org>; Mon, 29 Aug 2011 06:24:54 -0700 (PDT)
Received: from outbound-edge-2.mail.thdo.gradwell.net (bonnie.gradwell.net [212.11.70.2]) by outbound-queue-1.mail.thdo.gradwell.net (Postfix) with ESMTP id 6771021F6A for <ima@ietf.org>; Mon, 29 Aug 2011 14:26:18 +0100 (BST)
Received: from port-89.xxx.th.newnet.co.uk (HELO clerew.man.ac.uk) (80.175.135.89) (smtp-auth username postmaster%pop3.clerew.man.ac.uk, mechanism cram-md5) by outbound-edge-2.mail.thdo.gradwell.net (qpsmtpd/0.83) with (DES-CBC3-SHA encrypted) ESMTPSA; Mon, 29 Aug 2011 14:26:18 +0100
Received: from clerew.man.ac.uk (localhost [127.0.0.1]) by clerew.man.ac.uk (8.13.7/8.13.7) with ESMTP id p7TDQGFn015633 for <ima@ietf.org>; Mon, 29 Aug 2011 14:26:17 +0100 (BST)
Date: Mon, 29 Aug 2011 14:26:16 +0100
To: IMA <ima@ietf.org>
From: Charles Lindsey <chl@clerew.man.ac.uk>
Content-Type: text/plain; format="flowed"; delsp="yes"; charset="iso-8859-1"
MIME-Version: 1.0
References: <C31E821E731AC23ED7EE191F@PST.JCK.COM> <20110815175214.4833.qmail@joyce.lan> <CAHhFybo9PxBHehzTxkwj+-bCNvGcc0A66vnhbXOJi3AssMOPdw@mail.gmail.com> <01O4VRGRKZH800VHKR@mauve.mrochek.com> <4E4A8316.50401@dcrocker.net> <7B2F664A7C469046D955E05E@PST.JCK.COM>
Content-Transfer-Encoding: 8bit
Message-ID: <op.v0y8x2tq6hl8nm@clerew.man.ac.uk>
In-Reply-To: <7B2F664A7C469046D955E05E@PST.JCK.COM>
User-Agent: Opera Mail/9.25 (SunOS)
X-Gradwell-MongoId: 4e5b937a.8d7d-658e-2
X-Gradwell-Auth-Method: mailbox
X-Gradwell-Auth-Credentials: postmaster@pop3.clerew.man.ac.uk
Subject: Re: [EAI] UTF-8 in Message-IDs
X-BeenThere: ima@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "EAI \(Email Address Internationalization\)" <ima.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ima>, <mailto:ima-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ima>
List-Post: <mailto:ima@ietf.org>
List-Help: <mailto:ima-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ima>, <mailto:ima-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 29 Aug 2011 13:24:57 -0000

On Tue, 23 Aug 2011 15:42:40 +0100, John C Klensin <klensin@jck.com> wrote:

> To strengthen the distinction, one would need actually need a
> lot more than "truly pure UTF-8 environment".  Identifier
> comparison in a non-ASCII environment is actually quite
> difficult in the general case, requiring a lot of additional
> comparison rules and/or restrictions (see RFC 6055 for a partial
> discussion).  While we take it for granted these days, that is
> even true in the ASCII environment because we have to specify
> choices between case-dependent and case-independent comparison.
> But, in the DNS case, even if we substituted "pure UTF-8" for
> the IETF-invented Punycode encoding, we would still need fairly
> extensive rules about permitted characters, normalization, etc.,
> to make things work -- and even more extensive rules if any sort
> of mapping (or non-exact comparison) was permitted.

All of which is exactly why I oppose allowing UTF-8 in Message-IDs (which  
get treated like identifiers in many situations). Fine if the job is done  
properly by specifying the normalizations etc, but not otherwise.

-- 
Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131                       
   Web: http://www.cs.man.ac.uk/~chl
Email: chl@clerew.man.ac.uk      Snail: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9      Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5