Re: [idn] Re: Unicode categories

"Martin v. Löwis" <martin@v.loewis.de> Sat, 12 March 2005 18:18 UTC

Received: from psg.com (mailnull@psg.com [147.28.0.62]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id NAA26867 for <idn-archive@lists.ietf.org>; Sat, 12 Mar 2005 13:18:34 -0500 (EST)
Received: from majordom by psg.com with local (Exim 4.44 (FreeBSD)) id 1DAB7n-0005w3-OR for idn-data@psg.com; Sat, 12 Mar 2005 18:14:07 +0000
Received: from [80.67.18.16] (helo=smtprelay04.ispgateway.de) by psg.com with esmtp (Exim 4.44 (FreeBSD)) id 1DAB7m-0005vg-AT for idn@ops.ietf.org; Sat, 12 Mar 2005 18:14:06 +0000
Received: (qmail 8721 invoked from network); 12 Mar 2005 18:14:03 -0000
Received: from unknown (HELO [80.185.147.201]) (544451@[80.185.147.201]) (envelope-sender <martin@v.loewis.de>) by smtprelay04.ispgateway.de (qmail-ldap-1.03) with AES256-SHA encrypted SMTP for <klensin@jck.com>; 12 Mar 2005 18:14:03 -0000
Message-ID: <4233316A.7060101@v.loewis.de>
Date: Sat, 12 Mar 2005 19:14:02 +0100
From: "\"Martin v. Löwis\"" <martin@v.loewis.de>
User-Agent: Debian Thunderbird 1.0 (X11/20050116)
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: John C Klensin <klensin@jck.com>
CC: Erik van der Poel <erik@vanderpoel.org>, idn@ops.ietf.org, Kenneth Whistler <kenw@sybase.com>
Subject: Re: [idn] Re: Unicode categories
References: <421B8484.3070802@vanderpoel.org> <20050223072837.GA21463~@nicemice.net> <D872CCF059514053ECF8A198@scan.jck.com> <421D8411.9030006@vanderpoel.org> <p06210208be4390618c81@[192.168.0.101]> <421E0D0C.2000309@vanderpoel.org> <p06210202be43c3888991@[192.168.0.101]> <E07CE813AD23B2D95DA0C740@scan.jck.com> <421E30F2.1040408@vanderpoel.org> <0E7F74C71945B923C52211F3@scan.jck.com> <421EA0C9.1010500@vanderpoel.org> <00a401c51af3$7863aae0$030aa8c0@DEWELL> <A574CA1BE87BFDA3C2A1AC0E@scan.jck.com> <42322CE2.4040509@vanderpoel.org> <4232B2FD.1080104@vanderpoel.org> <59DD38FB83B7216C06E61E59@scan.jck.com>
In-Reply-To: <59DD38FB83B7216C06E61E59@scan.jck.com>
X-Enigmail-Version: 0.90.0.0
X-Enigmail-Supports: pgp-inline, pgp-mime
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
X-Spam-Checker-Version: SpamAssassin 3.0.1 (2004-10-22) on psg.com
X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.0.1
Sender: owner-idn@ops.ietf.org
Precedence: bulk
Content-Transfer-Encoding: 7bit

John C Klensin wrote:
>>From the standpoint of the IETF, or anyone else worried about a
> piece of protocol that must support many applications, the
> problem is a little different.  Some of the recent developments
> in automatic updating tools notwithstanding, IDNA (and its
> supporting tables) are designed to be embedded in and used from
> clients.  Many of those clients, and the associated operating
> systems, have been historically updated only when the machine in
> which they run is replaced.   That argues for an extremely
> conservative view of protocol design and compatibility, with
> very high thresholds for justifying incompatible changes of any
> sort.  From that viewpoint, the differences between 0.01%
> changes and 5% changes is like measures of being partially
> pregnant: perhaps helpful in some types of risk assessment, but
> less so in making the next design decision.  

While the facts are all true (clients only updated rarely, and
protocol stability being important), I somewhat disagree with the
conclusion. Design decisions need to take all these things into
account, but on a factual basis, not a moral one. E.g. if a protocol
change is known to potentially break existing clients, but the
number of actual clients being broken is also known to be small,
the protocol change might still be acceptable. Likewise if the
impact of "breakage" would be "small".

In the context of IDNA, I would conclude that upgrading to a
newer Unicode version in IDNA would do no significant harm, even
if it is, strictly speaking, an incompatible protocol change.
Existing clients must be considered, but in doing so, the effects
of a proposed change to such clients must also be considered -
not in a theoretical way, but trying the changes on a real
existing client. For IDNA, it seems likely that the existing
clients would not change in any user-visible way, and that
the behaviour change in artificial cases should be considered
acceptable.

That said, I also believe that upgrading to a newer Unicode
version would do little good. Additional characters are likely
of little use to the broad masses, as font support etc. is still
lacking. If certain new characters are of high interest to IDNA
users, I expect registrars to weaken the "AllowUnassigned"
setting of "false" to, say, "AllowUnassignedAsPerLatestUnicodeSpec" -
independent of what any RFC says.

Regards,
Martin