[idn] Re: character tables

John C Klensin <klensin@jck.com> Mon, 28 February 2005 03:03 UTC

Received: from psg.com (mailnull@psg.com [147.28.0.62]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id WAA00985 for <idn-archive@lists.ietf.org>; Sun, 27 Feb 2005 22:03:39 -0500 (EST)
Received: from majordom by psg.com with local (Exim 4.44 (FreeBSD)) id 1D5b7X-000EHy-FX for idn-data@psg.com; Mon, 28 Feb 2005 02:58:55 +0000
Received: from [209.187.148.211] (helo=bs.jck.com) by psg.com with esmtp (Exim 4.44 (FreeBSD)) id 1D5b7V-000EHZ-Cp for idn@ops.ietf.org; Mon, 28 Feb 2005 02:58:53 +0000
Received: from [209.187.148.215] (helo=scan.jck.com) by bs.jck.com with esmtp (Exim 4.34) id 1D5b7V-000IgY-1j; Sun, 27 Feb 2005 21:58:53 -0500
Date: Sun, 27 Feb 2005 21:58:52 -0500
From: John C Klensin <klensin@jck.com>
To: Erik van der Poel <erik@vanderpoel.org>
cc: idn@ops.ietf.org
Subject: [idn] Re: character tables
Message-ID: <45781B7428C6AA07C3B283BD@scan.jck.com>
In-Reply-To: <42227EBF.9040703@vanderpoel.org>
References: <421B8484.3070802@vanderpoel.org> <20050223072837.GA21463~@nicemice.net> <D872CCF059514053ECF8A198@scan.jck.com> <421D8411.9030006@vanderpoel.org> <p06210208be4390618c81@[192.168.0.101]> <421E0D0C.2000309@vanderpoel.org> <p06210202be43c3888991@[192.168.0.101]> <E07CE813AD23B2D95DA0C740@scan.jck.com> <421E30F2.1040408@vanderpoel.org> <0E7F74C71945B923C52211F3@scan.jck.com> <421EA0C9.1010500@vanderpoel.org> <00a401c51af3$7863aae0$030aa8c0@DEWELL> <A574CA1BE87BFDA3C2A1AC0E@scan.jck.com> <421FA55B.9000308@vanderpoel.org> <421FCBD7.8000805@vanderpoel.org> <42227EBF.9040703@vanderpoel.org>
X-Mailer: Mulberry/3.1.6 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
X-Spam-Checker-Version: SpamAssassin 3.0.1 (2004-10-22) on psg.com
X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.0.1
Sender: owner-idn@ops.ietf.org
Precedence: bulk
Content-Transfer-Encoding: 7bit


--On Sunday, 27 February, 2005 18:15 -0800 Erik van der Poel
<erik@vanderpoel.org> wrote:

>...
> As I indicate at nameprep.org, I found some character tables
> at the IANA site, but I found even more at the GNU libidn
> site. One of the first things to do is to agree on a single
> machine-readable format. The tables do not all use the same
> format yet, it seems. Then we would also need to have the
> latest and most official tables from the registries themselves
> (instead of possibly out of date IANA tables and possibly
> embellished unofficial GNU libidn tables).

Erik,

I'm been mildly resisting standard-format,
machine-interpretable, tables at IANA for a few reasons. The two
most important ones are:

	(i) ICANN is still assuming that this is a registry
	issue.  As such, if someone else starts guessing at what
	a registry is doing, we may get into trouble, especially
	since the tables may not show all of the relevant
	registry rules and restrictions.
	
	(ii) We've got at least two models for processing a
	proposed IDN.  One compares the proposed label against a
	list of characters for the selected language as
	maintained by that registry and, if it passes, registers
	it if it isn't already taken.  The other involves the
	JET "variant" model, or some relative of it, to
	determine what labels, or sets of labels, are permitted.
	The first plan requires a simple list of characters; the
	second requires a three (or two, or four) column table.

Also please note that the IANA tables make no attempt to be
authoritative for any given language.  They are just
documentation of what characters a given registry permits to be
associated with a given "language" for their registry.   To have
three different, and incompatible, tables --associated with
three different registries-- for "the same language" is not only
possible, but likely.

     john