Re: [idn] something a little lighter for the weekend

"Adam M. Costello" <idn.amc+0@nicemice.net.RemoveThisWord.cnri.reston.va.us> Mon, 28 February 2005 02:04 UTC

Received: from psg.com (mailnull@psg.com [147.28.0.62]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id VAA25633 for <idn-archive@lists.ietf.org>; Sun, 27 Feb 2005 21:04:50 -0500 (EST)
Received: from majordom by psg.com with local (Exim 4.44 (FreeBSD)) id 1D5a8V-0005KD-Ct for idn-data@psg.com; Mon, 28 Feb 2005 01:55:51 +0000
Received: from [128.32.132.165] (helo=nicemice.net) by psg.com with esmtp (Exim 4.44 (FreeBSD)) id 1D5a8R-0005Ji-6D for idn@ops.ietf.org; Mon, 28 Feb 2005 01:55:47 +0000
Received: from amc by nicemice.net with local (Exim 3.35 #1 (Debian)) id 1D5a8P-0006pI-00 for <idn@ops.ietf.org>; Sun, 27 Feb 2005 17:55:45 -0800
Date: Mon, 28 Feb 2005 01:55:45 +0000
From: "Adam M. Costello" <idn.amc+0@nicemice.net.RemoveThisWord.cnri.reston.va.us>
To: IETF idn working group <idn@ops.ietf.org>
Subject: Re: [idn] something a little lighter for the weekend
Message-ID: <20050228015545.GA23896~@nicemice.net>
Reply-To: IETF idn working group <idn@ops.ietf.org>
References: <p06210208be4390618c81@[192.168.0.101]> <421E0D0C.2000309@vanderpoel.org> <p06210202be43c3888991@[192.168.0.101]> <E07CE813AD23B2D95DA0C740@scan.jck.com> <421E30F2.1040408@vanderpoel.org> <0E7F74C71945B923C52211F3@scan.jck.com> <421EA0C9.1010500@vanderpoel.org> <00a401c51af3$7863aae0$030aa8c0@DEWELL> <20050226081913.GD14956~@nicemice.net> <4220D1C4.7000909@vanderpoel.org>
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-2022-jp"
Content-Disposition: inline
In-Reply-To: <4222607A.80804@vanderpoel.org> <4220D1C4.7000909@vanderpoel.org>
User-Agent: Mutt/1.5.6+20040722i
X-Spam-Checker-Version: SpamAssassin 3.0.1 (2004-10-22) on psg.com
X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.0.1
Sender: owner-idn@ops.ietf.org
Precedence: bulk

Erik van der Poel <erik@vanderpoel.org> wrote:

> Oh, this one's just priceless. I have to share it with you all:
> 
> http://e.netpia.com/
> 
> Move your mouse over the "Why NLIA" near the top, and then read the 
> words that appear.

That was extremely funny.

And then, after I read more about this company, somewhat ominous.  In a
nutshell, they're setting up an alternate root for DNS, so that people
unfamiliar with Latin characters won't have to deal with ASCII TLDs.
It's a nice goal, but the means are worrisome to say the least.

Maybe this is a sign that it's time to figure out a standard way to
support non-ASCII TLDs.

> Finally, regarding displaying ".com" in Chinese, there is currently
> no reason to display ".com" in ASCII.  This could easily be displayed
> in Chinese if the application developers were only willing to modify
> their programs to be more user-friendly.

I don't thin it's that simple.  The purpose of domain names is to
serve as global identifiers.  If non-ASCII synonyms for TLDs were
left as a UI issue for each application to solve independently, two
different applications could choose different Thai spellings for .uk
(for example), and their users wouldn't be able to refer each other
to sites; the domain names wouldn't be fulfilling their purpose as
global identifiers.  Therefore, the spellings of all the TLDs in all
the scripts need to be standardized.  That's about 300 TLDs times about
50 scripts, potentially around 15,000 localized TLDs.  Such a table
probably shouldn't be hard-coded into every application.  It should be
kept in an online database, like... the DNS!

According to the Unicode standard, there are 52 scripts.  Currently,
all TLDs use the Latin script.  I suggest that every country be allowed
to register up to 51 additional TLDs, one per non-Latin script, in the
root zone.  Countries would choose abbreviations for themselves, which
would need to be ratified by some review process to make sure they were
reasonable, and not homographs of other TLDs.

A similar policy could exist for gTLDs, except that .com and .商
[that's my guess at the analogue of .com in the Han script] would not
necessarily be operated by the same registry; any accredited registry
could apply to operate a synonym for an existing ASCII gTLD in a script
that was not already in service, and the proposed new gTLD would be
checked for being a reasonable synonym, but would not have to redo the
arduous approval process that the original ASCII gTLD did.

To be fair to early registrants of IDNs, perhaps every new non-ASCII
gTLD should be required to initialize its zone with any names from
the corresponding ASCII gTLD zone that satisfy the (possibly more
restrictive) syntax rules of the new gTLD.  The names should be added
in order of seniority, in case the new gTLD has more restrictive
name-blocking rules that prevent two admissible names from coexisting.
The names in the new zone would inherit their owners and expiration
dates from the old zone.  After the initialization, the new zone would
be independent of the old zone, and people could opt to register/renew
names in one and not the other.

Non-ASCII TLDs would be represented using IDNA, no different from labels
at any other level.

AMC