Re: [idn] Re: dichotomies

"JFC (Jefsey) Morfin" <jefsey@jefsey.com> Mon, 28 February 2005 03:22 UTC

Received: from psg.com (mailnull@psg.com [147.28.0.62]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id WAA02416 for <idn-archive@lists.ietf.org>; Sun, 27 Feb 2005 22:22:24 -0500 (EST)
Received: from majordom by psg.com with local (Exim 4.44 (FreeBSD)) id 1D5bR4-000H1r-I4 for idn-data@psg.com; Mon, 28 Feb 2005 03:19:06 +0000
Received: from [63.247.74.122] (helo=montage.altserver.com) by psg.com with esmtp (Exim 4.44 (FreeBSD)) id 1D5bR2-000H1d-Vk for idn@ops.ietf.org; Mon, 28 Feb 2005 03:19:05 +0000
Received: from lns-p19-1-idf-82-251-93-168.adsl.proxad.net ([82.251.93.168] helo=jfc.afrac.org) by montage.altserver.com with esmtpa (Exim 4.44) id 1D5bR1-0005Zs-6N; Sun, 27 Feb 2005 19:19:04 -0800
Message-Id: <6.1.2.0.2.20050228014024.02d4e920@mail.jefsey.com>
X-Sender: jefsey+jefsey.com@mail.jefsey.com
X-Mailer: QUALCOMM Windows Eudora Version 6.1.2.0
Date: Mon, 28 Feb 2005 04:18:51 +0100
To: Erik van der Poel <erik@vanderpoel.org>
From: "JFC (Jefsey) Morfin" <jefsey@jefsey.com>
Subject: Re: [idn] Re: dichotomies
Cc: idn@ops.ietf.org
In-Reply-To: <4222607A.80804@vanderpoel.org>
References: <D872CCF059514053ECF8A198@scan.jck.com> <421D8411.9030006@vanderpoel.org> <p06210208be4390618c81@[192.168.0.101]> <421E0D0C.2000309@vanderpoel.org> <p06210202be43c3888991@[192.168.0.101]> <E07CE813AD23B2D95DA0C740@scan.jck.com> <421E30F2.1040408@vanderpoel.org> <0E7F74C71945B923C52211F3@scan.jck.com> <421EA0C9.1010500@vanderpoel.org> <00a401c51af3$7863aae0$030aa8c0@DEWELL> <20050226081913.GD14956~@nicemice.net> <42221AB7.9070000@vanderpoel.org> <42221C34.2060505@vanderpoel.org> <6.1.2.0.2.20050227203118.02f22eb0@mail.jefsey.com> <4222607A.80804@vanderpoel.org>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format="flowed"
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - montage.altserver.com
X-AntiAbuse: Original Domain - ops.ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - jefsey.com
X-Spam-Checker-Version: SpamAssassin 3.0.1 (2004-10-22) on psg.com
X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.0.1
Sender: owner-idn@ops.ietf.org
Precedence: bulk

On 01:06 28/02/2005, Erik van der Poel said:
>JFC (Jefsey) Morfin wrote:
>>(I will correct Erik's "xn--s" proposed notation into "xs--" not to 
>>create havoc in punycode)
>
>You are right, of course. What was I thinking? Duh. Well, how about 
>"xs--n", to differentiate it from "xn--", and to still have the advantage 
>of a non-delimiter-like character before the rest of the string?

No need to have 4 and 5 characters. xs-- would be enough.

>>One of the reason why I disagreed with IDNA is it creates a possibly 
>>conflicting left-to-right hierarchy while the DNS hierarchy is right-to-left.
>
>I'm afraid I don't understand this.

The DNS hierarchy are 1st level, 2nd level, etc. from right to left.

The scripting hierarchy introduced by the ACE prefix (on the left of the 
name) decides if the label is ASCII or not. This is a local hierarchy in 
the label. But using Tables, permitting other versions as you proposes is 
creating a de facto lame hierarchy with various Tables applying or not. In 
the same URI.....

>>This does not simplify understanding, management, security. Why not to 
>>just use DNS zones? I have not yet understood why it was opposed. IMHO 
>>the future of ML.ML names are in the form "name2.name.xx--chicom.com" 
>>where "xx-nn.com" will print as ".com" in Chinese and name, name2 etc. 
>>will all have to use codes from the Chinese Table of ".com".
>
>I think your proposal makes some sense. It is similar to my proposal in a 
>way -- recall my .jp example, with the rule that *all* labels would have 
>to use the same ACE prefix or pure ASCII.

ACE prefix: I suppose you mean the same table.
Please understand there are three layers:
- internationalization: the scripts, the strcuture, etc. what is discussed 
by IDNA. This is the basis. But it does not make a service.
- multilingualization: the support of languages. IDNA is poor at that 
because it has not analyzed enough the structure of naming and that ASCII 
is just another lingual support using the ASCII Table which IS defined (by 
default) and forgot to structurally require the other tables to be defined. 
(".com" is actually ".ascii.com"  - what helps you to understand that 
".chinese.com" is NOT ".ascii.com" or ".com", not the same zone while IDNA 
makes them the same.
- vernacularization: what permits the users to use the system (ex. the 
colors, etc.). They not only refused to consider but did not work out the 
tools permitting it to be built (like a compression method to distribute 
Tables, etc.).

>There isn't really any way to force the TLDs or zone administrators to 
>follow any rules that we might come up with. The best we can do is write 
>down some guidelines that are well thought out. And write them clearly.

We have no _rule_ at all to write for anyone. If you edict a rule you must 
be able to enforce it. You can only describe the way things should work and 
hope people will adhere. So you have to keep it as simple, unique and logic 
as possible. The strength of the DNS is to be such. This results from the 
zones.

When you manage a zone you are the master in your zone but you _cannot_ 
affect the layer above. With IDNA you can: because if you enter an ACE 
prefix in your zone, the whole FQDN becomes a FQIDN.

For 22 years there is only one version of the DNS, the ACE prefix does not 
change that because it is unique. But if there are several prefixes the DN 
becomes complex.

>If registries and zone administrators fail to follow the guidelines, the 
>applications may have to display their domain names differently, to 
>indicate some level of risk.

The error (IMHO) of IDNS is to require "guidelines". The DNS has no 
"guidelines". It has functions. You use them correctly and it works, you 
don't and it does not.

>Also, we can't just suddenly switch a TLD from one encoding to another and 
>then expect all the subdomains to follow suit the same night. Instead, we 
>might have a rule specifying that all labels under the first new ACE 
>prefix must use the same prefix. For example, suppose we have a new domain 
>with the new prefix, called "xs--nfoo-abc.jp". Since the 2LD uses the new 
>prefix, any 3LDs, 4LDs and so on would also have to use the new prefix. 
>Does this make sense?

No. Because today we have already two rules.
1. a hierarchy on zones. From right to left. If I have ".com" TLD, the SLD 
will obey the ".com" rules, etc.
2. a local ACE prefix is _local_ to the label. Even if as I said above it 
has an impact on the whole FQDN creating a lame bottom-up hierarchy.

Now, what you propose is that if you put an "xn--" label somewhere it will 
"pollute" the whole FQDN into an FQIDN to be entirely read in using 
punycode without ACE prefix. This would be mad. Each label can be read/used 
separately, the ACE prefix is part of the punycode transcoding.

Important point: once the FQDN has been properly declared a FQIDN (through 
the top level information) you can have all the possible transcoding (with 
as many xn/xs/zq/etc-- ) you want and stay consistent: the right to left 
hierarchy has been respected.

>Finally, regarding displaying ".com" in Chinese, there is currently no 
>reason to display ".com" in ASCII. This could easily be displayed in 
>Chinese if the application developers were only willing to modify their 
>programs to be more user-friendly.

No. Here is the confusion between the internationalization and the 
multilingualization layer. What Adams called the host name (which becomes 
too complex in reality due to the probable extensive use of aliases - 
another topic). ".com" in Chinese does not make sense: who is going to say 
that it is to be printed in Chinese? as a Chinese name chosen by who? or in 
ASCII? ".com" is a default for ".ascii.com". Once you have understood that, 
there is no more problem of any kind.

The DNS is something very powerful because it is simple. It knows very 
little: labels and dots. IDNA says that applications can transcode labels 
at the application level. But they did not address the top level. . This 
was partly corrected with the Tables: but there is no mechanical relation 
between the TLD and a given Table as in the ASCII Domain Name. In the ASCII 
Domain Name, the Table is ASCII, you cannot use EBCDIC.

This information MUST be provided. The way the DNS (actually the global 
naming) does it is through a zone. This zone can be a primary zone (a _new_ 
Chinese .com equivalent having Chinese Table as a default) or lingual 
primary zone (a .chinese.com) zone. In that zone names will then have to be 
IDN labels using the .com Chinese Table (or other Chinese accepted codes). 
If there is no Table, there cannot be any name registered.

Just remind this very simple ".com" actually is an abreviation for 
".ascii.com" using the ANSI Table.

This means that the transcoding must be adapted.
- nameprep can know the table and check the IDN consistency.
- read/present the ".chinese.com" sequence in Chinese as the Chinese ".com" 
label.

>  Of course, this brings up all sorts of issues like what to do with copy 
> and paste, educating users about the new kind of display, being able to 
> type ".com" in Chinese in one application while being required to type it 
> in ASCII in another, and so on. There are some issues, but theoretically, 
> you *can* already display ".com" in Chinese.
>This is not so different from Punycode itself, which you wouldn't normally 
>display as is. You first decode it and then show the Unicode to the user.

I am not sure about what you discuss here. If you punycode a TLD you create 
a new TLD?
jfc