Re: [idn] opting out of SC/TC equivalence

"ben" <ben@cc-www.com> Tue, 14 August 2001 14:52 UTC

Received: from psg.com (exim@psg.com [147.28.0.62]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id KAA06445 for <idn-archive@lists.ietf.org>; Tue, 14 Aug 2001 10:52:43 -0400 (EDT)
Received: from lserv by psg.com with local (Exim 3.31 #1) id 15Wf9m-000Guo-00 for idn-data@psg.com; Tue, 14 Aug 2001 07:26:58 -0700
Received: from mail.paconline.net ([66.51.160.50]) by psg.com with esmtp (Exim 3.31 #1) id 15Wf9l-000Gue-00 for idn@ops.ietf.org; Tue, 14 Aug 2001 07:26:57 -0700
Received: from cs881964a ([24.112.63.212]) by mail.paconline.net (Post.Office MTA v3.5.3 release 223 ID# 0-71798U5300L500S0V35) with SMTP id net; Tue, 14 Aug 2001 07:38:12 -0700
Message-ID: <00eb01c124ce$60f338e0$d43f7018@mtag1.on.home.com>
From: ben <ben@cc-www.com>
To: David Hopwood <david.hopwood@zetnet.co.uk>, idn@ops.ietf.org
References: <200107301509.f6UF91N23867@nic-naa.net> <000101c11fec$7520a0a0$5b8d21d9@ietf.ignite.net> <001601c11fed$a80b2f20$fb05738c@ncu.edu.tw> <005001c1200d$01f85480$d43f7018@mtag1.on.home.com> <3B71A3DE.8DF843B4@zetnet.co.uk> <001901c12053$da3eaec0$d43f7018@mtag1.on.home.com> <3B71D034.63F361B6@zetnet.co.uk>
Subject: Re: [idn] opting out of SC/TC equivalence
Date: Tue, 14 Aug 2001 10:35:34 -0400
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 8bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 5.50.4522.1200
X-MIMEOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-idn@ops.ietf.org
Precedence: bulk
Content-Transfer-Encoding: 8bit

Hi

I am looking for support for my Supreme Chinese Domain Name System so
that it will become a standard.  It involves a standard way of
registration by all registries that offers CDNs as well as adding a
prohibit function to nameprep of CDNs.

However, if I don't have your support in making it a standard, that is
absolutely fine.  Perhaps you can still comment on my other
alternative to implementing this system by creating a new type of
lsTLD.  If you are concerned that discussions of <IDN>.<IDN> is out of
the scope of the IDN WG, please do email me privately.

Attached below is my draft.

Thanks for your support
Ben Chan


_________________________________



Internet Draft
Author: Ben Chan
July 14, 2001
Expires in six months


  Supreme Chinese Domain Name System



Status of this Memo

This document is an Internet-Draft and is in full conformance with all
provisions of Section 10 of RFC2026.

Internet-Drafts are working documents of the Internet Engineering Task
Force (IETF), its areas, and its working groups. Note that other
groups may also distribute working documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference material
or to cite them other than as "work in progress."

    The list of current Internet-Drafts can be accessed at
    http://www.ietf.org/ietf/1id-abstracts.txt

    The list of Internet-Draft Shadow Directories can be accessed at
    http://www.ietf.org/shadow.html


Abstract

Chinese can be written in 2 different scripts, traditional Chinese and
simplified Chinese, that cannot be distinguished by many people of
certain background/cultures/groups who use them interchangeably.  As a
result, users of Chinese Domain Names (CDN) have special needs that
can only be satisfied by adding a label to CDNs that distinguish a CDN
with traditional characters from a CDN with simplified characters.
This labeling is an entire system that can be accomplished with SLDs
or by creating a new type of TLD called Language Script TLD (lsTLD).
This draft describes the benefits that the system will provide and the
techniques involved in implementing it.


1.  Introduction

(Labeling of a CDN can be accomplished with either SLDs or lsTLDs.
However, for simplicity, most of this draft will only use lsTLD to
describe the system and its techniques.  For more information on how
SLDs can be used, please see section 5.2.)

The <.traditional> and <.simplified> TLDs in Chinese characters are:
a) <.traditional> in traditional Chinese is ".繁體" (U+7E41)(
U+9AD4)
b) <.traditional> in simplified Chinese is ".繁体" (U+7E41)( U+4F53)
c) <.simplified> in simplified Chinese is ".简体" (U+7B80)( U+4F53)
d) <.simplified> in traditional Chinese is ".簡體" (U+7C21)( U+9AD4)

Using the 4 language script TLDs above, a Chinese Domain Name System
can be created to satisfy the needs of CDN users by combining together
the following 3 benefits:

Benefit A- A registrant is given the choice of pointing a traditional
CDN to one location (ie. traditional Chinese website) and pointing the
corresponding simplified CDN to another location (ie. simplified
Chinese website).

Benefit B- The registration of a simplified CDN will automatically
reserves the corresponding traditional CDN(s) and visa versa thus
giving users the flexibility of using either script.

Benefit C- A method that will guide users to enter a CDN in their
applications (ie. web browsers) matching the meaning that was intended
by the registrant when the CDN was registered.


2.  Description of the Importance of the Benefits

It is important to understand the needs of ordinary everyday people
who will be the users of CDNs.  The following subsections will explain
in detail the 3 benefits of this system that satisfied those needs.


2.1  Importance of Benefit A.

It would be appropriate for a traditional CDN to be pointing to a
traditional website with contents that are suitable for visitors from
Hong Kong or Taiwan.  On the same line, it would be appropriate for a
simplified CDN to be pointing to a simplified website with contents
that are suitable for visitors from China or Singapore.


2.2  Importance of Benefit B.

Since many Chinese can read / write in both scripts, it is only
appropriate for a traditional CDN to be mapped to its corresponding
simplified CDN by applying a conversion.  This will ensure that no
matter what script the user types in, he will always be able to reach
the intended location(s).


2.3  Importance of Benefit C.

The relationship between simplified Chinese and traditional Chinese is
very complicated.  A TC character that corresponds to a SC character
may not have the same meaning.  To complicate the situation, one TC
character can be mapped into many different SC and visa versa.  One
CDN can potential have a great number of different written variations.
Without a method, a user can be given a CDN and type in the correct
CDN but still cannot reach the proper destination because it is a
variation of the original CDN not intended by the registrant.


3.  Solution / Method

The method for implementing this system is done both at the
registration system and at the client end.


3.1  Implemented at the registration system

The solution to delivering the 3 benefits explained above is a Chinese
domain name system that uses language script TLDs- a TLD of
<.traditional> for traditional CDNs (defined here as a CDN that uses
all traditional characters) and a TLD of <.simplified> for simplified
CDNs (defined here as a CDN that uses all simplified characters).
During registration, a person is allowed to register CDNs in either
all traditional Chinese characters or all simplified Chinese
characters but not by mixing the 2 scripts together.  If he registers
in traditional characters, he will be given a traditional CDN (with
the TLD of <.traditional>) and any similar traditional CDNs will be
reserved.  At the same time, the corresponding simplified CDN(s) (with
the TLD of <.simplified>) will also be reserved- to be activated at a
later date if the registrant chooses to do so.  If he registers in
simplified characters, he will be given a simplified CDN (with the TLD
of <.simplified>) and any similar simplified CDNs will be reserved.
At the same time, the corresponding traditional CDN(s) (with the TLD
of <.traditional>) is reserved.


3.2 Implemented at the client end.

If a user types in a traditional CDN (with the <.traditional>), error
checking can be done by the application (ie. web browser- nameprep to
prohibit invalid entries) on the CDN by searching for the characters
in a Unicode table containing all the valid traditional Chinese
characters.  If a certain character is found not to be a valid
traditional character, an error will be displayed to point out which
character is invalid.  If a user types in a simplified CDN (with the
<.simplified>), the same error checking will be performed by searching
for valid simplified Chinese character.  (Please see Appendix A for a
list of the disallowed Unicodes for traditional CDNs and simplified
CDNs.)


4.  Conclusion

Under such a method of creating a relationship between the lsTLDs, all
3 benefits will be satisfied.  Benefit A will be satisfied because he
can point <whatever>.<traditional> to a traditional website and point
<whatever>.<simplified> to a simplified website.  Benefit B will be
satisfied because when a user is given a <whatever>.<traditional> CDN,
but because he is from mainland China and is more comfortable using
the simplified script, he can simply use the corresponding CDN of
<whatever>.<simplified>.  In other words, a user can use the script of
his choice whether it is traditional Chinese or simplified Chinese and
still reach the location(s) intended by the registrant.  Benefit C
will be satisfied because when the user is given the
<whatever>.<traditional> CDN, the <.traditional> tells him that he
must set his Chinese Input editor to recognize TC only and thereby
preserving the original intended meaning of the CDN when it was first
registered.  In other words, the language script TLDs give the users
much more control and eliminates any guess work.


5.  Other Comments

There are 2 important related issues- “TC<->SC equivalence” and
“lsSLDs”.


5.1  TC<->SC equivalence

An interest question is how this system is effect with TC<->SC
equivalence in the DNS protocol?  The answer is that it will even be
better.  With TC<->SC equivalence in the DNS protocol, all 4 lsTLDs
are used.  No error checking will be performed.  The <.traditional>
lsTLD in both simplified and traditional forms are consider equivalent
and point to the same location (ie.  traditional website).  The
<.simplified> lsTLD in both simplified and traditional forms are
considered equivalent and point to the same location (ie.  simplified
website).  The author of this draft strongly endorse any efforts made
in finding a reasonable solution to the TC<->SC equivalence.


5.2 lsSLDs

The same techniques documented in this draft can also be applied to
the current gTLD and ccTLD registries by using SLDs.  In order to be
fair, everyone must agree to this system and make it a standard.  In
addition, every registry must change their current registered second
level domains to third level domains (ie.
<whatever>.<traditional>.TLD, <whatever>.<simplified>.TLD)


6.  Author’s Address

Ben Chan
cc-www.com
Box 92241
2900 Warden Avenue
Scarborough, Ontario
Canada
M1W 3Y9


7.  References

[IDNREQ]  Requirements of Internationalized Domain Names, Zita Wenzel,
James Seng, draft-ietf-idn-requirements



Appendix A-  Error checking for Unicodes of traditional/simplified
Chinese characters

(The following is a partial list for information only.  A complete
list will be presented upon actual implementation.)


Acceptable Unicodes         Acceptable Unicodes
for a simplified                    for a traditional
CDN                                 CDN


7691                                    769A
788D                                   7919
7231                                    611B
8884                                    8956
5965                                    5967
575D                                   58E9
7F62                                    7F77
6446                                    64FA
8D25                                   6557
9881                                    9812
529E                                    8FA6
7ECA                                  7D46
5E2E                                   5E6B
7ED1                                   7D81
9551                                    938A
8C24                                   8B17
5265                                    525D
9971                                    98FD
5B9D                                   5BF6
62A5                                    5831
9C8D                                   9B91
8F88                                    8F29
8D1D                                   8C9D
94A1                                    92C7
72C8                                    72FD
5907                                     5099
60EB                                    618A
7EF7                                    7E43
7B14                                    7B46
6BD5                                    7562
6BD9                                    6583
5E01                                     5E63
95ED                                    9589
8FB9                                    908A
7F16                                    7DE8
8D2C                                   8CB6
53D8                                    8B8A

etc.