Re: Update of RFC 2606 based on the recent ICANN changes?

Lyman Chapin <lyman@acm.org> Mon, 07 July 2008 15:45 UTC

Return-Path: <ietf-bounces@ietf.org>
X-Original-To: ietf-archive@megatron.ietf.org
Delivered-To: ietfarch-ietf-archive@core3.amsl.com
Received: from [127.0.0.1] (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 0B3A128C16B; Mon, 7 Jul 2008 08:45:54 -0700 (PDT)
X-Original-To: ietf@core3.amsl.com
Delivered-To: ietf@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 20F103A680E for <ietf@core3.amsl.com>; Thu, 3 Jul 2008 13:04:27 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -103.224
X-Spam-Level:
X-Spam-Status: No, score=-103.224 tagged_above=-999 required=5 tests=[AWL=-0.624, BAYES_00=-2.599, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id AXydEegqeUIQ for <ietf@core3.amsl.com>; Thu, 3 Jul 2008 13:04:26 -0700 (PDT)
Received: from hobbiton.shire.net (mail.shire.net [209.41.94.250]) by core3.amsl.com (Postfix) with ESMTP id 56E703A67E4 for <ietf@ietf.org>; Thu, 3 Jul 2008 13:04:26 -0700 (PDT)
Received: from c-75-67-86-24.hsd1.nh.comcast.net ([75.67.86.24] helo=[10.0.1.29]) by hobbiton.shire.net with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.51) id 1KEV2g-000Dq2-Fh; Thu, 03 Jul 2008 14:04:34 -0600
In-Reply-To: <558a39a60807021729m1fc299c2ted96064ce73228a7@mail.gmail.com>
References: <20080701223655.14768.qmail@simone.iecc.com> <C7F7E8A9-C844-4E1C-827D-189D4937BA6B@acm.org> <14AE948B18197467AE4D96A4@p3.JCK.COM> <558a39a60807021729m1fc299c2ted96064ce73228a7@mail.gmail.com>
Mime-Version: 1.0 (Apple Message framework v753)
Message-Id: <D400669B-EA1C-4494-8094-20DC762F0EB5@acm.org>
From: Lyman Chapin <lyman@acm.org>
Subject: Re: Update of RFC 2606 based on the recent ICANN changes?
Date: Thu, 03 Jul 2008 16:05:14 -0400
To: James Seng <james@seng.sg>
X-Mailer: Apple Mail (2.753)
X-SA-Exim-Connect-IP: 75.67.86.24
X-SA-Exim-Mail-From: lyman@acm.org
X-SA-Exim-Scanned: No (on hobbiton.shire.net); SAEximRunCond expanded to false
X-Mailman-Approved-At: Mon, 07 Jul 2008 08:45:52 -0700
Cc: John C Klensin <john-ietf@jck.com>, idna-update@alvestrand.no, ietf@ietf.org
X-BeenThere: ietf@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: IETF-Discussion <ietf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/ietf>, <mailto:ietf-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:ietf@ietf.org>
List-Help: <mailto:ietf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf>, <mailto:ietf-request@ietf.org?subject=subscribe>
Content-Transfer-Encoding: base64
Content-Type: text/plain; charset="utf-8"; Format="flowed"; DelSp="yes"
Sender: ietf-bounces@ietf.org
Errors-To: ietf-bounces@ietf.org

>> Is "сом" identical to "com"? (the first of these is U+0441
>> U+043E U+043C)
>
> The current principle is that it should be be a "confusing string",
> which is vague enough to cover the case above (but perhaps not able to
> cover .co)

"Similarity" can be defined and tested, by setting thresholds and the  
like, but "confusing" refers to a state of mind - something is  
"confusing" if the people who are likely to encounter it consider it  
to be confusing. There's no way to objectively define or test for  
"confusing" similarity without reference to how actual people respond  
to a particular string. That means either mining data collected from  
circumstances in which people have mistaken one string for another  
(perhaps a history of Google searches), or consulting a panel of real  
people whenever it is necessary to decide whether or not two strings  
are "confusingly" similar.

>>> (b) be identical to a Reserved Name;
>>
>>> (c) consist of a single character;
>>
>> I've heard it argued repeatedly that this is an unreasonable
>> rule for ideographic characters.   I don't have an opinion, but
>> hope that ICANN has considered that case in full details.
>
> This is where we dive into a discussion what is a "character". In
> ideographic based language, there isnt a concept of a "word".
>
> For example, Chinese, Japanese and Korean are actually "phonetics
> language", and that ideograph characters are used to express the
> phonetics. A "word" or more accurately "morphemes" can be express in a
> single or more ideographs. A single latin character is unlikely to be
> useful by itself (except of a and i) but thats not the case in CJK.
>
> If the condition is that "no single ASCII character", I may be neutral
> about it (since a single ideograph would never translate to a single
> ASCII character in the zonefile, due to the xn-- prefix) but if the
> "character" is defined more broadly to cover "U-label" character, then
> I would have strong objections.

At the moment, the condition is "no single Unicode code point." To  
the extent that a single CJK ideograph can be expressed using a  
single Unicode code point, this would represent the situation to  
which you say you would object. I will dig through my notes to find  
out why the "single character" condition was adopted -

- Lyman
_______________________________________________
Ietf mailing list
Ietf@ietf.org
https://www.ietf.org/mailman/listinfo/ietf