Re: [DNSOP] clue w.r.t. arabic

Eric Brunner-Williams <ebw@abenaki.wabanaki.net> Fri, 19 November 2010 00:35 UTC

Return-Path: <ebw@abenaki.wabanaki.net>
X-Original-To: dnsop@core3.amsl.com
Delivered-To: dnsop@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 4DAE528C0EC for <dnsop@core3.amsl.com>; Thu, 18 Nov 2010 16:35:12 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.724
X-Spam-Level:
X-Spam-Status: No, score=-2.724 tagged_above=-999 required=5 tests=[AWL=-0.125, BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id indFtj7sw3Ap for <dnsop@core3.amsl.com>; Thu, 18 Nov 2010 16:35:10 -0800 (PST)
Received: from abenaki.wabanaki.net (abenaki.wabanaki.net [65.99.1.133]) by core3.amsl.com (Postfix) with ESMTP id 467E828C0D6 for <dnsop@ietf.org>; Thu, 18 Nov 2010 16:35:09 -0800 (PST)
Received: from limpet.local (cpe-67-255-5-237.twcny.res.rr.com [67.255.5.237]) by abenaki.wabanaki.net (8.14.4/8.14.4) with ESMTP id oAINCBXo031534 for <dnsop@ietf.org>; Thu, 18 Nov 2010 18:12:12 -0500 (EST) (envelope-from ebw@abenaki.wabanaki.net)
Message-ID: <4CE5C667.8010106@abenaki.wabanaki.net>
Date: Thu, 18 Nov 2010 19:35:51 -0500
From: Eric Brunner-Williams <ebw@abenaki.wabanaki.net>
Organization: wampumpeag
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2.12) Gecko/20101027 Thunderbird/3.1.6
MIME-Version: 1.0
To: dnsop@ietf.org
References: <20101115213532.GD322@shinkuro.com> <04856F66-598D-43CC-8164-90178A6F2952@virtualized.org> <4CE283DA.5080606@abenaki.wabanaki.net> <20101116145308.GG1389@shinkuro.com> <alpine.LSU.2.00.1011161601450.14239@hermes-2.csi.cam.ac.uk> <20101116164818.GN1389@shinkuro.com> <alpine.LSU.2.00.1011171101250.14239@hermes-2.csi.cam.ac.uk> <20101117121906.GC3773@shinkuro.com> <8CEF048B9EC83748B1517DC64EA130FB43C309F821@off-win2003-01.ausregistrygroup.local> <4CE52226.70502@necom830.hpcl.titech.ac.jp> <20101118140728.GB5795@shinkuro.com> <4CE5523A.7090407@abenaki.wabanaki.net> <4CE5B12E.8020502@necom830.hpcl.titech.ac.jp>
In-Reply-To: <4CE5B12E.8020502@necom830.hpcl.titech.ac.jp>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
Subject: Re: [DNSOP] clue w.r.t. arabic
X-BeenThere: dnsop@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: IETF DNSOP WG mailing list <dnsop.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/dnsop>, <mailto:dnsop-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/dnsop>
List-Post: <mailto:dnsop@ietf.org>
List-Help: <mailto:dnsop-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dnsop>, <mailto:dnsop-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Nov 2010 00:35:12 -0000

On 11/18/10 6:05 PM, Masataka Ohta wrote:

> If it's just a reference, that's fine. But, if you want to
> make some point, make it with your own words or quotation,
> not just a reference to lengthy video, please.

noted.

> Anyway, Arabic strings are examples of exponential explosions
> with large coefficients a lot easier to understand for most
> of you than Chinese ones.

the density of variant (context dependent) characters in arabic 
script, whether sampled as text, or sampled as domain names, is 
sparse, relative to the density of "variant" characters in the 
(unified) han script(s), which is not quite 2^^n, where n is the 
number of characters in a label, but is sufficiently close to allow 
the "exponential" term to be reasonably used.

i know this to be the case, but just in case the point was not 
appreciated by the icann*, and other ietf persons present** at the idn 
session during the icann brussels meeting, i carefully asked the 
density question of the presenters, one a han script (chinese) domain 
name registry operator, the other an arabic script (arabic) domain 
name registry operator. their answers were as i expected, and fail to 
support a "Arabic strings are examples of exponential explosions with 
large coefficients" claim.

variants in arabic script present problems to the idn(a) 
specification(s) that assume "unicode" as the character repertoire, 
but they are unlike in scale the problems presented sc/tc equivalence 
classes presented with similar conditions and assumptions.

-e

*  t. dam
** o. gudmundsson