Re: [Idna-update] Genart telechat review of draft-faltstrom-unicode11-08

Asmus Freytag <> Wed, 20 March 2019 06:22 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 5CE41130FC4 for <>; Tue, 19 Mar 2019 23:22:13 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2.7
X-Spam-Status: No, score=-2.7 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key); domainkeys=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id ooM4H68mCSjc for <>; Tue, 19 Mar 2019 23:22:11 -0700 (PDT)
Received: from ( []) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 1A21E127287 for <>; Tue, 19 Mar 2019 23:22:11 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=dk12062016; t=1553062931; bh=nf/efVCiNtWRip5NhM52y4rt6zFbogLpZTKh gqHIhQ4=; h=Received:Subject:To:References:From:Message-ID:Date: User-Agent:MIME-Version:In-Reply-To:Content-Type:Content-Language: X-ELNK-Trace:X-Originating-IP; b=IFCf/XBxNpYP1/X4NspB/Bs1RK4Y6w791 ACzm+SguT5UmPlLBtL36YevqpUgwOXKbLav6OB28og08QdyNBzkC7RsAf3LJfdjFRcd 5CkYf3vSjVtaCiauYCP1q9MZNk7iZsVSvdvxMFrVR8sZw0phwvjQCuQUsHRqRcby3Bw VnyQ/ve9cKIj1BzFuVrsaGSrdpaLUU6skTw9R4iMNstoAat+E5u1Rv/alnaD7ttnpkk xWzTQW/790J/DkN/R1rfgEvOiLT3EaxLGec3x96yF+WrJgS2c7kaJIZh6KTveaXMWH+ pJCrplvyqpBV7tbzGIjw7e+/mwoP/HqE4EvxONhGw==
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=dk12062016;; b=EWi8SkzWhnBX2kvVCfQl530IzPMu4I0pjDLaDNyiYJ3ie8RwzQ8f9/hGmWUY918Geagu2GbfhBE+fFJy4FoX6j5ure7SyTc8HFhVN09xq//n+pBe0iePEz2ZDZwnzlUGGtP+7Piuhq4BxQilni2IZArcpXA1IN+YY2uT8CDiEYAlaMA2LVadGbycn62BjjM17yjcQYjwAQJ74MAnCJmDm2ihMKZ81jZZv8aItOln+xTrQ8hcIPZQqIPT7HPM4DjKfMNYLQHtrgHMJnHi9kuWY24IOSoQ8M2NQ0uKxMfdyrKIvIKZA3QadBmAqXc+nrU5EYgBzzzCWA8O2JXvxDTZ2A==; h=Received:Subject:To:References:From:Message-ID:Date:User-Agent:MIME-Version:In-Reply-To:Content-Type:Content-Language:X-ELNK-Trace:X-Originating-IP;
Received: from [] (helo=[]) by with esmtpa (Exim 4) (envelope-from <>) id 1h6Uc5-000FmO-Pa for; Wed, 20 Mar 2019 02:22:10 -0400
References: <> <458987D953A5B3227D3A791F@PSB> <> <> <> <> <> <>
From: Asmus Freytag <>
Message-ID: <>
Date: Tue, 19 Mar 2019 23:22:11 -0700
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.5.3
MIME-Version: 1.0
In-Reply-To: <>
Content-Type: multipart/alternative; boundary="------------5F454BE0E12E8A03A2DA1020"
Content-Language: en-US
X-ELNK-Trace: 464f085de979d7246f36dc87813833b2817f643b8a0bc7e04778f8996021d0d0c234e9c9dc3b7945350badd9bab72f9c350badd9bab72f9c350badd9bab72f9c
Archived-At: <>
Subject: Re: [Idna-update] Genart telechat review of draft-faltstrom-unicode11-08
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Internationalized Domain Names in Applications \(IDNA\) implementation and update discussions" <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 20 Mar 2019 06:22:13 -0000


The property gc=Nd is used by software identifying strings of characters 
that can be parsed as decimal numbers using offsets. That was implicit 
in the initial design but not made explicit until the first exceptions 
started to show up with new characters.

It is indeed the case that this new code point would have made a fine 
one to use in domain names. Possibly better than the 'standard' digit 
one that is used in running numbers, because that looks like a letter (a 
common issue for digits, just think of l and 1 and o and 0, but here the 
similarities are far more striking).

However, the decision to omit this code point isn't the end of the world 
for domain names. Given what is available today as information about 
usage, any practical impact may be rather limited; the population is 
reported highly literate, but the sources admit that this may reflect 
literacy in the dominant surrounding language. And an unknown and by 
some indication growing fraction writes the old script.

One thing that I find is that the information available for many scripts 
with at least some modern use is generally getting better long after 
they have been encoded. This should be a useful reminder that striving 
for absolute perfection here is a fool's errand.


On 3/19/2019 10:19 PM, Martin J. Dürst wrote:
> Sorry I forgot to mention it, but I think the change of the Unicode
> property for U+19DA NEW TAI LUE THAM DIGIT ONE happened because the
> General_Category value "Nd" was tightened to only apply to consecutive
> decimal digits that start with zero. See
>, page 175:
>   >>>>
> The Numeric_Type = Decimal property value (which is correlated with the
> General_Category = Nd property value) is limited to those numeric
> characters that are used in decimal-radix numbers and for which a full
> set of digits has been encoded in a contiguous range, with ascending
> order of Numeric_Value, and with the digit zero as the first code point
> in the range.
>   >>>>
> Either the tightening must have happened in the 6.0.0 timeframe, or
> U+19DA NEW TAI LUE THAM DIGIT ONE was originally classified wrongly.
> Anyway, the DNS doesn't need decimal digits to be in a contiguous block
> starting with zero.