[Ltru] Re: UTF-8

"Doug Ewell" <dewell@adelphia.net> Wed, 20 September 2006 04:44 UTC

Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1GPtx8-0001pW-EN; Wed, 20 Sep 2006 00:44:54 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1GPtx7-0001mY-8Z for ltru@ietf.org; Wed, 20 Sep 2006 00:44:53 -0400
Received: from mta10.adelphia.net ([68.168.78.202]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1GPtx6-0001DM-1A for ltru@ietf.org; Wed, 20 Sep 2006 00:44:53 -0400
Received: from DGBP7M81 ([68.67.66.131]) by mta10.adelphia.net (InterMail vM.6.01.05.02 201-2131-123-102-20050715) with SMTP id <20060920044451.ZTXX14728.mta10.adelphia.net@DGBP7M81> for <ltru@ietf.org>; Wed, 20 Sep 2006 00:44:51 -0400
Message-ID: <004501c6dc6f$828124a0$6401a8c0@DGBP7M81>
From: Doug Ewell <dewell@adelphia.net>
To: LTRU Working Group <ltru@ietf.org>
References: <E1GPhXC-0006JR-Fy@megatron.ietf.org>
Date: Tue, 19 Sep 2006 21:44:50 -0700
MIME-Version: 1.0
Content-Type: text/plain; format="flowed"; charset="utf-8"; reply-type="original"
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2900.2869
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2962
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 2409bba43e9c8d580670fda8b695204a
Subject: [Ltru] Re: UTF-8
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Errors-To: ltru-bounces@ietf.org

Peter Constable <petercon at microsoft dot com> wrote:

> There's a prior question: what will the content of registry records 
> contain? (We don't need an encoding that supports the entire UCS if we 
> don't intend to have records using characters from that entire 
> repertoire.) I don't recall if that's been discussed and resolved.

The existing RFC 4646 Registry already contains instances of 11 
different non-ASCII characters, including 3 that are not in Latin-1. 
The current set of 7,600 reference names in ISO/FDIS 639-3 contains 
almost 500 Latin-1 characters, all of which will end up in the Registry. 
(They have to; there are some language names in 639-3 that are 
differentiated only by an accent.)

Clearly the Registry needs an encoding that supports characters beyond 
ASCII or even Latin-1.  We already have one: the much-grumbled-about hex 
NCR's.  UTF-8 would be another.  While we could limit the Registry to a 
subset of the UCS, such as MES-2 -- or MES-1, if we were willing to go 
through the intense pain of removing "Ethiopic (Ge&#x2BB;ez)" -- I don't 
see what advantage it would bring.

--
Doug Ewell
Fullerton, California, USA
http://users.adelphia.net/~dewell/
RFC 4645  *  UTN #14


_______________________________________________
Ltru mailing list
Ltru@ietf.org
https://www1.ietf.org/mailman/listinfo/ltru