Re: [Ltru] Re: UTF-8

Addison Phillips <addison@yahoo-inc.com> Tue, 19 September 2006 15:45 UTC

Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1GPhnJ-0003Lq-IO; Tue, 19 Sep 2006 11:45:57 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1GPhnJ-0003Lk-1T for ltru@ietf.org; Tue, 19 Sep 2006 11:45:57 -0400
Received: from rsmtp2.corp.yahoo.com ([207.126.228.150]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1GPhnE-0003T1-I8 for ltru@ietf.org; Tue, 19 Sep 2006 11:45:57 -0400
Received: from [10.72.72.180] (snvvpn1-10-72-72-c180.corp.yahoo.com [10.72.72.180]) (authenticated bits=0) by rsmtp2.corp.yahoo.com (8.13.6/8.13.6/y.rout) with ESMTP id k8JFjmv9094832 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 19 Sep 2006 08:45:49 -0700 (PDT)
DomainKey-Signature: a=rsa-sha1; s=serpent; d=yahoo-inc.com; c=nofws; q=dns; h=message-id:date:from:user-agent:mime-version:to:cc:subject: references:in-reply-to:content-type:content-transfer-encoding; b=2jWbKZHcZdo1tpPg/1l9iegHd8wvwNpLKgvMPUqlSwrIOEuk4+N43ql3gKnJDt4E
Message-ID: <451010AB.3090406@yahoo-inc.com>
Date: Tue, 19 Sep 2006 08:45:47 -0700
From: Addison Phillips <addison@yahoo-inc.com>
User-Agent: Thunderbird 1.5.0.7 (Windows/20060909)
MIME-Version: 1.0
To: Peter Constable <petercon@microsoft.com>
Subject: Re: [Ltru] Re: UTF-8
References: <F8ACB1B494D9734783AAB114D0CE68FE0AFF895A@RED-MSG-52.redmond.corp.microsoft.com>
In-Reply-To: <F8ACB1B494D9734783AAB114D0CE68FE0AFF895A@RED-MSG-52.redmond.corp.microsoft.com>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 7bit
X-Spam-Score: -15.0 (---------------)
X-Scan-Signature: 10d3e4e3c32e363f129e380e644649be
Cc: LTRU Working Group <ltru@ietf.org>
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Errors-To: ltru-bounces@ietf.org

Descriptions and comments both allow the full range of Unicode today. 
Doug has noted some number of items (perhaps a dozen) that use non-ASCII 
(mostly Latin-script) letters or symbols. There is no check for the 
range of characters permitted, nor (in our discussion in draft-registry 
days) should there be.

I think that limiting the repertoire to anything less than the full 
range of the UCS would prove problematic anyway. And while I don't see a 
pressing need to identify languages in their native form, some people 
might find it offensive if it were banned outright. Similarly, some 
might find comments clearer if they contained some native description. I 
don't know.

What I do know is that I think that the only limitation that could 
reasonably be made that would be non-arbitrary would be a limitation to 
ASCII. And I, personally, will not support any encodings except those 
two for encoding the registry: either US-ASCII or UTF-8.

Addison

Peter Constable wrote:
> There's a prior question: what will the content of registry records
> contain? (We don't need an encoding that supports the entire UCS if we
> don't intend to have records using characters from that entire
> repertoire.) I don't recall if that's been discussed and resolved.
> 
> I do support using an encoding that directly supports whatever
> characters we wish to allow in the registry without use of NCRs or other
> such escape mechanisms; and if we wish to allow any UCS character in the
> registry then I would support using UTF-8 as the encoding.
> 
> 
> Peter
> 
> 
> 
>> -----Original Message-----
>> From: Martin Duerst [mailto:duerst@it.aoyama.ac.jp]
>> Sent: Tuesday, September 19, 2006 6:18 AM
>> To: Peter Constable; LTRU Working Group
>> Subject: RE: [Ltru] Re: UTF-8
>>
>> [chair hat on]
>>
>> Peter, with the observation below, do you want to say
>> you are in favor of moving to UTF-8, or against, or
>> did you write that strictly as an observation only?
>>
>> Regards,    Martin.
>>
>> At 16:04 06/09/18, Peter Constable wrote:
>>>> From: Martin Duerst [mailto:duerst@it.aoyama.ac.jp]
>>>
>>>> Agreed. In my point of view, the pain of Bokm&#$!@*l outweights
>>>> a lot of things. The main benefit is that native people can
>>>> easily check that things are correct, which they won't do
>>>> if we show them just a number.
>>> There's lots of software that will interpret an NCR in an HTML, XML
> or HTML
>>> file and display it to the user as the actual character in the UCS
> for
>>> which it is a reference. I don't think there is any software that
> will do
>>> this for the file located at
>>> http://www.iana.org/assignments/language-subtag-registry.
>>>
>>>
>>>
>>>
>>> Peter Constable
>>
>>
>> #-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
>> #-#-#  http://www.sw.it.aoyama.ac.jp
> mailto:duerst@it.aoyama.ac.jp
> 
> 
> _______________________________________________
> Ltru mailing list
> Ltru@ietf.org
> https://www1.ietf.org/mailman/listinfo/ltru

-- 
Addison Phillips
Globalization Architect -- Yahoo! Inc.

Internationalization is an architecture.
It is not a feature.

_______________________________________________
Ltru mailing list
Ltru@ietf.org
https://www1.ietf.org/mailman/listinfo/ltru