Re: [Ltru] Re: UTF-8

Addison Phillips <addison@yahoo-inc.com> Fri, 15 September 2006 15:54 UTC

Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1GOG1l-0001Ue-MR; Fri, 15 Sep 2006 11:54:53 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1GOG1l-0001UY-1t for ltru@lists.ietf.org; Fri, 15 Sep 2006 11:54:53 -0400
Received: from rsmtp1.corp.yahoo.com ([207.126.228.149]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1GOG1j-00046b-NO for ltru@lists.ietf.org; Fri, 15 Sep 2006 11:54:53 -0400
Received: from [10.72.72.2] (snvvpn1-10-72-72-c2.corp.yahoo.com [10.72.72.2]) (authenticated bits=0) by rsmtp1.corp.yahoo.com (8.13.6/8.13.6/y.rout) with ESMTP id k8FFskAM091025 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 15 Sep 2006 08:54:47 -0700 (PDT)
DomainKey-Signature: a=rsa-sha1; s=serpent; d=yahoo-inc.com; c=nofws; q=dns; h=message-id:date:from:user-agent:mime-version:to:cc:subject: references:in-reply-to:content-type:content-transfer-encoding; b=WpR5PjinHrPyQ+oYmMFHWgQxof4BeOSGMWdfFBOzqbvlQQe20PFS5iEHH79DZ7pJ
Message-ID: <450ACCC6.30309@yahoo-inc.com>
Date: Fri, 15 Sep 2006 08:54:46 -0700
From: Addison Phillips <addison@yahoo-inc.com>
User-Agent: Thunderbird 1.5.0.5 (Windows/20060719)
MIME-Version: 1.0
To: Frank Ellermann <nobody@xyzzy.claranet.de>
Subject: Re: [Ltru] Re: UTF-8
References: <E1GNzAK-0005WV-Uf@megatron.ietf.org> <007501c6d890$b96ff410$6401a8c0@DGBP7M81> <450A8887.EAB@xyzzy.claranet.de>
In-Reply-To: <450A8887.EAB@xyzzy.claranet.de>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 7bit
X-Spam-Score: -15.0 (---------------)
X-Scan-Signature: bb8f917bb6b8da28fc948aeffb74aa17
Cc: ltru@lists.ietf.org
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Errors-To: ltru-bounces@ietf.org

This thread is very amusing, but, I think, not very useful. I think that 
using a widely-recognized, plain-text character encoding (UTF-8: "the 
new ASCII") would be a Good Thing. But changing which escape syntax we 
use, especially to use something, um, "unusual" like UTF-1 or BOCU-1, 
gives us nothing over using US-ASCII with some escape syntax (such as 
our existing NCR format).

So:

Is there any support for changing to UTF-8?

Addison

Frank Ellermann wrote:
> Doug Ewell wrote:
> 
>> I'm not sure why you chose 0x86 as your sequence introducer
> 
> Six is the number of trailing octets: 91909F9F9F9F (for John's
> example u+10FFFF).
> 
>> you could make each sequence 1 byte shorter by marking the
>> lead or trail byte specially
> 
> Yes, but then only 2 octets (80+81) would never occur (instead
> of 11), and lost 9x bytes won't cause an error.  UTF-8 has now
> 13 "impossible" octets, and similar features.
> 

-- 
Addison Phillips
Globalization Architect -- Yahoo! Inc.

Internationalization is an architecture.
It is not a feature.

_______________________________________________
Ltru mailing list
Ltru@ietf.org
https://www1.ietf.org/mailman/listinfo/ltru