RE: [Ltru] Wrapping up the UTF-8 debate

Martin Duerst <duerst@it.aoyama.ac.jp> Sat, 21 July 2007 05:11 UTC

Return-path: <ltru-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1IC7G1-0004oF-QA; Sat, 21 Jul 2007 01:11:57 -0400
Received: from ltru by megatron.ietf.org with local (Exim 4.43) id 1IC7G0-0004o9-I3 for ltru-confirm+ok@megatron.ietf.org; Sat, 21 Jul 2007 01:11:56 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1IC7Fz-0004o0-7J for ltru@ietf.org; Sat, 21 Jul 2007 01:11:55 -0400
Received: from scmailgw1.scop.aoyama.ac.jp ([133.2.251.194]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1IC7Fx-0000Ba-AO for ltru@ietf.org; Sat, 21 Jul 2007 01:11:55 -0400
Received: from scmse2.scbb.aoyama.ac.jp (scmse2 [133.2.253.17]) by scmailgw1.scop.aoyama.ac.jp (secret/secret) with SMTP id l6L5Bndg009685 for <ltru@ietf.org>; Sat, 21 Jul 2007 14:11:49 +0900 (JST)
Received: from (133.2.206.133) by scmse2.scbb.aoyama.ac.jp via smtp id 5064_e2c236f8_3748_11dc_8151_0014221f2a2d; Sat, 21 Jul 2007 14:11:49 +0900
X-AuthUser: duerst@it.aoyama.ac.jp
Received: from Tanzawa.it.aoyama.ac.jp ([133.2.210.1]:53984) by itmail.it.aoyama.ac.jp with [XMail 1.22 ESMTP Server] id <SE7CFE> for <ltru@ietf.org> from <duerst@it.aoyama.ac.jp>; Sat, 21 Jul 2007 14:09:30 +0900
Message-Id: <6.0.0.20.2.20070721095434.02504230@localhost>
X-Sender: duerst@localhost
X-Mailer: QUALCOMM Windows Eudora Version 6J
Date: Sat, 21 Jul 2007 09:57:40 +0900
To: Peter Constable <petercon@microsoft.com>, Randy Presuhn <randy_presuhn@mindspring.com>, LTRU Working Group <ltru@ietf.org>
From: Martin Duerst <duerst@it.aoyama.ac.jp>
Subject: RE: [Ltru] Wrapping up the UTF-8 debate
In-Reply-To: <DDB6DE6E9D27DD478AE6D1BBBB83579561A8EFCDE2@NA-EXMSG-C117.r edmond.corp.microsoft.com>
References: <002c01c7ca8b$076f2d60$6a01a8c0@DGBP7M81> <46A049B5.2050800@yahoo-inc.com> <20070720060014.GQ5737@mercury.ccil.org> <46A052B8.3060508@yahoo-inc.com> <011701c7cae9$f23811a0$6801a8c0@oemcomputer> <DDB6DE6E9D27DD478AE6D1BBBB83579561A8EFCDE2@NA-EXMSG-C117.redmond.corp.microsoft.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Spam-Score: 1.5 (+)
X-Scan-Signature: 244a2fd369eaf00ce6820a760a3de2e8
Cc:
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Errors-To: ltru-bounces@ietf.org

I agree with Peter (and Mark) on the issues below.
I have a strong preference against a BOM.

Separately from our work on the drafts, we should continue
(as Randy already has done) to lobby the IETF into updating
their mailing list/archiving software so that UTF-8 (as well
as any other character encoding) is shown correctly.

Regards,   Martin.

At 07:42 07/07/21, Peter Constable wrote:
>> From: Randy Presuhn [mailto:randy_presuhn@mindspring.com]
>
>
>>    1)  Are we willing to use a representation (for the discussion
>>        of changes on the ietf-languages@iana.org list and communication
>>        with IANA) which is different (perhaps only
>> mechanically/trivially so)
>>        from what is actually published in the registry files?
>
>I assume that those responsible for actually updating the records will be 
>competent to do what's needed, so I am willing.
>
>
>>    2)  The registry file itself currently uses something which is
>> similar
>>        to an NCR.  Are we willing to change the registry format to
>>           a) use actual NCRs for non-ASCII code points, making
>> conversion
>>              to XML even more trivial than it already is, while still
>>              giving some fallback to folks inspecting the data for
>> errors
>>              or looking at it through ASCII windows
>>           b) embed the actual (UTF-8 encoded) characters into the file
>>           c) something else?
>
>My vote: 2B
>
>
>>    3)  Are we going to instruct IANA to maintain a "pure" ASCII version
>>        of the registry, in which everything not ASCII will have been
>> flattened
>>        or translitered?
>
>My vote: No.
>
>
>Another possible issue (assuming consensus on a file with actual 
>UTF-8-encoded characters): does the UTF-8 file begin with a UTF 
>BOM/encoding signature?
>
>
>Peter
>
>
>_______________________________________________
>Ltru mailing list
>Ltru@ietf.org
>https://www1.ietf.org/mailman/listinfo/ltru


#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst@it.aoyama.ac.jp     



_______________________________________________
Ltru mailing list
Ltru@ietf.org
https://www1.ietf.org/mailman/listinfo/ltru