[Ltru] Re: IANA, UTF-8 and the Language Subtag Registry

Frank Ellermann <nobody@xyzzy.claranet.de> Tue, 20 March 2007 17:50 UTC

Return-path: <ltru-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1HTiTa-000624-IP; Tue, 20 Mar 2007 13:50:26 -0400
Received: from [10.90.34.44] (helo=chiedprmail1.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1HTiTZ-00061w-Dd for ltru@lists.ietf.org; Tue, 20 Mar 2007 13:50:25 -0400
Received: from main.gmane.org ([80.91.229.2] helo=ciao.gmane.org) by chiedprmail1.ietf.org with esmtp (Exim 4.43) id 1HTiTX-0002Xj-SN for ltru@lists.ietf.org; Tue, 20 Mar 2007 13:50:25 -0400
Received: from list by ciao.gmane.org with local (Exim 4.43) id 1HTiTE-0003VM-67 for ltru@lists.ietf.org; Tue, 20 Mar 2007 18:50:04 +0100
Received: from d253031.dialin.hansenet.de ([80.171.253.31]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for <ltru@lists.ietf.org>; Tue, 20 Mar 2007 18:50:04 +0100
Received: from nobody by d253031.dialin.hansenet.de with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for <ltru@lists.ietf.org>; Tue, 20 Mar 2007 18:50:04 +0100
X-Injected-Via-Gmane: http://gmane.org/
To: ltru@lists.ietf.org
From: Frank Ellermann <nobody@xyzzy.claranet.de>
Date: Tue, 20 Mar 2007 18:47:38 +0100
Organization: <URL:http://purl.net/xyzzy>
Lines: 25
Message-ID: <46001E3A.3D46@xyzzy.claranet.de>
References: <E1HTbLi-0001AN-18@megatron.ietf.org> <005701c76af9$0dcbced0$6401a8c0@DGBP7M81> <B624DF0E-20A1-483A-97CC-4B09DCC5085C@icann.org>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Complaints-To: usenet@sea.gmane.org
X-Gmane-NNTP-Posting-Host: d253031.dialin.hansenet.de
X-Mailer: Mozilla 3.0 (OS/2; U)
X-Spam-Score: 0.0 (/)
X-Scan-Signature: ffa9dfbbe7cc58b3fa6b8ae3e57b0aa3
Cc:
Subject: [Ltru] Re: IANA, UTF-8 and the Language Subtag Registry
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Errors-To: ltru-bounces@ietf.org

David Conrad wrote:

> no idea what "NCRs" are.

Numerical Character Reference, in Web documents in the form
&#x103456; for u+103456.  It's what you have in the language
subtag registry today.  

Some "misguided" folks (= roughly anybody but me here) want
to switch the registry to UTF-8.  You'd then get one last
Internet-Draft in the old style (because I-Ds are US ASCII)
with hex. NCRs, and "somehow" you'd have to translate this
to UTF-8 (once).
 
> So you want IANA to do a post processing step to translate
> hex NCRs (whatever they are) into UTF-8?  That's fine, as
> long as the IANA considerations section is clear about what
> we need to do for the translation.

Deriving UTF-32 from the NCR-input would be very simple, and
from there to UTF-8 you'd pick whatever you have.  And most
likely deriving UTF-16 will do if all NCRs stay below &#xFFFF;

Frank



_______________________________________________
Ltru mailing list
Ltru@ietf.org
https://www1.ietf.org/mailman/listinfo/ltru