[Ltru] RE: Solving the UTF-8 problem; was Language Tag Modification 1694acad;

Peter Constable <petercon@microsoft.com> Tue, 03 July 2007 15:08 UTC

Return-path: <ltru-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1I5jzH-0003uQ-Ll; Tue, 03 Jul 2007 11:08:19 -0400
Received: from ltru by megatron.ietf.org with local (Exim 4.43) id 1I5jzG-0003uK-7M for ltru-confirm+ok@megatron.ietf.org; Tue, 03 Jul 2007 11:08:18 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1I5jzF-0003uC-U8 for ltru@ietf.org; Tue, 03 Jul 2007 11:08:17 -0400
Received: from mailc.microsoft.com ([131.107.115.214] helo=smtp.microsoft.com) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1I5jyS-0007jA-97 for ltru@ietf.org; Tue, 03 Jul 2007 11:08:17 -0400
Received: from tk5-exhub-c103.redmond.corp.microsoft.com (157.54.70.186) by TK5-EXGWY-E803.partners.extranet.microsoft.com (10.251.56.169) with Microsoft SMTP Server (TLS) id 8.0.700.0; Tue, 3 Jul 2007 08:07:28 -0700
Received: from NA-EXMSG-C117.redmond.corp.microsoft.com ([157.54.62.44]) by tk5-exhub-c103.redmond.corp.microsoft.com ([157.54.70.186]) with mapi; Tue, 3 Jul 2007 08:07:26 -0700
From: Peter Constable <petercon@microsoft.com>
To: "ietf-languages@iana.org" <ietf-languages@iana.org>, LTRU Working Group <ltru@ietf.org>
Date: Tue, 03 Jul 2007 08:07:18 -0700
Thread-Topic: Solving the UTF-8 problem; was Language Tag Modification 1694acad;
Thread-Index: Ace9epLsgGg3D89fSReGzPIUyP5pRwACSVHA
Message-ID: <DDB6DE6E9D27DD478AE6D1BBBB83579560F3AAE4D8@NA-EXMSG-C117.redmond.corp.microsoft.com>
References: <BAY114-F31053BEBAE0817D81180DFB30D0@phx.gbl> <009301c7bd35$1c649a60$6401a8c0@DGBP7M81> <20070703071146.GA8412@nic.fr> <001201c7bd7a$7c126ab0$6401a8c0@DGBP7M81>
In-Reply-To: <001201c7bd7a$7c126ab0$6401a8c0@DGBP7M81>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: en-US
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-Spam-Score: 0.0 (/)
X-Scan-Signature: c1c65599517f9ac32519d043c37c5336
Cc:
Subject: [Ltru] RE: Solving the UTF-8 problem; was Language Tag Modification 1694acad;
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Errors-To: ltru-bounces@ietf.org

+1 to specifying NFC (whether we use UTF-8 or NCRs).


Peter



-----Original Message-----
From: ietf-languages-bounces@alvestrand.no [mailto:ietf-languages-bounces@alvestrand.no] On Behalf Of Doug Ewell
Sent: Tuesday, July 03, 2007 7:00 AM
To: ietf-languages@iana.org; LTRU Working Group
Subject: Re: Solving the UTF-8 problem; was Language Tag Modification 1694acad;

Stephane Bortzmeyer <bortzmeyer at nic dot fr> wrote:

> But allow me a little troll: if we choose UTF-8, what about
> normalization?
>
> 1) Do not mention it (this would mean that IANA would be free to
> suddenly canonicalize the registry, thus making it different in a
> byte-to-byte comparison)
>
> 2) Mandate NFC or NFD (which means an automatic registry checker would
> have to check it)

There's actually nothing new here, since the Registry is already using
Unicode with hex NCRs as the encoding scheme, and we would just be
changing it to Unicode with UTF-8 as the encoding scheme.

However, it wouldn't hurt to specify NFC somewhere in the draft.  This
is what we are already using and what the IETF and W3C seem to prefer.
Descriptions and comments are supposed to be non-normative, so I'm not
sure any user's tools would *have* to do any checking or correcting,
though of course ours should.

--
Doug Ewell  *  Fullerton, California, USA  *  RFC 4645  *  UTN #14
http://users.adelphia.net/~dewell/
http://www1.ietf.org/html.charters/ltru-charter.html
http://www.alvestrand.no/mailman/listinfo/ietf-languages

_______________________________________________
Ietf-languages mailing list
Ietf-languages@alvestrand.no
http://www.alvestrand.no/mailman/listinfo/ietf-languages


_______________________________________________
Ltru mailing list
Ltru@ietf.org
https://www1.ietf.org/mailman/listinfo/ltru