[Ltru] Re: Solving the UTF-8 problem

Stephane Bortzmeyer <bortzmeyer@nic.fr> Mon, 02 July 2007 13:57 UTC

Return-path: <ltru-bounces@ietf.org>
Received: from [] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1I5MOy-0004NM-FJ; Mon, 02 Jul 2007 09:57:16 -0400
Received: from ltru by megatron.ietf.org with local (Exim 4.43) id 1I5MOx-0004NH-Lk for ltru-confirm+ok@megatron.ietf.org; Mon, 02 Jul 2007 09:57:15 -0400
Received: from [] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1I5MOx-0004N9-C7 for ltru@ietf.org; Mon, 02 Jul 2007 09:57:15 -0400
Received: from virtual3.netaktiv.com ([] helo=mail.bortzmeyer.org) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1I5MOs-0001lM-VZ for ltru@ietf.org; Mon, 02 Jul 2007 09:57:15 -0400
Received: by mail.bortzmeyer.org (Postfix, from userid 10) id 57C6D240817; Mon, 2 Jul 2007 15:57:08 +0200 (CEST)
Received: by mail.sources.org (Postfix, from userid 1000) id 890D411649; Mon, 2 Jul 2007 15:55:33 +0200 (CEST)
Date: Mon, 02 Jul 2007 15:55:33 +0200
From: Stephane Bortzmeyer <bortzmeyer@nic.fr>
To: Doug Ewell <dewell@roadrunner.com>
Message-ID: <20070702135533.GC23350@sources.org>
References: <006501c7bc33$637b08b0$6401a8c0@DGBP7M81>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <006501c7bc33$637b08b0$6401a8c0@DGBP7M81>
X-Transport: UUCP rules
X-Operating-System: Debian GNU/Linux 3.1
User-Agent: Mutt/1.5.9i
X-Spam-Score: 0.1 (/)
X-Scan-Signature: 9182cfff02fae4f1b6e9349e01d62f32
Cc: ietf-languages@iana.org, LTRU Working Group <ltru@ietf.org>
Subject: [Ltru] Re: Solving the UTF-8 problem
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Errors-To: ltru-bounces@ietf.org

On Sun, Jul 01, 2007 at 03:58:48PM -0700,
 Doug Ewell <dewell@roadrunner.com> wrote 
 a message of 161 lines which said:

> Another possibility is to have IANA post an official version of the
> Registry in one encoding, such as UTF-8, and additional, unofficial
> versions in other encodings, such as Latin-1 or hex NCRs.

Why not? Currently, we do exactly the opposite: IANA publishes the
official registry in hex NCR
(http://www.iana.org/assignments/language-subtag-registry) and
langtag.net publishes an unofficial version in UTF-8

> Potential problems with this approach are unintentional mismatches
> between the versions (I caught one of these problems for the ISO
> 639-3 people recently)

I do not get it. If the unofficial version is produced by a program,
how can a mismatch exist (unless there is a bug in the program)? 

And if the unofficial version is done by hand, should we tell ISO
639-3 that computers are better than people for boring and repetitive

Ltru mailing list