RE: Pinyin

"Phillips, Addison" <addison@amazon.com> Wed, 24 September 2008 19:34 UTC

Return-Path: <addison@amazon.com>
X-Original-To: ietf-languages@alvestrand.no
Delivered-To: ietf-languages@alvestrand.no
Received: from localhost (localhost [127.0.0.1]) by eikenes.alvestrand.no (Postfix) with ESMTP id F262639E68D for <ietf-languages@alvestrand.no>; Wed, 24 Sep 2008 21:34:09 +0200 (CEST)
X-Virus-Scanned: Debian amavisd-new at eikenes.alvestrand.no
Received: from eikenes.alvestrand.no ([127.0.0.1]) by localhost (eikenes.alvestrand.no [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id fMOcKnmaJt3F for <ietf-languages@alvestrand.no>; Wed, 24 Sep 2008 21:34:09 +0200 (CEST)
X-Greylist: from auto-whitelisted by SQLgrey-1.6.8
Received: from pechora4.lax.icann.org (pechora4.icann.org [208.77.188.39]) by eikenes.alvestrand.no (Postfix) with ESMTPS id EB25F39E498 for <ietf-languages@alvestrand.no>; Wed, 24 Sep 2008 21:34:08 +0200 (CEST)
Received: from smtp-fw-4101.amazon.com (smtp-fw-4101.amazon.com [72.21.198.25]) by pechora4.lax.icann.org (8.13.8/8.13.8) with ESMTP id m8OJYHX1009475 for <ietf-languages@iana.org>; Wed, 24 Sep 2008 12:34:38 -0700
X-IronPort-AV: E=Sophos;i="4.33,303,1220227200"; d="scan'208";a="51791200"
Received: from smtp-in-4103.sea5.amazon.com ([10.248.183.17]) by smtp-border-fw-out-4101.iad4.amazon.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 24 Sep 2008 19:34:16 +0000
Received: from ex-hub-4102.ant.amazon.com (ex-hub-4102.ant.amazon.com [10.248.163.23]) by smtp-in-4103.sea5.amazon.com (8.12.11/8.12.11) with ESMTP id m8OJYGN5009832; Wed, 24 Sep 2008 19:34:16 GMT
Received: from EX-SEA5-D.ant.amazon.com ([10.248.163.28]) by ex-hub-4102.ant.amazon.com ([10.248.163.23]) with mapi; Wed, 24 Sep 2008 12:34:15 -0700
From: "Phillips, Addison" <addison@amazon.com>
To: David Starner <prosfilaes@gmail.com>, John Cowan <cowan@ccil.org>
Date: Wed, 24 Sep 2008 12:34:14 -0700
Subject: RE: Pinyin
Thread-Topic: Pinyin
Thread-Index: Ackee3SxqJ99qzyGQNeWvbucnKkTRwAAB87g
Message-ID: <4D25F22093241741BC1D0EEBC2DBB1DA014C26B041@EX-SEA5-D.ant.amazon.com>
References: <83C5E5CB-FE27-47BA-A98F-F5003F586A64@evertype.com> <006e01c91e68$9e4abce0$6801a8c0@oemcomputer> <20080924172101.GU19886@mercury.ccil.org> <6d99d1fd0809241042g44eba0e8q613989437a958ee@mail.gmail.com> <20080924190502.GD11053@mercury.ccil.org> <6d99d1fd0809241226k384e9a11h41ebae090bb1b8d6@mail.gmail.com>
In-Reply-To: <6d99d1fd0809241226k384e9a11h41ebae090bb1b8d6@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: en-US
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-Virus-Scanned: ClamAV version 0.93.3, clamav-milter version 0.93.3 on pechora4.lax.icann.org
X-Virus-Status: Clean
X-Greylist: IP, sender and recipient auto-whitelisted, not delayed by milter-greylist-4.0 (pechora4.lax.icann.org [208.77.188.39]); Wed, 24 Sep 2008 12:34:38 -0700 (PDT)
Cc: "ietf-languages@iana.org" <ietf-languages@iana.org>
X-BeenThere: ietf-languages@alvestrand.no
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: IETF Language tag discussions <ietf-languages.alvestrand.no>
List-Unsubscribe: <http://www.alvestrand.no/mailman/listinfo/ietf-languages>, <mailto:ietf-languages-request@alvestrand.no?subject=unsubscribe>
List-Archive: <http://www.alvestrand.no/pipermail/ietf-languages>
List-Post: <mailto:ietf-languages@alvestrand.no>
List-Help: <mailto:ietf-languages-request@alvestrand.no?subject=help>
List-Subscribe: <http://www.alvestrand.no/mailman/listinfo/ietf-languages>, <mailto:ietf-languages-request@alvestrand.no?subject=subscribe>
X-List-Received-Date: Wed, 24 Sep 2008 19:34:10 -0000

The guiding principle in forming language tags is to "tag content wisely." Wisdom, of course takes different forms, but 'km', 'es', and 'pqm' are almost certainly better tag choices than ones that include dubious region subtags if one is not familiar enough to know if a regional distinction applies (or not). A librarian (or anyone else) is better off following RFC 4646 and/or its successor, where it says:

--
Subtags SHOULD only be used where they add useful distinguishing information; extraneous subtags interfere with the meaning, understanding, and processing of language tags.
--

Addison

Addison Phillips
Globalization Architect -- Lab126

Internationalization is not a feature.
It is an architecture.


> -----Original Message-----
> From: ietf-languages-bounces@alvestrand.no [mailto:ietf-languages-
> bounces@alvestrand.no] On Behalf Of David Starner
> Sent: Wednesday, September 24, 2008 12:26 PM
> To: John Cowan
> Cc: ietf-languages@iana.org
> Subject: Re: Pinyin
> 
> On Wed, Sep 24, 2008 at 3:05 PM, John Cowan <cowan@ccil.org> wrote:
> > David Starner scripsit:
> >
> >> km-US and tr-DE make sense; even if bo-TW doesn't exist today,
> it
> >> doesn't seem that unlikely a combination to appear in the future.
> >
> > Not really.  km-US may make sense as a *locale*, but as a
> *language
> > tag* it would mean "Khmer as spoken/written in the U.S.", which
> as far
> > as I know is not significantly different from Khmer as
> spoken/written
> > in Cambodia.
> 
> But whether or not that distinction is important may not be
> something
> the tagger knows; a librarian tagging recordings is probably better
> off tagging km-US, es-US and pqm-US for recordings of those
> language
> as spoken by Americans in the US then fussing over the fine details
> of
> dialectal variation in those languages.
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages@alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/ietf-languages