RE: Pinyin

"Phillips, Addison" <addison@amazon.com> Wed, 24 September 2008 20:15 UTC

Return-Path: <addison@amazon.com>
X-Original-To: ietf-languages@alvestrand.no
Delivered-To: ietf-languages@alvestrand.no
Received: from localhost (localhost [127.0.0.1]) by eikenes.alvestrand.no (Postfix) with ESMTP id 2A39939E68D for <ietf-languages@alvestrand.no>; Wed, 24 Sep 2008 22:15:15 +0200 (CEST)
X-Virus-Scanned: Debian amavisd-new at eikenes.alvestrand.no
Received: from eikenes.alvestrand.no ([127.0.0.1]) by localhost (eikenes.alvestrand.no [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vz0g-vWAZa6T for <ietf-languages@alvestrand.no>; Wed, 24 Sep 2008 22:15:14 +0200 (CEST)
X-Greylist: from auto-whitelisted by SQLgrey-1.6.8
Received: from pechora1.lax.icann.org (pechora1.icann.org [208.77.188.36]) by eikenes.alvestrand.no (Postfix) with ESMTPS id 15EE139E498 for <ietf-languages@alvestrand.no>; Wed, 24 Sep 2008 22:15:13 +0200 (CEST)
Received: from smtp-fw-9101.amazon.com (smtp-fw-9101.amazon.com [207.171.184.25]) by pechora1.lax.icann.org (8.13.8/8.13.8) with ESMTP id m8OKFNRk007106 for <ietf-languages@iana.org>; Wed, 24 Sep 2008 13:15:43 -0700
X-IronPort-AV: E=Sophos;i="4.33,303,1220227200"; d="scan'208";a="113400109"
Received: from smtp-in-1105.vdc.amazon.com ([10.140.9.24]) by smtp-border-fw-out-9101.sea19.amazon.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 24 Sep 2008 20:15:14 +0000
Received: from ex-hub-4102.ant.amazon.com (ex-hub-4102.ant.amazon.com [10.248.163.23]) by smtp-in-1105.vdc.amazon.com (8.12.11/8.12.11) with ESMTP id m8OKFC6s007170 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=FAIL); Wed, 24 Sep 2008 20:15:13 GMT
Received: from ex-pub-4102.ant.amazon.com (10.248.180.20) by ex-hub-4102.ant.amazon.com (10.248.163.23) with Microsoft SMTP Server (TLS) id 8.1.291.4; Wed, 24 Sep 2008 13:15:12 -0700
Received: from EX-SEA5-D.ant.amazon.com ([10.248.163.28]) by ex-pub-4102.ant.amazon.com ([10.248.180.20]) with mapi; Wed, 24 Sep 2008 13:14:32 -0700
From: "Phillips, Addison" <addison@amazon.com>
To: David Starner <prosfilaes@gmail.com>
Date: Wed, 24 Sep 2008 13:15:10 -0700
Subject: RE: Pinyin
Thread-Topic: Pinyin
Thread-Index: Ackeft4DRyTTEHJXSI+DTJUf3VJ4xwAAigGw
Message-ID: <4D25F22093241741BC1D0EEBC2DBB1DA014C26B0EF@EX-SEA5-D.ant.amazon.com>
References: <83C5E5CB-FE27-47BA-A98F-F5003F586A64@evertype.com> <006e01c91e68$9e4abce0$6801a8c0@oemcomputer> <20080924172101.GU19886@mercury.ccil.org> <6d99d1fd0809241042g44eba0e8q613989437a958ee@mail.gmail.com> <20080924190502.GD11053@mercury.ccil.org> <6d99d1fd0809241226k384e9a11h41ebae090bb1b8d6@mail.gmail.com> <4D25F22093241741BC1D0EEBC2DBB1DA014C26B041@EX-SEA5-D.ant.amazon.com> <6d99d1fd0809241250k18a51b12p7b13d313d3eb41a3@mail.gmail.com>
In-Reply-To: <6d99d1fd0809241250k18a51b12p7b13d313d3eb41a3@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: en-US
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-Virus-Scanned: ClamAV 0.93.3/8324/Wed Sep 24 03:55:43 2008 on pechora1.lax.icann.org
X-Virus-Status: Clean
X-Greylist: IP, sender and recipient auto-whitelisted, not delayed by milter-greylist-4.0 (pechora1.lax.icann.org [208.77.188.36]); Wed, 24 Sep 2008 13:15:43 -0700 (PDT)
Cc: "ietf-languages@iana.org" <ietf-languages@iana.org>, John Cowan <cowan@ccil.org>
X-BeenThere: ietf-languages@alvestrand.no
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: IETF Language tag discussions <ietf-languages.alvestrand.no>
List-Unsubscribe: <http://www.alvestrand.no/mailman/listinfo/ietf-languages>, <mailto:ietf-languages-request@alvestrand.no?subject=unsubscribe>
List-Archive: <http://www.alvestrand.no/pipermail/ietf-languages>
List-Post: <mailto:ietf-languages@alvestrand.no>
List-Help: <mailto:ietf-languages-request@alvestrand.no?subject=help>
List-Subscribe: <http://www.alvestrand.no/mailman/listinfo/ietf-languages>, <mailto:ietf-languages-request@alvestrand.no?subject=subscribe>
X-List-Received-Date: Wed, 24 Sep 2008 20:15:15 -0000

> 
> es-US is going to be very distinct from es-SP

Definitely, since 'SP' isn't a valid region code :-).

> 
> As for the ongoing discussion, there is data tagged zh-TW, and when
> that data is romanized into Hanyu Pinyin, it will be very natural
> to
> tag it zh-TW-pinyin, and I see it hard to argue that's formally
> wrong.
> So zh-TW(-Latn)-pinyin can't be trusted to be Tongyong Pinyin.

No. I think that's the point Randy is trying to make. A subtag that means, ambiguously, just 'pinyin' can have an implied additional meaning derived from surrounding subtags. But that derivation may be wrong. An explicitly defined subtag isn't ambiguous (within the limits of what it defines).

So "zh-Latn-TW-hpinyin" is definitely not Tongyong (ditto a "zh-Latn-TW-pinyin-hanyu"). Whereas "zh-Latn-TW-pinyin" might be either--or some other pinyin.

What is key here for me is whether this distinction is actually important for the requesters or others who would use the subtag. If, in practice, the only thing that matters is "it's in Latin script and it isn't Wade-Giles", then 'pinyin' does that just fine.

Addison