Re: Pinyin

CE Whitehead <cewcathar@hotmail.com> Sat, 27 September 2008 15:01 UTC

Return-Path: <cewcathar@hotmail.com>
X-Original-To: ietf-languages@alvestrand.no
Delivered-To: ietf-languages@alvestrand.no
Received: from localhost (localhost [127.0.0.1]) by eikenes.alvestrand.no (Postfix) with ESMTP id 7D21C39E47B for <ietf-languages@alvestrand.no>; Sat, 27 Sep 2008 17:01:35 +0200 (CEST)
X-Virus-Scanned: Debian amavisd-new at eikenes.alvestrand.no
Received: from eikenes.alvestrand.no ([127.0.0.1]) by localhost (eikenes.alvestrand.no [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Ott55bsHe-33 for <ietf-languages@alvestrand.no>; Sat, 27 Sep 2008 17:01:34 +0200 (CEST)
X-Greylist: domain auto-whitelisted by SQLgrey-1.6.8
Received: from pechora5.lax.icann.org (pechora5.icann.org [208.77.188.40]) by eikenes.alvestrand.no (Postfix) with ESMTPS id BDE3639E3F7 for <ietf-languages@alvestrand.no>; Sat, 27 Sep 2008 17:01:33 +0200 (CEST)
Received: from blu0-omc1-s32.blu0.hotmail.com (blu0-omc1-s32.blu0.hotmail.com [65.55.116.43]) by pechora5.lax.icann.org (8.13.8/8.13.8) with ESMTP id m8RF1hV8007436 for <ietf-languages@iana.org>; Sat, 27 Sep 2008 08:02:03 -0700
Received: from BLU109-W46 ([65.55.116.7]) by blu0-omc1-s32.blu0.hotmail.com with Microsoft SMTPSVC(6.0.3790.3959); Sat, 27 Sep 2008 08:01:42 -0700
Message-ID: <BLU109-W469DB4E0FFDA9DC0719976B3460@phx.gbl>
X-Originating-IP: [74.254.68.141]
From: CE Whitehead <cewcathar@hotmail.com>
To: ietf-languages@iana.org
Subject: Re: Pinyin
Date: Sat, 27 Sep 2008 11:01:42 -0400
Importance: Normal
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-OriginalArrivalTime: 27 Sep 2008 15:01:42.0792 (UTC) FILETIME=[F3C62880:01C920B1]
X-Virus-Scanned: ClamAV 0.93.3/8346/Sat Sep 27 00:08:52 2008 on pechora5.lax.icann.org
X-Virus-Status: Clean
X-Greylist: IP, sender and recipient auto-whitelisted, not delayed by milter-greylist-4.0 (pechora5.lax.icann.org [208.77.188.40]); Sat, 27 Sep 2008 08:02:03 -0700 (PDT)
X-BeenThere: ietf-languages@alvestrand.no
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: IETF Language tag discussions <ietf-languages.alvestrand.no>
List-Unsubscribe: <http://www.alvestrand.no/mailman/listinfo/ietf-languages>, <mailto:ietf-languages-request@alvestrand.no?subject=unsubscribe>
List-Archive: <http://www.alvestrand.no/pipermail/ietf-languages>
List-Post: <mailto:ietf-languages@alvestrand.no>
List-Help: <mailto:ietf-languages-request@alvestrand.no?subject=help>
List-Subscribe: <http://www.alvestrand.no/mailman/listinfo/ietf-languages>, <mailto:ietf-languages-request@alvestrand.no?subject=subscribe>
X-List-Received-Date: Sat, 27 Sep 2008 15:01:35 -0000


Hi (sorry to take so long to reply)

I am agreed with Michael and John that the subtag [pinyin] should apply to all languages that use an orthography closely related to Pinyin.  I think that it will be convenient for there to be a subtag to distinguish this Romanization from the Yale or other Romanizations.

I can see that Romanizations of Mandarin Chinese will be the most important in terms of the numbers of people they reach, but looking at the characters (sorry for my misuse of the term) in each orthography, I did not see enough differences (though I am not an expert) to warrant not including the Pinyin Romanizations of Tibetan and also the Pinyin Romanization of Mandarin that is called Tongyong (the big trick is distinguishing Tongyong from Hanyu though--but my guess is most people who can read one can read the other so is differentiating these that important?).

I do agree however with Randy and others that [zh-TW] might mean a particular variety of Mandarin as spoken in Taiwan and so dislike the use of country codes to differentiate these two varieties if/when it becomes necessary to differentiate them.

I'd prefer additional variants be registered to distinguish these--as Doug Ewell has suggested; we could register [Hanyu] right now.

I think such a tagging system will be specific enough for any text processing that needs to be done; I guess we are going to split a few hairs over this though.

Thanks.

--C. E. Whitehead
cewcathar@hotmail.com

From: John Cowan cowan@ccil.org

> Randy Presuhn scripsit:


>>> Regarding Niall Tracey's concern, I think that if the various orthographies


>>> use an almost identical character set,


>>> uh, -Latn- ?


> Clearly "character set" is not used in the standard sense here, but in
> a sense like "grapheme-to-phoneme mapping". (Technically, "Latn"is
> not a character set either, but a script that corresponds to acharacter
> repertoire.)
Yes this is it; sorry for my vagueness; thanks for clarifying what I wrote.

From: Michael Everson 
> On 25 Sep 2008, at 11:40, Tracey, Niall wrote:
>> To us the differences seem minor as all pinyins fall far outside of
>> our "western" frames of reference. Not at all. Not to me anyway. Nor to John, evidently.
>> But the more familiar something is to the observer, the more the
>> differences are exaggerated.
>> Conversely, the difference is massively
>> de-emphasised when both are unfamiliar.
Sometimes; I do not think this is what is going on here.

>> Remember that many white
>> people can't tell the difference between (for example) Chinese, Thai
>> and Japanese people despite a massive difference in phenotype.
>> What you perceive as a logical generalisation, the native speakers
>> may see as a racist generalisation.
>  OK, this is over the top. The fact is that "Pinyin" refers to a particular
> use of the Latin alphabet (just as "fonupa" does) whose properties (in
> particular its definition of j, c, q, and x for instance) make it quite
> unique as regards other orthographies. In addition to Mandarin, the
> Chinese themselves apply this alphabet to other Chinese languages as
> well as other languages of China. That is why Pinyin is an umbrella term
> for these orthographies.

> Michael Everson * http://www.evertype.com
Thanks; this is what I'd understood from the discussion!

Best,

--C. E. Whitehead
cewcathar@hotmail.com