RE: wadegile and pinyin LANGUAGE SUBTAG REGISTRATION FORMs

Peter Constable <petercon@microsoft.com> Wed, 03 September 2008 18:57 UTC

Return-Path: <petercon@microsoft.com>
X-Original-To: ietf-languages@alvestrand.no
Delivered-To: ietf-languages@alvestrand.no
Received: from localhost (localhost [127.0.0.1]) by eikenes.alvestrand.no (Postfix) with ESMTP id 2549B39E1AB for <ietf-languages@alvestrand.no>; Wed, 3 Sep 2008 20:57:36 +0200 (CEST)
X-Virus-Scanned: Debian amavisd-new at eikenes.alvestrand.no
Received: from eikenes.alvestrand.no ([127.0.0.1]) by localhost (eikenes.alvestrand.no [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id hZ4OFjc8rmPm for <ietf-languages@alvestrand.no>; Wed, 3 Sep 2008 20:57:35 +0200 (CEST)
X-Greylist: from auto-whitelisted by SQLgrey-1.6.8
Received: from pechora4.lax.icann.org (pechora4.icann.org [208.77.188.39]) by eikenes.alvestrand.no (Postfix) with ESMTPS id 3F5A539E05E for <ietf-languages@alvestrand.no>; Wed, 3 Sep 2008 20:57:35 +0200 (CEST)
Received: from smtp.microsoft.com (mailb.microsoft.com [131.107.115.215]) by pechora4.lax.icann.org (8.13.8/8.13.8) with ESMTP id m83IvcZq023156 for <ietf-languages@iana.org>; Wed, 3 Sep 2008 11:57:58 -0700
Received: from TK5-EXHUB-C101.redmond.corp.microsoft.com (157.54.18.48) by TK5-EXGWY-E802.partners.extranet.microsoft.com (10.251.56.168) with Microsoft SMTP Server (TLS) id 8.1.291.1; Wed, 3 Sep 2008 11:57:38 -0700
Received: from NA-EXMSG-C117.redmond.corp.microsoft.com ([157.54.62.44]) by TK5-EXHUB-C101.redmond.corp.microsoft.com ([157.54.18.48]) with mapi; Wed, 3 Sep 2008 11:57:38 -0700
From: Peter Constable <petercon@microsoft.com>
To: ietflang IETF Languages Discussion <ietf-languages@iana.org>
Date: Wed, 03 Sep 2008 11:57:36 -0700
Subject: RE: wadegile and pinyin LANGUAGE SUBTAG REGISTRATION FORMs
Thread-Topic: wadegile and pinyin LANGUAGE SUBTAG REGISTRATION FORMs
Thread-Index: AckN1LHA2FcHtFtIS7G5lmFxtix2AQAIUT2Q
Message-ID: <DDB6DE6E9D27DD478AE6D1BBBB835795633B2E3A54@NA-EXMSG-C117.redmond.corp.microsoft.com>
References: <mailman.5.1219744802.2264.ietf-languages@alvestrand.no> <240C1D5B2BCD4479894DE98CA7FDF0C5@DGBP7M81> <CAE7BB83-82BA-4411-B2AC-9E23E090719C@evertype.com> <DDB6DE6E9D27DD478AE6D1BBBB835795633B2E3832@NA-EXMSG-C117.redmond.corp.microsoft.com> <20080903145216.GF5136@mercury.ccil.org>
In-Reply-To: <20080903145216.GF5136@mercury.ccil.org>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: en-US
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-Virus-Scanned: ClamAV 0.93.3/8148/Wed Sep 3 09:07:44 2008 on pechora4.lax.icann.org
X-Virus-Status: Clean
X-Greylist: IP, sender and recipient auto-whitelisted, not delayed by milter-greylist-4.0 (pechora4.lax.icann.org [208.77.188.39]); Wed, 03 Sep 2008 11:57:58 -0700 (PDT)
X-BeenThere: ietf-languages@alvestrand.no
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: IETF Language tag discussions <ietf-languages.alvestrand.no>
List-Unsubscribe: <http://www.alvestrand.no/mailman/listinfo/ietf-languages>, <mailto:ietf-languages-request@alvestrand.no?subject=unsubscribe>
List-Archive: <http://www.alvestrand.no/pipermail/ietf-languages>
List-Post: <mailto:ietf-languages@alvestrand.no>
List-Help: <mailto:ietf-languages-request@alvestrand.no?subject=help>
List-Subscribe: <http://www.alvestrand.no/mailman/listinfo/ietf-languages>, <mailto:ietf-languages-request@alvestrand.no?subject=subscribe>
X-List-Received-Date: Wed, 03 Sep 2008 18:57:36 -0000

> From: John Cowan [mailto:cowan@ccil.org]


> > I think we all agree that Latin is implied. Chinese is also implied.
> By
> > this rationale, a complete tag of "wadegile" would work just as
> > well as "zh-wadegile" (BCP47 syntax requirements aside). In terms of
> > semantic representation, that is true: "wadegile" contains just as
> > much information as does "zh-wadegile".
>
> However, all tags MUST have a language subtag, so this analogy is not
> on all fours.

Even if the language subtag weren't required, I think we'd still recommend it be included for reasons I described.


> > Those are made more complicated if "Latn" is not part of the prefix
> for
> > "wadegile" and "pinyin".
>
> You still need to know that "wadegile" and "pinyin" imply Latin,
> because
> the Prefix for a subtag is only a SHOULD, so people are still free to
> send you "zh-wadegile" whether the Prefix says "zh" or "zh-Latn".

No, I (or some process) does not *need* to know. For analogy, that's like saying that Unicode implementations need to be able to interpret and process U+A866 correctly, which clearly is not what that standard requires. No more does BCP47 require that implementations must know that "wadegile" and "pinyin" imply Latin. And, in fact, I suspect that many *will not* know that, while they would be able to do the expected matching when comparing "zh-Latn" with "zh-Latn-wadegile" or "zh-Latn-pinyin". If we want people to get the best results -- or, as Niall said, the most fault-tolerant behaviour -- then we should be giving them a recommendation to tag "zh-Latn-wadegile" and "zh-Latn-pinyin". The prefix field will be treated as exactly that: a recommendation.



Peter