Re: [Ltru] Macrolanguage usage

"Phillips, Addison" <addison@amazon.com> Fri, 16 May 2008 17:13 UTC

Return-Path: <ltru-bounces@ietf.org>
X-Original-To: ltru-archive@megatron.ietf.org
Delivered-To: ietfarch-ltru-archive@core3.amsl.com
Received: from [127.0.0.1] (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 207D828C1F6; Fri, 16 May 2008 10:13:38 -0700 (PDT)
X-Original-To: ltru@core3.amsl.com
Delivered-To: ltru@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id C8A4C28C1F5 for <ltru@core3.amsl.com>; Fri, 16 May 2008 10:13:35 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -106.663
X-Spam-Level:
X-Spam-Status: No, score=-106.663 tagged_above=-999 required=5 tests=[AWL=-0.064, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Upw5R2-bwN-q for <ltru@core3.amsl.com>; Fri, 16 May 2008 10:13:35 -0700 (PDT)
Received: from smtp-fw-6101.amazon.com (smtp-fw-6101.amazon.com [72.21.208.25]) by core3.amsl.com (Postfix) with ESMTP id C125928C172 for <ltru@ietf.org>; Fri, 16 May 2008 10:13:34 -0700 (PDT)
X-IronPort-AV: E=Sophos;i="4.27,498,1204502400"; d="scan'208";a="311458324"
Received: from smtp-in-5102.iad5.amazon.com ([10.218.9.29]) by smtp-border-fw-out-6101.iad6.amazon.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 16 May 2008 17:13:29 +0000
Received: from ex-hub-4104.ant.amazon.com (ex-hub-4104.sea5.amazon.com [10.248.163.25]) by smtp-in-5102.iad5.amazon.com (8.12.11/8.12.11) with ESMTP id m4GHDS03017601 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=FAIL); Fri, 16 May 2008 17:13:29 GMT
Received: from EX-SEA5-D.ant.amazon.com ([10.248.163.28]) by ex-hub-4104.ant.amazon.com ([10.248.163.25]) with mapi; Fri, 16 May 2008 10:13:28 -0700
From: "Phillips, Addison" <addison@amazon.com>
To: Peter Constable <petercon@microsoft.com>, LTRU Working Group <ltru@ietf.org>
Date: Fri, 16 May 2008 10:13:27 -0700
Thread-Topic: [Ltru] Macrolanguage usage
Thread-Index: Aci3JsQR79pXsZVBSjCDjtvmPhwOZgANuV+wAAEc12AABLVl4AAALPYQ
Message-ID: <4D25F22093241741BC1D0EEBC2DBB1DA013A2183C6@EX-SEA5-D.ant.amazon.com>
References: <mailman.494.1210865385.5128.ltru@ietf.org> <00a901c8b6f5$c04529a0$e6f5e547@DGBP7M81> <DDB6DE6E9D27DD478AE6D1BBBB83579562E143D665@NA-EXMSG-C117.redmond.corp.microsoft.com> <4D25F22093241741BC1D0EEBC2DBB1DA013A118FF0@EX-SEA5-D.ant.amazon.com> <DDB6DE6E9D27DD478AE6D1BBBB83579562E143D759@NA-EXMSG-C117.redmond.corp.microsoft.com>
In-Reply-To: <DDB6DE6E9D27DD478AE6D1BBBB83579562E143D759@NA-EXMSG-C117.redmond.corp.microsoft.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: en-US
MIME-Version: 1.0
Subject: Re: [Ltru] Macrolanguage usage
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
Sender: ltru-bounces@ietf.org
Errors-To: ltru-bounces@ietf.org

Peter wrote:
>
> Again, I don't see why what I said is interpreted as equating 'zh' with
> Mandarin. I'm completely opposed to that. Re carefully the text that
> Mark sent: nowhere does it say, "If your content is Mandarin, you
> SHOULD use 'zh'," or, "It can be assumed that content tagged 'zh' is in
> Mandarin."
>

The text said (in part):

--
•       where content written in an encompassed language is also understandable in the predominant language (that being a distinct language encompassed by the same macrolanguage), the content could also be tagged with the macrolanguage identifier.
--

It then goes on to say:

--
Thus if a Cantonese passage is understandable if read as Mandarin, it could also be tagged with "zh",...
--

You are correct that this doesn't explicitly *say* 'zh' == 'cmn'. But it at least implies that Mandarin is the predominant language.

The real issue with the text you and Mark proposed is that it does not provide clear guidance, which, as I understand it, is one of the key objections that Karen has (I happen to know that she's away from email for a couple of days, so may not be able to respond in a timely fashion; I ask pardon if I misstate or misinterpret her position). The three recommendations in your bullet list, when applied to cmn/zh, as I interpret them are:

1. You can choose to tag or lookup Mandarin as either 'cmn' or 'zh', since Mandarin is the predominant language.
2. You can choose to consider 'zh' and 'cmn' to be synonyms or not as determined by your application's needs.
3. If the content is consistent with 'cmn', you can tag it as 'zh', since 'cmn' is the predominant form of 'zh'.

I don't have an argument with any of these, excepting #3. The guidance on this last is not clear. Why the predominant language only? What about other forms? What about the example text I just suggested (note that it needed some editing):

--
While documents written in Standard Mandarin SHOULD use the 'cmn' (Mandarin) language subtag to form the larger language tag, where it is necessary to indicate that the text is intended for wide accessibility applications MAY use the 'zh' subtag.
--

Was this your intention with the above?

Addison
_______________________________________________
Ltru mailing list
Ltru@ietf.org
https://www.ietf.org/mailman/listinfo/ltru