Re: [Ltru] Consensus call: extlang

Peter Constable <petercon@microsoft.com> Wed, 04 June 2008 03:18 UTC

Return-Path: <ltru-bounces@ietf.org>
X-Original-To: ltru-archive@megatron.ietf.org
Delivered-To: ietfarch-ltru-archive@core3.amsl.com
Received: from [127.0.0.1] (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 9C0A93A6913; Tue, 3 Jun 2008 20:18:17 -0700 (PDT)
X-Original-To: ltru@core3.amsl.com
Delivered-To: ltru@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 463FE3A6913 for <ltru@core3.amsl.com>; Tue, 3 Jun 2008 20:18:16 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -10.547
X-Spam-Level:
X-Spam-Status: No, score=-10.547 tagged_above=-999 required=5 tests=[AWL=0.052, BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id wy+UmBs01BW0 for <ltru@core3.amsl.com>; Tue, 3 Jun 2008 20:18:15 -0700 (PDT)
Received: from smtp.microsoft.com (mailc.microsoft.com [131.107.115.214]) by core3.amsl.com (Postfix) with ESMTP id 7E1773A68CD for <ltru@ietf.org>; Tue, 3 Jun 2008 20:18:15 -0700 (PDT)
Received: from TK5-EXHUB-C101.redmond.corp.microsoft.com (157.54.18.48) by TK5-EXGWY-E803.partners.extranet.microsoft.com (10.251.56.169) with Microsoft SMTP Server (TLS) id 8.1.240.5; Tue, 3 Jun 2008 20:18:19 -0700
Received: from NA-EXMSG-C117.redmond.corp.microsoft.com ([157.54.62.46]) by TK5-EXHUB-C101.redmond.corp.microsoft.com ([157.54.18.48]) with mapi; Tue, 3 Jun 2008 20:18:19 -0700
From: Peter Constable <petercon@microsoft.com>
To: LTRU Working Group <ltru@ietf.org>
Date: Tue, 03 Jun 2008 20:18:15 -0700
Thread-Topic: [Ltru] Consensus call: extlang
Thread-Index: AcjF3o88HqlAn7BYQcOH/jTMeTydZQAEmGqw
Message-ID: <DDB6DE6E9D27DD478AE6D1BBBB8357956333680DEC@NA-EXMSG-C117.redmond.corp.microsoft.com>
References: <mailman.3578.1212518591.4806.ltru@ietf.org> <000e01c8c5de$85a6eca0$e6f5e547@DGBP7M81>
In-Reply-To: <000e01c8c5de$85a6eca0$e6f5e547@DGBP7M81>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: en-US
MIME-Version: 1.0
Subject: Re: [Ltru] Consensus call: extlang
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: ltru-bounces@ietf.org
Errors-To: ltru-bounces@ietf.org

> From: ltru-bounces@ietf.org [mailto:ltru-bounces@ietf.org] On Behalf Of
> Doug Ewell


> > It must be crystal clear that 'zh' does NOT mean Mandarin, but rather
> > any kind of Chinese.  Of course, there are cases where one might
> > deliberately be imprecise in tagging, but we should not be so sloppy
> > in our own definitions to equate the label for the superset with the
> > subset, just as we must avoid falsely limiting 'de' to "Standard
> > German" or 'ar' to "Standard Arabic".
>
> We should decide whether this is the general sense of the working group,
> or if it's more along the lines of "Tagging non-Mandarin Chinese with
> 'zh' is comparable to walking through dangerous neighborhoods in the
> middle of the night, blindfolded, with $100 bills pinned to one's
> body."

Whether or not it is so risky depends on the scenario. There are some scenarios in which it may be completely unproblematic to tag content in either Mandarin or Cantonese as "zh". Librarians do the ISO 639-2 equivalent of that all the time, for example. There may also be applications in which there would certain risks in tagging Cantonese content as "zh", and in principle there may also be applications in which tagging Mandarin content as "zh" might present risks.

"zh" means "Chinese"; not Mandarin, not Cantonese.

Know the requirements of your application scenarios; tag wisely.




Peter
_______________________________________________
Ltru mailing list
Ltru@ietf.org
https://www.ietf.org/mailman/listinfo/ltru