Re: [Ltru] Macrolanguage usage

"Mark Davis" <mark.davis@icu-project.org> Thu, 15 May 2008 21:01 UTC

Return-Path: <ltru-bounces@ietf.org>
X-Original-To: ltru-archive@megatron.ietf.org
Delivered-To: ietfarch-ltru-archive@core3.amsl.com
Received: from [127.0.0.1] (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id A38013A69E5; Thu, 15 May 2008 14:01:01 -0700 (PDT)
X-Original-To: ltru@core3.amsl.com
Delivered-To: ltru@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 6C33F3A69E5 for <ltru@core3.amsl.com>; Thu, 15 May 2008 14:01:00 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.982
X-Spam-Level:
X-Spam-Status: No, score=-1.982 tagged_above=-999 required=5 tests=[AWL=-0.006, BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id l9LxSDwSM2qi for <ltru@core3.amsl.com>; Thu, 15 May 2008 14:00:59 -0700 (PDT)
Received: from yw-out-2324.google.com (yw-out-2324.google.com [74.125.46.31]) by core3.amsl.com (Postfix) with ESMTP id 07A223A6899 for <ltru@ietf.org>; Thu, 15 May 2008 14:00:58 -0700 (PDT)
Received: by yw-out-2324.google.com with SMTP id 3so331277ywj.49 for <ltru@ietf.org>; Thu, 15 May 2008 14:00:41 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:references:x-google-sender-auth; bh=Sz5q0EMRcuBAxiOp7id/ODbFjUzdl9eOvBWc3T5mKC4=; b=LhhZCxS9tUP6Jx/ytify0v6jK0pJg1JrPHIrTQrNOOUPYmFXhPL21s27Ob/0WjBPvLZU5FOSp9DNsCiawhWg5+tN+S+67agePQwS+Dpq+/xRLJ98Lz2acwid/uC2HCjqelm0D4DRTxdGY6pX2q7fjuoRUqCWzArFhJvpEzdGOPg=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:references:x-google-sender-auth; b=OyfZkr+DY7VYPoPyWoA2GLCU+YWb/K+lCq1BQOvyIl94jwoYxUVTWno+wE3gfJKw76MsxvrVCQKxXIs/Z+zu/3Hax3gTVR82/I/rLiN2KGqwxvoYx0NEBF3CN7Ljufyy2ftwQRt7WHVjL9MKVQTxG8VOocQq56PQ5p/l3+ZQX9A=
Received: by 10.151.112.5 with SMTP id p5mr2829225ybm.159.1210885241807; Thu, 15 May 2008 14:00:41 -0700 (PDT)
Received: by 10.150.206.3 with HTTP; Thu, 15 May 2008 14:00:41 -0700 (PDT)
Message-ID: <30b660a20805151400g7f84bc7em81304f19c6b969cc@mail.gmail.com>
Date: Thu, 15 May 2008 14:00:41 -0700
From: Mark Davis <mark.davis@icu-project.org>
To: Shawn Steele <Shawn.Steele@microsoft.com>
In-Reply-To: <C9BF0238EED3634BA1866AEF14C7A9E56155D47AF4@NA-EXMSG-C116.redmond.corp.microsoft.com>
MIME-Version: 1.0
References: <30b660a20805150829hda2c1e4p114504a973843543@mail.gmail.com> <C9BF0238EED3634BA1866AEF14C7A9E56155D47AF4@NA-EXMSG-C116.redmond.corp.microsoft.com>
X-Google-Sender-Auth: 028dfa49e26642e3
Cc: LTRU Working Group <ltru@ietf.org>
Subject: Re: [Ltru] Macrolanguage usage
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Content-Type: multipart/mixed; boundary="===============1888943626=="
Sender: ltru-bounces@ietf.org
Errors-To: ltru-bounces@ietf.org

Perhaps it does, but I don't see why it would need to. Could you explain a
bit more?

Mark

On Thu, May 15, 2008 at 1:09 PM, Shawn Steele <Shawn.Steele@microsoft.com>
wrote:

>  Sounds reasonable to me, but it seems like RFC4647 would need updated to
> handle the 2nd bullet.
>
>
>
> -          Shawn
>
>
>
> *From:* ltru-bounces@ietf.org [mailto:ltru-bounces@ietf.org] *On Behalf Of
> *Mark Davis
> *Sent:* Thursday, May 15, 2008 8:29 AM
> *To:* LTRU Working Group
> *Subject:* [Ltru] Macrolanguage usage
>
>
>
> Peter Constable and I were at the Unicode Technical Committee meeting, and
> had a chance to talk about macrolanguages. One of the key points is to
> provide implementers with enough guidance as to what they should do, while
> not precluding reasonable alternatives. Here's what we were thinking about
> (not in formal language, but the points we wanted to make).
>
> ===
>
> Formally, a macrolanguage identifier could be used to tag or look up
> content in any encompassed language; alternately, specific,
> individual-language identifiers could be used to tag or look up content in
> those languages. In consideration of these alternatives, the following
> provides guidance for how to maintain backwards compatibility while giving
> the ability to clearly tag and lookup content.
>
> 1. Implementations generally should tag and/or lookup content with the
> specific language where possible (this is the general recommendation with
> language tags). In the case of macrolanguages, this means that Cantonese
> should be tagged and looked up with "yue", Hakka with "hak", Tajiki Arabic
> with "abh", Plains Cree with "crk", and so on.
>
> 2. An exception to this general recommendation may apply in the case of
> macrolanguages with predominant forms, listed in Table 8. For backwards
> compatibility in those cases:
>
>    - an implementation could tag and/or lookup content in the predominant
>    language either with the macrolanguage or the encompassed language. (eg
>    either "ar" or "arb" for Standard Arabic).
>    - an implementation could make a distinction between these in lookup,
>    or could return the same content. That is, lookup for "zh-SG" and "cmn-SG"
>    may return the same content, or may return different content if the
>    implementation needs to make a distinction for some purpose.
>
>
>    - where content written in an encompassed language is also
>    understandable in the predominant language (that being a distinct language
>    encompassed by the same macrolanguage), the content could also be tagged
>    with the macrolanguage identifier. Thus if a Cantonese passage is
>    understandable if read as Mandarin, it could also be tagged with "zh", or
>    where a Tajiki Arabic passage is also understandable in Standard Arabic it
>    could be tagged with "ar".
>
> 3. Another exception to this general recommendation applies in the case of
> applications that have limitations that exclude the identifiers for
> encompassed, individual languages of a macrolanguage. For example, some
> content cataloguing systems limit language identifiers to those in ISO
> 639-2; as a result, they may support a macrolanguage identifier but not the
> identifiers for the encompassed languages of that macrolanguage.
>
> Mark and Peter
>
> --
> Mark
>



-- 
Mark
_______________________________________________
Ltru mailing list
Ltru@ietf.org
https://www.ietf.org/mailman/listinfo/ltru