Re: [Ltru] Extended language tags (long reply)

Addison Phillips <addison@yahoo-inc.com> Mon, 08 October 2007 03:31 UTC

Return-path: <ltru-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1IejL6-0001YA-Lp; Sun, 07 Oct 2007 23:31:28 -0400
Received: from ltru by megatron.ietf.org with local (Exim 4.43) id 1IejL5-0001Y3-Op for ltru-confirm+ok@megatron.ietf.org; Sun, 07 Oct 2007 23:31:27 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1IejL5-0001Ux-FA for ltru@ietf.org; Sun, 07 Oct 2007 23:31:27 -0400
Received: from rsmtp1.corp.yahoo.com ([207.126.228.149]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1IejKv-0000g4-6S for ltru@ietf.org; Sun, 07 Oct 2007 23:31:23 -0400
Received: from [10.72.77.22] (snvvpn2-10-72-77-c22.corp.yahoo.com [10.72.77.22]) (authenticated bits=0) by rsmtp1.corp.yahoo.com (8.13.8/8.13.8/y.rout) with ESMTP id l983TbgW007748 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sun, 7 Oct 2007 20:29:37 -0700 (PDT)
DomainKey-Signature: a=rsa-sha1; s=serpent; d=yahoo-inc.com; c=nofws; q=dns; h=message-id:date:from:user-agent:mime-version:to:cc:subject: references:in-reply-to:content-type:content-transfer-encoding; b=PnOZcm4UA5UbVLlX6pn3Ghi0tcmeSATc3lO71GSlXboe6h12Lo4beNvXGfKsxkVU
Message-ID: <4709A420.80508@yahoo-inc.com>
Date: Sun, 07 Oct 2007 20:29:36 -0700
From: Addison Phillips <addison@yahoo-inc.com>
User-Agent: Thunderbird 2.0.0.6 (Windows/20070728)
MIME-Version: 1.0
To: Mark Davis <mark.davis@icu-project.org>
Subject: Re: [Ltru] Extended language tags (long reply)
References: <E1IdT7z-0001vv-Ly@megatron.ietf.org> <C9BF0238EED3634BA1866AEF14C7A9E55A597AC370@NA-EXMSG-C116.redmond.corp.microsoft.com> <4709146F.6020504@yahoo-inc.com> <9d70cb000710071715p398a669fhd06326843d9d9390@mail.gmail.com> <30b660a20710071740ma6d39a3u61c8543c70125847@mail.gmail.com>
In-Reply-To: <30b660a20710071740ma6d39a3u61c8543c70125847@mail.gmail.com>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 7bit
X-Spam-Score: -15.0 (---------------)
X-Scan-Signature: a8a20a483a84f747e56475e290ee868e
Cc: "ltru@ietf.org" <ltru@ietf.org>
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Errors-To: ltru-bounces@ietf.org

I would say:

1. With Extlangs.

- change to filtering: none, but you probably want to use extended 
filtering instead of basic filtering (i.e. "zh-Hant-HK" matches 
"zh-yue-Hant-HK" and "zh-cmn-Hant-HK")

- change to lookup: treat extlang as atomic with the primary language 
subtag; potentially loop-back through the subtags. That is, given the 
range "zh-yue-Hant-HK", the fallback pattern is this:

  zh-yue-Hant-HK
  zh-yue-Hant
  zh-yue
  zh-Hant-HK
  zh-Hant
  zh
  (default)

Or this:

  zh-yue-Hant-HK
  zh-yue-Hant
  zh-yue
  (default)

2. Without extlangs.

- change to filtering: none

- change to lookup: none

BUT... you want to include the macro language in your ranges in some 
cases. Alternatively, we would have to define new filtering and lookup 
options that include mapping to macrolanguages. For example, with the 
range "yue-Hant-HK", you would want the fallback to be:

  yue-Hant-HK
  yue-Hant
  yue
  zh-Hant-HK
  zh-Hant
  zh
  (default)

This can be achieved either by having the language priority list 
"yue-Hant-HK;zh-Hant-HK" or by inferring it using registry data about 
macro languages.

A similar case can be made for filtering.

Addison

Mark Davis wrote:
> I think what might help is for us to draw up in detail what the matching 
> algorithms (changes to 4747) would be like in either of the two 
> proposals, and look at what happens with different cases 'ar', 'zh', 
> 'no', and others.
> 
> Mark
> 
> On 10/7/07, *Andrew Cunningham* <lang.support@gmail.com 
> <mailto:lang.support@gmail.com>> wrote:
> 
>     On 08/10/2007, Addison Phillips <addison@yahoo-inc.com
>     <mailto:addison@yahoo-inc.com>> wrote:
> 
>      >
>      > So, for me, the main issue is whether we are going to explicitly
>     break
>      > the connection between the language and its macrolanguage (at the
>     tag
>      > level). In some cases (Norwegian) we already know this can be
>      > problematic; in others, it may actually be desirable.
>      >
> 
>     It probably comes down to philosophical differences.
> 
>     So, cutting to the chase, which way forward?
> 
>     Andrew
> 
> 
>     _______________________________________________
>     Ltru mailing list
>     Ltru@ietf.org <mailto:Ltru@ietf.org>
>     https://www1.ietf.org/mailman/listinfo/ltru
>     <https://www1.ietf.org/mailman/listinfo/ltru>
> 
> 
> 
> 
> -- 
> Mark

-- 
Addison Phillips
Globalization Architect -- Yahoo! Inc.
Chair -- W3C Internationalization Core WG

Internationalization is an architecture.
It is not a feature.


_______________________________________________
Ltru mailing list
Ltru@ietf.org
https://www1.ietf.org/mailman/listinfo/ltru