Re: [Ltru] Re: Punjabi

Addison Phillips <addison@yahoo-inc.com> Fri, 16 March 2007 23:46 UTC

Return-path: <ltru-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1HSM8C-0000VI-AD; Fri, 16 Mar 2007 19:46:44 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1HSM8A-0000VC-OE for ltru@ietf.org; Fri, 16 Mar 2007 19:46:42 -0400
Received: from rsmtp2.corp.yahoo.com ([207.126.228.150]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1HSM89-0004vk-9i for ltru@ietf.org; Fri, 16 Mar 2007 19:46:42 -0400
Received: from [172.21.37.80] (duringperson-lx.corp.yahoo.com [172.21.37.80]) (authenticated bits=0) by rsmtp2.corp.yahoo.com (8.13.8/8.13.6/y.rout) with ESMTP id l2GNkMup061673 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 16 Mar 2007 16:46:22 -0700 (PDT)
DomainKey-Signature: a=rsa-sha1; s=serpent; d=yahoo-inc.com; c=nofws; q=dns; h=message-id:date:from:user-agent:mime-version:to:cc:subject: references:in-reply-to:content-type:content-transfer-encoding; b=fmUAVOSZHtZ2itICRAU2nSX4bdOwKP1aFJ9HxH61n8FMDPOHyw2tOgC5J/L9rqA3
Message-ID: <45FB2C4E.9090303@yahoo-inc.com>
Date: Fri, 16 Mar 2007 16:46:22 -0700
From: Addison Phillips <addison@yahoo-inc.com>
User-Agent: Thunderbird 1.5.0.10 (Windows/20070221)
MIME-Version: 1.0
To: Mark Davis <mark.davis@icu-project.org>
Subject: Re: [Ltru] Re: Punjabi
References: <E1HRsNL-0001ob-5h@megatron.ietf.org> <003501c76756$f2213760$6401a8c0@DGBP7M81> <30b660a20703161305h1f007acalb7ecf2c45224b4da@mail.gmail.com> <20070316210509.GF17950@mercury.ccil.org> <30b660a20703161537q77fcf86y9c6488e0eb0603b@mail.gmail.com> <45FB2259.7050202@yahoo-inc.com> <30b660a20703161617u85dbfe1r44ddc29fcfcf1a6d@mail.gmail.com>
In-Reply-To: <30b660a20703161617u85dbfe1r44ddc29fcfcf1a6d@mail.gmail.com>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 7bit
X-Spam-Score: -15.0 (---------------)
X-Scan-Signature: ff03b0075c3fc728d7d60a15b4ee1ad2
Cc: Doug Ewell <dewell@adelphia.net>, LTRU Working Group <ltru@ietf.org>
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Errors-To: ltru-bounces@ietf.org

What extlangs buys us is: for languages that are already tagged, the 
current primary language subtags remain consistently the primary 
language subtag of choice. You might add additional subtags, but you 
don't have to retag all of your content.

Thus "zh-Hans-CN" is still valid. You would not (could not) retag with 
"cmn-Hans-CN". You might retag with "zh-cmn-Hans-CN", but this 
interferes less drastically with matching than outright change of the 
primary subtag.

What we want are consistent choices for language tags.

One alternative would be to allow both "zh-cmn" and "cmn". Users would 
have to be careful to use these consistently in their content and range 
selection.

Another alternative would be to forget extlang altogether and permit 
*either* "zh" *or* "cmn" but not both in the same tag (except by 
grandfathering). This frees the extlang up for other, nefarious, purposes.

My surmise is that macro-languages are a one-time event: "discovery" of 
future macro-languages will mostly be prohibited by rule (since most of 
the languages will already have codes in the "primary" position when 
they become part of a macro-language collection). If my surmise is 
correct, we could ban future extlang additions and use the remainder of 
that namespace for (well) nefarious purposes.

The only exception to my surmise would be: a language not previously 
given a code that is part of an existing macro-language. A language that 
isn't currently a macro-language that receives a new sublanguage would 
be a problem (all of the macro-languages are, by definition, one-to-many 
mappings). That is, the sub languages always travel in (at least) pairs.

Does that make sense?

Addison

Mark Davis wrote:
> I must not be clearly stating my point. Let me try again.
> 
> I'm getting at the fundamental reasoning behind extlang at all. I'm not 
> arguing that we should use different DATA than ISO 639-3 in order to 
> decide what are extlangs and what are not. What I'm saying is: why do we 
> need the extlang construction at all? Why do we need to have zh-cmn 
> instead of just cmn?
> 
>  > Worst case we have some languages that could have been extlangs that 
> become primary language subtags instead.
> 
> That is: Why don't we simply have *all* languages be primary language 
> subtags instead of extlangs? What do extlangs buy us, that we need the 
> complication that they introduce?
> 
> Mark
> 
> On 3/16/07, *Addison Phillips* <addison@yahoo-inc.com 
> <mailto:addison@yahoo-inc.com>> wrote:
> 
> 
> 
>     Mark Davis wrote:
>      > Yes, and it concerns me that we are baking in a particular view
>     of the
>      > world by requiring that some language tags can only be used in
>      > conjunction with others (eg pmu can only be used with lah-pmu). Now
>      > maybe I'm missing something, can you articulate the reasons why
>     we must
>      > use lah-pmu (for example) instead of just pmu?
>      >
> 
>     The reasons are all procedural: any ISO 639-3 code that is contained by
>     a macro-language and is not previously encoded by ISO 639-1 or ISO 639-2
>     must be an extlang whose Prefix is the macro-language code.
> 
>     This allows us to piggyback on ISO 639-3's work in this area to create
>     tags such as zh-cmn and avoid naked 'cmn' language tags without having,
>     ourselves, to squint at the lists and make separate, possibly
>     unreasonable, decisions.
> 
>     One possible way to avoid this would be to limit the "automatic"
>     creation of mappings to those languages that have ISO 639-1 codes. This
>     greatly limits the impact. The problem here is that it is arbitrary (why
>     Aymara and not Baluchi? why Cree and not Delaware?)
> 
>     To respond to Doug's point, I think that we are not *forced* to delay
>     RFC 4646bis or even 4645bis's appearance. What we need are clear rules
>     for the incorporation of ISO 639-3 into the registry scheme. This
>     *could* take the form of a Big Bang insertion. But it would
>     certainly be
>     valid to insert only the language (and not the extlang) codes initially
>     or to include only the finalized and stable extlang codes when they are
>     mature---on a different day.
> 
>     I would suggest that a mechanism for doing this would be to take each
>     macro-language, as a collection, and vet the contents with the RA and
>     on-list with ietf-languages before doing an insert. The Chinese and
>     Arabic collections probably get in straight-away. Lahnda's subtags (by
>     way of example) could wait---and no one is hurt by that delay. The only
>     hurt is getting the mapping wrong. Worst case we have some languages
>     that could have been extlangs that become primary language subtags
>     instead.
> 
>     Addison
> 
>     --
>     Addison Phillips
>     Globalization Architect -- Yahoo! Inc.
> 
>     Internationalization is an architecture.
>     It is not a feature.
> 
> 
> 
> 
> -- 
> Mark

-- 
Addison Phillips
Globalization Architect -- Yahoo! Inc.

Internationalization is an architecture.
It is not a feature.

_______________________________________________
Ltru mailing list
Ltru@ietf.org
https://www1.ietf.org/mailman/listinfo/ltru