RE: [Ltru] Re: Punjabi

"Don Osborn" <> Sat, 17 March 2007 00:34 UTC

Return-path: <>
Received: from [] ( by with esmtp (Exim 4.43) id 1HSMsC-0008CW-Tk; Fri, 16 Mar 2007 20:34:16 -0400
Received: from [] ( by with esmtp (Exim 4.43) id 1HSMsB-0008CQ-Nz for; Fri, 16 Mar 2007 20:34:16 -0400
Received: from ([] by with esmtp (Exim 4.43) id 1HSMsA-00043G-Dv for; Fri, 16 Mar 2007 20:34:15 -0400
Received: (qmail 10795 invoked from network); 16 Mar 2007 19:34:01 -0500
Received: from (HELO IBM92AA25595C4) ( by with SMTP; 16 Mar 2007 19:34:01 -0500
From: "Don Osborn" <>
To: "'Addison Phillips'" <>, "'Mark Davis'" <>
References: <> <003501c76756$f2213760$6401a8c0@DGBP7M81> <> <> <> <> <> <>
In-Reply-To: <>
Subject: RE: [Ltru] Re: Punjabi
Date: Fri, 16 Mar 2007 20:33:51 -0400
Message-ID: <006e01c7682b$f0687b10$d1397130$@net>
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Mailer: Microsoft Office Outlook 12.0
Thread-Index: AcdoJWyxMes10817RSKUJDp47IqDHwAAIMdw
Content-Language: en-us
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 92df29fa99cf13e554b84c8374345c17
Cc: 'Doug Ewell' <>, 'LTRU Working Group' <>
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>

This is getting interesting but tricky. I think John's point...

"we don't know what the scope of actual practice will be.  For the first time we are extending language tagging to languages that either don't exist in written form at all, or are "written" by writing some closely related language.
This opens up a whole new set of hitherto unknown applications."

...was on target. I also concur with Mark's unease with "baking" in a certain approach, given the potential alternative needs. I would say "flexibility" here but that got me into trouble before.

Re Addison's explanations, replies in text...

> -----Original Message-----
> From: Addison Phillips []
> What we want are consistent choices for language tags.
> One alternative would be to allow both "zh-cmn" and "cmn". Users would
> have to be careful to use these consistently in their content and range
> selection.

In effect cmn would "map" to zh? I wonder if the existence of the codes doesn't pretty much guarantee that some people will, for whatever reason, use the language code without attention to the macrolanguage code. One would have to consider this possibility and provide for it. Hence this seems to be a good idea.

> Another alternative would be to forget extlang altogether and permit
> *either* "zh" *or* "cmn" but not both in the same tag (except by
> grandfathering). This frees the extlang up for other, nefarious,
> purposes.

But presumably there is a relationship between zh and cmn? What about permitting all three options? This goes back to the ff examples. IOW one could tag the macrolanguage, or the language, or both. The latter two having the same function. 

For some languages at least, I wonder if removing the extlang relationship wouldn't lead to some confusion. On the other hand, if overlapping locales and the possibility of alternative tagging being used for essentially the same thing is not a problem, then maybe this could work.

> My surmise is that macro-languages are a one-time event: "discovery" of
> future macro-languages will mostly be prohibited by rule (since most of
> the languages will already have codes in the "primary" position when
> they become part of a macro-language collection). If my surmise is
> correct, we could ban future extlang additions and use the remainder of
> that namespace for (well) nefarious purposes.

Don't agree here. I think there are de facto macrolanguages out there without the title that may need new codes and the extlang relationships with other 639-3 or even 639-1/2 coded (sub)languages that those would imply (Runyakitara, Oshiwambo, and Beti are examples of possibilities, from my understanding of descriptions). At the very least, the door should not be closed.

It is a truism that language is changing, but for a number of languages there are factors (per John's comment) that are still not clear or are more "changeable" than the ones we have more experience with in ICT.


Ltru mailing list