RE: [Ltru] John Cowan throws in the towel on extlangs

Peter Constable <> Fri, 30 November 2007 15:38 UTC

Return-path: <>
Received: from [] ( by with esmtp (Exim 4.43) id 1Iy7wP-0007xk-SR; Fri, 30 Nov 2007 10:38:09 -0500
Received: from ltru by with local (Exim 4.43) id 1Iy7wO-0007xK-EK for; Fri, 30 Nov 2007 10:38:08 -0500
Received: from [] ( by with esmtp (Exim 4.43) id 1Iy7wO-0007x9-3B for; Fri, 30 Nov 2007 10:38:08 -0500
Received: from ([]) by with esmtp (Exim 4.43) id 1Iy7wN-0000Sb-Iv for; Fri, 30 Nov 2007 10:38:08 -0500
Received: from ( by ( with Microsoft SMTP Server (TLS) id; Fri, 30 Nov 2007 07:37:33 -0800
Received: from ([]) by ([]) with mapi; Fri, 30 Nov 2007 07:38:06 -0800
From: Peter Constable <>
To: "" <>
Date: Fri, 30 Nov 2007 07:38:00 -0800
Subject: RE: [Ltru] John Cowan throws in the towel on extlangs
Thread-Topic: [Ltru] John Cowan throws in the towel on extlangs
Message-ID: <>
References: <> <> <007601c832d3$73a6d220$5af47660$@net>
In-Reply-To: <007601c832d3$73a6d220$5af47660$@net>
Accept-Language: en-US
Content-Language: en-US
acceptlanguage: en-US
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-Spam-Score: 0.0 (/)
X-Scan-Signature: b4a0a5f5992e2a4954405484e7717d8c
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>

> From: Don Osborn []

> Apologies in advance if this is a non-sequiteur to this particular
> thread,

It is not.

> but is there provision for new macrolanguages. A while back there was a
> suggestion that we know them all, but I could suggest other
> possibilities
> such as Runyakitara (aready mentioned), Oshiwambo, possible new mix for
> Manding languages. This may be ISO 639 territory, but also would relate
> to use of tags.

New macrolanguage entries can be added to ISO 639.

If RFC4646bis were to use extlang subtags prefixed by macrolanguage subtags, then new macrolanguage IDs could not be prefixed to existing language IDs. The macrolanguage ID could be used on its own, though; it could mean needing data in the LSR to support matching a macrolanguage ID against IDs for the encompassed languages (as would be done if extlangs are not used).

For example, if none of zh, cmn and yue were registered today, they could be added tomorrow and extlang subtags zh-cmn or zh-yue could be used. If, on the other hand, cmn and yue were registered today but zh was not, zh could still be added tomorrow, but cmn and yue would already be used as primary subtags, so zh could not be prefixed. It could still be used as a primary subtag, without extlang subtags; we would just need to allow for possible need to match content tagged "cmn" or "yue" with a request for "zh", or vice versa.

(Btw, the possibility of new macrolanguages encompassing pre-registered languages forms one of the arguments against extlang: you end up needing to support approaches to matching that would be needed without extlangs anyway.)

Now, in the kinds of scenarios you have in mind, it's not clear to me if macrolanguage is the appropriate scope for new entries that might be requested. Keep in mind that the macrolanguage has a measure of ambiguity: it is not one specific variety, but rather one of several distinct varieties. For instance, either Mandarin and Cantonese content can be labeled "Chinese", and content labeled "Chinese" could well be either of those (or any of the other Chinese languages).

If Kwanyama and Ndonga are distinct languages and if "Oshiwambo" can refer to either and is in some contexts used as though it were a single language rather than a collection, then it can be considered a macrolanguage. But if Oshiwambo refers to a distinct variety that is not the same as either Kwanyama or Ndonga but is used or coming into use by speakers of both Kwanyama and Ndonga communities, then it might make more sense to consider Oshiwambo an individual language.

(I don't know which situation applies to Oshiwambo, and I'm not intending to steer discussion in that direction -- which would be off topic.)

There are cases in which language differentiation has occurred so that, over some extended period of time, a single language community has descendents speaking multiple distinct languages; but in which a unifying process has occurred (whether naturally or engineered) with a single language emerging that is used by speakers of the various languages. An example of an engineered case is Filipino; I gather that N'Ko is somewhat an example of natural evolution on this type. These are individual languages, not macrolanguages. E.g. "Filipino" would not be an appropriate label for Cebuano content.


Ltru mailing list