RE: [Ltru] Re: extlang

"Don Osborn" <> Tue, 20 March 2007 12:34 UTC

Return-path: <>
Received: from [] ( by with esmtp (Exim 4.43) id 1HTdXg-0001tD-18; Tue, 20 Mar 2007 08:34:20 -0400
Received: from [] ( by with esmtp (Exim 4.43) id 1HTdXf-0001sz-9P for; Tue, 20 Mar 2007 08:34:19 -0400
Received: from ([] by with esmtp (Exim 4.43) id 1HTdXd-0008AD-FM for; Tue, 20 Mar 2007 08:34:19 -0400
Received: (qmail 485 invoked from network); 20 Mar 2007 06:54:14 -0500
Received: from (HELO IBM92AA25595C4) ( by with SMTP; 20 Mar 2007 06:54:13 -0500
From: "Don Osborn" <>
To: "'GerardM'" <>
References: <> <006e01c7682b$f0687b10$d1397130$@net> <004501c768bb$3bc185e0$6401a8c0@DGBP7M81> <00fd01c76914$18377ae0$48a670a0$@net> <> <> <01b801c76990$e3e9b5a0$abbd20e0$@net> <> <> <000001c76aab$71a159a0$54e40ce0$@net> <>
In-Reply-To: <>
Subject: RE: [Ltru] Re: extlang
Date: Tue, 20 Mar 2007 07:54:08 -0400
Message-ID: <004501c76ae6$786e8600$694b9200$@net>
MIME-Version: 1.0
X-Mailer: Microsoft Office Outlook 12.0
Thread-Index: Acdq2HHqbkwOhSP7RAiC2GbSAgZEuAACgZPg
Content-Language: en-us
X-Spam-Score: 0.1 (/)
X-Scan-Signature: c8d1e86bb8f49de8156b6392faa4a63b
Cc: 'Frank Ellermann' <>,
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
Content-Type: multipart/mixed; boundary="===============1170320095=="

Hi Gerard, But what I have heard from some people re Arabic localization, “no one” writes “spoken Egyptian Arabic” arz (in Egypt or elsewhere). Maybe that’s an overstatement (I don’t know), but it reflects a certain reality. Which is not to suggest that ar-EG is necessarily the same as arz, but that they describe different realities (perhaps one written, one spoken? – if indeed it is that simple). In such a circumstance, what might seem to be the apparent “LCD” unit of language definition (and coding) may actually not be, or at least may not always be the single best choice.


In other contexts, one might seek an appropriate level of aggregation, rather than the smallest accurate language description. Such as software localization for instance (and hence locale).


So it seems that the context in combination with the specific realities of a particular macrolanguage-sublanguage(s) relationship(s) determines the choice.


All the best.





From: GerardM [] 
Sent: Tuesday, March 20, 2007 6:14 AM
To: Don Osborn
Cc: John Cowan; Addison Phillips; Frank Ellermann;
Subject: Re: [Ltru] Re: extlang


Reading Don, it is yet another great argument to be sensible and allow for the use of tags that are the single lowest denominator for a language. Don explains quite nicely the ambiguity involved.

When you have a choice between ar-EG and arz and think you can say they are the same, how would you then code spoken Egyptian as used by the immigrant community in the USA ?? I would say that ar-arz-US and arz-US are appropriate. The notion that ar-EG is necessarily the same as arz is imho wrong. Coding it like ar-EG-US is probably not acceptable. 


On 3/20/07, Don Osborn <> wrote:

Reply below...

> -----Original Message-----
> From: John Cowan []
> Addison Phillips scripsit:
> > The idea of macro-languages is that they are not, themselves, 
> languages.
> > Rather, they are groupings of languages that can be usefully referred
> to
> > collectively. That is, strictly speaking, "Chinese" (meaning "zh")
> isn't 
> > a language---and neither is "Arabic" (by which I mean the code "ar").
> Collections are one thing and macrolanguages are another.  A
> macrolanguage
> is a grouping of languages that can usefully be treated as a single 
> language in certain contexts.

Thanks, John. The question I have is what are the contexts? IOW, are the
various macrolanguage-sublanguage relationships, which seem to represent
varying realities in terms of usage, categorizable into certain types? If so 
might that help discussions?

"Macrolanguage" is an interesting concept, but whatever its original intent
(and whatever linguists in general think of it), it seems bound to accrue
other meanings and applications. This is not necessarily bad though it might 
complicate life for coding.

Part of the issue with Arabic may be that one standard form is used pretty
widely while the spoken forms vary. I've heard from localisers of Arabic
that they only really use one Arabic. In the case of Arabic in Egypt, does 
anyone write arz? In literature, theater, journalism, or...? I don't know,
but I suspect that the existence of the code for "Egyptian spoken Arabic"
(which by the way is only one of the colloquial forms in the country it 
seems) will prompt some to choose it for what might more appropriately be
ar-EG or arb-EG. The part that I don't know about Arabic far exceeds the
little I think I know, so best to stop here.

When I think of Fula (ff) this is another complex issue. I won't get into it 
now, but a couple of things come to mind: publications in the 1970s or so
that treated the (macro)language as a whole, and "pan-Fula" dictionaries
(with notes re different usages among the varieties of Fula). I have been 
talking up the idea of a pan-Fula monolingual dictionary online, and hope to
discuss that some more in a few days (at ACAL/ALTA in Gainesville). It
really seems like you can shift perspective and see such tongues from the 
point of their functional unity or their functional differences.

<way out in left field>All this has me thinking that a macrolanguage is
maybe something like a Platonic ideal but with a functional aspect. Thus 
macrolanguages are actually all over the place, but we don't see them as
such since they so often coincide each with one language. So for instance
Hausa is a macrolanguage with only one language (and some dialect 
differences from west to east). On the other hand, something like Fula is a
macrolanguage with several languages that have high degrees of
interintelligibility. IOW, language has an ideological component reflected 
in "macrolanguage," which has some practical meaning/use. Take the case of
Runyakitara in western Uganda, which is a conscious effort to provide a
standard for a group of 4 very close languages that also have a common 
historical identity - is that creating a macrolanguage or identifying
something that was always there? (conversely, none of the 4 tongues under
that were themselves cases of "macrolanguages" in the sense I'm exploring 
here).</way out in left field>

Not sure that any of this helps the coding questions at hand but maybe it
gives another useful perspective on possible use of "macrolanguage."

All the best. 


Ltru mailing list 


Ltru mailing list