RE: [Ltru] Extended language tags

"Don Osborn" <> Fri, 05 October 2007 01:40 UTC

Return-path: <>
Received: from [] ( by with esmtp (Exim 4.43) id 1IdcAc-0008Lv-FY; Thu, 04 Oct 2007 21:40:02 -0400
Received: from ltru by with local (Exim 4.43) id 1IdcAa-0008Jk-PR for; Thu, 04 Oct 2007 21:40:00 -0400
Received: from [] ( by with esmtp (Exim 4.43) id 1IdcAa-0008Ds-Dm for; Thu, 04 Oct 2007 21:40:00 -0400
Received: from ([] by with esmtp (Exim 4.43) id 1IdcAR-0008Ap-KL for; Thu, 04 Oct 2007 21:39:52 -0400
Received: (qmail 31812 invoked from network); 4 Oct 2007 20:39:50 -0500
Received: from (HELO IBM92AA25595C4) ( by with SMTP; 4 Oct 2007 20:39:50 -0500
From: "Don Osborn" <>
To: "'Andrew Cunningham'" <>, "'Shawn Steele'" <>
References: <> <> <>
In-Reply-To: <>
Subject: RE: [Ltru] Extended language tags
Date: Thu, 4 Oct 2007 21:39:44 -0400
Message-ID: <002101c806f0$9c7dffc0$d579ff40$@net>
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Mailer: Microsoft Office Outlook 12.0
Thread-Index: AcgG63r0wRUkk5e7R7OOF2/JgQKd+QAAUg/w
Content-Language: en-us
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 5ebbf074524e58e662bc8209a6235027
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>

Just to confirm that what Andrew is saying applies to a number of languages in Africa. I've brought up similar concerns in the past wrt (macro) languages like Fula and Manding (though the latter might be called a cluster), though I believe I addressed them to the ietf-languages list. These are just a few examples.

I haven't been following the recent discussions here as much as I should have, but do believe that if there is a bias in the system to specificity, when specificity is not always the ideal approach, then it will inevitably encounter calls for flexibility later. At this point I can't say how extlang responds to this so can only offer these general comments on the context of some languages.

This is not to say that specificity in language tags is wrong, but that they describe one aspect of a more complex reality. Large degrees of interintelligibility, the sense of speakers regarding the essential unity of a language (based on their experience and the common wisdom in the culture), and the hopes and potentials for standardized versions, all probably mean that less specific tags will become more important as these languages emerge more fully into digital culture and the digital economy.

The language situations are dynamic, and the medium in which we are trying to describe and categorize languages is adding a new dimension to the development of lesser-resourced languages and their variants.

Hope this makes some sense.

Don Osborn

> -----Original Message-----
> From: Andrew Cunningham []
> Sent: Thursday, October 04, 2007 9:02 PM
> To: Shawn Steele
> Cc:
> Subject: Re: [Ltru] Extended language tags
> Throwing in my two cents worth:
> One aspect of language tagging is the preference for the tags to be as
> specific as possible. In some cases it is desirable to be less
> specific.
> An example i mentioned on the teleconference was the case of Dinka. In
> ISO-639-2 this is represented by the language code "din".
> While ISO-639-3 have the following language codes:
> dib  	South Central Dinka
> dik 	Southwestern Dinka
> dip 	Northeastern Dinka
> diw 	Northwestern Dinka
> dks 	Southeastern Dinka
> If I was applying a langauge tag to a Rek grammar, then i'd use "dik",
> for a collection of Ciec folktales I'd use "dib". For a collection of
> Bor proverbs I'd use "dks".
> To describe the literacy materials and class room materials being
> developed in Australia by the Dinka community, I'd use "din". Within
> the diaspora and in Australia specifically the literacy and language
> teachers, translators and interpreters are discussing a standardized
> approach to written Dinka.
> The original SPLA/M education policies highlighted Rek as a standard
> for written Dinka. What seems to be occurring is an amalgam based on
> Rek, but including aspects and vocab from other Dinka dialects. There
> will be a locally hosted conference next year to thrash out some of
> the issues.
> In this context "din": would be the most appropriate way of tagging
> the new educational material, while keeping the existing five
> iso-639-3 language tags to more accurately describe information and
> data written in one of the 20 odd specific dialects.
> from the perspective of the user community, an extlang approach would
> make more sense, i.e. Rek labeled as "din-dik" makes more sense than
> "dik". To the community Rek is a Dinka langauge, Dinak Rek, not a
> separate language called Rek.
> In this sense extlang reflects the communities understanding of their
> language.
> This is just an observation, i'm neither for or against extlang.
> Although from the perspective of web development and how CSS and web
> browsers handle psuedo langauge selectors and attribute selectors, I'd
> suggest that extlang approach simplifies things for those rare
> individuals amongst us that use these selectors.
> Andrew
> --
> Andrew Cunningham
> State Library of Victoria, Australia
> _______________________________________________
> Ltru mailing list

Ltru mailing list