Re: [Ltru] How to handle macrolanguage when no code?

Peter Constable <petercon@microsoft.com> Thu, 09 April 2009 03:10 UTC

Return-Path: <petercon@microsoft.com>
X-Original-To: ltru@core3.amsl.com
Delivered-To: ltru@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 6E9A83A6B16 for <ltru@core3.amsl.com>; Wed, 8 Apr 2009 20:10:34 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -10.932
X-Spam-Level:
X-Spam-Status: No, score=-10.932 tagged_above=-999 required=5 tests=[AWL=-0.334, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id hVVuZKcFNHGT for <ltru@core3.amsl.com>; Wed, 8 Apr 2009 20:10:33 -0700 (PDT)
Received: from smtp.microsoft.com (mailb.microsoft.com [131.107.115.215]) by core3.amsl.com (Postfix) with ESMTP id 411003A68CC for <ltru@ietf.org>; Wed, 8 Apr 2009 20:10:33 -0700 (PDT)
Received: from TK5-EXHUB-C102.redmond.corp.microsoft.com (157.54.18.53) by TK5-EXGWY-E802.partners.extranet.microsoft.com (10.251.56.168) with Microsoft SMTP Server (TLS) id 8.2.99.4; Wed, 8 Apr 2009 20:11:40 -0700
Received: from NA-EXMSG-C117.redmond.corp.microsoft.com ([157.54.62.46]) by TK5-EXHUB-C102.redmond.corp.microsoft.com ([157.54.18.53]) with mapi; Wed, 8 Apr 2009 20:11:39 -0700
From: Peter Constable <petercon@microsoft.com>
To: "Phillips, Addison" <addison@amazon.com>, Don Osborn <dzo@bisharat.net>, 'LTRU Working Group' <ltru@ietf.org>, 'IETF Languages Discussion' <ietf-languages@iana.org>
Date: Wed, 08 Apr 2009 20:11:39 -0700
Thread-Topic: [Ltru] How to handle macrolanguage when no code?
Thread-Index: Acm4o1fcV19UCBEISBqf9yM+k3MrCAACY5+QAASS88A=
Message-ID: <DDB6DE6E9D27DD478AE6D1BBBB83579566EBDDCC4D@NA-EXMSG-C117.redmond.corp.microsoft.com>
References: <011e01c9b8a3$58ab9670$0a02c350$@net> <4D25F22093241741BC1D0EEBC2DBB1DA019F40EC70@EX-SEA5-D.ant.amazon.com>
In-Reply-To: <4D25F22093241741BC1D0EEBC2DBB1DA019F40EC70@EX-SEA5-D.ant.amazon.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: en-US
Content-Type: multipart/alternative; boundary="_000_DDB6DE6E9D27DD478AE6D1BBBB83579566EBDDCC4DNAEXMSGC117re_"
MIME-Version: 1.0
Subject: Re: [Ltru] How to handle macrolanguage when no code?
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 09 Apr 2009 03:10:34 -0000

If it is content in one linguistic variety and crafted to serve two audiences deemed in 639-3 to be distinct languages, then that strikes me as a potential macrolanguage scenario.

One key question is how narrow a scope of content is needed and how much deliberate effort is needed to craft something like that. For instance, a document consisting of “Papa!” can serve many different audiences, but that is solely because the scope of content is so constrained, and for that reason the bar is not met for a macrolanguage. But if it’s easy for a content provider to come up with content that serves both, then that’s interesting.

Another key question is why that content is functional for both audiences. Is it because it is expressed in a variety that can really be considered common, or is it because it’s actually in language A and 90% of speakers in language B are functionally bilingual in A? Does the common-identify label reflect actual linguistic commonality, or is it a logistic tool used in the repository to reflect merely a dual tasking?


Some thoughts. Discuss it with the 639-3 RA.


Peter

From: ltru-bounces@ietf.org [mailto:ltru-bounces@ietf.org] On Behalf Of Phillips, Addison
Sent: Wednesday, April 08, 2009 5:53 PM
To: Don Osborn; 'LTRU Working Group'; 'IETF Languages Discussion'
Subject: Re: [Ltru] How to handle macrolanguage when no code?

HTML certainly allows you to declare that some content is applicable to more than one language audience. See:

   http://www.w3.org/TR/i18n-html-tech-lang/#ri20040728.121358444

Otherwise, John Cowan’s advice seems appropriate… ISO 639-3 or ISO 639-5 would be your next stop. Note that macrolanguages are sometimes problematical, so you might also consider a collection code instead.

Addison Phillips
Globalization Architect -- Lab126

Internationalization is not a feature.
It is an architecture.

From: ltru-bounces@ietf.org [mailto:ltru-bounces@ietf.org] On Behalf Of Don Osborn
Sent: Wednesday, April 08, 2009 4:40 PM
To: 'LTRU Working Group'; 'IETF Languages Discussion'
Subject: [Ltru] How to handle macrolanguage when no code?

In looking at the BBC website's offerings in African languages, one notes that they have grouped Kinyarwanda and Kirundi together under http://www.bbc.co.uk/greatlakes/  . This makes sense from a linguistic point of view since as I understand it, the two languages are almost the same. When looking at the view (page) source, one notes that they use lang="rw" (for Kinyarwanda). It may be that the pages I checked are properly Kinyarwanda and an expert would know that they are not Kirundi (rn), but it is in any event true that there is no code element to cover both languages.

I'm curious if there is any other recommended way to handle such a situation where web content may be deliberately and easily designed to cover more than one language as defined by ISO 639 when there is not currently any macrolanguage code for them. Could one for example define a whole page as having two languages? E.g., something like lang="rw, rn"?

Thanks in advance for any feedback.

Don