Re: [Ltru] How to handle macrolanguage when no code?

"Phillips, Addison" <addison@amazon.com> Thu, 09 April 2009 00:52 UTC

Return-Path: <addison@amazon.com>
X-Original-To: ltru@core3.amsl.com
Delivered-To: ltru@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 20ACC3A6EB2 for <ltru@core3.amsl.com>; Wed, 8 Apr 2009 17:52:11 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -105.719
X-Spam-Level:
X-Spam-Status: No, score=-105.719 tagged_above=-999 required=5 tests=[AWL=-0.980, BAYES_20=-0.74, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id tEaLRtFYBlqY for <ltru@core3.amsl.com>; Wed, 8 Apr 2009 17:52:10 -0700 (PDT)
Received: from smtp-fw-2101.amazon.com (smtp-fw-2101.amazon.com [72.21.196.25]) by core3.amsl.com (Postfix) with ESMTP id E6F993A6BC5 for <ltru@ietf.org>; Wed, 8 Apr 2009 17:52:09 -0700 (PDT)
X-IronPort-AV: E=Sophos; i="4.40,157,1238976000"; d="scan'208,217"; a="250584843"
Received: from smtp-in-4104.sea5.amazon.com ([10.248.183.18]) by smtp-border-fw-out-2101.iad2.amazon.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 09 Apr 2009 00:53:07 +0000
Received: from ex-hub-4101.ant.amazon.com (ex-hub-4101.ant.amazon.com [10.248.163.22]) by smtp-in-4104.sea5.amazon.com (8.12.11/8.12.11) with ESMTP id n390r7Xx001088 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=FAIL); Thu, 9 Apr 2009 00:53:07 GMT
Received: from EX-SEA5-D.ant.amazon.com ([10.248.163.27]) by ex-hub-4101.ant.amazon.com ([10.248.163.22]) with mapi; Wed, 8 Apr 2009 17:53:06 -0700
From: "Phillips, Addison" <addison@amazon.com>
To: Don Osborn <dzo@bisharat.net>, 'LTRU Working Group' <ltru@ietf.org>, 'IETF Languages Discussion' <ietf-languages@iana.org>
Date: Wed, 08 Apr 2009 17:53:05 -0700
Thread-Topic: [Ltru] How to handle macrolanguage when no code?
Thread-Index: Acm4o1fcV19UCBEISBqf9yM+k3MrCAACY5+Q
Message-ID: <4D25F22093241741BC1D0EEBC2DBB1DA019F40EC70@EX-SEA5-D.ant.amazon.com>
References: <011e01c9b8a3$58ab9670$0a02c350$@net>
In-Reply-To: <011e01c9b8a3$58ab9670$0a02c350$@net>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: en-US
Content-Type: multipart/alternative; boundary="_000_4D25F22093241741BC1D0EEBC2DBB1DA019F40EC70EXSEA5Dantama_"
MIME-Version: 1.0
Subject: Re: [Ltru] How to handle macrolanguage when no code?
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 09 Apr 2009 00:52:11 -0000

HTML certainly allows you to declare that some content is applicable to more than one language audience. See:

   http://www.w3.org/TR/i18n-html-tech-lang/#ri20040728.121358444

Otherwise, John Cowan’s advice seems appropriate… ISO 639-3 or ISO 639-5 would be your next stop. Note that macrolanguages are sometimes problematical, so you might also consider a collection code instead.

Addison Phillips
Globalization Architect -- Lab126

Internationalization is not a feature.
It is an architecture.

From: ltru-bounces@ietf.org [mailto:ltru-bounces@ietf.org] On Behalf Of Don Osborn
Sent: Wednesday, April 08, 2009 4:40 PM
To: 'LTRU Working Group'; 'IETF Languages Discussion'
Subject: [Ltru] How to handle macrolanguage when no code?

In looking at the BBC website's offerings in African languages, one notes that they have grouped Kinyarwanda and Kirundi together under http://www.bbc.co.uk/greatlakes/  . This makes sense from a linguistic point of view since as I understand it, the two languages are almost the same. When looking at the view (page) source, one notes that they use lang="rw" (for Kinyarwanda). It may be that the pages I checked are properly Kinyarwanda and an expert would know that they are not Kirundi (rn), but it is in any event true that there is no code element to cover both languages.

I'm curious if there is any other recommended way to handle such a situation where web content may be deliberately and easily designed to cover more than one language as defined by ISO 639 when there is not currently any macrolanguage code for them. Could one for example define a whole page as having two languages? E.g., something like lang="rw, rn"?

Thanks in advance for any feedback.

Don