Re: [Ltru] Proposed resolution for Issue 13 (language tags)

"Phillips, Addison" <> Tue, 15 April 2008 17:26 UTC

Return-Path: <>
Received: from (localhost []) by (Postfix) with ESMTP id 9A05F3A6F25; Tue, 15 Apr 2008 10:26:07 -0700 (PDT)
Received: from localhost (localhost []) by (Postfix) with ESMTP id 79BF03A6913 for <>; Tue, 15 Apr 2008 10:26:06 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -106.599
X-Spam-Status: No, score=-106.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id awl6fAZR4PCf for <>; Tue, 15 Apr 2008 10:26:02 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id 4D7D33A69BE for <>; Tue, 15 Apr 2008 10:26:02 -0700 (PDT)
X-IronPort-AV: E=Sophos;i="4.25,459,1199664000"; d="scan'208";a="22042821"
Received: from ([]) by with ESMTP/TLS/DHE-RSA-AES256-SHA; 15 Apr 2008 17:26:35 +0000
Received: from ( []) by (8.12.11/8.12.11) with ESMTP id m3FHQXnt017021 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=FAIL); Tue, 15 Apr 2008 17:26:35 GMT
Received: from ([]) by ([]) with mapi; Tue, 15 Apr 2008 10:26:34 -0700
From: "Phillips, Addison" <>
To: Martin Duerst <>, Julian Reschke <>, HTTP Working Group <>
Date: Tue, 15 Apr 2008 10:26:32 -0700
Thread-Topic: [Ltru] Proposed resolution for Issue 13 (language tags)
Thread-Index: Acieo0tcMn6vgAFoQ8q0oV28m6za8gAdeZ+w
Message-ID: <>
References: <> <>
In-Reply-To: <>
Accept-Language: en-US
Content-Language: en-US
acceptlanguage: en-US
MIME-Version: 1.0
Cc: LTRU Working Group <>
Subject: Re: [Ltru] Proposed resolution for Issue 13 (language tags)
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64


I have a number of thoughts on the proposed changes, which appear below my signature.

I would suggest that HTTPbis wait in order to reference 4646bis, which is in WG last call, if at all possible. I mention this for two reasons:

1. It would be better to reference the update than the soon-to-be-obsolete version, even though the differences are minor.

2. The new version includes an ABNF production of value to HTTPbis (obs-language).

Best Regards,


Addison Phillips
Globalization Architect -- Lab126 (Amazon)
Chair -- W3C Internationalization Core WG

Internationalization is not a feature.
It is an architecture.

> Martin Dürst wrote comments on Julian's email which said in part:
> >
> >(see also <>).
> >
> >

> The above text gives the impression that there is a separate
> concept of a "HTTP language tag". Why not just say something
> like "HTTP uses language tags as defined in ...".

I agree with Martin here. However, it may be useful to reference the RFC 3066 Language-Tag production in Section 2.2.9, for compatibility with existing RFC 2616 implementations, and to specify "well-formed" conformance.
So I strongly suggest you reference BCP 47
> rather than a specific RFC.


> >Section 3.5., paragraph 3:
> >OLD:
> >
> >      language-tag  = primary-tag *( "-" subtag )
> >      primary-tag   = 1*8ALPHA
> >      subtag        = 1*8ALPHA
> >
> >NEW:
> >
> >      language-tag  = <Language-Tag, defined in [RFC4646], Section
> 2.1>
> See above.

It may be better to reference Language-Tag as defined in 2.2.9 for compatibility. While it would be good to adopt the modern language tag ABNF, that would suggest that receivers reject tags that were well-formed but no longer are.

> This has to be reworded. en-US is a tag allowed based on the current
> subtag registrations. I'm not totally sure about en-cockney and i-
> cherokee.

"en-cockney" is not valid, nor is "i-cherokee". "i-cherokee" can NEVER be registered. The "cockney" subtag could be registered. However, there is no need to use artificial examples (as there was when RFC 2616 was written). It would be better to use examples from RFC 4646/4646bis.

Note: Cherokee is in the registry with the subtag 'chr'.

> For 14.4, Accept-Language, please note that BCP 47 (RFC 4647 currently)
> also defines a language-range, probably the same as you have, so you
> should reference that. There are also various variants for matching
> predefined; you should be able to choose the one that fits your needs
> best and then only have to define a few details.

Actually, Accept-Language is an example of a "language priority list" in RFC 4647. However, RFC 4647 doesn't define the exact syntax for a language priority list. The "language-range" production *is* defined by RFC 4647, as a "Basic Language Range".

The text about matching of language ranges and tags in RFC 2616 is the same as the Basic Filtering algorithm--in fact, the latter was designed to be identical. However, I think that choosing this algorithm explicitly would be a Bad Thing. Sometimes the Lookup algorithm is the right choice for an application (as when selecting user interface language for an application), while at other times one of the Filtering algorithms makes more sense (as when selecting search results).

Note that the text in RFC 2616 should have been superseded by RFC 3282. Ultimately, I think that 2616bis should define Accept-Language using the existing ABNF but adopting the terminology in RFC 4647. It should then guide users to BCP 47 (RFC 4647) regarding the selection of matching algorithms.

Ltru mailing list