Re: [Ltru] matching: Language Range Lists

Martin Duerst <duerst@it.aoyama.ac.jp> Thu, 29 September 2005 07:26 UTC

Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EKsnq-0000CF-Gj; Thu, 29 Sep 2005 03:26:02 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EKsnp-0000C2-D6 for ltru@megatron.ietf.org; Thu, 29 Sep 2005 03:26:01 -0400
Received: from ietf-mx.ietf.org (ietf-mx [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id DAA23813 for <ltru@ietf.org>; Thu, 29 Sep 2005 03:25:58 -0400 (EDT)
Received: from scmailgw1.scop.aoyama.ac.jp ([133.2.251.194]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1EKsvP-00012f-FI for ltru@ietf.org; Thu, 29 Sep 2005 03:33:53 -0400
Received: from scmse2.scbb.aoyama.ac.jp (scmse2 [133.2.253.17]) by scmailgw1.scop.aoyama.ac.jp (secret/secret) with SMTP id j8T7Phu24512; Thu, 29 Sep 2005 16:25:43 +0900 (JST)
Received: from nodnsquery(133.2.210.1) by scmse2.scbb.aoyama.ac.jp via csmap id 21df7fd6_30ba_11da_9c07_00304811fd80_30002; Thu, 29 Sep 2005 16:24:55 +0900 (JST)
Received: from EBOSHIIWA.it.aoyama.ac.jp (localhost.localdomain [127.0.0.1]) by localhost.localdomain (8.13.1/8.13.1) with ESMTP id j8T7Oe98024018; Thu, 29 Sep 2005 16:25:05 +0900
Message-Id: <6.0.0.20.2.20050929161500.03a4db40@localhost>
X-Sender: duerst@localhost
X-Mailer: QUALCOMM Windows Eudora Version 6J
Date: Thu, 29 Sep 2005 16:23:33 +0900
To: Mark Davis <mark.davis@icu-project.org>, Mark Davis <mark.davis@icu-project.org>
From: Martin Duerst <duerst@it.aoyama.ac.jp>
Subject: Re: [Ltru] matching: Language Range Lists
In-Reply-To: <433AEF9C.9070405@icu-project.org>
References: <FA13712B13469646A618BC95F7E1BA8F0E2763@alvmbxw01.prod.quest.corp> <433AE893.4030304@icu-project.org> <433AEF9C.9070405@icu-project.org>
Mime-Version: 1.0
Content-Type: text/plain; charset="ISO-2022-JP"; format="flowed"
Content-Transfer-Encoding: 7bit
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 4b800b1eab964a31702fa68f1ff0e955
Content-Transfer-Encoding: 7bit
Cc: ltru@ietf.org
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Sender: ltru-bounces@ietf.org
Errors-To: ltru-bounces@ietf.org

as a chair: Please use 'matching:' in the subject of all mails
concerning the matching draft. Thanks!
[I have done this in this subject, and also made the subject
to be closer to the actual topic.]


At 04:31 05/09/29, Mark Davis wrote:
>The main issue I have with the document as it stands is that it neglects a 
>very important case. The general case of language matching involves having 
>a *list* of language ranges. You take a list like "en-US en fr it" and use 
>it for either matching or lookup. So I would suggest adding the following 
>text. (It probably needs fleshing out a bit, but this is a start.).

without my chair hat:

This is already defined, in quite a bit more detail and
with some slight differences (i.e. allowing a q value) in
the HTTP 1.1 spec. See e.g.
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.4

Do we need anything more?

Please also note that according to the HTTP rules,
zh-Hant-CN

in the 'language range list' does NOT match zh or zh-Hant.
(whereas the other way round, a document of zh-Hant-CN IS
matched by a request for zh or zh-Hant).
This was done to avoid false positives. In the example belows,
zh is in third place, but if the user explicitly specifies
zh-Hant-CN, this probably means (or should be reserved to
mean) that she doesn't read simplified,... In that case,
it seems that en-US or fr should have higher priority,
because the user actually understands that.

Regards,    Martin.
>2.3 Language Range Lists
>
>A language range list is a space-delimited list of zero or more language 
>ranges. An extended language range list contains extended language ranges; 
>a basic language range list is limited to containing basic language 
>ranges. In matching, anything that matches any element of the list 
>according to the above options is included. (With distance metrics, the 
>shortest distance from any element in the list is the one used for the 
>overall distance).
>
>In lookup, the list is examined from first to last for a match, according 
>to the above options. For example, starting with the list range 
>"zh-Hant-CN en-US fr", and using Lookup with truncation as in 2.1.1, the 
>lookup would progressively search for content as shown below:
>
>
>List Range to match: zh-Hant-CN en-US fr
>1. zh-Hant-CN
>2. zh-Hant
>3. zh
>4. en-US
>5. en
>6. fr
>7. (default content or the empty tag)
>
>
>
>2.3ツ Meaning of Language Tags and Ranges [RENUMBER to 2.4...]
>
>_______________________________________________
>Ltru mailing list
>Ltru@ietf.org
>https://www1.ietf.org/mailman/listinfo/ltru


_______________________________________________
Ltru mailing list
Ltru@ietf.org
https://www1.ietf.org/mailman/listinfo/ltru