Re: [Ltru] my technical position on extlang
John Cowan <cowan@ccil.org> Fri, 23 May 2008 04:43 UTC
Return-Path: <ltru-bounces@ietf.org>
X-Original-To: ltru-archive@megatron.ietf.org
Delivered-To: ietfarch-ltru-archive@core3.amsl.com
Received: from [127.0.0.1] (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 937063A6C43; Thu, 22 May 2008 21:43:12 -0700 (PDT)
X-Original-To: ltru@core3.amsl.com
Delivered-To: ltru@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 6E4CE3A69AD for <ltru@core3.amsl.com>; Thu, 22 May 2008 21:43:11 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.066
X-Spam-Level:
X-Spam-Status: No, score=-2.066 tagged_above=-999 required=5 tests=[AWL=0.533, BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id elZv3iyast1R for <ltru@core3.amsl.com>; Thu, 22 May 2008 21:43:10 -0700 (PDT)
Received: from earth.ccil.org (earth.ccil.org [192.190.237.11]) by core3.amsl.com (Postfix) with ESMTP id 1F9F33A6C43 for <ltru@ietf.org>; Thu, 22 May 2008 21:43:10 -0700 (PDT)
Received: from cowan by earth.ccil.org with local (Exim 4.63) (envelope-from <cowan@ccil.org>) id 1JzP7R-0006o0-RT; Fri, 23 May 2008 00:43:05 -0400
Date: Fri, 23 May 2008 00:43:05 -0400
To: Mark Davis <mark.davis@icu-project.org>
Message-ID: <20080523044305.GB7960@mercury.ccil.org>
References: <30b660a20805181149u2e1e3fb9y1a3b5b751c3e6998@mail.gmail.com>
MIME-Version: 1.0
Content-Disposition: inline
In-Reply-To: <30b660a20805181149u2e1e3fb9y1a3b5b751c3e6998@mail.gmail.com>
User-Agent: Mutt/1.5.13 (2006-08-11)
From: John Cowan <cowan@ccil.org>
Cc: LTRU Working Group <ltru@ietf.org>
Subject: Re: [Ltru] my technical position on extlang
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: ltru-bounces@ietf.org
Errors-To: ltru-bounces@ietf.org
Mark Davis scripsit: > For filtering, extlang offers no particular advantage. Let's look at queries > of "ar-ary" Moroccan Arabic vs "ary". In either case I need a way to match > all and only Moroccan Arabic; I must not fallback to "ar". If I fallback to > include all Arabic, the actual content that is in Moroccan Arabic would be > completely and utterly swamped by Standard Arabic. So for filtering, the > extlang model just gives us a more complicated syntax, with no benefit. I agree with most of your document, which is almost entirely about lookup, but I can't agree with your characterization of filtering. It would apply to filtering applied to a search engine, but a search engine is a very atypical case of a Web server -- its content is vast and not controlled by the server's owner except in the broadest sense. Filtering is the algorithm applied to language-tag negotiation by the HTTP RFC (2616); indeed, 4647's simple filtering algorithm is taken directly from 2616. So as an IETF group we must take filtering seriously, and it is my contention that using extlang syntax is a big win for filtering. Remember that the operational difference between filtering and lookup is that in filtering, a range with *fewer* subtags can match a resource tagged with *more* subtags; whereas in lookup, a range with *more* subtags can match a resource tagged with *fewer* subtags if no longer match is available. Suppose we are downloading Arabic-language audio, and we wish to receive Egyptian Arabic by choice; failing that, Standard Arabic; and failing that, any available Arabic variety except Shuwa Arabic, which we do not understand at all. We can inform the server of our wishes using the following header: Accept-Language: ar-arz, ar-arb, ar, ar-shw;q=0 A 2616-conformant server will apply the filtering algorithm and return anything beginning with "ar-arz"; if none, then anything beginning with "ar-arb"; if none, then anything beginning with "ar" except something beginning with "ar-shw". (Some web servers ignore "q=0", making the exclusion of Shuwa not work, but that's just a bug; RFC 2616 section 3.9 is clear that ";q=0" means content so tagged is not acceptable to the client.) What is more (and this is key) if the server has more than one match for ar-arb, say ar-arb-koranic and ar-arb-modern, it can return either, or return a disambiguation page allowing the user to choose. This is entirely different from lookup, where at most one resource must be returned, and we remove subtags until we get an exact match. Now what if extlang syntax is not used? Then we must say Accept-Language: arz, arb, ar, shw;q=0 And in order for this to mean what it is intended to mean, the server must extend RFC 2616 and treat filtering for "ar" as magically matching resources that begin with "arq", "aao", "bbz", ..., "aeb", "auz" (see http://www.sil.org/iso639-3/documentation.asp?id=ara ). Either that, or the user must specify all these tags in Accept-Language. What's more, if the list grows, the server or the client must be updated. Whereas if all Arabics begin with "ar", there is no such issue. Obviously extlang syntax won't solve *all* problems with filtering, but it will help a great deal with some important cases. That's why I think the issue needs to be reconsidered. -- If you understand, John Cowan things are just as they are; http://www.ccil.org/~cowan if you do not understand, cowan@ccil.org things are just as they are. _______________________________________________ Ltru mailing list Ltru@ietf.org https://www.ietf.org/mailman/listinfo/ltru
- [Ltru] my technical position on extlang Martin Duerst
- Re: [Ltru] my technical position on extlang John Cowan
- Re: [Ltru] my technical position on extlang Gerard Meijssen
- Re: [Ltru] my technical position on extlang Debbie Garside
- [Ltru] my technical position on extlang Mark Davis
- Re: [Ltru] my technical position on extlang Doug Ewell
- Re: [Ltru] my technical position on extlang Peter Constable
- Re: [Ltru] my technical position on extlang Leif Halvard Silli
- Re: [Ltru] my technical position on extlang Leif Halvard Silli
- Re: [Ltru] my technical position on extlang Doug Ewell
- Re: [Ltru] my technical position on extlang Leif Halvard Silli
- Re: [Ltru] my technical position on extlang Doug Ewell
- Re: [Ltru] my technical position on extlang Peter Constable
- Re: [Ltru] my technical position on extlang John Cowan
- Re: [Ltru] my technical position on extlang Martin Duerst
- Re: [Ltru] my technical position on extlang Mark Davis
- Re: [Ltru] my technical position on extlang John Cowan
- Re: [Ltru] my technical position on extlang Mark Davis
- Re: [Ltru] my technical position on extlang John Cowan
- Re: [Ltru] my technical position on extlang Mark Davis
- Re: [Ltru] my technical position on extlang John Cowan
- Re: [Ltru] my technical position on extlang Mark Davis
- Re: [Ltru] my technical position on extlang John Cowan
- Re: [Ltru] my technical position on extlang Randy Presuhn
- Re: [Ltru] my technical position on extlang Peter Constable
- Re: [Ltru] my technical position on extlang Leif Halvard Silli
- Re: [Ltru] my technical position on extlang Gerard Meijssen
- Re: [Ltru] my technical position on extlang Mark Davis
- Re: [Ltru] my technical position on extlang Mark Davis
- Re: [Ltru] my technical position on extlang John Cowan
- Re: [Ltru] my technical position on extlang John Cowan
- Re: [Ltru] my technical position on extlang Peter Constable
- Re: [Ltru] my technical position on extlang John Cowan
- Re: [Ltru] my technical position on extlang Leif Halvard Silli
- Re: [Ltru] my technical position on extlang Leif Halvard Silli
- Re: [Ltru] my technical position on extlang Gerard Meijssen
- Re: [Ltru] my technical position on extlang Leif Halvard Silli
- Re: [Ltru] my technical position on extlang John Cowan
- [Ltru] What people want (Was: my technical positi… Stephane Bortzmeyer
- Re: [Ltru] my technical position on extlang Stephane Bortzmeyer
- Re: [Ltru] What people want (Was: my technical po… Mark Davis
- Re: [Ltru] my technical position on extlang John Cowan
- Re: [Ltru] my technical position on extlang Leif Halvard Silli
- Re: [Ltru] my technical position on extlang Peter Constable
- Re: [Ltru] my technical position on extlang Peter Constable
- Re: [Ltru] my technical position on extlang Peter Constable
- Re: [Ltru] What people want (Was: my technical po… Peter Constable
- Re: [Ltru] my technical position on extlang Nicolas Krebs
- Re: [Ltru] my technical position on extlang Kent Karlsson
- Re: [Ltru] my technical position on extlang Shawn Steele
- Re: [Ltru] my technical position on extlang Leif Halvard Silli
- Re: [Ltru] my technical position on extlang Leif Halvard Silli
- [Ltru] [OT] Logic (was: Re: my technical position… Martin Duerst
- Re: [Ltru] [OT] Logic (was: Re: my technical posi… Peter Constable