Re: [Ltru] Issue 181, was: Issue 113 (language tag matching (Accept-Language) vs RFC4647), was: Proposed resolution for Issue 13 (language tags)

Julian Reschke <julian.reschke@gmx.de> Mon, 02 November 2009 18:20 UTC

Return-Path: <julian.reschke@gmx.de>
X-Original-To: ltru@core3.amsl.com
Delivered-To: ltru@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 028D828C104 for <ltru@core3.amsl.com>; Mon, 2 Nov 2009 10:20:26 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.299
X-Spam-Level:
X-Spam-Status: No, score=-1.299 tagged_above=-999 required=5 tests=[AWL=1.300, BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id YOHOGXQ4+n7z for <ltru@core3.amsl.com>; Mon, 2 Nov 2009 10:20:25 -0800 (PST)
Received: from s-utl02-sjpop.stsn.net (s-utl02-sjpop.stsn.net [72.254.0.202]) by core3.amsl.com (Postfix) with SMTP id 085FF28C0E4 for <ltru@ietf.org>; Mon, 2 Nov 2009 10:20:25 -0800 (PST)
Received: from s-utl02-sjpop.stsn.net ([127.0.0.1]) by s-utl02-sjpop.stsn.net (SMSSMTP 4.1.2.20) with SMTP id M2009110210203603755 ; Mon, 02 Nov 2009 10:20:36 -0800
Received: from [10.0.61.130] ([10.0.61.130]) by s-utl02-sjpop.stsn.net; Mon, 2 Nov 2009 10:20:36 -0800
Message-ID: <4AEF22E9.5060200@gmx.de>
Date: Mon, 02 Nov 2009 19:20:25 +0100
From: Julian Reschke <julian.reschke@gmx.de>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.8.0.4) Gecko/20060516 Thunderbird/1.5.0.4 Mnenhy/0.7.4.666
MIME-Version: 1.0
To: "Phillips, Addison" <addison@amazon.com>
References: <48037FF9.9030103@gmx.de> <48049274.3090501@gmx.de> <4A61B8B7.7030200@gmx.de> <4D25F22093241741BC1D0EEBC2DBB1DA01AB843B4F@EX-SEA5-D.ant.amazon.com> <4A61F5C2.3050906@gmx.de> <4D25F22093241741BC1D0EEBC2DBB1DA01AB843BEE@EX-SEA5-D.ant.amazon.com> <4A6D9396.5020506@gmx.de> <4D25F22093241741BC1D0EEBC2DBB1DA01ABD5D45E@EX-SEA5-D.ant.amazon.com> <20090727173222.GF32524@mercury.ccil.org> <4D25F22093241741BC1D0EEBC2DBB1DA01ABD5D6D6@EX-SEA5-D.ant.amazon.com>
In-Reply-To: <4D25F22093241741BC1D0EEBC2DBB1DA01ABD5D6D6@EX-SEA5-D.ant.amazon.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Cc: LTRU Working Group <ltru@ietf.org>, HTTP Working Group <ietf-http-wg@w3.org>
Subject: Re: [Ltru] Issue 181, was: Issue 113 (language tag matching (Accept-Language) vs RFC4647), was: Proposed resolution for Issue 13 (language tags)
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 02 Nov 2009 18:20:26 -0000

Phillips, Addison wrote:
>> Phillips, Addison scripsit:
>>
>>> I tend to think that HTTP's requirements are most like what the
>>> Lookup algorithm provides. That is, you can (and must) return
>>> exactly one result for a given request.
>> Actually, no; that's an oversimplification of HTTP.  
> 
> Well... what I was trying to say was "HTTP is most commonly used in a way that I think works better with Lookup". That is, most typically, HTTP is used to return a single resource at the end of a URI. I'm well aware that there are other cases, including the 300 case, which is why I went on to say the rest of what I said :-). 
> 
> This is also why I didn't suggest that we merely replace filtering with lookup. I do think that the most common use of HTTP involves returning a single information object for a single request and, in the case where language negotiation is done at all, these typically fit Lookup more closely than Basic Filtering. A significant subset fit Basic Filtering better. And a different significant subset happen to use Basic Filtering (even if Lookup would have been a better choice) simply because 2616 said to.
> 
>> The question is, then, what to do if there is no resource that
>> specifies
>> those minimum requirements.  Apache in this case applies the lookup
>> algorithm
>> to loosen the client requirement in hopes of finding something
>> usable.
>>
> 
> Yes, and this is neither "Basic Filtering" nor "Lookup". Similarly, implementations sometimes make use of outside information (Mark Davis's example of Breton falling back to French, for example). And so forth. The problem, as I see it, is that over-specificity in HTTPbis might lead implementers astray and we really need a more comprehensive treatment of language negotiation so that folks can choose wisely.
> 
> Addison

Several of us just met at the W3C TPAC, and we produced the following 
proposal (this is the full text for the Accept-Language definition):

-- snip --
5.4.  Accept-Language

    The "Accept-Language" request-header field can be used by user agents
    to indicate the set of natural languages that are preferred in the
    response.  Language tags are defined in Section 2.4.

      Accept-Language   = "Accept-Language" ":" OWS
                        Accept-Language-v
      Accept-Language-v =
                        1#( language-range [ OWS ";" OWS "q=" qvalue ] )
      language-range    =
                <language-range, defined in [RFC4647], Section 2.1>

    Each language-range can be given an associated quality value which
    represents an estimate of the user's preference for the languages
    specified by that range.  The quality value defaults to "q=1".  For
    example,

      Accept-Language: da, en-gb;q=0.8, en;q=0.7

    would mean: "I prefer Danish, but will accept British English and
    other types of English." (see also Section 2.3 of [RFC4647])

    For matching, Section 3 of [RFC4647] defines several matching
    schemes.  Implementations can offer the most appropriate matching
    scheme for their requirements.

       Note: the "Basic Filtering" scheme ([RFC4647], Section 3.3.1) is
       identical to the matching scheme that was previously defined in
       Section 14.4 of [RFC2616].

    It might be contrary to the privacy expectations of the user to send
    an Accept-Language header with the complete linguistic preferences of
    the user in every request.  For a discussion of this issue, see
    Section 7.1.

    As intelligibility is highly dependent on the individual user, it is
    recommended that client applications make the choice of linguistic
    preference available to the user.  If the choice is not made
    available, then the Accept-Language header field MUST NOT be given in
    the request.

       Note: When making the choice of linguistic preference available to
       the user, we remind implementors of the fact that users are not
       familiar with the details of language matching as described above,
       and should provide appropriate guidance.  As an example, users
       might assume that on selecting "en-gb", they will be served any
       kind of English document if British English is not available.  A
       user agent might suggest in such a case to add "en" to get the
       best matching behavior.
-- snip --

The change is that the definition of matching is now delegated to RFC 
4647, and we just mention that the "Basic Filtering" scheme is identical 
to what RFC 2616 used to say.

See also 
<http://trac.tools.ietf.org/wg/httpbis/trac/attachment/ticket/181/181.diff> 
for the proposed change as a diff from the text in -08.

Feedback appreciated,

Julian