Re: #428 Accept-Language ordering for identical qvalues

"Adrien W. de Croy" <adrien@qbik.com> Sat, 19 January 2013 07:11 UTC

Return-Path: <ietf-http-wg-request@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1B3AD21F8581 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Fri, 18 Jan 2013 23:11:56 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.432
X-Spam-Level:
X-Spam-Status: No, score=-9.432 tagged_above=-999 required=5 tests=[AWL=1.167, BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 48nZQhLCSZKz for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Fri, 18 Jan 2013 23:11:55 -0800 (PST)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) by ietfa.amsl.com (Postfix) with ESMTP id 2D0B721F857E for <httpbisa-archive-bis2Juki@lists.ietf.org>; Fri, 18 Jan 2013 23:11:55 -0800 (PST)
Received: from lists by frink.w3.org with local (Exim 4.72) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1TwSZm-0007TQ-Mj for ietf-http-wg-dist@listhub.w3.org; Sat, 19 Jan 2013 07:10:50 +0000
Resent-Date: Sat, 19 Jan 2013 07:10:50 +0000
Resent-Message-Id: <E1TwSZm-0007TQ-Mj@frink.w3.org>
Received: from maggie.w3.org ([128.30.52.39]) by frink.w3.org with esmtp (Exim 4.72) (envelope-from <adrien@qbik.com>) id 1TwSZi-0007Sg-VH for ietf-http-wg@listhub.w3.org; Sat, 19 Jan 2013 07:10:46 +0000
Received: from smtp.qbik.com ([210.55.214.35]) by maggie.w3.org with esmtp (Exim 4.72) (envelope-from <adrien@qbik.com>) id 1TwSZh-0003F0-9O for ietf-http-wg@w3.org; Sat, 19 Jan 2013 07:10:46 +0000
Received: From [192.168.0.10] (unverified [192.168.0.10]) by SMTP Server [192.168.0.1] (WinGate SMTP Receiver v7.5.0 (Build 3481)) with SMTP id <0019473360@smtp.qbik.com>; Sat, 19 Jan 2013 20:11:50 +1300
From: "Adrien W. de Croy" <adrien@qbik.com>
To: Amos Jeffries <squid3@treenet.co.nz>, Nicholas Shanks <nickshanks@nickshanks.com>
Cc: "Roy T. Fielding" <fielding@gbiv.com>, Julian Reschke <julian.reschke@gmx.de>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
Date: Sat, 19 Jan 2013 07:09:52 +0000
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; format="flowed"; charset="utf-8"
In-Reply-To: <50FA2120.8000008@treenet.co.nz>
Message-Id: <em4c879d50-e2c1-4a01-be35-cb4740f2f4cc@bombed>
Mime-Version: 1.0
Reply-To: "Adrien W. de Croy" <adrien@qbik.com>
User-Agent: eM_Client/5.0.17263.0
Received-SPF: pass client-ip=210.55.214.35; envelope-from=adrien@qbik.com; helo=smtp.qbik.com
X-W3C-Hub-Spam-Status: No, score=-3.5
X-W3C-Hub-Spam-Report: AWL=-3.449, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001
X-W3C-Scan-Sig: maggie.w3.org 1TwSZh-0003F0-9O a2c5f3284318f131c2bc31837bdb51ca
X-Original-To: ietf-http-wg@w3.org
Subject: Re: #428 Accept-Language ordering for identical qvalues
Archived-At: <http://www.w3.org/mid/em4c879d50-e2c1-4a01-be35-cb4740f2f4cc@bombed>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/16022
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

RFC 2616 already strongly implies (in prose not ABNF) order preference

>From 14.4

"The quality value defaults to "q=1". For example,

Accept-Language: da, en-gb;q=0.8, en;q=0.7

would mean: "I prefer Danish, but will accept British English and
other types of English."

The prose states a preference of Danish over the following languages.

Actually the non-requirement for a server to take into account any 
indicated preferences in Accept-* leads to some interesting conundra for 
caching.

For instance,

GET /something HTTP/1.1
Host: someserver.org
Accept-Language: fr
...

200 OK Document follows
Content-Language: en
Vary: Accept-Language
...

leads to problems when the cache sees a request

GET /something HTTP/1.1
Host: someserver.org
Accept-Language: en
...

Even though it's obvious to send the cached resource, it's not correct 
since the selecting header does not match between requests.  This could 
be ameliorated if there were ETags since at least the cached version 
could be checked?

Adrien




------ Original Message ------
From: "Amos Jeffries" <squid3@treenet.co.nz>
To: "Nicholas Shanks" <nickshanks@nickshanks.com>
Cc: "Roy T. Fielding" <fielding@gbiv.com>; "Julian Reschke" 
<julian.reschke@gmx.de>; "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
Sent: 19/01/2013 5:29:20 p.m.
Subject: Re: #428 Accept-Language ordering for identical qvalues
>On 19/01/2013 1:50 a.m., Nicholas Shanks wrote:
>>>>On 2013-01-18 09:46, Amos Jeffries wrote:
>>>>>I'm with Roy on this one. It's not adding any new requirement about
>>I feel I concur with Julian the most.
>>
>>On 18 January 2013 12:11, Roy T. Fielding <fielding@gbiv.com> wrote:
>>>Yes. It would also be conformant to send Mäori text.
>>Use a macron or leave it off ;-)
>>[Option-a] [a] on a Mac with one of the "Extended" keyboard layouts
>>
>>>Ignoring the preferences sent in Accept-Language is conforming 
>>>behavior.
>>>
>>>Conformance is not a relevant issue here. What matters is what the
>>>user actually prefers. It is my opinion that when a user sets an
>>>Accept-Language header to
>>>
>>>    Accept-Language: en, de
>>>
>>>what they are actually saying is that they accept both languages
>>>but would prefer en if the de representation is no better.
>>You cannot assume that. They are either using a broken client, or both
>>are acceptable. Please don't change the standard to accomodate broken
>>clients, especially as these are going to become fewer in number as
>>time progresses and machines get upgraded.
>>
>>>The reason I believe this is because user agents that allow a
>>>user to send such a header field have explicitly instructed the
>>>user that the field is ordered (or based the value on some other
>>>ordered list for the host UI, as is the case for some cell phones).
>>All UAs I know of that allow users to set an ordered list of
>>languages, also send auto-generated q-values.
>>
>>Do you actually have any statistics to back up your belief, or is it
>>just a gut feeling?
>>Some numbers to say that "versions x and earlier of so-and-so browser
>>on X-series phones allow users to define an ordered list but do not
>>send q-values; those browsers currently have a worldwide market share
>>of 0.0001%" would be useful to know whether it's worth ignoring such
>>broken UAs to pandering to them.
>>
>>
>>FWIW, my usual AL string, in browsers that let you set one, is:
>>"en-GB, en-IE, en-AU, en-US;q=0, en;q=0.95, fr;q=0.5, de;q=0.5,
>>zh-Hant;q=0.1, *;q=0.2"
>>My goals should be self-evident from the q-values, specifically to get
>>english, french or german, to demote 'complicated' Han script and fall
>>back to anything else. The US thing is to see if sites are actually
>>obeying my preferences (I get many more "y'all"s than 406's sadly!)
>
>According to the 2616 spec we are quibbling over you would get en-GB, 
>en-IE, or en-AU if any of them were available. These being assumed to 
>be q=1 and all equal valued; one is supposed to be selected *randomly*.
>
>Now suppose you had a blog page with each comment loaded by XHR as a 
>separate GET request using that AL header and auto-translation of 
>comments - your page looks like a group of multi-cultural responses 
>some possibly with pidgin-English style wording despite probably all 
>actually being en-US text to begin with. AND if you refresh the page 
>everybodies language changes from what it was last load.
>
>Versus a server which assumed en-GB, en-IE, en-AU were equal q=1 and 
>ordered by preference would supply you with the en-GB for each response 
>part of the page.
>
>Which is the better outcome for web developers to rely on?
>Which one is easier for servers to write fast code for?
>
>Amos
>