Re: #428 Accept-Language ordering for identical qvalues

Julian Reschke <julian.reschke@gmx.de> Thu, 17 January 2013 21:05 UTC

Return-Path: <ietf-http-wg-request@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 15F4E21F893E for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Thu, 17 Jan 2013 13:05:40 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -8.599
X-Spam-Level:
X-Spam-Status: No, score=-8.599 tagged_above=-999 required=5 tests=[AWL=2.000, BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id z6ZBAEY1ypmR for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Thu, 17 Jan 2013 13:05:39 -0800 (PST)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) by ietfa.amsl.com (Postfix) with ESMTP id C9C1C21F8937 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Thu, 17 Jan 2013 13:05:38 -0800 (PST)
Received: from lists by frink.w3.org with local (Exim 4.72) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1TvwdO-0004Vz-Rr for ietf-http-wg-dist@listhub.w3.org; Thu, 17 Jan 2013 21:04:26 +0000
Resent-Date: Thu, 17 Jan 2013 21:04:26 +0000
Resent-Message-Id: <E1TvwdO-0004Vz-Rr@frink.w3.org>
Received: from lisa.w3.org ([128.30.52.41]) by frink.w3.org with esmtp (Exim 4.72) (envelope-from <julian.reschke@gmx.de>) id 1TvwdK-0004VF-6X for ietf-http-wg@listhub.w3.org; Thu, 17 Jan 2013 21:04:22 +0000
Received: from mout.gmx.net ([212.227.15.19]) by lisa.w3.org with esmtp (Exim 4.72) (envelope-from <julian.reschke@gmx.de>) id 1TvwdJ-0004Sr-6u for ietf-http-wg@w3.org; Thu, 17 Jan 2013 21:04:22 +0000
Received: from mailout-de.gmx.net ([10.1.76.33]) by mrigmx.server.lan (mrigmx001) with ESMTP (Nemesis) id 0MWdXB-1TPguX1S8Z-00XpLc for <ietf-http-wg@w3.org>; Thu, 17 Jan 2013 22:03:54 +0100
Received: (qmail invoked by alias); 17 Jan 2013 21:03:53 -0000
Received: from p54BB24DE.dip.t-dialin.net (EHLO [192.168.2.117]) [84.187.36.222] by mail.gmx.net (mp033) with SMTP; 17 Jan 2013 22:03:53 +0100
X-Authenticated: #1915285
X-Provags-ID: V01U2FsdGVkX19Hod8ELP9HOT7WQ4z6yn2nGmm4khgGe/VlmWUCKM CG4eTOscOGH4+/
Message-ID: <50F86739.40302@gmx.de>
Date: Thu, 17 Jan 2013 22:03:53 +0100
From: Julian Reschke <julian.reschke@gmx.de>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130107 Thunderbird/17.0.2
MIME-Version: 1.0
To: "Roy T. Fielding" <fielding@gbiv.com>
CC: HTTP Working Group <ietf-http-wg@w3.org>
References: <50F6CD98.8080802@gmx.de> <2BF19800-66E0-42DC-B0B5-0F8CA6AE6379@gbiv.com> <50F7C0DC.90906@gmx.de> <838B1C13-3170-4BA1-8F1F-E171137E0BC8@gbiv.com>
In-Reply-To: <838B1C13-3170-4BA1-8F1F-E171137E0BC8@gbiv.com>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
X-Y-GMX-Trusted: 0
Received-SPF: pass client-ip=212.227.15.19; envelope-from=julian.reschke@gmx.de; helo=mout.gmx.net
X-W3C-Hub-Spam-Status: No, score=-3.2
X-W3C-Hub-Spam-Report: AWL=-3.184, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001
X-W3C-Scan-Sig: lisa.w3.org 1TvwdJ-0004Sr-6u 995a18130a15fd3e9cc723bd84135cf1
X-Original-To: ietf-http-wg@w3.org
Subject: Re: #428 Accept-Language ordering for identical qvalues
Archived-At: <http://www.w3.org/mid/50F86739.40302@gmx.de>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/15972
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

On 2013-01-17 11:17, Roy T. Fielding wrote:
> On Jan 17, 2013, at 1:14 AM, Julian Reschke wrote:
>
>> On 2013-01-17 09:59, Roy T. Fielding wrote:
>>> The change is to improve interoperability when the preferences sent
>>> result in a tie or contain no qvalues.
>>>
>>> http://www.w3.org/International/questions/qa-lang-priorities.en.php
>>>
>>> Firefox and Chrome have an ordered language UI that takes whatever
>>> list the user comes up with and creates q-values to associate with
>>> each language tag after the first.  The languages are always listed
>>> in order of preference.  I've heard that Opera and MSIE do the same
>>
>> But they have different qvalues, so these UAs do *not* rely on ordering.
>
> They order them *and* they send qvalues, because of broken sites like
>
>     http://wiki.nginx.org/AcceptLanguageModule

...so they do not rely on recipients treating the list as ordered.

> and the change I made has no effect on UAs that send qvalues.

Agreed.

> It does, however, improve the lot for users of other user agents
> that either do not send qvalues or allow the user to specify the
> value by hand, either of which can result in same-valued tags.

Well, it's doing that at the cost of making recipients non-compliant 
that implement RFC 2068/2616. So this is a change that would need to be 
listed in the Changes section.

>>> but haven't tested.  Chrome has a bug with the ordering of tags to
>>> match the UI, but that's orthogonal to this issue.
>>>
>>> ...
>>>
>>> Older browsers did not send qvalues. Hence, server implementations
>>> of language negotiation do use the ordering provided as I described
>>> in the change.
>>
>> Some, apparently. But not all. Servers have been written according to the spec, ignoring ordering. If we make ordering significant, these will not interoperate anymore.
>
> In what way do they interoperate now, and in what way does that
> change?  As far as I can tell, the only effect this change has
> is a suggestion that they not respond randomly.

Right now they interoperate as specified by the spec. If we change the 
spec, they do not anymore (or only some of the time).

>>>> I believe this change should be backed out.
>>>
>>> The change has no impact on user agents that send distinct
>>> qvalues.  At most, it would change the interpretation for those
>>> few requests that still rely on ordered language tags, for which
>>> the prior specs had no interpretation at all.
>>
>> Well, it had an interpretation ("same weight").
>>
>> It's good that there are only few requests on relying on this. Do you want to encourage more, potentially breaking servers that ignore ordering?
>
> It isn't possible to break those servers any more than they
> are already broken by responding in a random fashion.

They are not broken. They do what the spec says.

>>> http://forums.thedailywtf.com/forums/t/15895.aspx
>>
>> Yes, I'm aware that this is a FAQ. But we're not starting from scratch here.
>>
>>> What I added is how Apache httpd has implemented it since before
>>> any of the HTTP specs were RFCs, which was compatible with how
>>> CERN httpd implemented it before that (no qvalues at all).
>>> The change has no impact on conformance because language
>>> negotiation is optional and the change is not expressed as a
>>> requirement.  It also resolves an inconsistency with RFC4647.
>>
>> It does have an impact, because servers that previously implemented the optional feature now do not anymore.
>
> A server is not required to obey Accept-Language.

Yes, but a server that *does* implement it according to the *current* 
spec (literally) won't be conforming to the new spec anymore.

>>> There are far more examples on the Web where applications
>>> incorrectly assume the list is ordered
>>
>> That'll be hard to count. :-)
>>
>>>    http://pic.dhe.ibm.com/infocenter/tivihelp/v2r1/index.jsp?topic=%2Fcom.ibm.itame.doc_6.1%2Fam61_webseal_admin134.htm
>>>
>>>    http://www.developershome.com/wap/detection/detection.asp?page=acceptLanguageHeader
>>>
>>> than there are servers that implement language negotiation and
>>> actually want to resolve ties at random.
>>
>> They do not "want" to resolve at random; they do so because they have implemented what the spec says. There's no reason to create an ordered list structure when the spec says that an unordered list is sufficient.
>
> Yes, which is why we are changing the spec so that they can eventually
> improve their implementation (or not).  An unordered list is not
> sufficient because users don't send unordered lists, not even when
> they are hand-crafted.  Everyone assumes they are ordered or they
> send them with correct qvalues.

If people stick to what 2068 and 2616 says, there is no problem.

>>> As Harald said,
>>>
>>>> it seems still to be fairly normal to give a sequence of
>>>> language-ranges in this header without any q= values, and expect the
>>>> result to be deterministic.
>>
>> I'd like so see evidence of that, not hearsay. Browsers do *not* rely on this.
>
> ENOCARE.

Not helpful.

>>> because that's how it works in practice for the vast majority
>>> of systems that implement content negotiation.  The spec should
>>> reflect what is most interoperable for the users.
>>
>> As far as I can tell, what's interoperable is the exact opposite: not relying on ordering, but sending qvalues.
>
> One has nothing to do with the other.  The added text does not
> change the ordering for UAs that send proper, non-identical
> qvalues on language tags, even if they choose not to sort them
> by order (which the latest browsers do for a good reason).

Yes. But it makes recipients that are currently conforming potentially 
non-conforming. Yes, for an optional feature, but still.

It seems there are two logical approaches:

(1) Leave things as they are, but explain that HTTP is not totally 
consistent with RFC4647, and that there may be broken senders that rely 
on ordering, and also recipients that do support this; thus it might be 
a good idea to mirror that behavior.

OR

(2) Change the spec to what it says no, noting that this is a change 
from 2068/2616, and also inconsistent with other Accept header fields 
(or do you want to change those, too???).

Note that introducing an inconsistency here may be a bad idea; either it 
defeats code re-use, or the "new old" semantics may bleed over 
unintentionally into the other header fields.

My preference is (1). I also don't think we have not even rough 
consensus for a change from 2616 here.

Best regards, Julian