Re: #428 Accept-Language ordering for identical qvalues
Amos Jeffries <squid3@treenet.co.nz> Thu, 24 January 2013 08:39 UTC
Return-Path: <ietf-http-wg-request@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 27BA921F8717 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Thu, 24 Jan 2013 00:39:17 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -8.819
X-Spam-Level:
X-Spam-Status: No, score=-8.819 tagged_above=-999 required=5 tests=[AWL=-0.079, BAYES_20=-0.74, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id InQG9yDk96WE for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Thu, 24 Jan 2013 00:39:12 -0800 (PST)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) by ietfa.amsl.com (Postfix) with ESMTP id 22E4521F86FA for <httpbisa-archive-bis2Juki@lists.ietf.org>; Thu, 24 Jan 2013 00:39:12 -0800 (PST)
Received: from lists by frink.w3.org with local (Exim 4.72) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1TyIK8-0006F8-Ez for ietf-http-wg-dist@listhub.w3.org; Thu, 24 Jan 2013 08:38:16 +0000
Resent-Date: Thu, 24 Jan 2013 08:38:16 +0000
Resent-Message-Id: <E1TyIK8-0006F8-Ez@frink.w3.org>
Received: from lisa.w3.org ([128.30.52.41]) by frink.w3.org with esmtp (Exim 4.72) (envelope-from <squid3@treenet.co.nz>) id 1TyIK0-0006EP-I2 for ietf-http-wg@listhub.w3.org; Thu, 24 Jan 2013 08:38:08 +0000
Received: from ip-58-28-153-233.static-xdsl.xnet.co.nz ([58.28.153.233] helo=treenet.co.nz) by lisa.w3.org with esmtp (Exim 4.72) (envelope-from <squid3@treenet.co.nz>) id 1TyIJs-0006sc-Vk for ietf-http-wg@w3.org; Thu, 24 Jan 2013 08:38:08 +0000
Received: from [192.168.1.103] (unknown [14.1.64.4]) by treenet.co.nz (Postfix) with ESMTP id B72F3E6F8D for <ietf-http-wg@w3.org>; Thu, 24 Jan 2013 21:37:30 +1300 (NZDT)
Message-ID: <5100F2C4.2090104@treenet.co.nz>
Date: Thu, 24 Jan 2013 21:37:24 +1300
From: Amos Jeffries <squid3@treenet.co.nz>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130107 Thunderbird/17.0.2
MIME-Version: 1.0
To: ietf-http-wg@w3.org
References: <50F6CD98.8080802@gmx.de> <2BF19800-66E0-42DC-B0B5-0F8CA6AE6379@gbiv.com> <50F7C0DC.90906@gmx.de> <CA+hEJVW1AkVCZdivu8tM1m2E_hsXH0BSrQwM=4A87xx5zrBDBw@mail.gmail.com> <50FE99E3.3020009@gmx.de>
In-Reply-To: <50FE99E3.3020009@gmx.de>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 7bit
Received-SPF: pass client-ip=58.28.153.233; envelope-from=squid3@treenet.co.nz; helo=treenet.co.nz
X-W3C-Hub-Spam-Status: No, score=-3.5
X-W3C-Hub-Spam-Report: AWL=-3.449, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001
X-W3C-Scan-Sig: lisa.w3.org 1TyIJs-0006sc-Vk 52c6e8c38ae1b835764dc504153147f1
X-Original-To: ietf-http-wg@w3.org
Subject: Re: #428 Accept-Language ordering for identical qvalues
Archived-At: <http://www.w3.org/mid/5100F2C4.2090104@treenet.co.nz>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/16143
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>
On 23/01/2013 2:53 a.m., Julian Reschke wrote: > On 2013-01-22 14:40, Nicholas Shanks wrote: >> On 17 January 2013 09:14, Julian Reschke wrote: >>> On 2013-01-17 09:59, Roy T. Fielding wrote: >>>> than there are servers that implement language negotiation and >>>> actually want to resolve ties at random. >>> >>> They do not "want" to resolve at random; they do so because they have >>> implemented what the spec says. There's no reason to create an >>> ordered list >>> structure when the spec says that an unordered list is sufficient. >> >> I think no implication of randomness should be permitted by the >> specifications. >> They should instead require that a deterministic process be used, and >> that, other than requests to services which explicitly exist to >> provide random results (e.g. Wikipedia's "Random Page" link), the same >> request should generate the same result providing nothing pertinent to >> the resource has changed on the server. >> >> Someone, I don't recall who, gave the example of a home page loading >> blog posts via AJAX, where the blog posts are available in two >> languages. Random selection between the variants, where (q * qs) >> values are equal for both languages, or are being ignored, would That would be me. Take a note of the Androids below... > > Can you please give an example of clients sending these kind of header > field values? > > Clients that care can provide different qvalues, and as a matter of > fact, they do. Uhm. Lets see..... where shall I start ? I think an overview of what happens what agents "care" enough to send q-values. Followed by a small sample of the 513 agents I have on record with no q-values at all. Judge for yourself which ones are interpreted better as sorted lists. For starters I would like to say, that to be completely fair the majority of agents that I have on record (~54% of unique language:agent pair entries) *do* send q-values properly in accordance with the specification - and that same 54% of unique agent entries is all 'voting' for the list to be ordered. I am presenting this sub-set as what types of complexity/confusion issues we are introducing when we rely solely on q-values to provide ordering semantics in the list. WebKit ... cs, en-us; 0.9, de-de; 0.8, ru-ru; 0.7 - Mozilla/5.0 (X11; U; Linux; cs-CZ) AppleWebKit/532.4 (KHTML, like Gecko) Arora/0.10.1 Safari/532.4 + do we consider that a list with q-values or not? + notice also how it is a much more "up to date" version the the following... en;q=1.0, en;q=0.5, zh-cn, zh;q=0.5, en;q=0.5 - Mozilla/5.0 (SymbianOS/9.2; U; Series60/3.1 NokiaE71-1/300.21.012; Profile/MIDP-2.0 Configuration/CLDC-1.1 ) AppleWebKit/413 (KHTML, like Gecko) Safari/413 + Nokia Symbian and SonyEricsson WebKit/ 4XX-532 derived agents across the board seem to have 1 primary language set at q=1.0 followed by a list of others all sharing q=0.5 or no q-value at all as seen above. cs-CZ, en-US - Mozilla/5.0 (Linux; U; Android 2.2; cs-cz; HTC Legend Build/FRF91) AppleWebKit/533.1 (KHTML, like Gecko) Version/4.0 Mobile Safari/533.1 + Starting with WebKit/533 all the mobiles seem to have moved to this 2-language model with something then "en-US" da-DK, en-US - Mozilla/5.0 (Linux; U; Android 4.0.4; da-dk; GT-P5110 Build/IMM76D) AppleWebKit/534.30 (KHTML, like Gecko) Version/4.0 Safari/534.30 en-us,en - Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; Valve Steam Client; ) AppleWebKit/534.1 (KHTML, like Gecko) Chrome/6.0.444.0 Safari/534.1 th-TH, en-US - Mozilla/5.0 (Linux; U; Android 4.0.3; th-th; A1 Build/IML74K) AppleWebKit/534.30 (KHTML, like Gecko) Version/4.0 Mobile Safari/534.30 ... and then we have iTunes. A massive "WTF?" going out to the iTunes developers if anyone is reading. en;q=1.0,fr;q=1.0,de;q=0.9,ja;q=0.9,nl;q=0.9,it;q=0.9,es;q=0.8,pt;q=0.8,pt-PT;q=0.8,da;q=0.7,fi;q=0.7,nb;q=0.7,sv;q=0.7,ko;q=0.6,zh-Hans;q=0.6,zh-Hant;q=0.6,ru;q =0.5,pl;q=0.5,tr;q=0.5,uk;q=0.5,ar;q=0.4,hr;q=0.4,cs;q=0.4,el;q=0.3,he;q=0.3,ro;q=0.3,sk;q=0.3,th;q=0.2,id;q=0.2,ms;q=0.2,en-GB;q=0.1,ca;q=0.1,hu;q=0.1,vi;q=0.1 - iTunes-iPad/5.1.1 (2; 32GB; dt:74) en;q=1.0,fr;q=1.0,de;q=0.9,ja;q=0.9,nl;q=0.9,it;q=0.9,es;q=0.8,pt;q=0.8,pt-PT;q=0.8,da;q=0.7,fi;q=0.7,nb;q=0.7,sv;q=0.7,ko;q=0.6,zh-Hans;q=0.6,zh-Hant;q=0.6,ru;q =0.5,pl;q=0.5,tr;q=0.5,uk;q=0.5,ar;q=0.4,hr;q=0.4,cs;q=0.4,el;q=0.3,he;q=0.3,ro;q=0.3,sk;q=0.3,th;q=0.2,id;q=0.2,ms;q=0.2,en-GB;q=0.1,ca;q=0.1,hu;q=0.1,vi;q=0.1 - iTunes-iPhone/5.0 (4; 16GB) en;q=1.0,fr;q=1.0,de;q=0.9,ja;q=0.9,nl;q=0.9,it;q=0.9,es;q=0.8,pt;q=0.8,pt-PT;q=0.8,da;q=0.7,fi;q=0.7,nb;q=0.7,sv;q=0.7,ko;q=0.6,zh-Hans;q=0.6,zh-Hant;q=0.6,ru;q =0.5,pl;q=0.5,tr;q=0.5,uk;q=0.5,ar;q=0.4,hr;q=0.4,cs;q=0.4,el;q=0.3,he;q=0.3,ro;q=0.3,sk;q=0.3,th;q=0.2,id;q=0.2,ms;q=0.2,en-GB;q=0.1,ca;q=0.1,hu;q=0.1,vi;q=0.1 - iTunes-iPhone/4.3.5 (3; 16GB) ... spiders are mostly doing a remarkably good job. At least it looks that way until the q-values get involved. ja-JP,ja - Baiduspider+(+http://www.baidu.jp/spider/) ja,en - Mozilla/5.0 (compatible; Steeler/3.5; http://www.tkl.iis.u-tokyo.ac.jp/~crawler/) ru, uk;q=0.8, be;q=0.8, en;q=0.7, *;q=0.01 - Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots) + q=0.8 - Ukranian or Belarusian ? en-us,en-gb,en;q=0.99,*;q=0.01 - TosCrawler/Nutch-1.5.1 (http://www.toshiba.co.jp/rdc/about/crawl_info.htm; <dc-crawler at ml dot toshiba dot co dot jp>) + q=1.0 - English US or British? (no so much trouble for humans but for a search engine it might cause indexing trouble). Don't know if you would call some of the major search engine bots popular or even "fixable problem"? I host a translation server so it is likely that these below are from actual users working on text translation. You know, the kind of person who *really* objects to getting a randomly-wrong language displayed. Also these people are highly knowledgeable about language codes and what they mean, so if they entered these manually it was for a specific reason according to how they or their tools author interpreted the Accept-Language specs. Note how the first entries have no q-value and are *sorted* as if they were q=1.0, which is what the spec says to do when no q-value is supplied remember ... Treat it as q=1.0. ca,ca-ES,es-es;q=0.9,es;q=0.9,en-US;q=0.9,en;q=0.9,es-419;q=0.8,ca-AD;q=0.8,en-gb;q=0.8,de-de;q=0.7,de;q=0.7,ca-CA;q=0.7,cs-CZ;q=0.6,cs;q=0.6,it-it;q=0.6,it;q=0.6,es-CL;q=0.5,en-au;q=0.5,fr-FR;q=0.5,fr;q=0.4,ru-ru;q=0.4,ru;q=0.4,es-x-mtfrom-en;q=0.4,es-ar;q=0.3,ja-JP;q=0.3,ja;q=0.3,pt-PT;q=0.2,pt;q=0.2,do-es;q=0.2,do;q=0.1,es-x-mtfrom-it;q=0.1,nl-nl;q=0.1,nl;q=0.1,en-en;q=0.0 - Mozilla/5.0 (X11; Linux x86_64; rv:10.0.6) Gecko/20100101 Firefox/10.0.6 Iceweasel/10.0.6 + q=1.0 - Catalan Valencian or Spanish Catalan? + q=0.9 - Spanish or English? Generic or nationalized grammar? + q=0.8 - Spanish or Catalan Andoran or English or German or Catalan Valencian? + q=0.6 - want to try again with German or Catalan Generic? + q=0.5 - Spanish or Australian English or French? + q=0.4 - what about French or Russian? + q=0.3 - Argentine Spanish or Japanese? + q=0.1 - Spanish or Dutch? de,de-DE,en-US;q=0.9,en;q=0.9,nl-nl;q=0.8,nl;q=0.8,en-gb;q=0.8,ro-RO;q=0.7,ro;q=0.7,fr-FR;q=0.6,fr;q=0.6,de-DE-1901;q=0.5,tr-TR;q=0.5,tr;q=0.5,pl-PL;q=0.4,pl;q=0.4,nl-NL;q=0.3,de-de;q=0.3,de-at;q=0.3,en-us;q=0.2,pl-pl;q=0.2,de;q=0.1,en-us;q=0.1,en;q=0.0 - Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.15) Gecko/20110303 Firefox/3.6.15 + q=0.9 - English Generic or US-centric ? + q=0.8 - Dutch or English? + q=0.5 - German or Turkish? + q=0.3 - Dutch or German? + q=0.2 - English or Polish? + q=0.1 - German or English? + q=0.1 - oops Cancel that q=0.9 US English option. + q=0.0 - oops Cancel that q=0.9 generic English option. + I skip q=1.0 (none), q=0.7, q=0.6 and q=0.4 because these, while being alternatives sharing a q-value, are in the ISO definitions semantically equivalent aliases for the same language. So any selection algorithm other than if-it-exists is a waste of CPU cycles but not a user problem. We have only a few agents sending "q=1.0", by my interpretation of 2616 these few are the "correct" users of q-values when q=1: en;q=1.0 - w3m/0.5.2 also the YoudaoBot spider with a mix of language codes. It seems to be trying to fetch different translations specifically for some reason. en-us;q=1.0, es-ve;q=0.5 - Mozilla/4.1 (U; BREW 3.1.5; en-US; Teleca/Q05A/INT) - NetFront/3.5.1 (BREW 5.0.1.2; U; en-us; LG; NetFront/3.5.1/AMB) Sprint LN510 MMP/2.0 Profile/MIDP-2.1 Configuration/CLDC-1.1 there are a few other variations of this "NetFront/" framework from Samsung and LG mobile devices. The rest (~50 unique agent:language pairs) using q=1.0 somewhere in the A-L header are all WebKit derived agents. We already covered how well they handle q-values. Still a fair few browser few browser agents around with no q-values. zh-cn,zh-tw - Mozilla/5.0 (Windows; U; Windows NT 5.1; zh-CN; rv:1.9.0.1) Gecko/2008070208 Firefox/3.0.1 zh-cn,zh-tw - Mozilla/5.0 (Windows; U; Windows NT 5.1; zh-CN; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3 en,zh,fr,de,it - Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.20) Gecko/20081217 Firefox/2.0.0.20 Novarra-Vision/8.0 ru, en-US, en - Mozilla/5.0 (compatible; Konqueror/4.4; Linux) KHTML/4.4.5 (like Gecko) ru, uk, en-US, en - Mozilla/5.0 (compatible; Konqueror/4.4; FreeBSD) KHTML/4.4.3 (like Gecko) HTH Amos
- #428 Accept-Language ordering for identical qvalu… Julian Reschke
- Re: #428 Accept-Language ordering for identical q… Martin J. Dürst
- Re: #428 Accept-Language ordering for identical q… Julian Reschke
- Re: #428 Accept-Language ordering for identical q… Roy T. Fielding
- Re: #428 Accept-Language ordering for identical q… Julian Reschke
- Re: #428 Accept-Language ordering for identical q… Roy T. Fielding
- Re: #428 Accept-Language ordering for identical q… Julian Reschke
- Re: #428 Accept-Language ordering for identical q… Julian Reschke
- Re: #428 Accept-Language ordering for identical q… Amos Jeffries
- Re: #428 Accept-Language ordering for identical q… Martin J. Dürst
- Re: #428 Accept-Language ordering for identical q… Julian Reschke
- Re: #428 Accept-Language ordering for identical q… Amos Jeffries
- Re: #428 Accept-Language ordering for identical q… Julian Reschke
- Re: #428 Accept-Language ordering for identical q… Amos Jeffries
- Re: #428 Accept-Language ordering for identical q… Julian Reschke
- Re: #428 Accept-Language ordering for identical q… Roy T. Fielding
- Re: #428 Accept-Language ordering for identical q… Nicholas Shanks
- Re: #428 Accept-Language ordering for identical q… Amos Jeffries
- Re: #428 Accept-Language ordering for identical q… Nico Williams
- Re: #428 Accept-Language ordering for identical q… Adrien W. de Croy
- Re: #428 Accept-Language ordering for identical q… Adrien W. de Croy
- Re: #428 Accept-Language ordering for identical q… Amos Jeffries
- Re: #428 Accept-Language ordering for identical q… Mark Nottingham
- Re: #428 Accept-Language ordering for identical q… Roy T. Fielding
- Re: #428 Accept-Language ordering for identical q… Mark Nottingham
- Re: #428 Accept-Language ordering for identical q… James M Snell
- Re: #428 Accept-Language ordering for identical q… Mark Nottingham
- Re: #428 Accept-Language ordering for identical q… Adrien W. de Croy
- Re: #428 Accept-Language ordering for identical q… Amos Jeffries
- Re: #428 Accept-Language ordering for identical q… Mark Nottingham
- Re: #428 Accept-Language ordering for identical q… Roy T. Fielding
- Re: #428 Accept-Language ordering for identical q… Mark Nottingham
- Re: #428 Accept-Language ordering for identical q… Nicholas Shanks
- Re: #428 Accept-Language ordering for identical q… James M Snell
- Re: #428 Accept-Language ordering for identical q… Adrien W. de Croy
- Re: #428 Accept-Language ordering for identical q… Amos Jeffries
- Re: #428 Accept-Language ordering for identical q… Adrien W. de Croy
- Re: #428 Accept-Language ordering for identical q… Roy T. Fielding
- Re: #428 Accept-Language ordering for identical q… Adrien W. de Croy
- Re: #428 Accept-Language ordering for identical q… Nicholas Shanks
- Re: #428 Accept-Language ordering for identical q… Julian Reschke
- Re: #428 Accept-Language ordering for identical q… Mark Nottingham
- Re: #428 Accept-Language ordering for identical q… Julian Reschke
- Re: #428 Accept-Language ordering for identical q… Roy T. Fielding
- Re: #428 Accept-Language ordering for identical q… Mark Nottingham
- Re: #428 Accept-Language ordering for identical q… Julian Reschke
- Re: #428 Accept-Language ordering for identical q… Amos Jeffries
- Re: #428 Accept-Language ordering for identical q… Amos Jeffries
- Re: #428 Accept-Language ordering for identical q… Julian Reschke
- Re: #428 Accept-Language ordering for identical q… Amos Jeffries
- Re: #428 Accept-Language ordering for identical q… Julian Reschke
- Re: #428 Accept-Language ordering for identical q… Julian Reschke
- Re: #428 Accept-Language ordering for identical q… Julian Reschke
- Re: #428 Accept-Language ordering for identical q… Mark Nottingham
- Re: #428 Accept-Language ordering for identical q… Martin J. Dürst
- Re: #428 Accept-Language ordering for identical q… Julian Reschke
- Re: #428 Accept-Language ordering for identical q… Julian Reschke
- Re: #428 Accept-Language ordering for identical q… Martin J. Dürst
- Re: #428 Accept-Language ordering for identical q… Mark Nottingham
- Re: #428 Accept-Language ordering for identical q… Eric J. Bowman
- Re: #428 Accept-Language ordering for identical q… James M Snell
- Re: #428 Accept-Language ordering for identical q… Martin Thomson
- Re: #428 Accept-Language ordering for identical q… Eric J. Bowman
- Re: #428 Accept-Language ordering for identical q… Roy T. Fielding
- Re: #428 Accept-Language ordering for identical q… Eric J. Bowman
- Re: #428 Accept-Language ordering for identical q… Nicholas Shanks
- Re: #428 Accept-Language ordering for identical q… Julian Reschke