Re: #428 Accept-Language ordering for identical qvalues

Amos Jeffries <> Mon, 21 January 2013 01:58 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id D6ADE21F8439 for <>; Sun, 20 Jan 2013 17:58:24 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -8
X-Spam-Status: No, score=-8 tagged_above=-999 required=5 tests=[RCVD_IN_DNSWL_HI=-8]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id g+93ZTmzlA83 for <>; Sun, 20 Jan 2013 17:58:24 -0800 (PST)
Received: from ( []) by (Postfix) with ESMTP id 41A7521F8423 for <>; Sun, 20 Jan 2013 17:58:24 -0800 (PST)
Received: from lists by with local (Exim 4.72) (envelope-from <>) id 1Tx6d5-0007i1-Rp for; Mon, 21 Jan 2013 01:56:55 +0000
Resent-Date: Mon, 21 Jan 2013 01:56:55 +0000
Resent-Message-Id: <>
Received: from ([]) by with esmtp (Exim 4.72) (envelope-from <>) id 1Tx6d1-0007h8-OF for; Mon, 21 Jan 2013 01:56:51 +0000
Received: from ([] by with esmtp (Exim 4.72) (envelope-from <>) id 1Tx6d0-00042G-OP for; Mon, 21 Jan 2013 01:56:51 +0000
Received: from [] (unknown []) by (Postfix) with ESMTP id C4FE8E719E for <>; Mon, 21 Jan 2013 14:56:26 +1300 (NZDT)
Message-ID: <>
Date: Mon, 21 Jan 2013 14:56:23 +1300
From: Amos Jeffries <>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130107 Thunderbird/17.0.2
MIME-Version: 1.0
References: <em144175d2-e44d-4209-b5a2-f2dbf14d99d4@bombed>
In-Reply-To: <em144175d2-e44d-4209-b5a2-f2dbf14d99d4@bombed>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Received-SPF: pass client-ip=;;
X-W3C-Hub-Spam-Status: No, score=-3.9
X-W3C-Hub-Spam-Report: AWL=-2.045, BAYES_00=-1.9, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001
X-W3C-Scan-Sig: 1Tx6d0-00042G-OP c49a63543dfb40c2f028a574dbeb2d17
Subject: Re: #428 Accept-Language ordering for identical qvalues
Archived-At: <>
X-Mailing-List: <> archive/latest/16072
Precedence: list
List-Id: <>
List-Help: <>
List-Post: <>
List-Unsubscribe: <>

On 21/01/2013 12:30 p.m., Adrien W. de Croy wrote:
>   ------ Original Message ------
> From: "James M Snell" < <>>
>> +1.. in fact, for 2.0, I'd very much like to get rid of q-values 
>> entirely and depend entirely on order.
> same here.
> The idea may have been laudable in 1998, but really, how can a web 
> server tell if some resource is 80% better than another? A human needs 
> to tell it, and humans have enough trouble with other things.
> the q=0 option would need to be turned into a Naccept-* header or 
> something.   But does anyone even use it outside of testing for 406 
> responses which never come?

My collection of 2 years worth of language headers says no.

Of 2018 unique Accept-Language header field-values;
   1532 are using q-values in a strictly sorted list
   491 are not using q-values
   14 are using "q=0.0".
   5 are using q-values and non-qvalues without ordering the sent list 
(1 looks otherwise normal, teh others are using puny-codes)

The 14 are also unique in being very long and having multiple entries 
with equal q-values. They are still without exception strictly ordered 
with the entries having no q-value entries first (as if q=1.0 was used 
for sort but omitted sending). They are also containing a number of 
oddities such as multiple entries for language codes with differing 

NP: Of those 14 odd A-L headers noted above I have UA details on 8 of 
them. All claim to be Firefox but the Gecko dates do not line up with 
other info on those versions (the 11.0 was released some years before 
3.5.9 on the same OS) so the whole input is a bit suspect.

The 5 cases un-ordered list have puny-code values with no q-value being 
listed after an otherwise normal series of languages. Like so:

I have a few cases of q-value ordered list followed by wildcard "*" with 
no q-value. Sender obviously assuming the list is ordered.

Broken down by UA, which I started ~6 months ago at Juliens suggestion I 
have 54289 distinct UA visiting, of which;
   21756 are not sending A-L header at all
   19621 unique UA are using a single language code with no q-value
   12495 unique UA are using q-values as above.
   8 are sending only wildcard "*" or "*/*"

The remainder ~400 roughly match up with the 491 AL field-values not 
using q-values. Are older agents (Windows 98, NT, 2k stand out), agents 
sending the same language multiple times (VoilaBot variants and Safari 
there), or sending sub-language variants with the generic form last eg 
"en-GB,en", "en-US,en", "en-US,en,*" (Tablets and Mobile Safari mostly). 
Obviously assuming sorted lists even back into the Windows 98 ones.

There are also a few bots sending exactly 2 puny-code entries.