Re: [http-state] non-ASCII cookie values (was Re: Closing

Adam Barth <> Wed, 10 February 2010 02:39 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 3DB4F3A764C for <>; Tue, 9 Feb 2010 18:39:51 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.849
X-Spam-Status: No, score=-1.849 tagged_above=-999 required=5 tests=[AWL=0.128, BAYES_00=-2.599, FM_FORGED_GMAIL=0.622]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id 1jOlHEZuP8oe for <>; Tue, 9 Feb 2010 18:39:50 -0800 (PST)
Received: from ( []) by (Postfix) with ESMTP id 4431B3A750D for <>; Tue, 9 Feb 2010 18:39:50 -0800 (PST)
Received: by pzk28 with SMTP id 28so1064091pzk.31 for <>; Tue, 09 Feb 2010 18:40:56 -0800 (PST)
MIME-Version: 1.0
Received: by with SMTP id r41mr5922190wfc.150.1265769656135; Tue, 09 Feb 2010 18:40:56 -0800 (PST)
In-Reply-To: <>
References: <> <>
From: Adam Barth <>
Date: Tue, 9 Feb 2010 18:40:36 -0800
Message-ID: <>
To: eric bianchetti <>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Subject: Re: [http-state] non-ASCII cookie values (was Re: Closing
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Discuss HTTP State Management Mechanism <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 10 Feb 2010 02:39:51 -0000

I'm sorry, I didn't quite understand your message.  It sounds like
you're interested in non-ASCII cookies because you work in Thailand.

Would you prefer that we:

1) Recommend non-ASCII characters be encoded into ASCII (for example,
with base64).
2) Allow non-ASCII characters and leave the character set undefined
(or up to the server internally).
3) Specify a particular character set, such as UTF-8 or ISO-8859-1.
4) Some other option that hasn't been discussed yet.


On Tue, Feb 9, 2010 at 6:31 PM, eric bianchetti
<> wrote:
> It was the idea that made me point on the non-ASCII characters. At one moment in a near future we will have to go for them.
> Working in a non-ASCII country since 2002 (Thailand); I can tell there is need to include (non normatively for now) non-ASCII characters in the specs (for human and technical reasons).
> Some generals wordings , such precising non ASCII have to be encoded (would it be better to precise or not-precise the encoding?) if they are used for cookies.
> Eric
> Message: 2
> Date: Tue, 9 Feb 2010 08:47:58 -0800
> From: Adam Barth <>
> Subject: Re: [http-state] non-ASCII cookie values (was Re: Closing
>     Ticket 3:    Public Suffixes)
> To: Julian Reschke <>
> Cc:
> Message-ID:
>     <>
> Content-Type: text/plain; charset=ISO-8859-1
> On Wed, Feb 3, 2010 at 12:50 AM, Julian Reschke <> wrote:
>> Adam Barth wrote:
>>> In general, I'm very reluctant to specify a behavior that conflicts
>>> with IE, Firefox, Chrome, and Opera. ?In specific, IE has something
>>> like 99% market share in some Asian markets. ?The fact that this issue
>>> affects Asian markets disproportionately means the pressure to be
>>> IE-like is even greater than usual.
>>> ...
>> On the other hand, why require something that's known not be in use (if that
>> is the case)?
> As Maciej points out, this is not the standard of evidence that user
> agent implementors usually use when making compatibility decisions.  I
> understand that this is a bit of a cultural divide in this group, but
> I'll try to give you a flavor of why UA folks think this way.
> A number of months ago, I helped change the way JavaScript prototypes
> were configured in WebKit.  Instead of using the lexical or dynamic
> scope (which is what WebKit was using a mix of at the time), we
> changed them to some other esoteric rule that matched Firefox and IE.
> At the time, we had zero evidence that this change had any
> compatibility because the cases where these things are different are
> extremely obscure.  So, why did we change it?  Well, because there's
> no value in being different from other browsers and some (potential)
> value in being the same.  That means the benefit (potentially
> non-zero) outweighs the cost (essentially zero).
> Now, as far as I know, the project still has zero evidence that this
> change actually improved compatibility in the real world.  However, I
> happen to know by some random coincidence that this change was hugely
> beneficial.  It just so happens that when folks use both Selenium (a
> popular web application testing harness) and Prototype.js (a popular
> JavaScript library), they arrive in precisely the situation where
> these computations give different answers.  Without the change,
> developers have a lot of trouble testing a Prototype.js-using web site
> with Selenium, but, with the change, it works fine.
> So what are the consequences of the change?  Well, it means developers
> are much more likely to test their web sites in Safari because they
> can just re-use their existing Selenium tests instead of having to
> build or find another testing harness.  That means they're more likely
> to fix bugs that affect Safari, which means their web sites will work
> better in Safari.  That, in turns, means users of Safari will have
> better experiences at web sites and the marketshare of Safari will
> grow.
> If the WebKit project had required actual examples of the change being
> beneficial, they probably would not have made that change and would
> not have enjoyed the benefits.  Now, not every change pays off as well
> as the one I describe, and, in many cases, we never find out how well
> they pay off.  The point is more that user agent implementing are
> being rational when they make these sorts of decisions.
>> If you allow non-ASCII characters (or actually require support), you'll also
>> have to figure out how existing server and client based APIs are to deal
>> with them (so what encoding is it?).
> For our purposes, it doesn't really matter how existing servers handle
> the encodings.  In the end, they do what they do.  As long as we send
> the "right" bytes on the wire, they'll do the "right" thing, whatever
> that is.  As for client APIs, yes, we'll need to get that right,
> either in this specification or in the specification of those APIs (as
> we've discussed elsewhere in this thread).
> Adam