Re: [http-state] non-ASCII cookie values (was Re: Closing Ticket 3: Public Suffixes)

Adam Barth <> Tue, 09 February 2010 16:47 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id BAD8328C237 for <>; Tue, 9 Feb 2010 08:47:14 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.842
X-Spam-Status: No, score=-1.842 tagged_above=-999 required=5 tests=[AWL=0.135, BAYES_00=-2.599, FM_FORGED_GMAIL=0.622]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id XirvgoKe2BOe for <>; Tue, 9 Feb 2010 08:47:13 -0800 (PST)
Received: from ( []) by (Postfix) with ESMTP id D83B228C21B for <>; Tue, 9 Feb 2010 08:47:13 -0800 (PST)
Received: by pxi35 with SMTP id 35so78585pxi.18 for <>; Tue, 09 Feb 2010 08:48:18 -0800 (PST)
MIME-Version: 1.0
Received: by with SMTP id 33mr2041521wfc.275.1265734098235; Tue, 09 Feb 2010 08:48:18 -0800 (PST)
In-Reply-To: <>
References: <> <> <> <> <> <> <> <> <> <>
From: Adam Barth <>
Date: Tue, 09 Feb 2010 08:47:58 -0800
Message-ID: <>
To: Julian Reschke <>
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: quoted-printable
Subject: Re: [http-state] non-ASCII cookie values (was Re: Closing Ticket 3: Public Suffixes)
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Discuss HTTP State Management Mechanism <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Tue, 09 Feb 2010 16:47:14 -0000

On Wed, Feb 3, 2010 at 12:50 AM, Julian Reschke <> wrote:
> Adam Barth wrote:
>> In general, I'm very reluctant to specify a behavior that conflicts
>> with IE, Firefox, Chrome, and Opera.  In specific, IE has something
>> like 99% market share in some Asian markets.  The fact that this issue
>> affects Asian markets disproportionately means the pressure to be
>> IE-like is even greater than usual.
>> ...
> On the other hand, why require something that's known not be in use (if that
> is the case)?

As Maciej points out, this is not the standard of evidence that user
agent implementors usually use when making compatibility decisions.  I
understand that this is a bit of a cultural divide in this group, but
I'll try to give you a flavor of why UA folks think this way.

A number of months ago, I helped change the way JavaScript prototypes
were configured in WebKit.  Instead of using the lexical or dynamic
scope (which is what WebKit was using a mix of at the time), we
changed them to some other esoteric rule that matched Firefox and IE.
At the time, we had zero evidence that this change had any
compatibility because the cases where these things are different are
extremely obscure.  So, why did we change it?  Well, because there's
no value in being different from other browsers and some (potential)
value in being the same.  That means the benefit (potentially
non-zero) outweighs the cost (essentially zero).

Now, as far as I know, the project still has zero evidence that this
change actually improved compatibility in the real world.  However, I
happen to know by some random coincidence that this change was hugely
beneficial.  It just so happens that when folks use both Selenium (a
popular web application testing harness) and Prototype.js (a popular
JavaScript library), they arrive in precisely the situation where
these computations give different answers.  Without the change,
developers have a lot of trouble testing a Prototype.js-using web site
with Selenium, but, with the change, it works fine.

So what are the consequences of the change?  Well, it means developers
are much more likely to test their web sites in Safari because they
can just re-use their existing Selenium tests instead of having to
build or find another testing harness.  That means they're more likely
to fix bugs that affect Safari, which means their web sites will work
better in Safari.  That, in turns, means users of Safari will have
better experiences at web sites and the marketshare of Safari will

If the WebKit project had required actual examples of the change being
beneficial, they probably would not have made that change and would
not have enjoyed the benefits.  Now, not every change pays off as well
as the one I describe, and, in many cases, we never find out how well
they pay off.  The point is more that user agent implementing are
being rational when they make these sorts of decisions.

> If you allow non-ASCII characters (or actually require support), you'll also
> have to figure out how existing server and client based APIs are to deal
> with them (so what encoding is it?).

For our purposes, it doesn't really matter how existing servers handle
the encodings.  In the end, they do what they do.  As long as we send
the "right" bytes on the wire, they'll do the "right" thing, whatever
that is.  As for client APIs, yes, we'll need to get that right,
either in this specification or in the specification of those APIs (as
we've discussed elsewhere in this thread).