Re: [http-state] Ticket 5: Cookie ordering

Adam Barth <ietf@adambarth.com> Wed, 20 January 2010 09:48 UTC

Return-Path: <adam@adambarth.com>
X-Original-To: http-state@core3.amsl.com
Delivered-To: http-state@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 4F4BA3A67D9 for <http-state@core3.amsl.com>; Wed, 20 Jan 2010 01:48:56 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.917
X-Spam-Level:
X-Spam-Status: No, score=-2.917 tagged_above=-999 required=5 tests=[AWL=1.060, BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, GB_I_LETTER=-2]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id WAhLFUfemUrH for <http-state@core3.amsl.com>; Wed, 20 Jan 2010 01:48:54 -0800 (PST)
Received: from mail-pw0-f50.google.com (mail-pw0-f50.google.com [209.85.160.50]) by core3.amsl.com (Postfix) with ESMTP id CB7A23A63C9 for <http-state@ietf.org>; Wed, 20 Jan 2010 01:48:54 -0800 (PST)
Received: by pwi20 with SMTP id 20so3284867pwi.29 for <http-state@ietf.org>; Wed, 20 Jan 2010 01:48:48 -0800 (PST)
MIME-Version: 1.0
Received: by 10.142.55.8 with SMTP id d8mr4240646wfa.22.1263980923256; Wed, 20 Jan 2010 01:48:43 -0800 (PST)
In-Reply-To: <alpine.DEB.2.00.1001200832220.11282@tvnag.unkk.fr>
References: <7789133a1001191410l48530adar28098a03e6de0fb1@mail.gmail.com> <alpine.DEB.2.00.1001192327190.27499@tvnag.unkk.fr> <7789133a1001191729g4e0a4827w43a12879d23e289c@mail.gmail.com> <alpine.DEB.2.00.1001200832220.11282@tvnag.unkk.fr>
From: Adam Barth <ietf@adambarth.com>
Date: Wed, 20 Jan 2010 01:48:23 -0800
Message-ID: <7789133a1001200148k1aa148e6qe7d1b900e5b434c7@mail.gmail.com>
To: Daniel Stenberg <daniel@haxx.se>
Content-Type: text/plain; charset="windows-1252"
Content-Transfer-Encoding: quoted-printable
Cc: http-state <http-state@ietf.org>
Subject: Re: [http-state] Ticket 5: Cookie ordering
X-BeenThere: http-state@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Discuss HTTP State Management Mechanism <http-state.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/http-state>, <mailto:http-state-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/http-state>
List-Post: <mailto:http-state@ietf.org>
List-Help: <mailto:http-state-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/http-state>, <mailto:http-state-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 20 Jan 2010 09:48:56 -0000

It sounds like you're now advocating:

(3) Require that cookies with longer paths appear before cookies with
shorter paths, but do not require an ordering among cookies with equal
path lengths.

In particular, because you wrote:

On Wed, Jan 20, 2010 at 12:13 AM, Daniel Stenberg <daniel@haxx.se> wrote:
> The (3) "sort-them-all" approach is indeed the easier way to implement this,
> as figuring out exactly which cookies that have the same name and only
> special-case the order of those will probably be a bigger effort than to
> simply unconditionally sort all cookies based on path length.

and

> (I should add that curl will sort cookies based on path length starting
> soon, as I've been convinced that there will be a few rare sites that cannot
> work properly without it.)

You didn't answer my question, which I'll repeat here re-phrased for
your current position:

Can you explain why (3) provides a better cost / benefit trade-off
than (1)?  In particular, what is the advantage of us adopting (3)
over (1)?

I claim that (1) has a better cost / benefit trade-off than (3) for
the following two reasons:

A) There is a risk that some sites will fail to function properly if
the cookies are presented in another order.

B) It is essentially cost-free for user agents to sort cookies in a
particular order, especially in light of the fact that we agree that
user agents must at least sort cookies using a partial order based on
path length.

Detailed comments below.

On Wed, Jan 20, 2010 at 12:13 AM, Daniel Stenberg <daniel@haxx.se> wrote:
> I like specifying only what needs to be specified and leave the rest to the
> implementor. In every spec I read or work with.

I'm not sure what your personal preference has to do with anything.
I'm sure some folks would prefer that we specify things in more detail
and others would prefer that we have a looser specification.  In the
end, technical arguments and data carry more weight.

> Since the only cookies that NEED an order are those that have the same name,
> I think we should only specify that.

You haven't presented any evidence for this claim.

>> Do you have data about the relative compatibility of (1), (3), and (4)?
>>  If not, are you willing to generate this data?  It seems like there is some
>> interoperability / compatibility risk to (4) that would be mitigated in (3)
>> and (1).  Without a way to quantify this risk, we are flying blind.
>
> I've not seen anyone present any numbers regarding cookie sort order so I
> don't quite see why I would be different here. No, I don't have any numbers
> and I don't think I'm able to come up with any numbers either.

If there were some cost associated with sorting the cookies, then we
would need to balance that cost against the compatibility cost of not
sorting the cookies.  As it stands, there is no cost to sorting the
cookies.  Only risk.  You claim that the risk is small, but you
haven't presented any evidence as to its magnitude.  My argument
doesn't depend on the magnitude of the risk.  That's why your argument
requires data and mine does not.

> My assumptions and views of reality here comes from the fact that I've
> written a cookie implementation that has been used rather widely by
> applications, frameworks and browsers for quite a number of years by now.
> Like most other people here base their views on work with other
> implementations and/or users. I assume we're all similar in that aspect.

I'd rather made decisions based on data and technical arguments
instead of assumptions.

>> Why introduce interoperability problems where none exist?
>
> The interoperability here is against servers. Can you show me two servers in
> active use today that a client doesn't work with if the cookie order is as
> "free" as I suggest?

That's not a good criterion to use when deciding to relax a
requirement.  For example, if you'd asked me to show you a server that
would fail to operate if we changed the number of cookies allowed per
domain from 20 to 50, I would not have been able to produce one.  Yet,
when IE changes to allow 50 cookies per domain, they rendered a major
financial institution's web site completely inoperable:

http://blogs.msdn.com/ieinternals/archive/2009/08/20/WinINET-IE-Cookie-Internals-FAQ.aspx

In that case, however, there was a good reason for allowing 50 cookies
per domain (which I can elaborate).  In the case of cookie sorting, I
haven't heard a good reason not to sort the cookies.

Here's what Eric Lawrence learned from this episode:

"That, in turn, is one reason why the IE team must exercise great care
when making any change to IE’s cookie implementation."

We would be well served to heed this advice.  If we ignore it, user
agent implementors will ignore us and we'll have another dead-letter
spec like 2109.

> Some of those non-web tools that deal with cookies know how to import
> cookies from the traditional cookies file in the netscape file format, and
> that format has no "creation time" field and thus importing cookies from
> such a file makes it impossible to send the cookies back sorted on the
> original "creation time" but instead it could only do it based on the time
> of the read from a file.

This is a good technical argument.  Maybe we should state that the
creation time ordering is required only if the creation time
information is available to the user agent.

>> or omit the cookie header entirely.  I can easily imagine server-side code
>> that would break if presented with "Cookie: baz=qux; foo=bar".
>
> Really? I mean, sure I can imagine that there is also sites that require
> that the Cookie: header is the 5th header in the request but that doesn't
> make it a correct assumption even though it might work with a browser or
> two.

The HTTP spec would be quite different if I were writing it.  For
example, this poor proxy vendor wouldn't have had this problem:

http://lists.w3.org/Archives/Public/ietf-http-wg/2009JulSep/0264.html

However, my job is not to write the HTTP spec.  My job is to write the
highest quality cookie spec I can.

> I'm only against specifying it in the spec if it's a very rare assumption
> that we can declare is a server bug. If the assumption is spread and used
> more than rarely then I am wrong and it should be specified.

Suppose we declare it a server bug.  What then?  Are user agent
vendors going to spend their evangelism budget to convince site
operators to unbreak their servers?  In reality, the user agents with
high market share will continue to sort cookies.  User agents with low
market share will be blissfully ignorant until they come across a
server that breaks.  If they're lucky enough to get a bug report and
reduce the issue, they can either spend evangelism dollars to get the
site to change or they can sort the cookies in the same way as the
user agents with high market share.

As someone who's been around these decisions, I can tell you that a
lot of evangelism dollars get taken up on things like fixing the way
servers parse the user agent string.  In these cases, the user agent
has no recourse except to copy the user agent string of a high market
share user agent.

>> The easiest way to work with all servers is to sort the cookies as
>> proposed in (1).
>
> The easiest way is to just document the exact behavior of one browser and
> one server implementation and we're done. That's however not our job here, I
> think.

Our job is the following:

"Where commonalities exist in the most widely used implementations,
the working group will specify the common behavior."

In my view, option (1) best accomplishes this goal.

Adam