Re: [http-state] Is this an omission in the parser rules of draft-ietf-httpstate-cookie-21?

Adam Barth <ietf@adambarth.com> Tue, 15 February 2011 21:11 UTC

Return-Path: <ietf@adambarth.com>
X-Original-To: http-state@core3.amsl.com
Delivered-To: http-state@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 9DFEF3A6CB9 for <http-state@core3.amsl.com>; Tue, 15 Feb 2011 13:11:22 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.624
X-Spam-Level:
X-Spam-Status: No, score=-2.624 tagged_above=-999 required=5 tests=[AWL=0.353, BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4pVa34d+p+xH for <http-state@core3.amsl.com>; Tue, 15 Feb 2011 13:11:21 -0800 (PST)
Received: from mail-iy0-f172.google.com (mail-iy0-f172.google.com [209.85.210.172]) by core3.amsl.com (Postfix) with ESMTP id ED91E3A6C4E for <http-state@ietf.org>; Tue, 15 Feb 2011 13:11:20 -0800 (PST)
Received: by iym1 with SMTP id 1so519469iym.31 for <http-state@ietf.org>; Tue, 15 Feb 2011 13:11:47 -0800 (PST)
Received: by 10.231.30.76 with SMTP id t12mr4270779ibc.163.1297804307042; Tue, 15 Feb 2011 13:11:47 -0800 (PST)
Received: from mail-iw0-f172.google.com (mail-iw0-f172.google.com [209.85.214.172]) by mx.google.com with ESMTPS id z4sm3890987ibg.7.2011.02.15.13.11.46 (version=SSLv3 cipher=OTHER); Tue, 15 Feb 2011 13:11:46 -0800 (PST)
Received: by iwc10 with SMTP id 10so628461iwc.31 for <http-state@ietf.org>; Tue, 15 Feb 2011 13:11:45 -0800 (PST)
Received: by 10.231.37.200 with SMTP id y8mr4350197ibd.105.1297804305850; Tue, 15 Feb 2011 13:11:45 -0800 (PST)
MIME-Version: 1.0
Received: by 10.231.215.67 with HTTP; Tue, 15 Feb 2011 13:11:15 -0800 (PST)
In-Reply-To: <26240DE2-4DD3-4863-81B1-635D34BA4AE4@gbiv.com>
References: <20110204184735.26023.qmail@mm01.prod.mesa1.secureserver.net> <AANLkTi=qBVkGwMHqAidtwP5_A8pPrF-Y9MV4jgYS5_QM@mail.gmail.com> <7384878F-C44A-42A4-9694-1BB1C18AA5E6@gbiv.com> <AANLkTinFq7bE_e3SSgdjuFvZ8hGn1xy4Hc1VKwc=vp1D@mail.gmail.com> <49225418-A1AF-4299-8C4F-2E608D34265D@gbiv.com> <AANLkTimrJF3LFR4t4j=U2L33kFh+wf-R=sjjwexcmyPi@mail.gmail.com> <26240DE2-4DD3-4863-81B1-635D34BA4AE4@gbiv.com>
From: Adam Barth <ietf@adambarth.com>
Date: Tue, 15 Feb 2011 13:11:15 -0800
Message-ID: <AANLkTikzB=VORtn7xiG2JY8ymTjk4epC9huZTC-s0nzq@mail.gmail.com>
To: "Roy T. Fielding" <fielding@gbiv.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Cc: http-state@ietf.org
Subject: Re: [http-state] Is this an omission in the parser rules of draft-ietf-httpstate-cookie-21?
X-BeenThere: http-state@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Discuss HTTP State Management Mechanism <http-state.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/http-state>, <mailto:http-state-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/http-state>
List-Post: <mailto:http-state@ietf.org>
List-Help: <mailto:http-state-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/http-state>, <mailto:http-state-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 15 Feb 2011 21:11:22 -0000

On Tue, Feb 15, 2011 at 12:29 PM, Roy T. Fielding <fielding@gbiv.com>; wrote:
> On Feb 14, 2011, at 8:42 PM, Adam Barth wrote:
>> On Mon, Feb 14, 2011 at 7:23 PM, Roy T. Fielding <fielding@gbiv.com>; wrote:
>>> Parsing for user agents is defined in section 5.  Servers have to
>>> parse cookies as well, and the grammar provided in section 4 is wrong
>>> for both generation and parsing.
>>
>> I agree that the grammar depicted in Section 4 is not appropriate for
>> parsing.  That grammar is not for parsing.  It's for generation.
>> Perhaps we should add instructions for how servers should parse the
>> Cookie header?  That's absent from the current document.
>
> We don't need more instructions.  We need a correct ABNF that tells
> us what to generate and what to parse.
>
>> As for generation, there's no exogenous notion of correctness.  We can
>> speak about usefulness, if you like, but correctness is endogenous to
>> this document.
>
> Then you should delete the introduction, since it says
>
>   This document specifies the syntax and semantics of these headers as
>   they are actually used on the Internet.  In particular, this document
>   does not create new syntax or semantics beyond those in use today.
>
>>> Why do you want to push forward a document with a known error in the
>>> grammar?  Just fix it.
>>
>> I disagree that it's an "error."
>
> Then please explain to Amazon why you want to break their site?
> Look at your browser's cookies for amazon.com and you will probably
> find cookies named session-token, at-main, and x-main that do not
> follow your grammar.  They are quoted strings and valid under all
> prior descriptions of the Cookie and Set-Cookie header fields.
>
> And while you are at it, maybe you'd like to explain to Google
> why you want to make Google Analytics cookies invalid.
>
> http://code.google.com/apis/analytics/docs/concepts/gaConceptsCookies.html
>
> The cookie named __utmz is not a quoted-string, not a token, and not base64.
>
>>> This is not a trivial detail.  As written, the specification says
>>> server developers SHOULD break sites like amazon.com.
>>>
>>>   cookie-value      = token / ""
>>
>> Can you explain how the developers of site B using a restricted
>> grammar for generation will break amazon.com?  That seems entirely
>> non-sensical.
>
> Site B is Amazon.com.  You are editing a proposed standard for the
> Internet and, last I checked, they use HTTP on the Internet.
>
>>> The correct grammar is in the original Netscape cookie spec: "This
>>> string is a sequence of characters excluding semi-colon, comma and
>>> white space."  Which easily translates (conservatively) into
>>>
>>>   cookie-value      = %x21-2B / %x2D-3A / %x3C-7E / %x80-FF
>>>
>>> and it does not conflict with section 5.
>>
>> I've updated the draft to recommend the grammar below for generating
>> cookie-values:
>>
>> cookie-value      = token / *base64-character
>> base64-character  = ALPHA / DIGIT / "+" / "/" / "="
>>
>> Do you have a particular use case in mind for generating Set-Cookie
>> headers outside this grammar?
>
> Yes.  Backwards compliance with existing content stored within
> several hundred million deployed browsers that won't expire
> until sometime in 2031 dictates that the grammar be
>
>  cookie-value      = %x21-2B / %x2D-3A / %x3C-7E / %x80-FF
>
> because servers have to parse Cookie in every form that they
> ever sent a Set-Cookie in the past.

You really think we should recommend that servers use invalid UTF-8
sequences as cookie-values?  That sounds like bad advice...

Adam


> Why are you arguing against this?  You are adamant about
> supporting the above syntax in browsers, for the same reasons,
> and there is no harm in supporting the complete Netscape syntax
> in the server grammar.
>
> ....Roy
>