Re: [http-state] algorithm definitions

Adam Barth <ietf@adambarth.com> Sat, 17 July 2010 17:39 UTC

Return-Path: <ietf@adambarth.com>
X-Original-To: http-state@core3.amsl.com
Delivered-To: http-state@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 946F83A6918 for <http-state@core3.amsl.com>; Sat, 17 Jul 2010 10:39:16 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.817
X-Spam-Level:
X-Spam-Status: No, score=-0.817 tagged_above=-999 required=5 tests=[AWL=-0.699, BAYES_20=-0.74, FM_FORGED_GMAIL=0.622]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id LYqH4PjLEkat for <http-state@core3.amsl.com>; Sat, 17 Jul 2010 10:39:15 -0700 (PDT)
Received: from mail-iw0-f172.google.com (mail-iw0-f172.google.com [209.85.214.172]) by core3.amsl.com (Postfix) with ESMTP id 67FE33A67DB for <http-state@ietf.org>; Sat, 17 Jul 2010 10:39:15 -0700 (PDT)
Received: by iwn38 with SMTP id 38so3456570iwn.31 for <http-state@ietf.org>; Sat, 17 Jul 2010 10:39:27 -0700 (PDT)
Received: by 10.231.130.145 with SMTP id t17mr2277761ibs.144.1279388367681; Sat, 17 Jul 2010 10:39:27 -0700 (PDT)
Received: from mail-iw0-f172.google.com (mail-iw0-f172.google.com [209.85.214.172]) by mx.google.com with ESMTPS id h8sm15684686ibk.21.2010.07.17.10.39.26 (version=SSLv3 cipher=RC4-MD5); Sat, 17 Jul 2010 10:39:26 -0700 (PDT)
Received: by iwn38 with SMTP id 38so3456559iwn.31 for <http-state@ietf.org>; Sat, 17 Jul 2010 10:39:26 -0700 (PDT)
Received: by 10.231.146.141 with SMTP id h13mr2863484ibv.1.1279388365753; Sat, 17 Jul 2010 10:39:25 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.231.143.145 with HTTP; Sat, 17 Jul 2010 10:39:05 -0700 (PDT)
In-Reply-To: <4C4061C3.6090606@gmx.de>
References: <4C4061C3.6090606@gmx.de>
From: Adam Barth <ietf@adambarth.com>
Date: Sat, 17 Jul 2010 10:39:05 -0700
Message-ID: <AANLkTikiTa6YDrTdRYfkDEM6GJx8tuhBBe6lMsRxZ5FM@mail.gmail.com>
To: Julian Reschke <julian.reschke@gmx.de>
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: quoted-printable
Cc: "http-state@ietf.org" <http-state@ietf.org>
Subject: Re: [http-state] algorithm definitions
X-BeenThere: http-state@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Discuss HTTP State Management Mechanism <http-state.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/http-state>, <mailto:http-state-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/http-state>
List-Post: <mailto:http-state@ietf.org>
List-Help: <mailto:http-state-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/http-state>, <mailto:http-state-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 17 Jul 2010 17:39:16 -0000

On Fri, Jul 16, 2010 at 6:42 AM, Julian Reschke <julian.reschke@gmx.de> wrote:
> Wow -- all of this to say that a string should be tokenized where ";"
> occurs, that the first token and the remaining tokens have different roles,
> and how to parse the individual tokens.
>
> A few ideas how to compress this:

What problem are we trying to solve by compressing this presentation?
The current presentation is only two pages of a 38 page document.
IMHO, parsing the set-cookie-string is the most important algorithm in
the document.  I think it's fine to dedicate 5% of the document to
getting it exactly right.

> - If part 2 / step 1 removes the leading semicolon, why include it in the
> first place?

Why not?  I doesn't matter from a conformance point of view.  It just
makes it easier for me to ensure that I've gotten everything 100%
correct.

> - Maybe just say ";" and "=" after stating the Unicode code point once?
> Speaking of which, is *anybody* confused about what these characters might
> be?

Sounds like an editorial issue.  You have your favorite pendantics, I have mine.

> - Instead of expressing a for-loop in prose, simply state that the string is
> to be split on semicolons, and a certain set of steps is to be applied to
> each fragment.

I don't see that as an improvement over the current text.

> I've heard that this part is exclusively for those who actually write the
> parsing code, and nobody else need to care. I disagree with that. If the
> spec makes normative requirements on handling non-conforming input, then it
> should be phrased in a way so that it's clear what gets processed how.

The current text is as clear as I can make it about how every sequence
of octets in a Set-Cookie header field is processed.

> Giving an example of a conforming algorithm is fine, but substituting the
> description with that algorithm IMHO is not.

IMHO, it is.  I guess we'll have to agree to disagree.

> For instance, when I debug an HTTP/cookie problem and look at an HTTP trace,
> I want to be able to understand how the recipient is going to parse the
> string. Reading the algorithm really isn't very helpful for that.

Actually, it should be very helpful.  Just mentally process the string
using the steps outlined in the document and you'll see exactly what
the UA is supposed to do.

> Also, if we need algorithms instead of format descriptions, why is it ok to
> define date parsing using an ABNF (see section 5.1.1)?

Would you prefer I switched the date parser to be more algorithmic?
In that case, I was able to find a precise presentation of the date
parsing algorithm that I thought would appeal more to IETF aesthetics.
 If you read that section carefully, you'll notice that the ABNF
doesn't tell the whole story about how to parse cookie-dates.  You
still need the algorithmic aspects below to get the right answer, but
I tried to make it as declarative as possible.

On Fri, Jul 16, 2010 at 8:50 AM, Bjoern Hoehrmann <derhoermi@gmx.net> wrote:
> * Julian Reschke wrote:
>>Also, if we need algorithms instead of format descriptions, why is it ok
>>to define date parsing using an ABNF (see section 5.1.1)?
>
> There is no need for an algorithm, for instance, "parsing unparsed
> attributes" is just greedily matching against e.g. (using XML's EBNF)
>
>  attributes ::= (';' s* av-name s* '='? s* av-value s*)*
>  av-name    ::= [^=;]* ([^=;] - s) | ''
>  av-value   ::= [^;]* ([^;] - s) | ''
>
> For the initial key-value-pair the grammar would be similar, you would
> just make the empty string an invalid name and require the "=" sign,
> and specify that failure to match means to ignore the whole thing.

I'm not familiar with EBNF (nor am I cheesed about introducing yet
another grammar type into this document).

I'm sure there are 1000 different presentations we could use for the
set-cookie-string parser.  The one we have seems fine to me.

Adam