Re: [http-state] algorithm definitions

Adam Barth <ietf@adambarth.com> Sat, 17 July 2010 17:39 UTC

MIME-Version: 1.0
In-Reply-To: <4C4061C3.6090606@gmx.de>
References: <4C4061C3.6090606@gmx.de>
From: Adam Barth <ietf@adambarth.com>
Date: Sat, 17 Jul 2010 10:39:05 -0700
Message-ID: <AANLkTikiTa6YDrTdRYfkDEM6GJx8tuhBBe6lMsRxZ5FM@mail.gmail.com>
To: Julian Reschke <julian.reschke@gmx.de>
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: quoted-printable
Cc: "http-state@ietf.org" <http-state@ietf.org>
Subject: Re: [http-state] algorithm definitions
Precedence: list

On Fri, Jul 16, 2010 at 6:42 AM, Julian Reschke <julian.reschke@gmx.de> wrote:
> Wow -- all of this to say that a string should be tokenized where ";"
> occurs, that the first token and the remaining tokens have different roles,
> and how to parse the individual tokens.
>
> A few ideas how to compress this:

What problem are we trying to solve by compressing this presentation?
The current presentation is only two pages of a 38 page document.
IMHO, parsing the set-cookie-string is the most important algorithm in
the document.  I think it's fine to dedicate 5% of the document to
getting it exactly right.

> - If part 2 / step 1 removes the leading semicolon, why include it in the
> first place?

Why not?  I doesn't matter from a conformance point of view.  It just
makes it easier for me to ensure that I've gotten everything 100%
correct.

> - Maybe just say ";" and "=" after stating the Unicode code point once?
> Speaking of which, is *anybody* confused about what these characters might
> be?

Sounds like an editorial issue.  You have your favorite pendantics, I have mine.

> - Instead of expressing a for-loop in prose, simply state that the string is
> to be split on semicolons, and a certain set of steps is to be applied to
> each fragment.

I don't see that as an improvement over the current text.

> I've heard that this part is exclusively for those who actually write the
> parsing code, and nobody else need to care. I disagree with that. If the
> spec makes normative requirements on handling non-conforming input, then it
> should be phrased in a way so that it's clear what gets processed how.

The current text is as clear as I can make it about how every sequence
of octets in a Set-Cookie header field is processed.

> Giving an example of a conforming algorithm is fine, but substituting the
> description with that algorithm IMHO is not.

IMHO, it is.  I guess we'll have to agree to disagree.

> For instance, when I debug an HTTP/cookie problem and look at an HTTP trace,
> I want to be able to understand how the recipient is going to parse the
> string. Reading the algorithm really isn't very helpful for that.

Actually, it should be very helpful.  Just mentally process the string
using the steps outlined in the document and you'll see exactly what
the UA is supposed to do.

> Also, if we need algorithms instead of format descriptions, why is it ok to
> define date parsing using an ABNF (see section 5.1.1)?

Would you prefer I switched the date parser to be more algorithmic?
In that case, I was able to find a precise presentation of the date
parsing algorithm that I thought would appeal more to IETF aesthetics.
 If you read that section carefully, you'll notice that the ABNF
doesn't tell the whole story about how to parse cookie-dates.  You
still need the algorithmic aspects below to get the right answer, but
I tried to make it as declarative as possible.

On Fri, Jul 16, 2010 at 8:50 AM, Bjoern Hoehrmann <derhoermi@gmx.net> wrote:
> * Julian Reschke wrote:
>>Also, if we need algorithms instead of format descriptions, why is it ok
>>to define date parsing using an ABNF (see section 5.1.1)?
>
> There is no need for an algorithm, for instance, "parsing unparsed
> attributes" is just greedily matching against e.g. (using XML's EBNF)
>
>  attributes ::= (';' s* av-name s* '='? s* av-value s*)*
>  av-name    ::= [^=;]* ([^=;] - s) | ''
>  av-value   ::= [^;]* ([^;] - s) | ''
>
> For the initial key-value-pair the grammar would be similar, you would
> just make the empty string an invalid name and require the "=" sign,
> and specify that failure to match means to ignore the whole thing.

I'm not familiar with EBNF (nor am I cheesed about introducing yet
another grammar type into this document).

I'm sure there are 1000 different presentations we could use for the
set-cookie-string parser.  The one we have seems fine to me.

Adam

Re: [http-state] algorithm definitions Bjoern Hoehrmann
[http-state] algorithm definitions Julian Reschke
Re: [http-state] algorithm definitions Adam Barth
Re: [http-state] algorithm definitions =JeffH