[http-state] algorithm definitions
Julian Reschke <julian.reschke@gmx.de> Fri, 16 July 2010 13:42 UTC
Return-Path: <julian.reschke@gmx.de>
X-Original-To: http-state@core3.amsl.com
Delivered-To: http-state@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id DF6393A6A30 for <http-state@core3.amsl.com>; Fri, 16 Jul 2010 06:42:31 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.024
X-Spam-Level:
X-Spam-Status: No, score=-3.024 tagged_above=-999 required=5 tests=[AWL=-0.425, BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id i9rFcpi4jW5w for <http-state@core3.amsl.com>; Fri, 16 Jul 2010 06:42:30 -0700 (PDT)
Received: from mail.gmx.net (mailout-de.gmx.net [213.165.64.22]) by core3.amsl.com (Postfix) with SMTP id 6D1C43A6A04 for <http-state@ietf.org>; Fri, 16 Jul 2010 06:42:30 -0700 (PDT)
Received: (qmail invoked by alias); 16 Jul 2010 13:42:40 -0000
Received: from mail.greenbytes.de (EHLO [192.168.1.144]) [217.91.35.233] by mail.gmx.net (mp004) with SMTP; 16 Jul 2010 15:42:40 +0200
X-Authenticated: #1915285
X-Provags-ID: V01U2FsdGVkX1/u7npB7w/umtXos123jNmYrJEhFIErf6rtFIEf31 XGUtFiUE/56qg/
Message-ID: <4C4061C3.6090606@gmx.de>
Date: Fri, 16 Jul 2010 15:42:27 +0200
From: Julian Reschke <julian.reschke@gmx.de>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.10) Gecko/20100512 Lightning/1.0b1 Thunderbird/3.0.5
MIME-Version: 1.0
To: "http-state@ietf.org" <http-state@ietf.org>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
X-Y-GMX-Trusted: 0
Subject: [http-state] algorithm definitions
X-BeenThere: http-state@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Discuss HTTP State Management Mechanism <http-state.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/http-state>, <mailto:http-state-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/http-state>
List-Post: <mailto:http-state@ietf.org>
List-Help: <mailto:http-state-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/http-state>, <mailto:http-state-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 16 Jul 2010 13:42:32 -0000
Hi, from <http://tools.ietf.org/html/draft-ietf-httpstate-cookie-09#section-5.2>: A user agent MUST use an algorithm equivalent to the following algorithm to parse set-cookie-strings: 1. If the set-cookie-string contains a U+003B (";") character: The name-value-pair string consists of the characters up to, but not including, the first U+003B (";"), and the unparsed- attributes consist of the remainder of the set-cookie-string (including the U+003B (";") in question). Otherwise: The name-value-pair string consists of all the characters contained in the set-cookie-string, and the unparsed- attributes is the empty string. 2. If the name-value-pair string lacks a U+003D ("=") character, ignore the set-cookie-string entirely. 3. The (possibly empty) name string consists of the characters up to, but not including, the first U+003D ("=") character, and the (possibly empty) value string consists of the characters after the first U+003D ("=") character. 4. Remove any leading or trailing WSP characters from the name string and the value string. 5. If the name string is empty, ignore the set-cookie-string entirely. 6. The cookie-name is the name string, and the cookie-value is the value string. The user agent MUST use an algorithm equivalent to the following algorithm to parse the unparsed-attributes: 1. If the unparsed-attributes string is empty, skip the rest of these steps. 2. Discard the first character of the unparsed-attributes (which will be a U+003B (";") character). 3. If the remaining unparsed-attributes contains a U+003B (";") character: Consume the characters of the unparsed-attributes up to, but not including, the first U+003B (";") character. Otherwise: Consume the remainder of the unparsed-attributes. Let the cookie-av string be the characters consumed in this step. 4. If the cookie-av string contains a U+003D ("=") character: The (possibly empty) attribute-name string consists of the characters up to, but not including, the first U+003D ("=") character, and the (possibly empty) attribute-value string consists of the characters after the first U+003D ("=") character. Otherwise: The attribute-name string consists of the entire cookie-av string, and the attribute-value string is empty. 5. Remove any leading or trailing WSP characters from the attribute- name string and the attribute-value string. 6. Process the attribute-name and attribute-value according to the requirements in the following subsections. (Notice that attributes with unrecognized attribute-names are ignored.) 7. Return to Step 1. Wow -- all of this to say that a string should be tokenized where ";" occurs, that the first token and the remaining tokens have different roles, and how to parse the individual tokens. A few ideas how to compress this: - If part 2 / step 1 removes the leading semicolon, why include it in the first place? - Maybe just say ";" and "=" after stating the Unicode code point once? Speaking of which, is *anybody* confused about what these characters might be? - Instead of expressing a for-loop in prose, simply state that the string is to be split on semicolons, and a certain set of steps is to be applied to each fragment. Etc. I've heard that this part is exclusively for those who actually write the parsing code, and nobody else need to care. I disagree with that. If the spec makes normative requirements on handling non-conforming input, then it should be phrased in a way so that it's clear what gets processed how. Giving an example of a conforming algorithm is fine, but substituting the description with that algorithm IMHO is not. For instance, when I debug an HTTP/cookie problem and look at an HTTP trace, I want to be able to understand how the recipient is going to parse the string. Reading the algorithm really isn't very helpful for that. Also, if we need algorithms instead of format descriptions, why is it ok to define date parsing using an ABNF (see section 5.1.1)? Best regards, Julian
- Re: [http-state] algorithm definitions Bjoern Hoehrmann
- [http-state] algorithm definitions Julian Reschke
- Re: [http-state] algorithm definitions Adam Barth
- Re: [http-state] algorithm definitions =JeffH