Re: Ambiguities in header-field rules (p1-messaging)

Frank Mertens <frank@cyblogic.de> Thu, 18 August 2011 08:09 UTC

Return-Path: <ietf-http-wg-request@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 559DB21F877F for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Thu, 18 Aug 2011 01:09:27 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.854
X-Spam-Level:
X-Spam-Status: No, score=-9.854 tagged_above=-999 required=5 tests=[AWL=0.745, BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id wJVuoMk+nb-u for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Thu, 18 Aug 2011 01:09:26 -0700 (PDT)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) by ietfa.amsl.com (Postfix) with ESMTP id 21B8321F8AD1 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Thu, 18 Aug 2011 01:09:26 -0700 (PDT)
Received: from lists by frink.w3.org with local (Exim 4.69) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1Qtxg1-00045Y-TN for ietf-http-wg-dist@listhub.w3.org; Thu, 18 Aug 2011 08:10:10 +0000
Received: from lisa.w3.org ([128.30.52.41]) by frink.w3.org with esmtp (Exim 4.69) (envelope-from <frank@cyblogic.de>) id 1Qtxft-0002YL-0M for ietf-http-wg@listhub.w3.org; Thu, 18 Aug 2011 08:10:01 +0000
Received: from cyblogic.com ([109.239.57.159] ident=postfix) by lisa.w3.org with esmtp (Exim 4.72) (envelope-from <frank@cyblogic.de>) id 1Qtxfr-0006kb-OO for ietf-http-wg@w3.org; Thu, 18 Aug 2011 08:10:00 +0000
Received: from [192.168.1.15] (mnch-d9bdabfa.pool.mediaWays.net [217.189.171.250]) by cyblogic.com (Postfix) with ESMTPSA id 0E98D1000C for <ietf-http-wg@w3.org>; Thu, 18 Aug 2011 10:07:29 +0200 (CEST)
Message-ID: <4E4CC8A3.5070503@cyblogic.de>
Date: Thu, 18 Aug 2011 10:09:07 +0200
From: Frank Mertens <frank@cyblogic.de>
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.18) Gecko/20110626 Icedove/3.1.11
MIME-Version: 1.0
To: ietf-http-wg@w3.org
References: <4E4C013D.2090407@cyblogic.de> <88b489507e504d9eef318438194f929e@treenet.co.nz>
In-Reply-To: <88b489507e504d9eef318438194f929e@treenet.co.nz>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 7bit
Received-SPF: pass client-ip=109.239.57.159; envelope-from=frank@cyblogic.de; helo=cyblogic.com
X-W3C-Hub-Spam-Status: No, score=-1.9
X-W3C-Hub-Spam-Report: BAYES_00=-1.9, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001
X-W3C-Scan-Sig: lisa.w3.org 1Qtxfr-0006kb-OO f4e2139e1d9ec8edeeceec07a642ab12
X-Original-To: ietf-http-wg@w3.org
Subject: Re: Ambiguities in header-field rules (p1-messaging)
Archived-At: <http://www.w3.org/mid/4E4CC8A3.5070503@cyblogic.de>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/11214
X-Loop: ietf-http-wg@w3.org
Sender: ietf-http-wg-request@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>
Resent-Message-Id: <E1Qtxg1-00045Y-TN@frink.w3.org>
Resent-Date: Thu, 18 Aug 2011 08:10:09 +0000

On 08/18/2011 05:16 AM, Amos Jeffries wrote:
> On Wed, 17 Aug 2011 19:58:21 +0200, Frank Mertens wrote:
>> Hi,
>>
>> I played around with the ABNF published by this WG and stumbled
>> over some rough edges.
>>
>> Current rules:
>>
>> OWS = *( [ obs-fold ] WSP )
>> header-field = field-name ":" OWS [ field-value ] OWS
>> field-value = *( field-content / OWS )
>> field-content = *( WSP / VCHAR / obs-text )
>>
>> Problems:
>>
>> - field-value and field-content match the empty symbol,
>> which requires searching for the longest match, which is costly
>> (and confusing for the human reader)
>> - because field-value matches the empty symbol claiming it optional
>> in header-field allows ambiguous productions of same length
>> (with or without field-value of zero length?)
>>
>> Suggested improvement:
>>
>> field-value = 1*( field-content OWS )
>> field-content = 1*( VCHAR / WSP / obs-text )
>>
>> Best Regards,
>> Frank Mertens.
>
>
> The OWS on header-field remains ambiguous as well.
>
> Also, with WSP being in field-content there is the possibility of header-field matching:
>
> field-name ":" [ obs-fold ] 1*( WSP OWS ) OWS
>
> Nasty. But section 3.2 comes to the rescue:
> "The field value does not include any leading or trailing white space"
> and
> "HTTP/1.1 senders MUST NOT produce messages that include line folding"
>
> So OWS in the field-value ABNF appears to be invalid in several ways going by the text.
>
>
> Perhapse this would be better:
>
> header-field = field-name ":" [ WSP ] BWS [ field-value ]
> field-value = 1*( field-content BWS )
> field-content = 1*( VCHAR / WSP / obs-text )
>
>
>
>
> Nit: section 1.2.2 currently says:
>
> "Multiple OWS octets that occur within field-content
> SHOULD be replaced with a single SP before interpreting the field
> value or forwarding the message downstream."
> ...
> "Multiple RWS octets that occur within field-content SHOULD be
> replaced with a single SP before interpreting the field value or
> forwarding the message downstream.
> "
>
> When there is no OWS or RWS in the field-content ABNF.
>
> I think both should say header-field instead of field-content. Or maybe drop the "within field-content" condition to make it general.
>
>
> AYJ
>
>

Yes, we should also have a strict version of the grammar.
But for now, I'm happy with a working tolerant one;)
Replacing OWS by BWS would also disable support for line folding.

FM