Re: [abnf-discuss] constrained-01 - advantage?

Sean Leonard <dev+ietf@seantek.com> Tue, 15 November 2016 04:51 UTC

Return-Path: <dev+ietf@seantek.com>
X-Original-To: abnf-discuss@ietfa.amsl.com
Delivered-To: abnf-discuss@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B63D2129A02 for <abnf-discuss@ietfa.amsl.com>; Mon, 14 Nov 2016 20:51:40 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.601
X-Spam-Level:
X-Spam-Status: No, score=-2.601 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id yz3eqiLYAl5I for <abnf-discuss@ietfa.amsl.com>; Mon, 14 Nov 2016 20:51:38 -0800 (PST)
Received: from mxout-07.mxes.net (mxout-07.mxes.net [216.86.168.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id E2E8C1298B9 for <abnf-discuss@ietf.org>; Mon, 14 Nov 2016 20:51:37 -0800 (PST)
Received: from dhcp-898b.meeting.ietf.org (unknown [31.133.137.139]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by smtp.mxes.net (Postfix) with ESMTPSA id 5D16522E253; Mon, 14 Nov 2016 23:51:36 -0500 (EST)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\))
From: Sean Leonard <dev+ietf@seantek.com>
In-Reply-To: <ea711a12-2edd-917b-2588-2e87fed3c379@gmx.de>
Date: Tue, 15 Nov 2016 13:51:31 +0900
Content-Transfer-Encoding: quoted-printable
Message-Id: <3D847FB9-936C-4426-BCF1-DBE3E16326E6@seantek.com>
References: <5828DD42.8010009@gmail.com> <36FC0A35-2ADA-4710-ABFB-08E8B916718E@seantek.com> <a5d764a8-c560-bed3-095f-f1a1a5e35688@gmx.de> <F49254F8-41B7-499F-8745-5F2374693AA7@seantek.com> <ea711a12-2edd-917b-2588-2e87fed3c379@gmx.de>
To: Julian Reschke <julian.reschke@gmx.de>
X-Mailer: Apple Mail (2.3124)
Archived-At: <https://mailarchive.ietf.org/arch/msg/abnf-discuss/-UbNZaBE12xHlUwLnxd7cdwS3lc>
Cc: Doug Royer <douglasroyer@gmail.com>, abnf-discuss@ietf.org
Subject: Re: [abnf-discuss] constrained-01 - advantage?
X-BeenThere: abnf-discuss@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: "General discussion about tools, activities and capabilities involving the ABNF meta-language" <abnf-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/abnf-discuss>, <mailto:abnf-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/abnf-discuss/>
List-Post: <mailto:abnf-discuss@ietf.org>
List-Help: <mailto:abnf-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/abnf-discuss>, <mailto:abnf-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 15 Nov 2016 04:51:41 -0000

> On Nov 14, 2016, at 2:12 PM, Julian Reschke <julian.reschke@gmx.de> wrote:
> 
> On 2016-11-14 05:55, Sean Leonard wrote:
>> ...
>>> See how RFC 7230 distinguishes between parsing the HTTP message info fields and field values, and then has completely *separate* ABNFs for each field value.
>> 
>> Yes, I saw that. It is a difference approach to the “mail standards” approach, which uses a lot of / and =/ and extension-header syntaxes.
> 
> And RFC 2616 did that as well - it's just that we (that is the HTTPbis WG) decided that this use of ABNF isn't helpful -- IMHO the structure of the ABNF should be aligned with how messages are parsed; so this change in RFC 723x reflects that reality.

Okay.

>> It is desirable to express the relationship between the <Via> production, and the <field-name> “Via”. Specifically, when <field-name> is “Via”, then <field-value> is <Via>.
> 
> Yes, I wouldn't be opposed to a mechanism that just deals with that.
> 
>> A recent place where this leads to a protocol problem is the PKCS #11 URI scheme. RFC 7512 says that “|” can appear in <pk11-query-res-avail>, as a delimiter to some command-path. But “|” is not a part of URIs under [RFC3986]. Therefore, any [RFC3986] conforming URI parser is going to reject what RFC 7512 says are valid pkcs11: URIs. If the relationship between URI@[RFC3986] and pk11-URI@[RFC7512] could have been expressed formally (namely that pk11-URI is a subset of URI), then a validator could have easily flagged that problem during the editorial process.
> 
> I believe that's a much better example to use.


Sounds like we have a plan for progress then.


Regarding the other stuff:

> 
>> A disadvantage of the RFC 7230 approach is that the relationship between the generic header production and specific headers is not formalized. For example, 7230 says:
>> 
>>   HTTP-message = start-line *( header-field CRLF ) CRLF [ message-body
>>    ]
>> 
>>   header-field = field-name ":" OWS field-value OWS
>> 
>>   Via = *( "," OWS ) ( received-protocol RWS received-by [ RWS comment
>>    ] ) *( OWS "," [ OWS ( received-protocol RWS received-by [ RWS
>>    comment ] ) ] )
> 
> Can you clarify why that is a problem?

In some scenarios (namely if the author wants), it is nice to determine that all rules in a specification are reachable from some starting rule. This can help to identify and eliminate orphan rules that are no longer used by the spec. Of course, that is just if the author wants.

> 
>> All that is very interesting, but if you want to verify that the ABNF is correct (or if you want to verify that sample HTTP messages conform to the ABNF), you have to do extra steps to extract each header-field and match them to the particular field values.
> 
> There are two questions here.
> 
> - if you want the validity of the message itself, you check the message ABNF
> 
> - if you want to check the validity of a given field value, you check the field value against the field's ABNF
> 

Yes. In this example, I would say that the “message ABNF” can have a header-line that conforms with the generic syntax, but does not conform with the particular field-name -> field-value production. This should not be a fatal error, in the sense that the (HTTP) message is unusable; but merely that particular field cannot be used. (The actual behavior is up to whatever spec.) I think we are on the same page here...

Overall, for document authors, it helps for a tool to check that the subordinate rule (constrained rule) is purely a subset of the generic rule. For development/debugging/testing/analysis of data on-the-wire, it helps for a single ABNF spec to be used to check the generic data (generic rule) as well as the specific data (constrained rule) in one swoop, without much additional coding to glue the two together.

Regards,

Sean