Re: ABNF sets and sequences

Dave Crocker <dcrocker@brandenburg.com> Tue, 11 July 2000 09:28 UTC

Received: from cs.utk.edu (CS.UTK.EDU [128.169.94.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id FAA22721 for <drums-archive@odin.ietf.org>; Tue, 11 Jul 2000 05:28:33 -0400 (EDT)
Received: from localhost (daemon@localhost) by cs.utk.edu with SMTP (cf v2.9s-UTK) id FAA08186; Tue, 11 Jul 2000 05:28:11 -0400 (EDT)
Received: by cs.utk.edu (bulk_mailer v1.13); Tue, 11 Jul 2000 05:28:10 -0400
Received: by cs.utk.edu (cf v2.9s-UTK) id FAA08169; Tue, 11 Jul 2000 05:28:10 -0400 (EDT)
Received: from joy.songbird.com (marvin@localhost) by cs.utk.edu with ESMTP (cf v2.9s-UTK) id FAA08156; Tue, 11 Jul 2000 05:28:08 -0400 (EDT)
Received: from joy.songbird.com (208.184.79.7 -> joy.songbird.com) by cs.utk.edu (smtpshim v1.0); Tue, 11 Jul 2000 05:28:08 -0400
Received: from kobe-ppp-210-172-164-173.interq.or.jp (kobe-ppp-210-172-164-173.interq.or.jp [210.172.164.173]) by joy.songbird.com (8.9.3/8.9.3) with SMTP id CAA16307 for <drums@cs.utk.edu>; Tue, 11 Jul 2000 02:28:05 -0700
X-Authentication-Warning: joy.songbird.com: kobe-ppp-210-172-164-173.interq.or.jp [210.172.164.173] didn't use HELO protocol
Message-Id: <4.3.2.20000711173432.00c043e0@mail.bayarea.net>
X-Sender: dcrocker@mail.bayarea.net
X-Mailer: QUALCOMM Windows Eudora Version 4.3
Date: Tue, 11 Jul 2000 18:26:27 +0900
To: drums@cs.utk.edu
From: Dave Crocker <dcrocker@brandenburg.com>
Subject: Re: ABNF sets and sequences
In-Reply-To: <7683695.3172227524@nifty-jr.west.sun.com>
References: <4.3.2.20000709223011.00bfebf0@mail.bayarea.net>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format="flowed"
List-Unsubscribe: <mailto:drums-request@cs.utk.edu?Subject=unsubscribe>

PRIMARY COMMENT:

At 02:18 PM 7/10/00 -0700, Chris Newman wrote:
>(1) Syntax for unordered lists.  But since we already have *(a / b / c), 
>we don't need to add anything.

While I would like to explore my original concerns and proposal further, I 
frankly will consider this thread adequately productive for having surfaced 
the above sentence sufficiently to give me guidance for some additional 
text in the ABNF specification, so IT can give guidance to users of ABNF.



FURTHER DISCUSSION

The thin edge of the wedge I will use to keep the door open is RFC 1894, as 
an example of the above insight not making it into a specification  -- and 
please note the use of the word "example", which means that there is no 
criticism of 1894 itself, just interest in it as representative of ABNF 
writing style.  Also there is the question of *( a / b / c )'s adequacy.

RFC1894 was written in 1996.  Perhaps the insight is more recent?

Will adding advisory "usage" text to the ABNF specification be sufficient 
to help future writers use the *(a / b / c) construct?

Is the *(a/b/c) construct sufficient for 1894?  My effort at re-formulation 
produces:

     per-message-fields = *(
           ( original-envelope-id-field CRLF )
           / ( reporting-mta-field CRLF )
           / ( dsn-gateway-field CRLF )
           / (received-from-mta-field CRLF )
           / ( arrival-date-field CRLF )
           / ( extension-field CRLF ) )

The main point is that the distinction between optional and required is 
entirely removed from the syntax, as well as multiple occurrences, for 
extension-field.



DRUMS EXEMPLAR

With respect to draft-ietf-drums-msg-fmt-07.txt , there seems to be a problem:

>message         =       (fields / obs-fields)   [CRLF body]
>
>fields          =       *(trace
>                           *(resent-date /
>                            resent-from /
>                            resent-sender /
>                            resent-to /
>                            resent-cc /
>                            resent-bcc /
>                            resent-id))
>                         *(orig-date /
>                         from /
>                         sender /
>                         reply-to /
>                         to /
>                         cc /
>                         bcc /
>                         message-id /
>                         in-reply-to /
>                         references /
>                         subject /
>                         comments /
>                         keywords /
>                         optional-field)

does NOT fully show the non-ordering that is cited in:

>....  So from a programmer viewpoint 822bis is heading in exactly the 
>right direction -- combine the *(a / b / c) unordered-list syntax with a 
>table of additional restrictions.

or did I miss something?

It looks as if, for example, the syntax requires trace information to 
precede the From field.



PRECISE NUMBER OF OCCURRENCES

>(2) Syntax for a list of attribute-value pairs where each attribute can 
>occur only once.  This isn't useful in the message format case since the 
>rules are far more complex.

0.  "only once"... or any specified number of times.

1.  In fact, the *( a/b/c ) construct guarantees loss of all syntactic 
control over the number of times.

         *** Specification of number of permitted/required occurrences
         *** must be placed into the prose.

2.  Can we get some elaboration on the "far more complex" assessment?  The 
formal rules in RFC822 permit zero or one header in most cases, requiring 
only a few.  With respect to generic multiple occurrences of the same 
header RFC822 states

>      4.1.  SYNTAX
>...
>             This specification permits multiple  occurrences  of  most
>             fields.   Except  as  noted,  their  interpretation is not
>             specified here, and their use is discouraged.

However most implementations treat them as a concatenation of a single header.

My own feeling is that the rules pertaining to number of occurrences of 
list members tend to be pretty simple, along the lines of:

         1.  Exactly once
         2.  At most once
         3.  Zero or more
         4.  1 or more

and that other rules are very rare indeed.


>Furthermore, I don't believe it's possible for a formal syntax to express 
>this clearly since it implies the formal syntax has explicit knowledge of 
>the attribute-value concept, and since attribute-value lists can be 
>represented by many different syntaxes, I suspect this would have problems 
>similar to the # construct, only much worse.

Not clear to me that explicit knowledge of the attribute-value concept is 
at all required.  I believe the focus on "number of occurrences" fits the 
issue adequately.  The fact that the occurrences are of things that are 
attribute/value does not seem relevant.


>(3) Syntax to generally support complex restrictions on attributes within 
>an attribute-value construct.  Similar to (2), but will be both more 
>useful and vastly more complex and unintuitive.

I agree entirely.



At 09:29 AM 7/10/00 +0100, shelness@lotus.com wrote:
>1. Do Dave's two new proposed constructs (if defined with enough rigor)
>allow a human being to more easily distinguish whether an arbitrary string
>is legal or not? My gut feeling is that they do, though clearly there are
>some nasty problems to be solved. For example, under Dave's proposed
>notation, are the two grammars
>
>   foo = SET ( "a" "b" *("c"))
>
>and
>
>    foo = SET ( "a" "b" c)
>    c = *("c")
>
>equivalent? Everything I know about BNF would imply they are. My sense of
>what Dave is saying is that they are not, and that "cabc" is legal under

Classic BNF is strictly ordered, so, yes, the two forms yield the same 
result.  And, yes, I am suggesting something that does not.  And, yes, 
that's a problem if only for clarity of understanding.

However, like defining the difference between URLs that are part of a web 
page set, versus URLs that are just external references, there needs to be 
a boundary on the un-ordering.

I'd be perfectly happy exploring a variation on *(a/b/c) for unordering, if 
it results in something more clear.  At base, focus on concatenation seemed 
better than alternation because unordering is messing with the 
combinatorials of items, rather than with what items are 'selected'.

d/

=-=-=-=-=
Dave Crocker  <dcrocker@brandenburg.com>
Brandenburg Consulting  <www.brandenburg.com>
Tel: +1.408.246.8253,  Fax: +1.408.273.6464
675 Spruce Drive,  Sunnyvale, CA 94086 USA