[rfc-i] v3imp #8 Fragment tagging on sourcecode

dev+ietf at seantek.com (Sean Leonard) Wed, 28 January 2015 05:16 UTC

From: "dev+ietf at seantek.com"
Date: Tue, 27 Jan 2015 21:16:37 -0800
Subject: [rfc-i] v3imp #8 Fragment tagging on sourcecode
In-Reply-To: <54C7FAD7.7040500@alum.mit.edu>
References: <54C20F92.4090400@seantek.com> <54C232FC.1000604@gmx.de> <54C275BC.1040905@alum.mit.edu> <20150123175511.GI2350@localhost> <54C28E3F.4040901@alum.mit.edu> <E378C876-5217-4274-86B6-1DBFB653DE24@vpnc.org> <54C29891.6040101@alum.mit.edu> <54C3576A.9030206@greenbytes.de> <54C3BE06.8010707@alum.mit.edu> <54C3C6A3.6080003@seantek.com> <54C3CF7F.6090901@seantek.com> <54C4AFF1.6030608@gmx.de> <54C7FAD7.7040500@alum.mit.edu>
Message-ID: <54C870B5.7000205@seantek.com>

Overall I still stand by my proposition that the RFC is the module for 
ABNF purposes. Honestly it just makes things a lot simpler. To the 
extent that you need to split things inside the RFC, you can refer to 
specific sections. Specific comments below.

With regard to import rules, I can concur with Julian's comment that 
ABNF should not be extended.

Instead, I propose the following *informal* definition, which is based 
on Section 2.7 in RFC 7230 (and is repeated by Julian below, in RFC 7231):

localrule = <foreignrule, see [RFCXXXX], Section Y.Z>

For example, the text in Section 2.7 of RFC 7230 says:

...

    paths that begin with "//".)  A "partial-URI" rule is defined for
    protocol elements that can contain a relative URI but not a fragment
    component.

      URI-reference = <URI-reference, see[RFC3986], Section 4.1  <http://tools.ietf.org/html/rfc3986#section-4.1>>
      absolute-URI  = <absolute-URI, see[RFC3986], Section 4.3  <http://tools.ietf.org/html/rfc3986#section-4.3>>
      relative-part = <relative-part, see[RFC3986], Section 4.2  <http://tools.ietf.org/html/rfc3986#section-4.2>>
      scheme        = <scheme, see[RFC3986], Section 3.1  <http://tools.ietf.org/html/rfc3986#section-3.1>>
      authority     = <authority, see[RFC3986], Section 3.2  <http://tools.ietf.org/html/rfc3986#section-3.2>>
      uri-host      = <host, see[RFC3986], Section 3.2.2  <http://tools.ietf.org/html/rfc3986#section-3.2.2>>
      port          = <port, see[RFC3986], Section 3.2.3  <http://tools.ietf.org/html/rfc3986#section-3.2.3>>
      path-abempty  = <path-abempty, see[RFC3986], Section 3.3  <http://tools.ietf.org/html/rfc3986#section-3.3>>
      segment       = <segment, see[RFC3986], Section 3.3  <http://tools.ietf.org/html/rfc3986#section-3.3>>
      query         = <query, see[RFC3986], Section 3.4  <http://tools.ietf.org/html/rfc3986#section-3.4>>
      fragment      = <fragment, see[RFC3986], Section 3.5  <http://tools.ietf.org/html/rfc3986#section-3.5>>

      absolute-path = 1*( "/" segment )
      partial-URI   = relative-part [ "?" query ]
...



Any ABNF analyzer can be "smart" enough to see that when the stuff in <> 
is formatted as <foreignrule, see [RFCXXXX], Section Y.Z>. That looks 
like an informal import directive to me.

Moreover, we need to distinguish between an ABNF compiler, and an ABNF 
validator. I think that Paul is thinking of some kind of ABNF compiler, 
to compile to some other computer language. But all that is 
needed/helpful for RFC publication purposes is an ABNF validator.

All an ABNF validator needs to do is make sure that all rules are 
comprised of valid ABNF primitives, or other rules that decompose into 
valid ABNF primitives. ABNF primitives are:
1. literals, such as %d13 and %x0D (which, incidentally, are equivalent)
2. rules assumed to exist (i.e., RFC 5234 Appendix B)
1. rules defined by <>

Such a validator is quite easy to program, and doesn't need to import 
anything from other RFCs.

On 1/27/2015 12:53 PM, Paul Kyzivat wrote:
> On 1/25/15 3:57 AM, Julian Reschke wrote:
>> On 2015-01-24 17:59, Sean Leonard wrote:
>>> On 1/24/2015 8:21 AM, Sean Leonard wrote:
>>>> First of all there is no such thing as "ABNF modules" yet--only ABNF
>>>> grammar (combined with specification text). I recognize this
>>>> conversation is trending to creating them.
>>>> Providing different definitions of the same rule in the same RFC is
>>>> reckless
>>>
>>> The more I thought about this, the more I would like to propose that 
>>> the
>>> RFC itself be unit of analysis (i.e., "module").
>>> ...
>>
>>
>> I agree that it's good to formalize this somewhat, but I'm not convinced
>> updating/extending RFC 5234 is a good idea.
>>
>> For instance, in the HTTP specs we use prose rules with a well-defined
>> syntax:
>>
>> <http://greenbytes.de/tech/webdav/rfc7231.html#imported.abnf>
>>
>> This might be enough for automated checkers to do the right thing.
>
> I agree that we need to be careful not to extend ABNF too much, making 
> it more difficult. OTOH, the people who use ABNF are not, for the most 
> part, stupid. (Does ABNF need to be understandable to someone who 
> doesn't know at least one real programming language?)

I don't find this relevant to the analysis.

>
> The use of some symbols defined in another draft presents a 
> particularly interesting issue:
>
> To verify the using ABNF, you need to import at least the rule 
> defining the symbol in question. But that rule may well refer to other 
> rules in the referenced document. Should you:
>
> - selectively import rules that are needed, one by one, until there
>   are no more undefined symbols?
>
> - OR, simply import the full set of rules from the referenced document?

Neither. See how it works in RFC 7230 and RFC 7231. {localrule = 
<foreignrule, ...>}. The rule name of the foreign rule is irrelevant 
since a local rule is defined with the standard syntax.

>
> Either way, there may then be conflicts between rules defined in the 
> new document and those imported from the old document. The potential 
> is greater if you have imported all the ABNF from the referenced 
> document.

See above; not an issue.

>
> And this of course depends a bit on whether the ABNF in the referenced 
> document was intended to be one "module" or not.
>
> RFC5234 is itself an interesting case study. It includes:

RFC 5234 is not interesting because it is defining "itself". It is not 
appropriate to view all of the so-called "definitions" in RFC 5234 as 
actual instances of ABNF. For example, Section 3.7:

***


      3.7 <http://tools.ietf.org/html/rfc5234#section-3.7>. Specific
      Repetition:

nRule

    A rule of the form:

          <n>element

    is equivalent to

          <n>*<n>element

    That is, exactly <n> occurrences of <element>.  Thus, 2DIGIT is a
    2-digit number, and 3ALPHA is a string of three alphabetic
    characters.


***

Clearly, neither {<n>element} nor {<n>*<n>element} are intended to be 
interpreted as ABNF as-is. Marking them as <artwork type="abnf"> would 
just be incorrect.

Actually I think that they should be:

<t>A rule of the form:<br/>
<tt xml:space="preserve">    &lt;n&gt;element</tt><br/>
is equivalent to<br/>
<tt xml:space="preserve">    &lt;n&gt;*&lt;n&gt;element</tt><br/>
That is, exactly <tt>&lt;n&gt;</tt>
occurrences of <tt>&lt;element&gt;</tt>. Thus, <tt>2DIGIT</tt> is a
    2-digit number, and <tt>3ALPHA</tt> is a string of three alphabetic
    characters.</t>


The text is crystal clear that the examples {<n>element} and 
{<n>*<n>element} are treated as quotations, which are nouns for 
grammatical purposes. I.e., the entire Section 3.7 comprises *ONE* 
paragraph. Splitting these sample verbatim text elements out into 
<figure><artwork> blocks is ludicrous. There you have it: yet another 
use case in support of Improvement #1 (fine control over spaces and line 
breaks).

If you actually go through RFC 5234 piece by piece, you will see that 
there is no conflict between rule names in Section 4 (ABNF Definition of 
ABNF) and Appendix B (Core ABNF of ABNF). But anyway, as I have already 
argued, future RFCs should consider Appendix B names already 
pre-defined, and therefore should not have any need to import RFC 5234 
parts anyway.

Respectfully submitted,

Sean

>
> - a set of "Core Rules" in Appendix B. This could be viewed as one
>   ABNF "module".
> - a complete ABNF definition of ABNF. This could also be viewed as
>   a separate ABNF "module", but it informally indicates that it
>   depends (imports) the Core Rules.
> - ABNF fragments interspersed with text, duplicating rules in
>   both of the above.
>
> *Many* uses of ABNF reuse rules defined in the Core Rules. When doing 
> so, it would probably be fine to import the full set of Core Rules, 
> but it would probably be inappropriate to also import the rules 
> defining the ABNF of ABNF, and it certainly would be inappropriate to 
> also import all the fragments.
>
> IMO it would make sense to introduce enough new syntax to ABNF to 
> define named modules, and to specify the import of specific named 
> modules from an external document.
>
>     Thanks,
>     Paul
> _______________________________________________
> rfc-interest mailing list
> rfc-interest at rfc-editor.org
> https://www.rfc-editor.org/mailman/listinfo/rfc-interest