[abnf-discuss] BNFs and determinism (Re: [art] [Technical Errata Reported] RFC7601 (5435))

Carsten Bormann <cabo@tzi.org> Mon, 23 July 2018 19:19 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: abnf-discuss@ietfa.amsl.com
Delivered-To: abnf-discuss@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A5078130E8E; Mon, 23 Jul 2018 12:19:37 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.2
X-Spam-Level:
X-Spam-Status: No, score=-4.2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id kb_smzFgbPPK; Mon, 23 Jul 2018 12:19:35 -0700 (PDT)
Received: from mailhost.informatik.uni-bremen.de (mailhost.informatik.uni-bremen.de [IPv6:2001:638:708:30c9::12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4573D12F1A6; Mon, 23 Jul 2018 12:19:35 -0700 (PDT)
X-Virus-Scanned: amavisd-new at informatik.uni-bremen.de
Received: from submithost.informatik.uni-bremen.de (submithost.informatik.uni-bremen.de [IPv6:2001:638:708:30c9::b]) by mailhost.informatik.uni-bremen.de (8.14.5/8.14.5) with ESMTP id w6NJJJ2b017510; Mon, 23 Jul 2018 21:19:19 +0200 (CEST)
Received: from [192.168.217.114] (p54A6C84F.dip0.t-ipconnect.de [84.166.200.79]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by submithost.informatik.uni-bremen.de (Postfix) with ESMTPSA id 41ZBBH0hlfzDXJX; Mon, 23 Jul 2018 21:19:19 +0200 (CEST)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.1\))
From: Carsten Bormann <cabo@tzi.org>
X-Priority: 3
In-Reply-To: <5b55c7e3.1c69fb81.10c36.95dc@mx.google.com>
Date: Mon, 23 Jul 2018 21:19:18 +0200
Cc: "John R. Levine" <johnl@iecc.com>, "abnf-discuss@ietf.org" <abnf-discuss@ietf.org>, "adam@nostrum.com" <adam@nostrum.com>, "ben@nostrum.com" <ben@nostrum.com>, "art@ietf.org" <art@ietf.org>, Murray Kucherawy <superuser@gmail.com>, Alexey Melnikov <aamelnikov@fastmail.fm>
X-Mao-Original-Outgoing-Id: 554066356.02133-dd21c71a953ad4eb3de85c6d922fd0eb
Content-Transfer-Encoding: quoted-printable
Message-Id: <2F8E9F6E-E4CD-4307-A14A-2FCED18E0556@tzi.org>
References: <20180723000558.CD758B80F99@rfc-editor.org> <alpine.OSX.2.21.1807222152380.18947@ary.qy> <5b5539e6.1c69fb81.b4b6d.5477@mx.google.com> <alpine.OSX.2.21.1807222231200.19369@ary.qy> <5b55c7e3.1c69fb81.10c36.95dc@mx.google.com>
To: Peter Occil <poccil14@gmail.com>
X-Mailer: Apple Mail (2.3445.9.1)
Archived-At: <https://mailarchive.ietf.org/arch/msg/abnf-discuss/D7tsjd3X7DSQBYW4BTtYxnMRK7k>
Subject: [abnf-discuss] BNFs and determinism (Re: [art] [Technical Errata Reported] RFC7601 (5435))
X-BeenThere: abnf-discuss@ietf.org
X-Mailman-Version: 2.1.27
Precedence: list
List-Id: "General discussion about tools, activities and capabilities involving the ABNF meta-language" <abnf-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/abnf-discuss>, <mailto:abnf-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/abnf-discuss/>
List-Post: <mailto:abnf-discuss@ietf.org>
List-Help: <mailto:abnf-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/abnf-discuss>, <mailto:abnf-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 23 Jul 2018 19:19:38 -0000

On Jul 23, 2018, at 14:19, Peter Occil <poccil14@gmail.com> wrote:
> 
> Thus, whether a received-token parser parses an "atom" as a "word" or as a "domain" depends on the context in which the "atom" appears,

Hi Peter,

actually, this is not how BNF traditionally works.

BNF is a productive grammar that produces a language.

It doesn’t tell you how to “parse”, i.e., assign rules to sections of the strings that make up the language.

E.g.,

  Foo = Bar / Baz

  Bar = “abc”

  Baz = “ab” “c”

is perfectly fine as BNF (here in ABNF form modulo the smartquotes) and takes no position on whether “abc” is a Bar or a Baz; it is certainly part of the language (as is “aBc” but that is a different story).

Clearly, parser generators do generate “parsers" that do want to know, so they are happier if the decision can be made.

When writing software based on ABNF, I generally prefer to use PEG (parse expression grammar) parsers instead of traditional BNF-based parser generators; this sometimes means I have to massage the ABNF slightly.  I also try to deliver my ABNF in PEG-friendly form.  But none of that is supported by RFC 5234 (STD 68).

CDDL (the ABNF derivative for CBOR and JSON) does prefer a clear assignment of specific rules to a subtree of the data model item tree; we are using PEG semantics for this.

BTW, the first RFC to use BNF (without actually using that word) was RFC 5 of June 2, l969 [sic!]; this wasn’t Ken L. Harrenstien's version of BNF but already somewhat close.  Prepare for the “BNF in RFCs” fiftieth anniversary in 10.5 months…

Grüße, Carsten