Re: [abnf-discuss] ABNF colloquialism for end-of-line

Carsten Bormann <> Sat, 18 November 2017 09:57 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id B36B61273B1 for <>; Sat, 18 Nov 2017 01:57:35 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -4.2
X-Spam-Status: No, score=-4.2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3] autolearn=ham autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id 9WZXoZ78lwYq for <>; Sat, 18 Nov 2017 01:57:34 -0800 (PST)
Received: from ( [IPv6:2001:638:708:30c9::12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id C581B12025C for <>; Sat, 18 Nov 2017 01:57:33 -0800 (PST)
X-Virus-Scanned: amavisd-new at
Received: from ( [IPv6:2001:638:708:30c9::b]) by (8.14.5/8.14.5) with ESMTP id vAI9vSRq027820; Sat, 18 Nov 2017 10:57:28 +0100 (CET)
Received: from [] (unknown []) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPSA id 3yf9Q01gkCzDX2n; Sat, 18 Nov 2017 10:57:28 +0100 (CET)
Content-Type: text/plain; charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\))
From: Carsten Bormann <>
In-Reply-To: <>
Date: Sat, 18 Nov 2017 10:57:27 +0100
Cc: ABNF-Discuss <>, Sean Leonard <>
X-Mao-Original-Outgoing-Id: 532691847.56698-af050791fccabba5afa89cd0fffb210c
Content-Transfer-Encoding: quoted-printable
Message-Id: <>
References: <> <> <> <>
To: Dave Crocker <>
X-Mailer: Apple Mail (2.3273)
Archived-At: <>
Subject: Re: [abnf-discuss] ABNF colloquialism for end-of-line
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "General discussion about tools, activities and capabilities involving the ABNF meta-language" <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Sat, 18 Nov 2017 09:57:35 -0000

On Nov 16, 2017, at 23:31, Dave Crocker <> wrote:
> Carsten,
> On 11/15/2017 6:35 PM, Carsten Bormann wrote:
>> Hi Dave,
>> On Nov 15, 2017, at 23:37, Dave Crocker <> wrote:
>>> Given that the thread in CBOR says 'matching rules', I'm guessing that the goal here is to describe freeform data coming from the net.  Hence, requiring a simple, canonicalized data form is not appropriate.  (This is an essential point; if it's not correct, then what follows won't be either.)
>> The thread title unfortunately is misleading.
>> The ABNF is not for on-the wire packets, but for defining the syntax of the CDDL language (which then defines the syntax of the on-the-wire data items).
>> So this ABNF is about files on computers, which probably run a form of Linux/Unix or Windows (and very likely not pre-2001 classic MacOS).  So
> I take your point, but suspect there is still an issue.  At the least, being clear /and explicit/ about this in the specification document(s) will be helpful.


> The issue I suspect is the intended portability of the file.  If the file is intended to be portable, then it, too, needs to be in a canonical form.  It's a type of 'over the wire' even though it isn't part of a wire protocol.

Well, software development has focused on Unix line ends, tolerating DOS line ends in some spaces, for a while.
That seems to work for so many languages, we can just emulate that.

Let’s take a page from RFC 7950 (YANG 1.1):

   line-break          = CRLF / LF

That is essentially the same I was proposing, but the explicit name “line-break” is probably better than NL by some.

> This, then, would require separate translation from native, local representation to the canonical form.  But that's a pretty simple definition effort.

Again, I think that most source control systems and programmers’ editors know how to do that.

>>    EOL = [CR] LF
>> is probably the right way to describe line ends for these files.
> Possibly, unless folk really want
>   EOL = *CR LF

We don’t want to tolerate more than one CR here; these would be isolated CRs in todays line end worlds.

> So while there is an historical basis for saying EOL, I'd think that in this context, it would sufficient and simpler just to have:
>  WS = SP / CR / LF

That generally works, except in certain strings, where it is good to identify actual line ends.

> (why not also include TAB?)

I can’t find the string TAB in RFC 20; I think you probably mean HT.
HT is evil(**) and, approximately since the time 300 bit/s LA36 terminals(***) went out of use, should never be used(*).
Easy fix as applied here: Simply don’t allow HT in specification source files.

Grüße, Carsten

(*) Outside certain very sheltered environments such as Linux Kernel development.
(**) RFC 7386/7396 should be proof enough here.
(***) p. 82; additionally insert fond memories here…