[abnf-discuss] ABNF colloquialism for end-of-line

Sean Leonard <dev+ietf@seantek.com> Wed, 15 November 2017 11:54 UTC

Return-Path: <dev+ietf@seantek.com>
X-Original-To: abnf-discuss@ietfa.amsl.com
Delivered-To: abnf-discuss@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 90681129454 for <abnf-discuss@ietfa.amsl.com>; Wed, 15 Nov 2017 03:54:42 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Ed0t6wGPVpSv for <abnf-discuss@ietfa.amsl.com>; Wed, 15 Nov 2017 03:54:40 -0800 (PST)
Received: from smtp-out-1.mxes.net (smtp-out-1.mxes.net [67.222.241.250]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B6B8F126B6E for <abnf-discuss@ietf.org>; Wed, 15 Nov 2017 03:54:40 -0800 (PST)
Received: from dhcp-894b.meeting.ietf.org (dhcp-894b.meeting.ietf.org [31.133.137.75]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.mxes.net (Postfix) with ESMTPSA id AC60B274F7 for <abnf-discuss@ietf.org>; Wed, 15 Nov 2017 06:54:39 -0500 (EST)
From: Sean Leonard <dev+ietf@seantek.com>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\))
Message-Id: <97E6D6C0-7010-46D6-8641-670F10A2504C@seantek.com>
Date: Wed, 15 Nov 2017 19:54:36 +0800
To: ABNF-Discuss <abnf-discuss@ietf.org>
X-Mailer: Apple Mail (2.3273)
Archived-At: <https://mailarchive.ietf.org/arch/msg/abnf-discuss/g3nMGkoB4ISYm_fBqoTW-RGvYmY>
Subject: [abnf-discuss] ABNF colloquialism for end-of-line
X-BeenThere: abnf-discuss@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "General discussion about tools, activities and capabilities involving the ABNF meta-language" <abnf-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/abnf-discuss>, <mailto:abnf-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/abnf-discuss/>
List-Post: <mailto:abnf-discuss@ietf.org>
List-Help: <mailto:abnf-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/abnf-discuss>, <mailto:abnf-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 15 Nov 2017 11:54:42 -0000

Hello ABNF “Doctors”:

On a recent thread on cbor@ <https://mailarchive.ietf.org/arch/msg/cbor/ZL2NvalH6jmSVqfvkBH4af3BHsE>, the following question arose:
What is the best ABNF colloquialism for end-of-line?

CR LF is the “Internet standard newline”, with rule name <CRLF> [RFC5234]. However, there is a desire amongst some specification writers to admit Unix LF as well. I have not heard of people clamoring for bare CR; however if bare LF is on the table then bare CR might be on the table too, for completeness. A quite small minority of published RFC specifications define an end-of-line production as CRLF / LF, and an even smaller minority of published RFC specifications allow for bare CR as well.

So I just wanted to see what ABNF people’s opinions are on this one.

Thanks,

Sean

~~~~~~~
2017-11-15 Facts about CRLF and new lines in ABNF in RFC Series
Sean Leonard

Facts about CRLF and new lines:

Surveyed ABNF of all RFCs between RFC 2000 to RFC 7999.

“What is the end-of-line marker called, other than CRLF?” (by search of CRLF)

Answer: Most call it CRLF. Amongst the minority, EOL is the most common, followed by line-break or lineBreak. Most define the end-of-line marker as CR LF; in the minority, most in the minority use CRLF / LF, and a smaller subset uses CR / LF (aka any of bare CR, bare LF, or CRLF).

RFC	Hint	Name	=	Definition
2705	Megaco	EOL	=	CRLF / LF
2967*	TISDAG	nl	=	%d13 %d10
2967*	TISDAG	SEP	=	(CR LF) | LF   <-- NOT actually ABNF 
3108	SDP	EOL	=	(CR / LF / CRLF)
3435	Megaco	EOL	=	CRLF / LF
3780	SMIng	lineBreak =	CRLF / LF
4997	ROHC-FN	CRLF	=	%x0A / %x0D.0A
4997	ROHC-FN	NL	=	COMMENT / CRLF
4997	ROHC-FN	COMMENT	=	"//" *(SP / HTAB / VCHAR) CRLF
5228*	Sieve	(Sieve explicitly limits to CRLF-only in text)
5234*	ABNF	c-nl	=	comment / CRLF
5234*	ABNF	comment =	";" *(WSP / VCHAR) CRLF
6020	YANG	line-break =	CRLF / LF
7208*	SPF	CRLF	=	<standard end-of-line token as per [RFC5322]>
7468	PKIXt	eol	=	CRLF / CR / LF
7468	PKIXt	eolWSP	=	WSP / CR / LF
7468	PKIXt	W	=	WSP / CR / LF / %x0B / %x0C


“How is ‘nl’ defined?” (by search of ‘nl’)

Answer: Always CRLF.

RFC	Hint	Name	=	Definition
2957	whoispp	nl	=	%d13 %d10
2958	whoispp	nl	=	%d13 %d10
2967*	TISDAG	nl	=	%d13 %d10
4997	ROHC-FN	NL	=	COMMENT / CRLF
5234*	ABNF	c-nl	=	comment / CRLF
5234*	ABNF	comment =	";" *(WSP / VCHAR) CRLF
6787*	MRCPv2	phrase-nl =	"Phrase-NL" ":" 1*UTFCHAR CRLF


“How is ’newline’ used?” (by search of ‘newline’) 

Answer: ‘newline’ is never defined as a rule name. When it appears in comments, it is *always* associated with CR LF only. (Same as ‘nl’.)

“How is ‘0A’ used?” (by search of 0A)

Answer: ‘0A’ is almost always defined as LF or used as LF; no additional interesting variations were uncovered. The only interesting variation is RFC 4464, which has %x0a / %x0d / (others) as whitespace generically, but that protocol is not line-oriented.

* items do not address the premise of the inquiry, but is included anyway for completeness.
~~~~~~~