Re: [abnf-discuss] ABNF colloquialism for end-of-line
Sean Leonard <dev+ietf@seantek.com> Mon, 20 November 2017 09:19 UTC
Return-Path: <dev+ietf@seantek.com>
X-Original-To: abnf-discuss@ietfa.amsl.com
Delivered-To: abnf-discuss@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 93A601294F0 for <abnf-discuss@ietfa.amsl.com>; Mon, 20 Nov 2017 01:19:20 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id lf5ZqpDO0UZO for <abnf-discuss@ietfa.amsl.com>; Mon, 20 Nov 2017 01:19:19 -0800 (PST)
Received: from smtp-out-1.mxes.net (smtp-out-1.mxes.net [67.222.241.250]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 0FFD31294D4 for <abnf-discuss@ietf.org>; Mon, 20 Nov 2017 01:19:18 -0800 (PST)
Received: from [192.168.123.7] (cpe-76-90-60-238.socal.res.rr.com [76.90.60.238]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.mxes.net (Postfix) with ESMTPSA id 37BD6274F7 for <abnf-discuss@ietf.org>; Mon, 20 Nov 2017 04:19:18 -0500 (EST)
To: abnf-discuss@ietf.org
References: <97E6D6C0-7010-46D6-8641-670F10A2504C@seantek.com> <3fbd228d-c6cf-be73-c7f2-f6b15979b852@gmail.com> <477FA5E8-FBAA-47D4-98A6-79DBAE4498C7@tzi.org> <7db503ef-3db4-9a72-6d14-001831742600@gmail.com> <62B9A765-E6EE-4C20-9A4E-58ADA9FDE975@seantek.com> <c10a79f2-5e42-fc00-ed5a-4459064b5af4@gmail.com>
From: Sean Leonard <dev+ietf@seantek.com>
Message-ID: <c9a7213d-0412-2280-6e24-dacaa00b4ee3@seantek.com>
Date: Mon, 20 Nov 2017 01:17:18 -0800
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0
MIME-Version: 1.0
In-Reply-To: <c10a79f2-5e42-fc00-ed5a-4459064b5af4@gmail.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: quoted-printable
Content-Language: en-US
Archived-At: <https://mailarchive.ietf.org/arch/msg/abnf-discuss/2rDcbfsBe1ulrN1pwFetaaE6rRw>
Subject: Re: [abnf-discuss] ABNF colloquialism for end-of-line
X-BeenThere: abnf-discuss@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "General discussion about tools, activities and capabilities involving the ABNF meta-language" <abnf-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/abnf-discuss>, <mailto:abnf-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/abnf-discuss/>
List-Post: <mailto:abnf-discuss@ietf.org>
List-Help: <mailto:abnf-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/abnf-discuss>, <mailto:abnf-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 20 Nov 2017 09:19:20 -0000
With respect to the original topic about end-of-line, I think the solution has only 10% to do with ABNF per-se, and 90% to do with the surrounding computing environment. Therefore, I have come up with the following conventions based on inspection of a large quantity of RFCs: Some protocols are binary-oriented (unit is based on octet), others are text-oriented (unit is based on code point, e.g., ASCII, or Unicode/UTF-8). Within the text-oriented protocols, some are line-oriented and some are whitespace-oriented. (Some are oriented towards both.) CRLF is the Internet standard newline. (RFC 5198) LF is the XML standard newline. (https://www.w3.org/TR/xml/ Section 2.11) CRLF is the Windows standard newline. LF is the Unix standard newline. (No citations necessary.) And etc. A format is a "protocol" for our purposes if the format is sent over the Internet. Specifically, RFC 2046 says: The canonical form of any MIME "text" subtype MUST always represent a line break as a CRLF sequence. Similarly, any occurrence of CRLF in MIME "text" MUST represent a line break. Use of CR and LF outside of line break sequences is also forbidden. Therefore: When ABNF is used to describe a line-oriented protocol that has to do with the Internet, it ought to use <CRLF>. When ABNF is used to describe a line-oriented protocol that does not have to do with the Internet, it can define line markers "to taste"; a common convention is <EOL>. But don't define EOL = CRLF because there is no point. EOL is an indication to the reader that something is going on with end-of-line markers, but there needs to be a good reason. Don't redefine CRLF (or LF for that matter) either, because that confuses readers. When ABNF is used to describe a whitespace-oriented (but not line-oriented) protocol, it is acceptable to define whitespace as a glob of any of SP, HTAB, CR, and LF. A key example is JSON RFC 7159, which is whitespace-oriented, but does not care about lines. It is media-typed as application/json, not text/json, for that and related reasons. When ABNF is used to describe a binary protocol, do whatever, but don't use <CRLF> or <EOL> rule names in a binary protocol definition since those conventions imply a text-oriented protocol. When in doubt, ask: "If this format were to be dumped into a MIME part, would this format foreseeably be transmitted as text/* such as text/plain or message/* such as message/http? Or is it going to have to be transmitted as application/* such as application/xml?" And there's a reasonable guide. If there is disagreement about line endings and it matters to the protocol, say something about that during "preprocessing", rather than in the (A)BNF definition. Example is CSS3: Section 3.3 of <https://www.w3.org/TR/css-syntax-3/>. Regards, Sean On 11/19/2017 10:56 AM, Dave Crocker wrote: > On 11/16/2017 1:26 PM, Sean Leonard wrote: >> To add some color to this point, “cuts” was discussed in the CBOR WG >> in the context of CDDL. It is a technique from Parsing Expression >> Grammars. Here is overview: > ... >> Basically it commits the parser at a particular point, so that it >> does not backtrack. >> >> However, PEGs are ordered; ABNF is unordered. With ABNF (as presently >> constituted), all alternatives in a choice are considered >> simultaneously (order is not relevant). Even if you match one >> alternative, you’re supposed to try all other alternatives. > > > (Disclaimer: What follows is pure personal opinion.) > > > I'll offer this for consideration, without also offering any specific > action... > > Why did ABNF become popular? > > (For a time, RFC 733 and then RFC 822 were the most-cited RFCs. It > turned out this was not due to the email portion but folk were > re-using the ABNF meta-specification, which is why the later, revision > effort to RFC 822 split the ABNF text out into its own RFC.) > > At the time ABNF was defined in the latter 1970s, most specs provided > their own variation of BNF. Everyone wanted tailoring to the basic > tool. But while folk have often wanted to enhance ABNF, over the > years, the 'let's define a new variant' tendency mostly died out -- > ignoring the much more recent move towards JSON... Why did this > popularity happen? > > Languages need to balance expressive power against human usage > complexity. Enough but not too much, of each. ABNF seemed to strike > a good balance. (I like to think the documentation clarity in RFC 733 > also helped, but then I'm quite biased about this, given how much > effort I put into that aspect of the work...) > > I think the biggest danger in creating a meta-language is > specification obscurity. The tendency to want to add features can > too-easily create too much complexity for easy human comprehension. > The result is that seemingly-simple specifications can too-easily have > implications that are not understood by most readers. > > Computers are not the target audience for computer languages. Human > readers are. Subtle effects (nevermind side-effects) are very easily > missed by human readers. > > d/
- [abnf-discuss] ABNF colloquialism for end-of-line Sean Leonard
- Re: [abnf-discuss] ABNF colloquialism for end-of-… Dave Crocker
- Re: [abnf-discuss] ABNF colloquialism for end-of-… Sean Leonard
- Re: [abnf-discuss] ABNF colloquialism for end-of-… Matthew Kerwin
- Re: [abnf-discuss] ABNF colloquialism for end-of-… Carsten Bormann
- Re: [abnf-discuss] ABNF colloquialism for end-of-… Dave Crocker
- Re: [abnf-discuss] ABNF colloquialism for end-of-… Sean Leonard
- Re: [abnf-discuss] ABNF colloquialism for end-of-… Carsten Bormann
- Re: [abnf-discuss] ABNF colloquialism for end-of-… Paul Kyzivat
- Re: [abnf-discuss] ABNF colloquialism for end-of-… Dave Crocker
- Re: [abnf-discuss] ABNF colloquialism for end-of-… Carsten Bormann
- [abnf-discuss] Target audience for ABNF Paul Kyzivat
- Re: [abnf-discuss] Target audience for ABNF Dave Crocker
- Re: [abnf-discuss] Target audience for ABNF Paul Kyzivat
- Re: [abnf-discuss] Target audience for ABNF Dave Crocker
- Re: [abnf-discuss] Target audience for ABNF Carsten Bormann
- Re: [abnf-discuss] Target audience for ABNF Sean Leonard
- Re: [abnf-discuss] ABNF colloquialism for end-of-… Sean Leonard
- Re: [abnf-discuss] Target audience for ABNF Paul Kyzivat
- Re: [abnf-discuss] Target audience for ABNF Carsten Bormann
- Re: [abnf-discuss] Target audience for ABNF Paul Kyzivat