Re: [abnf-discuss] ABNF colloquialism for end-of-line

Paul Kyzivat <pkyzivat@alum.mit.edu> Sat, 18 November 2017 17:54 UTC

Return-Path: <pkyzivat@alum.mit.edu>
X-Original-To: abnf-discuss@ietfa.amsl.com
Delivered-To: abnf-discuss@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 06A0212025C for <abnf-discuss@ietfa.amsl.com>; Sat, 18 Nov 2017 09:54:37 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.301
X-Spam-Level:
X-Spam-Status: No, score=-2.301 tagged_above=-999 required=5 tests=[RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id MDtkUwNPEShr for <abnf-discuss@ietfa.amsl.com>; Sat, 18 Nov 2017 09:54:35 -0800 (PST)
Received: from alum-mailsec-scanner-2.mit.edu (alum-mailsec-scanner-2.mit.edu [18.7.68.13]) by ietfa.amsl.com (Postfix) with ESMTP id 01963120713 for <abnf-discuss@ietf.org>; Sat, 18 Nov 2017 09:54:34 -0800 (PST)
X-AuditID: 1207440d-86bff70000000f42-44-5a1073d7cdba
Received: from outgoing-alum.mit.edu (OUTGOING-ALUM.MIT.EDU [18.7.68.33]) (using TLS with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by alum-mailsec-scanner-2.mit.edu (Symantec Messaging Gateway) with SMTP id DC.35.03906.8D3701A5; Sat, 18 Nov 2017 12:54:32 -0500 (EST)
Received: from PaulKyzivatsMBP.localdomain (c-24-62-227-142.hsd1.ma.comcast.net [24.62.227.142]) (authenticated bits=0) (User authenticated as pkyzivat@ALUM.MIT.EDU) by outgoing-alum.mit.edu (8.13.8/8.12.4) with ESMTP id vAIHsUmv000794 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT) for <abnf-discuss@ietf.org>; Sat, 18 Nov 2017 12:54:31 -0500
To: abnf-discuss@ietf.org
References: <97E6D6C0-7010-46D6-8641-670F10A2504C@seantek.com> <3fbd228d-c6cf-be73-c7f2-f6b15979b852@gmail.com> <477FA5E8-FBAA-47D4-98A6-79DBAE4498C7@tzi.org> <7db503ef-3db4-9a72-6d14-001831742600@gmail.com> <8733D93A-58A9-4D6C-BF1C-CE02F221DEA1@tzi.org>
From: Paul Kyzivat <pkyzivat@alum.mit.edu>
Message-ID: <edef3996-ccbd-9596-6cf6-c04e3091af77@alum.mit.edu>
Date: Sat, 18 Nov 2017 12:54:30 -0500
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:52.0) Gecko/20100101 Thunderbird/52.4.0
MIME-Version: 1.0
In-Reply-To: <8733D93A-58A9-4D6C-BF1C-CE02F221DEA1@tzi.org>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Language: en-US
Content-Transfer-Encoding: 8bit
X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFvrCIsWRmVeSWpSXmKPExsUixO6iqHuzWCDK4O5pPounh36wOTB6LFny kymAMYrLJiU1J7MstUjfLoErY1nLDuaCibIVa2eFNDD2i3cxcnJICJhITDg8kaWLkYtDSGAH k8SF981MEM5fJomlK+4wglQJC9hJzDl9jAnEFhGQlDiz8iobiC0k0MYk0dGfB2KzCWhJzDn0 nwXE5hWwlzh+awGYzSKgKnFt3ilWEFtUIE3izoyHTBA1ghInZz4Bq+EUsJb40PMMbCazgJnE vM0PmSFscYlbT+YzQdjyEs1bZzNPYOSfhaR9FpKWWUhaZiFpWcDIsopRLjGnNFc3NzEzpzg1 Wbc4OTEvL7VI10gvN7NELzWldBMjJCh5dzD+XydziFGAg1GJh/cCE3+UEGtiWXFl7iFGSQ4m JVHeg+uBQnxJ+SmVGYnFGfFFpTmpxYcYJTiYlUR4c5MEooR4UxIrq1KL8mFS0hwsSuK8akvU /YQE0hNLUrNTUwtSi2CyMhwcShK814uAGgWLUtNTK9Iyc0oQ0kwcnCDDeYCG8xeDDC8uSMwt zkyHyJ9iNObo6bnxh4nj2czXDcxCLHn5ealS4rzRIKUCIKUZpXlw02CJ5RWjONBzwrx7QZby AJMS3LxXQKuYgFa5XOAHWVWSiJCSamD05ngwj2exu+OHm0ZHqh8tsFM903vny8TSWNFrW54J rrvOkFZjtLZM9o5OLa9iZNqWNbP7g9y0LT7umR6zXvv4bGPH+XpJf13ObJRszZxx1nqS4gUb 8cTbzT29V+TvVznfKDm96uOqp/a70+KWhy5Yv2KTuJz+jM/Su+rZHC/c+V0r3T9z8Q4rJZbi jERDLeai4kQAeDm3kgcDAAA=
Archived-At: <https://mailarchive.ietf.org/arch/msg/abnf-discuss/s0uN2AXbWoImbFAsSZSBXo9hlm0>
Subject: Re: [abnf-discuss] ABNF colloquialism for end-of-line
X-BeenThere: abnf-discuss@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "General discussion about tools, activities and capabilities involving the ABNF meta-language" <abnf-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/abnf-discuss>, <mailto:abnf-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/abnf-discuss/>
List-Post: <mailto:abnf-discuss@ietf.org>
List-Help: <mailto:abnf-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/abnf-discuss>, <mailto:abnf-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 18 Nov 2017 17:54:37 -0000

 From a practical perspective, when parsing things that need EOL marks, 
ISTM that you can generally get good results by treating LF as EOL, and 
CR as WS. It works for CRLF, LFCR and naked LF. (I don't recall anything 
in the last few decades that treats a naked CR as EOL.)

This of course isn't enough to know what forms to produce when 
generating text.

	Thanks,
	Paul

On 11/18/17 4:57 AM, Carsten Bormann wrote:
> On Nov 16, 2017, at 23:31, Dave Crocker <dcrocker@gmail.com> wrote:
>>
>> Carsten,
>>
>> On 11/15/2017 6:35 PM, Carsten Bormann wrote:
>>> Hi Dave,
>>> On Nov 15, 2017, at 23:37, Dave Crocker <dcrocker@gmail.com> wrote:
>>>>
>>>> Given that the thread in CBOR says 'matching rules', I'm guessing that the goal here is to describe freeform data coming from the net.  Hence, requiring a simple, canonicalized data form is not appropriate.  (This is an essential point; if it's not correct, then what follows won't be either.)
>>> The thread title unfortunately is misleading.
>>> The ABNF is not for on-the wire packets, but for defining the syntax of the CDDL language (which then defines the syntax of the on-the-wire data items).
>>> So this ABNF is about files on computers, which probably run a form of Linux/Unix or Windows (and very likely not pre-2001 classic MacOS).  So
>>
>> I take your point, but suspect there is still an issue.  At the least, being clear /and explicit/ about this in the specification document(s) will be helpful.
> 
> Right.
> 
> 
>> The issue I suspect is the intended portability of the file.  If the file is intended to be portable, then it, too, needs to be in a canonical form.  It's a type of 'over the wire' even though it isn't part of a wire protocol.
> 
> Well, software development has focused on Unix line ends, tolerating DOS line ends in some spaces, for a while.
> That seems to work for so many languages, we can just emulate that.
> 
> Let’s take a page from RFC 7950 (YANG 1.1):
> 
>     line-break          = CRLF / LF
> 
> That is essentially the same I was proposing, but the explicit name “line-break” is probably better than NL by some.
> 
>> This, then, would require separate translation from native, local representation to the canonical form.  But that's a pretty simple definition effort.
> 
> Again, I think that most source control systems and programmers’ editors know how to do that.
> 
>>>     EOL = [CR] LF
>>> is probably the right way to describe line ends for these files.
>>
>> Possibly, unless folk really want
>>
>>    EOL = *CR LF
> 
> We don’t want to tolerate more than one CR here; these would be isolated CRs in todays line end worlds.
> 
>> So while there is an historical basis for saying EOL, I'd think that in this context, it would sufficient and simpler just to have:
>>
>>   WS = SP / CR / LF
> 
> That generally works, except in certain strings, where it is good to identify actual line ends.
> 
>> (why not also include TAB?)
> 
> I can’t find the string TAB in RFC 20; I think you probably mean HT.
> HT is evil(**) and, approximately since the time 300 bit/s LA36 terminals(***) went out of use, should never be used(*).
> Easy fix as applied here: Simply don’t allow HT in specification source files.
> 
> Grüße, Carsten
> 
> (*) Outside certain very sheltered environments such as Linux Kernel development.
> (**) RFC 7386/7396 should be proof enough here.
> (***) https://archive.org/details/handbookofintera00duan p. 82; additionally insert fond memories here…
> 
> _______________________________________________
> abnf-discuss mailing list
> abnf-discuss@ietf.org
> https://www.ietf.org/mailman/listinfo/abnf-discuss
>