[netmod] Benjamin Kaduk's Discuss on draft-ietf-netmod-artwork-folding-09: (with DISCUSS and COMMENT)

Benjamin Kaduk via Datatracker <noreply@ietf.org> Thu, 05 September 2019 06:07 UTC

Return-Path: <noreply@ietf.org>
X-Original-To: netmod@ietf.org
Delivered-To: netmod@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id B09B4120A9B; Wed, 4 Sep 2019 23:07:46 -0700 (PDT)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
From: Benjamin Kaduk via Datatracker <noreply@ietf.org>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-netmod-artwork-folding@ietf.org, Lou Berger <lberger@labn.net>, netmod-chairs@ietf.org, lberger@labn.net, netmod@ietf.org
X-Test-IDTracker: no
X-IETF-IDTracker: 6.100.0
Auto-Submitted: auto-generated
Precedence: bulk
Reply-To: Benjamin Kaduk <kaduk@mit.edu>
Message-ID: <156766366671.22774.7481795788724573201.idtracker@ietfa.amsl.com>
Date: Wed, 04 Sep 2019 23:07:46 -0700
Archived-At: <https://mailarchive.ietf.org/arch/msg/netmod/m7salwQonJAwDYlNXfHOVTlTni8>
Subject: [netmod] Benjamin Kaduk's Discuss on draft-ietf-netmod-artwork-folding-09: (with DISCUSS and COMMENT)
X-BeenThere: netmod@ietf.org
X-Mailman-Version: 2.1.29
List-Id: NETMOD WG list <netmod.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/netmod>, <mailto:netmod-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/netmod/>
List-Post: <mailto:netmod@ietf.org>
List-Help: <mailto:netmod-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/netmod>, <mailto:netmod-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 05 Sep 2019 06:07:47 -0000

Benjamin Kaduk has entered the following ballot position for
draft-ietf-netmod-artwork-folding-09: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
for more information about IESG DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-netmod-artwork-folding/



----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

I think the procedures described herein are incomplete without a footer
to terminate the un-folding process.  Otherwise, it seems that the
described algorithms would leave the two-line header for the second and
subsequent instances of folded text in a single document.  (If we tried
to just blindly remove all instances of the header without seeking
boundaries, then we would misreconstruct content when different folding
algorithms are used in the same document with the single-backslash
algorithm occurring first.)

I don't think it's proper to refer to a script that requires bash
specifically as a "POSIX shell script".  I did not attmept to check
whether any bash-specific features are used or this requirements stems
solely from the shebang line, though.

I think the shell script does need to use double-quotes around some
variable expansions, especially "$infile" and "$outfile", to work
properly for filenames containing spaces.  We do quote "$infile" when
we're checking that it exists, just not (most of the time) when we
actually use it!

In addition to the above, I also share Alissa's (and Mirja's) concerns,
but feel that Discuss is more appropriate than Abstain, so we can discuss
what the best way to get this content published is.  For it's fine
content, and we should see it published; it's just not immediately clear
to me what the right way to do so is.


----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

Section 4.1

   Automated folding of long lines is needed in order to support draft
   compilations that entail a) validation of source input files (e.g.,
   XML, JSON, ABNF, ASN.1) and/or b) dynamic generation of output, using
   a tool that doesn't observe line lengths, that is stitched into the
   final document to be submitted.

I don't think the intended meaning of "source input files" will be clear
to all readers just from this text.  Some discussion of how RFCs can
consider source code, data structures, generated output, etc., that have
standalone representations and natural formats, and the need to display
their contents in the RFC format that has different requirements might
be helpful context for this paragraph and the next.

Section 7.1.2

For some reason my mental model of "RFC style" does not use the word
"really" in this way, and prefers alternatives like "very" or
"exceptionally".  (Also in Section 8.1.2.)

Section 7.2.1

   1.  Determine where the fold will occur.  This location MUST be
       before or at the desired maximum column, and MUST NOT be chosen
       such that the character immediately after the fold is a space ('
       ') character.  For forced foldings, the location is between the

This is a rather awkward natural line break.  I suggest an RFC Editor
note to make sure that the punctuation around the space character all
appears on the same line.

   3.  On the following line, insert any number of space (' ')
       characters.

I'm not sure I'd characterize the procedure as "complete" when it leaves
the value of the output subject to implementation choice such as this.
(Note that the next paragraph talks about the resulting "arbitrary
number of space" characters, and would presumably also need to be
adjusted if this text was adjusted.)
We also don't seem to bound this number of spaces to be fewer than the
target line length, which only matters in some weirdly pedantic sense.

Section 7.2.2

   Scan the beginning of the text content for the header described in
   Section 7.1.1.  If the header is not present, starting on the first
   line of the text content, exit (this text contents does not need to
   be unfolded).

I'm not sure I understand what "starting on the first line of the text
content" is intended to mean.  (Also in 8.2.2.)

Section 8.2.1

   If this text content needs to and can be folded, insert the header
   described in Section 8.1.1, ensuring that any additional printable
   characters surrounding the header do not result in a line exceeding
   the desired maximum.

We discussed above some cases when text could not be folded using the
algorithm from Section 7.2.1; in what case could text not be folded with
this algorithm?  Just the case when the implementation doesn't support
forced folding?

Section 10

We should warn against implementations scanning past the end of a buffer
(containing the entire contents of a file) when checking what's in the
beginning of the next line -- if a file ends with a backslash and "end
of line" but no further content, we could perform an out of bounds
access if the code assumes it is safe to check for the next line's
initial content.

Section 12.2

I think that RFC 7991 could be normative, since we say "per RFC 7991" to
describe some requirements on behavior.  Likewise for RFC 7994, whose
character encoding requirements we incorporate by reference.

Appendix A

I could perhaps argue that we should include a reference to POSIX for
"POSIX shell script" but find it somewhat hard to believe that this
would be a problem in practice.  It's also moot since we require bash
specifically, so we'd need to reference bash instead of POSIX.

   copy/paste the script for local use.  As should be evident by the
   lack of the mandatory header described in Section 7.1.1, these
   backslashes do not designate a folded line, such as described in
   Section 7.

It perhaps should be, but I think currently is not -- we only talk about
using the two-line header to detect instances of folding, without
mention of a requirement to be contained within <CODE BEGINS>/<CODE
ENDS> or similar.

It seems that my perception of "common shell style" diverges from that
presented in this document, which is not necessarily problematic.
(Things like what diagnostics go to stdout vs. stderr, use or ">
/dev/null" vs ">> /dev/null", etc.)

     printf "Usage: rfcfold [-s <strategy>] [-c <col>] [-r] -i <infile>"
     printf " -o <outfile>\n"

This summary usage line doesn't mention -d, -q, or -h.  (Maybe it
doesn't have to, of course.)

     # ensure input file doesn't contain a TAB
     grep $'\t' $infile >> /dev/null 2>&1

(`grep -q` is a thing, here and elsewhere.)

     # unfold wip file
     "$SED" '{H;$!d};x;s/^\n//;s/\\\n *//g' $temp_dir/wip > $outfile

[I don't remember why the s/^\n// is needed; similarly for the
unfold_it_2() case.]

     if [[ $strategy -eq 2 ]]; then
       min_supported=`expr ${#hdr_txt_2} + 8`
     else
       min_supported=`expr ${#hdr_txt_1} + 8`
     fi

On the face of it this seems like it will produce "folded" output that
exceeds the line length, when we give min_supported of 54, use
autodetection of strategy, and have input that is incompatible with
fold_it_1().

     process_input $@

Need double-quotes around "$@" to properly handle arguments with
embedded spaces.