Re: [netmod] Benjamin Kaduk's Discuss on draft-ietf-netmod-artwork-folding-09: (with DISCUSS and COMMENT)
Benjamin Kaduk <kaduk@mit.edu> Wed, 11 September 2019 00:03 UTC
Return-Path: <kaduk@mit.edu>
X-Original-To: netmod@ietfa.amsl.com
Delivered-To: netmod@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0706B120074; Tue, 10 Sep 2019 17:03:49 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.199
X-Spam-Level:
X-Spam-Status: No, score=-4.199 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id uEAKcKY1clu8; Tue, 10 Sep 2019 17:03:45 -0700 (PDT)
Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id CF6A8120013; Tue, 10 Sep 2019 17:03:44 -0700 (PDT)
Received: from kduck.mit.edu ([24.16.140.251]) (authenticated bits=56) (User authenticated as kaduk@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id x8B03c6W016279 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 10 Sep 2019 20:03:41 -0400
Date: Tue, 10 Sep 2019 19:03:38 -0500
From: Benjamin Kaduk <kaduk@mit.edu>
To: Kent Watsen <kent+ietf@watsen.net>
Cc: The IESG <iesg@ietf.org>, "netmod-chairs@ietf.org" <netmod-chairs@ietf.org>, draft-ietf-netmod-artwork-folding@ietf.org, "netmod@ietf.org" <netmod@ietf.org>
Message-ID: <20190911000337.GQ18198@kduck.mit.edu>
References: <156766366671.22774.7481795788724573201.idtracker@ietfa.amsl.com> <0100016d0372debf-16e6e132-b334-41b3-ad9c-953fd9314963-000000@email.amazonses.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <0100016d0372debf-16e6e132-b334-41b3-ad9c-953fd9314963-000000@email.amazonses.com>
User-Agent: Mutt/1.10.1 (2018-07-13)
Archived-At: <https://mailarchive.ietf.org/arch/msg/netmod/0icg3aA4-ZSOSLETMnF5FKzZOyA>
Subject: Re: [netmod] Benjamin Kaduk's Discuss on draft-ietf-netmod-artwork-folding-09: (with DISCUSS and COMMENT)
X-BeenThere: netmod@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: NETMOD WG list <netmod.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/netmod>, <mailto:netmod-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/netmod/>
List-Post: <mailto:netmod@ietf.org>
List-Help: <mailto:netmod-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/netmod>, <mailto:netmod-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 11 Sep 2019 00:03:49 -0000
On Thu, Sep 05, 2019 at 10:02:03PM +0000, Kent Watsen wrote: > Hi Ben, > > Thank you for your review. Comments below. > > Update @ https://tools.ietf.org/html/draft-ietf-netmod-artwork-folding-10 <https://tools.ietf.org/html/draft-ietf-netmod-artwork-folding-10> > > Kent // as co-author > > > > On Sep 5, 2019, at 2:07 AM, Benjamin Kaduk via Datatracker <noreply@ietf.org> wrote: > > > > Benjamin Kaduk has entered the following ballot position for > > draft-ietf-netmod-artwork-folding-09: Discuss > > > > When responding, please keep the subject line intact and reply to all > > email addresses included in the To and CC lines. (Feel free to cut this > > introductory paragraph, however.) > > > > > > Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html > > for more information about IESG DISCUSS and COMMENT positions. > > > > > > The document, along with other ballot positions, can be found here: > > https://datatracker.ietf.org/doc/draft-ietf-netmod-artwork-folding/ > > > > > > > > ---------------------------------------------------------------------- > > DISCUSS: > > ---------------------------------------------------------------------- > > > > I think the procedures described herein are incomplete without a footer > > to terminate the un-folding process. Otherwise, it seems that the > > described algorithms would leave the two-line header for the second and > > subsequent instances of folded text in a single document. (If we tried > > to just blindly remove all instances of the header without seeking > > boundaries, then we would misreconstruct content when different folding > > algorithms are used in the same document with the single-backslash > > algorithm occurring first.) > > Are you referring to when an RFC contains multiple inclusions and one is > trying to unfold them all at once? That's not the intention here, as Yes, that was what I was thinking; sorry for missing or misinterpreting the notes in Sections 7.2/8.2. > noted in paragraph 3 in both sections 7.2 and 8.2. FWIW, this sounds > like the framing problem that the WG discussed with the conclusion that > extracting from plain-text is dead, now that XML is the required > submission format, and XML provides a superior framing mechanism than any > footer we could add. > > BTW, yes, each text inclusion in a single RFC may independently be folded > using either the '\' or '\\' strategy, with the recommendation that '\' > always be tried first and '\\' only used when '\' fails. > > If referring to a single text content instance, could you provide an > example illustrating the concern? > > > > > > I don't think it's proper to refer to a script that requires bash > > specifically as a "POSIX shell script". I did not attmept to check > > whether any bash-specific features are used or this requirements stems > > solely from the shebang line, though. > > I just changed "POSIX" to "Bash" in the title for Appendix A. > > Not that it matters, but "--posix" is passed into `bash` on the first > line of the script ;) > > > > > I think the shell script does need to use double-quotes around some > > variable expansions, especially "$infile" and "$outfile", to work > > properly for filenames containing spaces. We do quote "$infile" when > > we're checking that it exists, just not (most of the time) when we > > actually use it! > > Updated. > > > > > In addition to the above, I also share Alissa's (and Mirja's) concerns, > > but feel that Discuss is more appropriate than Abstain, so we can > > discuss what the best way to get this content published is. For it's > > fine content, and we should see it published; it's just not immediately > > clear to me what the right way to do so is. > > Agreed. For now, I've changed it to Informational, but I think there > remains a discussion around if the draft should be re-rerun through the > IAB stream. My responses today to Alissa's Abstain and Suresh Discuss > dig into this. Is it okay to use those threads for this item? Please do; this point was mostly intended to make sure that we didn't inadvertently approve the document while those discussions were still going on. > > > ---------------------------------------------------------------------- > > COMMENT: > > ---------------------------------------------------------------------- > > > > Section 4.1 > > > > Automated folding of long lines is needed in order to support draft > > compilations that entail a) validation of source input files (e.g., > > XML, JSON, ABNF, ASN.1) and/or b) dynamic generation of output, using > > a tool that doesn't observe line lengths, that is stitched into the > > final document to be submitted. > > > > I don't think the intended meaning of "source input files" will be > > clear to all readers just from this text. Some discussion of how RFCs > > can consider source code, data structures, generated output, etc., that > > have standalone representations and natural formats, and the need to > > display their contents in the RFC format that has different > > requirements might be helpful context for this paragraph and the next. > > Is the updated text more understandable? Yes, thanks > > > > Section 7.1.2 > > > > For some reason my mental model of "RFC style" does not use the word > > "really" in this way, and prefers alternatives like "very" or > > "exceptionally". (Also in Section 8.1.2.) > > s/Really/Exceptionally/ in both cases. > > > > Section 7.2.1 > > > > 1. Determine where the fold will occur. This location MUST be > > before or at the desired maximum column, and MUST NOT be chosen such > > that the character immediately after the fold is a space (' ') > > character. For forced foldings, the location is between the > > > > This is a rather awkward natural line break. I suggest an RFC Editor > > note to make sure that the punctuation around the space character all > > appears on the same line. > > RFC Editor note added, near the top of the draft. > > > > > 3. On the following line, insert any number of space (' ') > > characters. > > > > I'm not sure I'd characterize the procedure as "complete" when it > > leaves the value of the output subject to implementation choice such as > > this. (Note that the next paragraph talks about the resulting > > "arbitrary number of space" characters, and would presumably also need > > to be adjusted if this text was adjusted.) We also don't seem to bound > > this number of spaces to be fewer than the target line length, which > > only matters in some weirdly pedantic sense. > > Added "subject to the resulting line not exceeding the desired maximum" > to both locations in the draft. > > > > > Section 7.2.2 > > > > Scan the beginning of the text content for the header described in > > Section 7.1.1. If the header is not present, starting on the first > > line of the text content, exit (this text contents does not need to > > be unfolded). > > > > I'm not sure I understand what "starting on the first line of the text > > content" is intended to mean. (Also in 8.2.2.) > > I think you are saying that it seems overly prescriptive, given that the > previous sentence says "beginning" and "header", it defies logic that the > header might not start on the first line and, by this text calling it > out, it suggests something special is going on. Is this what you mean? > To be clear, the only intention here is to catch the case whereby there > might be, e.g., some blank lines preceding the header. Do you think the > "starting on the first line of the text content" fragment should be > removed? I think I was too confused by the text to be complaining that it was overly prescriptive :( I guess my complaint is that it seems ambiguous whether this is "the procedure says: start on the first line of text content, and check for the header" or "If the header is not present [anywhere in the content], start on the first line of content, and exit". That is, I think the order in which the clauses appear confuses me, with perhaps some exacerbation by verb tense. I support being able to cope with some blank lines preceding the header! > > Section 8.2.1 > > > > If this text content needs to and can be folded, insert the header > > described in Section 8.1.1, ensuring that any additional printable > > characters surrounding the header do not result in a line exceeding > > the desired maximum. > > > > We discussed above some cases when text could not be folded using the > > algorithm from Section 7.2.1; in what case could text not be folded > > with this algorithm? Just the case when the implementation doesn't > > support forced folding? > > Yes, that's the only case known. But what does this have to do with > Section 8.2.1? Are you keying off of the "needs to" part? Is it okay? I was just trying to check that we have given the reader enough information to ascertain the "can be folded" result. > > Section 10 > > > > We should warn against implementations scanning past the end of a > > buffer (containing the entire contents of a file) when checking what's > > in the beginning of the next line -- if a file ends with a backslash > > and "end of line" but no further content, we could perform an out of > > bounds access if the code assumes it is safe to check for the next > > line's initial content. > > Both Sections 7.2.2 and 8.2.2 describe conditions to determine when > unfolding occurs. AFIACT, in both cases, the unfolding algorithm stays > within the bounds of those conditions. These procedures are fine if you're operating in a context where you interact with the text corpus via "get next line" operations. But I don't think we have limited ourselves to such contexts; consider the case where I (foolishly) write text-processing code in C, and read(2) the text in question into a memory buffer. I'm on my own for linebreak detection, and if I start peeking past escape characters, it's not so hard to imagine that I could fail to check for "end of buffer" and trigger undefined behavior. > For instance, given the input sequence [ '\' '\n' EOF] , the 7.2.2 > algorithm would replace it with [ EOF ] and the 8.2.2 algorithm wouldn't > even attempt to unfold it since the condition of the next line containing > a second '\' character isn't met. > > Is this Security Consideration needed? Well, it's a nonblocking comment. So if the above description seems totally implausible to you, I can accept it not being included in the document. > > > > Section 12.2 > > > > I think that RFC 7991 could be normative, since we say "per RFC 7991" > > to describe some requirements on behavior. Likewise for RFC 7994, > > whose character encoding requirements we incorporate by reference. > > Given that this format may be used in contexts outside the IETF, it seems > that understanding RFC 7991 is optional. Agreed? For most of the occurrences of 7991 references, I agree with you. The only one that makes me think otherwise is in Section 7.1.2: The character encoding is the same as described in Section 2 of [RFC7994], except that, per [RFC7991], tab characters are prohibited. which is a statement of behavior that defers to an external specification. > > > > Appendix A > > > > I could perhaps argue that we should include a reference to POSIX for > > "POSIX shell script" but find it somewhat hard to believe that this > > would be a problem in practice. It's also moot since we require bash > > specifically, so we'd need to reference bash instead of POSIX. > > Per above, "POSIX" is now "Bash" in the title. I added an Informative > reference for Bash. Thanks! > > > copy/paste the script for local use. As should be evident by the > > lack of the mandatory header described in Section 7.1.1, these > > backslashes do not designate a folded line, such as described in > > Section 7. > > > > It perhaps should be, but I think currently is not -- we only talk > > about using the two-line header to detect instances of folding, without > > mention of a requirement to be contained within <CODE BEGINS>/<CODE > > ENDS> or similar. > > Correct. The 2-line header is missing. That <CODE BEGINS>/<CODE ENDS> > appears is secondary. Is there anything to be done here? In light of the previous discussion about extracting artwork individually from the document, probably not. Though it seems the -10 has added a line-wrapping header to the script, which seems to be inadvertent, if I understand correctly. > > It seems that my perception of "common shell style" diverges from that > > presented in this document, which is not necessarily problematic. > > (Things like what diagnostics go to stdout vs. stderr, use or "> > > /dev/null" vs ">> /dev/null", etc.) > > I fixed one "> /dev/null" case. Heh, I was trying to say that I prefer to always write "> /dev/null", while acknowledging that my preference is irrelevant for this document. I'm glad it helped to fix a consistency nit, though! > As for style, we could review line by line but, for the cases where > output is directed to /dev/null/, it's unclear where the output is > needed, only the exit code status ever seems to matter. > > > > printf "Usage: rfcfold [-s <strategy>] [-c <col>] [-r] -i <infile>" > > printf " -o <outfile>\n" > > > > This summary usage line doesn't mention -d, -q, or -h. (Maybe it > > doesn't have to, of course.) > > Added. > > > > # ensure input file doesn't contain a TAB grep $'\t' $infile >> > > /dev/null 2>&1 > > > > (`grep -q` is a thing, here and elsewhere.) > > Added. > > > > # unfold wip file "$SED" '{H;$!d};x;s/^\n//;s/\\\n *//g' > > $temp_dir/wip > $outfile > > > > [I don't remember why the s/^\n// is needed; similarly for the > > unfold_it_2() case.] > > Erik responded to this point already. > > > > if [[ $strategy -eq 2 ]]; then min_supported=`expr ${#hdr_txt_2} + > > 8` else min_supported=`expr ${#hdr_txt_1} + 8` fi > > > > On the face of it this seems like it will produce "folded" output that > > exceeds the line length, when we give min_supported of 54, use > > autodetection of strategy, and have input that is incompatible with > > fold_it_1(). > > Fixed off-by-one error. > > > > > process_input $@ > > > > Need double-quotes around "$@" to properly handle arguments with > > embedded spaces. > > Added. Thanks! I'll try to find time to look at the new script with an eye for quoting, and update my position in the datatracker; please start complaining if I haven't done so and the other threads about where/how to publish have come to a conclusion. -Ben
- [netmod] Benjamin Kaduk's Discuss on draft-ietf-n… Benjamin Kaduk via Datatracker
- Re: [netmod] Benjamin Kaduk's Discuss on draft-ie… Erik Auerswald
- Re: [netmod] Benjamin Kaduk's Discuss on draft-ie… Kent Watsen
- Re: [netmod] Benjamin Kaduk's Discuss on draft-ie… Benjamin Kaduk
- Re: [netmod] Benjamin Kaduk's Discuss on draft-ie… Erik Auerswald
- Re: [netmod] Benjamin Kaduk's Discuss on draft-ie… Kent Watsen