Re: [netmod] artwork folding: dual support modes?

Martin Bjorklund <mbj@tail-f.com> Mon, 04 March 2019 18:35 UTC

Return-Path: <mbj@tail-f.com>
X-Original-To: netmod@ietfa.amsl.com
Delivered-To: netmod@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 15A94130DDA for <netmod@ietfa.amsl.com>; Mon, 4 Mar 2019 10:35:44 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id HvR7tWZrO44u for <netmod@ietfa.amsl.com>; Mon, 4 Mar 2019 10:35:42 -0800 (PST)
Received: from mail.tail-f.com (mail.tail-f.com [46.21.102.45]) by ietfa.amsl.com (Postfix) with ESMTP id 7A1E2128B14 for <netmod@ietf.org>; Mon, 4 Mar 2019 10:35:42 -0800 (PST)
Received: from localhost (h-4-215.A165.priv.bahnhof.se [158.174.4.215]) by mail.tail-f.com (Postfix) with ESMTPSA id 75F021AE0118; Mon, 4 Mar 2019 19:35:40 +0100 (CET)
Date: Mon, 04 Mar 2019 19:35:40 +0100
Message-Id: <20190304.193540.1020759172873811211.mbj@tail-f.com>
To: kent+ietf@watsen.net
Cc: rwilton@cisco.com, netmod@ietf.org
From: Martin Bjorklund <mbj@tail-f.com>
In-Reply-To: <0100016949d802d6-ccf713c5-df75-4f24-b479-4bc94b4138ec-000000@email.amazonses.com>
References: <0100016949647f53-8a4d372a-c576-4489-a1e5-b885c6510a1f-000000@email.amazonses.com> <20190304.170423.167423260282534149.mbj@tail-f.com> <0100016949d802d6-ccf713c5-df75-4f24-b479-4bc94b4138ec-000000@email.amazonses.com>
X-Mailer: Mew version 6.7 on Emacs 25.2 / Mule 6.0 (HANACHIRUSATO)
Mime-Version: 1.0
Content-Type: Text/Plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/netmod/FF6Wbo_YSk8qKvxtfMFw4zum7FM>
Subject: Re: [netmod] artwork folding: dual support modes?
X-BeenThere: netmod@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: NETMOD WG list <netmod.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/netmod>, <mailto:netmod-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/netmod/>
List-Post: <mailto:netmod@ietf.org>
List-Help: <mailto:netmod-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/netmod>, <mailto:netmod-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 04 Mar 2019 18:35:44 -0000

Kent Watsen <kent+ietf@watsen.net> wrote:
> 
> 
> > On Mar 4, 2019, at 11:04 AM, Martin Bjorklund <mbj@tail-f.com> wrote:
> > 
> > Kent Watsen <kent+ietf@watsen.net> wrote:
> >> 
> >> 
> >>> But note that figures in RFCs are normally indented with 3 spaces
> >>> (they _can_ be outdented, if the lines are long enough).
> >> 
> >> 
> >> The days of scraping from plain-text RFCs are over [1].  Extracting,
> >> if needed at all, should be from the XML, where there are no such
> >> issues. Extracting from the plain-text output makes about as much
> >> sense as extracting from the HTML or PDF outputs.
> > 
> > I am confused.  Are you saying that the unfolding algorithm only is
> > supposed to work on data extracted from the XML version of the I-D or
> > RFC?  If so, I think this needs to be clarified in the draft.
> 
> The unfolding algorithm works as long as the input == the output.  The 
> problem is that plain-text RFCs introduce a lot of artifacts that makes 
> lossless extraction difficult.  I don't believe we should try to design a 
> solution for input != output.
> 
> Now that IETF has officially moved to XML as the sole format

I'm not sure what you mean, can you provide a pointer?  AFAICT, the
latest published RFC is still only available as txt and pdf.

If the only format was XML, why bother with any line breaking at all?

>, there
> is no longer a need to support extracting from plain-text.   In general, 
> folks are advised to always extract from XML.   I support adding a 
> statement to this affect.
> 
> 
> 
> >> Lossless extractions are critical for formal verifications (e.g.,
> >> doctor reviews, shepherd reviews, AUTH48 reviews).  Both the
> >> double-backslash approach we currently have, and the single-backslash
> >> approach we had originally (where the continuation-line begins on
> >> column 1, as it has been in programming languages for decades) provide
> >> lossless extractions.
> > 
> > ... as does the single-backslash with leading space removal.
> 
> No, there are cases where this fails.  We went thru this before.

Only if you have data with > 69 spaces in a row that needs to be
preserved.


/martin



> This is why
> we adopted the double-backslash approach.
> 
> 
> Kent // contributor  (also on my previous emails in this thread)
> 
> 
>