Re: [netmod] rfcstrip does not work on https://tools.ietf.org/html/draft-ietf-netmod-artwork-folding-12

Erik Auerswald <auerswal@unix-ag.uni-kl.de> Tue, 31 March 2020 12:01 UTC

Return-Path: <auerswal@unix-ag.uni-kl.de>
X-Original-To: netmod@ietfa.amsl.com
Delivered-To: netmod@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5BE463A07BF for <netmod@ietfa.amsl.com>; Tue, 31 Mar 2020 05:01:47 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.004
X-Spam-Level:
X-Spam-Status: No, score=0.004 tagged_above=-999 required=5 tests=[MAY_BE_FORGED=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=unavailable autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id iATRq-g6fzj9 for <netmod@ietfa.amsl.com>; Tue, 31 Mar 2020 05:01:45 -0700 (PDT)
Received: from mailgw1.uni-kl.de (mailgw1.uni-kl.de [IPv6:2001:638:208:120::220]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 951AC3A07DF for <netmod@ietf.org>; Tue, 31 Mar 2020 05:01:45 -0700 (PDT)
Received: from [172.20.10.2] (x2e720675.dyn.telefonica.de [46.114.6.117] (may be forged)) (authenticated bits=0) by mailgw1.uni-kl.de (8.14.4/8.14.4/Debian-8+deb8u2) with ESMTP id 02VC1RrF026120 (version=TLSv1/SSLv3 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 31 Mar 2020 14:01:39 +0200
To: Carsten Bormann <cabo@tzi.org>, Kent Watsen <kent@watsen.net>
Cc: =?UTF-8?Q?Bal=c3=a1zs_Lengyel?= <balazs.lengyel=40ericsson.com@dmarc.ietf.org>, "netmod@ietf.org" <netmod@ietf.org>
References: <DB7PR07MB4011D9AAE36903D951C86C7EF0CC0@DB7PR07MB4011.eurprd07.prod.outlook.com> <010001711d349b01-d87da5d4-0638-4d76-aaf2-a6b94d777685-000000@email.amazonses.com> <70e30e73-983c-8c60-d7dc-4ae863363fe2@unix-ag.uni-kl.de> <010001712dd7864e-c179c424-9424-4fc3-acb5-1bc0cc23041f-000000@email.amazonses.com> <5F251832-0C47-4254-8A6E-8DEEA6CC7FBC@tzi.org> <DE6D0F27-B4DB-423E-80D7-63DA7F3D394A@tzi.org>
From: Erik Auerswald <auerswal@unix-ag.uni-kl.de>
Message-ID: <3e2333ce-d20a-60f7-3578-5199720a03d6@unix-ag.uni-kl.de>
Date: Tue, 31 Mar 2020 14:01:26 +0200
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.4.1
MIME-Version: 1.0
In-Reply-To: <DE6D0F27-B4DB-423E-80D7-63DA7F3D394A@tzi.org>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-GB
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/netmod/oNLeACZfBcpXvuoNpwTxl06QDAY>
Subject: Re: [netmod] rfcstrip does not work on https://tools.ietf.org/html/draft-ietf-netmod-artwork-folding-12
X-BeenThere: netmod@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: NETMOD WG list <netmod.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/netmod>, <mailto:netmod-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/netmod/>
List-Post: <mailto:netmod@ietf.org>
List-Help: <mailto:netmod-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/netmod>, <mailto:netmod-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 31 Mar 2020 12:01:47 -0000

Hi all,

On 31.03.20 09:36, Carsten Bormann wrote:
> On 2020-03-31, at 09:22, Carsten Bormann <cabo@tzi.org> wrote:
>
>> On 2020-03-31, at 01:47, Kent Watsen <kent@watsen.net> wrote:
>>>
>>> I thought someone said that `xml2rfc` was going to support a “markers” attribute for the <sourcecode> element, but I don’t see that here yet: https://github.com/rfc-format/draft-iab-xml2rfc-v3-bis/blob/master/draft-iab-rfc7991bis.xml#L8514.
>>
>> https://tools.ietf.org/html/draft-levkowetz-xml2rfc-v3-implementation-notes-10#section-3.1.22
>>

Thanks for the additional info.  The proposal states:

     Proposal:  Add an attribute 'markers' for <sourcecode>, to control
        the emission of <CODE BEGINS> and <CODE ENDS>.  If markers="true"
        and the "name" attribute is set, the filename will also be
        emitted, as specified in [RFC8407] for YANG modules.

Thus the `file' part of the  <CODE ...> markers seems to be optional
there, too, unless the code in question is a YANG module.  In RFC8407 it
is recommended (SHOULD) to add a file name specification to the
<CODE BEGINS> marker, so it is still kind of optional.

I try to understand if the omission of a file name for the code in
draft-ietf-netmod-artwork-folding-12 was "wrong" or just "not great",
and what the current practice is.  It did result in Martin's `rfcstrip'
not extracting it from the text version.

Martin's `rfcstrip' could not extract `rfcfold' from the XML version
either,  because the XML uses `artwork' instead of `sourcecode'.
Adjusting the code to expect the `artwork' tag confirms this.  I used

     sed -i 's,//sourcecode\[,//artwork[,' rfcstrip

to try this out.  (Note that I do not suggest to apply this change to
Martin's `rfcstrip', but rather want to show what I did.)

> Oh, and you can find examples for markers=“true” in the published RFCs
> 
> rfc8650.xml

RFC 8650 "Dynamic Subscription to YANG Events and Datastores over
RESTCONF" contains a YANG module inside a <CODE ...> section with
file name specification on the following line in the text rendering.

Since the code of Martin's `rfcstrip' that recognizes a <CODE ...>
section requires the keyword `file' on the same line as `<CODE BEGINS>',
I take a closer look at how it handles this RFC.

Martin's rfcstrip does not recognize this <CODE ...> section in the
text rendering, because the keyword `file' is not on the same line as
`<CODE BEGINS>'.  It still extracts the YANG module via the "type 3"
code path for YANG modules without <CODE ...> markers from the text
rendering.

Martin's `rfcstrip' extracts the YANG module with the specified name
from the XML version of the RFC.  It extracts two of the many JSON
examples from the XML version as well.  The examples do not use
<CODE ...> markers in the text rendering.

One of the HTML renderings displays the file name on the same line
as the <CODE BEGINS> marker 
(https://www.rfc-editor.org/rfc/rfc8650.html#name-yang-module), the 
other shows it on the
following line (https://tools.ietf.org/html/rfc8650), as does the
text rendering (https://tools.ietf.org/rfc/rfc8650.txt).  The PDF
rendering shows the file name specification on the same line as
<CODE BEGINS> (https://tools.ietf.org/pdf/rfc8650.pdf).

As far as I understand it, the XML version of the RFC is authoritative
and tools are supposed to work on the XML, not renderings created from
XML.  [I am not sure about the exact RFC number where this change from
text to XML takes effect.]

> rfc8652.xml
> rfc8675.xml
> rfc8676.xml

The above RFCs contain YANG modules.  They specify file names for the
respective <CODE ...> sections.

> rfc8681.xml

RFC 8681 "Sliding Window Random Linear Code (RLC) Forward Erasure
Correction (FEC) Schemes for FECFRAME" does not contain a YANG module.
It contains C code.  It does not specify a file name for any of the
two code sections.

> rfc8682.xml

RFC 8682 "TinyMT32 Pseudorandom Number Generator (PRNG)" contains C code
inside a <CODE ...> section without specifying a file name.  The
preceding text does suggest a file name.

> rfc8695.xml

RFC 8695 is another YANG module RFC giving a file name for the
<CODE ...> section.

> rfc8696.xml

RFC 8696 "Using Pre-Shared Key (PSK) in the Cryptographic Message Syntax
(CMS)" contains an ASN.1 module inside a <CODE ...> section without file
name.

> rfc8748.xml

RFC 8748 "Registry Fee Extension for the Extensible Provisioning
Protocol (EPP)" contains an XML schema inside <CODE ...> markers without
giving a file name.

It seems to me that <CODE BEGINS> ... <CODE ENDS> sections without file
name specification are common for both older RFCs in text format and
newer RFCs in XML format.  Even for YANG modules it seems to be
recommended, but not required, to specify a file name.  Thus I would not
have called the <CODE ...> line of draft-ietf-netmod-artwork-folding-12
"messed up," but do think that adding a file name improves it. :-)

The empty line after <CODE BEGINS> in I-D.ietf-netmod-artwork-folding-12
looks like a potential problem, because the first line of a shell script
on Unix-like systems needs to specify the interpreter, but Martin's
`rfcstrip' correctly handles this by skipping leading and trailing empty
lines inside the <CODE BEGINS> ... <CODE ENDS> section.

I think I do have a better understanding of the use of <CODE ...>
markers and the possible issues. :-)

Thanks,
Erik