[xml2rfc] Differentiating prepped from unprepared documents

Jay Daley <exec-director@ietf.org> Wed, 22 June 2022 12:09 UTC

Return-Path: <exec-director@ietf.org>
X-Original-To: xml2rfc@ietfa.amsl.com
Delivered-To: xml2rfc@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 39E17C159496 for <xml2rfc@ietfa.amsl.com>; Wed, 22 Jun 2022 05:09:59 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.909
X-Spam-Level:
X-Spam-Status: No, score=-1.909 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id c7dkeMdQ9dED for <xml2rfc@ietfa.amsl.com>; Wed, 22 Jun 2022 05:09:55 -0700 (PDT)
Received: from ietfx.amsl.com (ietfx.amsl.com [50.223.129.196]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 15610C157B5B for <xml2rfc@ietf.org>; Wed, 22 Jun 2022 05:09:55 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1]) by ietfx.amsl.com (Postfix) with ESMTP id 072934053E25; Wed, 22 Jun 2022 05:09:55 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
Received: from ietfx.amsl.com ([50.223.129.196]) by localhost (ietfx.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id p16EZCT2wAYm; Wed, 22 Jun 2022 05:09:54 -0700 (PDT)
Received: from smtpclient.apple (host-92-27-125-209.static.as13285.net [92.27.125.209]) by ietfx.amsl.com (Postfix) with ESMTPSA id AAE564053E20; Wed, 22 Jun 2022 05:09:53 -0700 (PDT)
From: Jay Daley <exec-director@ietf.org>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.100.31\))
Message-Id: <222A7719-518F-45FB-BE05-A3CA1172DD9B@ietf.org>
Date: Wed, 22 Jun 2022 13:09:50 +0100
Cc: Julian Reschke <julian.reschke@gmx.de>, Carsten Bormann <cabo@tzi.org>, Robert Sparks <rjsparks@nostrum.com>, John Levine <john.levine@standcore.com>
To: xml2rfc@ietf.org
X-Mailer: Apple Mail (2.3696.100.31)
Archived-At: <https://mailarchive.ietf.org/arch/msg/xml2rfc/LfzC2lkFvRdgpLb1G1tKQOFpkl8>
Subject: [xml2rfc] Differentiating prepped from unprepared documents
X-BeenThere: xml2rfc@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: XML2RFC discussion list <xml2rfc.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/xml2rfc>, <mailto:xml2rfc-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/xml2rfc/>
List-Post: <mailto:xml2rfc@ietf.org>
List-Help: <mailto:xml2rfc-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/xml2rfc>, <mailto:xml2rfc-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 22 Jun 2022 12:09:59 -0000

A group of us have been discussing [1] an issue we see with the current grammar and I would like to continue that discussion on list, but I thought I should give some background first for others on this list.

The issue is that there are certain elements and attributes of the grammar that are only used by the prep tool and yet are 'visible' in the grammar specification, though not consistently, and are known to cause author confusion.  For those who are not clear on this, the 'prep tool' is a specific function of xml2rfc that add various RFCXML markup [2] to a document as a key step in converting an I-D into an RFC.  What the five of us have been discussing is how we separate the prepped and unprepped grammar in order to make it significantly easier for any reader (human or machine) to distinguish between the two.

A good example of the problem is the <toc> (table of contents) element, which is added by the prep tool.  This is included in the documentation (https://authors.ietf.org/en/rfcxml-vocabulary#toc-1) but with the warning "As a document author, you should not use <toc> directly".  However, the "pn" attribute that is added to multiple elements by the prep tool and which is the target of the table of contents entries, is not covered in the documentation.

Julian has suggested that we use a namespace extension for the prepped grammar.  That would mean that the prep tool would add a namespace attribute, such as xmlns:prep="http://www.rfc-editor.org/prep-namespace", to the <rfc> element and then all markup inserted by the tool would be prefixed with "prep" ("prep" is just an example prefix) such as the <prep:toc …> element and "prep:pn" attributes.  There would still only be one grammar file (i.e. rfcXXXX.rnc) but that would be amended to specify the namespace of each element/attribute. 

Personally, I think this is the best way forward given where we are now, though I would have preferred us to have started differently.  Ideally we would have more clearly differentiated between I-Ds and RFCs in the authoring format to match the distinction that is so apparent in how these documents are used.  That would mean having two different grammar files, one for I-Ds with a root element <i-d> and one for RFCs with a root element of <rfc>, and with the prep tool converting between the two.  However, moving to that now seems excessive.

The only practical downside of the namespace solution is that it constricts I-D grammar to always being a subset of RFC grammar (as the latter is implemented by an extension) and so it makes it messy to implement such things as including annotations in the grammar that are only for use in I-Ds and not RFCs.

Jay

[1]  https://github.com/ietf-tools/xml2rfc/issues/791
[2]  https://www.rfc-editor.org/rfc/rfc7998.html

-- 
Jay Daley
IETF Executive Director
exec-director@ietf.org