Re: [rfc-i] Evolving document sources over a long time (Re: Comments on draft-roach-bis-documents-00)

Carsten Bormann <cabo@tzi.org> Sun, 12 May 2019 16:16 UTC

Return-Path: <rfc-interest-bounces@rfc-editor.org>
X-Original-To: ietfarch-rfc-interest-archive@ietfa.amsl.com
Delivered-To: ietfarch-rfc-interest-archive@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1A42B120241 for <ietfarch-rfc-interest-archive@ietfa.amsl.com>; Sun, 12 May 2019 09:16:46 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -5.199
X-Spam-Level:
X-Spam-Status: No, score=-5.199 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.001, MAILING_LIST_MULTI=-1, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id E7-Qdc6N0XXY for <ietfarch-rfc-interest-archive@ietfa.amsl.com>; Sun, 12 May 2019 09:16:43 -0700 (PDT)
Received: from rfc-editor.org (rfc-editor.org [4.31.198.49]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id EE0A01201D3 for <rfc-interest-archive-eekabaiReiB1@ietf.org>; Sun, 12 May 2019 09:16:42 -0700 (PDT)
Received: from rfcpa.amsl.com (localhost [IPv6:::1]) by rfc-editor.org (Postfix) with ESMTP id 1C297B81C75; Sun, 12 May 2019 09:16:33 -0700 (PDT)
X-Original-To: rfc-interest@rfc-editor.org
Delivered-To: rfc-interest@rfc-editor.org
Received: from localhost (localhost [127.0.0.1]) by rfc-editor.org (Postfix) with ESMTP id BE544B81C75 for <rfc-interest@rfc-editor.org>; Sun, 12 May 2019 09:16:31 -0700 (PDT)
X-Virus-Scanned: amavisd-new at rfc-editor.org
Received: from rfc-editor.org ([127.0.0.1]) by localhost (rfcpa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id deO47TRSggCl for <rfc-interest@rfc-editor.org>; Sun, 12 May 2019 09:16:29 -0700 (PDT)
Received: from smtp.uni-bremen.de (gabriel-vm-2.zfn.uni-bremen.de [134.102.50.17]) by rfc-editor.org (Postfix) with ESMTPS id 0ABB4B81C71 for <rfc-interest@rfc-editor.org>; Sun, 12 May 2019 09:16:28 -0700 (PDT)
Received: from client-0032.vpn.uni-bremen.de (client-0032.vpn.uni-bremen.de [134.102.107.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.uni-bremen.de (Postfix) with ESMTPSA id 4528GC3KV6zyY6; Sun, 12 May 2019 18:16:35 +0200 (CEST)
Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.1\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <CADaq8jc1KJwC=Ypoo9a+-=Me=GP5tgX=2kcfUd56o53Mcu05kw@mail.gmail.com>
Date: Sun, 12 May 2019 18:16:34 +0200
X-Mao-Original-Outgoing-Id: 579370591.624217-65f1a58236305a277b788da6e3699180
Message-Id: <9179590B-C513-44DC-906C-16534DA8EC51@tzi.org>
References: <CADaq8jdRMUZAN3rRXoActXqvGpkgx_-kW67uwzGLtVPoh7LfAQ@mail.gmail.com> <6E787E2A-18F2-4EFE-BFBA-61B1B4300930@tzi.org> <CADaq8jc1KJwC=Ypoo9a+-=Me=GP5tgX=2kcfUd56o53Mcu05kw@mail.gmail.com>
To: David Noveck <davenoveck@gmail.com>
X-Mailer: Apple Mail (2.3445.9.1)
Subject: Re: [rfc-i] Evolving document sources over a long time (Re: Comments on draft-roach-bis-documents-00)
X-BeenThere: rfc-interest@rfc-editor.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "A list for discussion of the RFC series and RFC Editor functions." <rfc-interest.rfc-editor.org>
List-Unsubscribe: <https://www.rfc-editor.org/mailman/options/rfc-interest>, <mailto:rfc-interest-request@rfc-editor.org?subject=unsubscribe>
List-Archive: <http://www.rfc-editor.org/pipermail/rfc-interest/>
List-Post: <mailto:rfc-interest@rfc-editor.org>
List-Help: <mailto:rfc-interest-request@rfc-editor.org?subject=help>
List-Subscribe: <https://www.rfc-editor.org/mailman/listinfo/rfc-interest>, <mailto:rfc-interest-request@rfc-editor.org?subject=subscribe>
Reply-To: RFC Interest <rfc-interest@rfc-editor.org>
Cc: RFC Interest <rfc-interest@rfc-editor.org>, IETF Discussion Mailing List <ietf@ietf.org>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
Errors-To: rfc-interest-bounces@rfc-editor.org
Sender: rfc-interest <rfc-interest-bounces@rfc-editor.org>

> 
> > — validating the syntax of anchors (no spaces, plus signs)
> 
> It is not clear to me, given that anchors are always in quotes, why these restrictions were added

RFC 2629 provides a DTD for RFCXML.  This enables XML-based tools to meaningfully process RFCXML.
In this DTD, anchors were supposed to be XML “ID” type strings.  Using non-conforming strings here created problems for some tools.  So the v1 tool’s overly enthusiastic “be liberal what you accept” wan’t really helpful.  This was fixed in v2.

> In any case, they were added with no concern about the fact that many existing .xml files would be invalidated.  

(It would have been easy to create a transition tool.  I have no idea why this wasn’t done — maybe people finished their work on existing v1 documents with the v1 tool, so the problem never occurred in practice.  Or it was simply too easy to fix by hand — I remember doing this for a few documents before I completely stopped using XML as an authoring format.)

> I hope future instances of such changes can be avoided.

Yes (preferably by following the specification in the first place, or by providing transition tools, as we’ll have for v2 to v3).
 
> > — validating artwork.
> 
> Artwork, by its nature, does not need to be validated.  

But it still needs to be encoded in well-formed XML — which requires CDATA or &amp; etc.
Again, the v1 tool was more “liberal” (broken) here than it should have been.

> Since we can’t go back in time and change the v1 tool, a better way to provide compatibility would be for the existing tool to get a processing option, allowing it to accept formerly acceptable anchors that have been forbidden in recent times.

A better option is to provide a migration tool, so previously invalid (and incorrectly accepted) input documents can be transformed into well-formed input documents that have the intended effect — these can then be processed by other XML-based tools as well.

[…]
> The actual .xml file was generated by make

That is actually a good way to generate a non-trivial document.
In many cases, today’s Makefiles generate markdown documents, or generate pieces that are then included from the main markdown document.  The Makefile then translates into XML whole-sale.

> > Right.  That will be less of a problem in the future, 
> 
> I hope so.  Why do you think that this will be getting better?

With the move to v3, an XML document will be the main publication format.
(It still isn’t the same as that submitted by the authors, but I think one could continue authoring from that.)

> > (really, for the sake of your readers, please don’t).
> 
> For RFC5661 and documents derived from it using Adam’s procedure, that ship has already sailed :-(.

As the references section needs to be updated anyway (for the DOIs), I’m not sure this is really true.  Or, if it is, RFC 5661 maybe isn’t really a candidate for this process, because it may be impractical to re-generate the exact numbering that RFC 5661 used.

> > Since RFC 5661, we also got DOIs on RFCs, so it is inevitable there are a lot of diffs.  
> 
> It is not inevitable as shown by the fact that I didn't run into that issue.   It's kind of nice to know that there was an issue out there that I didn't run into :-)
> 
> For reasons I  really don't understand, the xml for rfc5661 does not include rfc reference from external libraries.   It includes them inline, so a new rfc derived from that xml  file will not include DOIs.  

Yes.  All these RFC references would be updated by the RFC editor into current references.

> That is not a problem for Adam's procedure, but it may be for the IESG or the RFC editor.   I hope that, in processing RFC’s using Adam's procedure, people will overlook the lack of DOIs in the same way that they overlook other aspects of the document that would prevent a new document of that form from being published.

AFAICT, they can’t, as the RFC editor has committed to providing DOIs.

> > Of course, I would not recommend directly authoring in XML these days 
> 
> Why not?

Because:

> > (there are now good markdown and asciidoc choices, as well as an org-mode one if that is your thing), 
> 
> Where are these documented?

These are not maintained by the IETF, so they are documented at the place wherever the authors liked to document them.

Try http://rfc.space and the links emanating from that (well, maybe re-consult in a day or so; I just noticed the links are rather incomplete as of today).

> > but that was the way things were done in 2010.  
> 
> I’m prepared to stick with that, unless there is something better about the alternatives.

Right, for a minor update, digging out the v1 tools and finding a platform where they can still run may actually be the best way to proceed.

> > (If you do want to make the transition, there are some conversion tools available; I may be able to help.)
> 
> I would only make a transition for new documents.

I have helped people transition the authoring format of quite well-advanced documents, and it seems it actually helped them complete them.  (A routine check after transition is whether the formatted output is the same, and it typically is, except occasionally for minor differences in hyphenation and spacing caused by different versions of xml2rfc being in use.)

> For documents to be processed according to Adam’s procedure, the likelihood of minor diffs arising is such that I don't think a transition is possible.

There is no way the v2 tool will create the same formatting that the v1 tool did, so I think minor diffs are nearly impossible to avoid anyway.

> For a later full bis, I would be replacing major sections of an existing xml-written document which would make any transition especially difficult.
> 
> For completely new documents,my concerns would be to make sure that the needs of those who might later be called upon to update the document were taken account of.   Relevant issues: 
> 	• Could we be sure that the new source could be processed at a later time, and give rise to the same xml?

Different authoring tools may differ in their priority on forward compatibility.
The one I can talk about authoritatively has created the same formatted documents for more than 8 years now, but not necessarily exactly the same xml.

> 	• Would the xml produced by a new tool be easily human-editable? 

For all of them that I have looked at in some detail: yes.
But you wouldn’t want to do this (as with any compiler, you would edit the input), except if you want to migrate to a different authoring tool (which is an important option that is easily provided by the common XML format).

Grüße, Carsten

_______________________________________________
rfc-interest mailing list
rfc-interest@rfc-editor.org
https://www.rfc-editor.org/mailman/listinfo/rfc-interest