Evolving document sources over a long time (Re: Comments on draft-roach-bis-documents-00)

Carsten Bormann <cabo@tzi.org> Sat, 11 May 2019 06:10 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: ietf@ietfa.amsl.com
Delivered-To: ietf@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2EA4B120247 for <ietf@ietfa.amsl.com>; Fri, 10 May 2019 23:10:08 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.2
X-Spam-Level:
X-Spam-Status: No, score=-4.2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4R5yKf0ld5En for <ietf@ietfa.amsl.com>; Fri, 10 May 2019 23:10:06 -0700 (PDT)
Received: from smtp.uni-bremen.de (gabriel-vm-2.zfn.uni-bremen.de [134.102.50.17]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B98BE12022E for <ietf@ietf.org>; Fri, 10 May 2019 23:10:05 -0700 (PDT)
Received: from [192.168.217.106] (p54A6CC75.dip0.t-ipconnect.de [84.166.204.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.uni-bremen.de (Postfix) with ESMTPSA id 451Grr0l9vzyY2; Sat, 11 May 2019 08:10:04 +0200 (CEST)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.1\))
Subject: Evolving document sources over a long time (Re: Comments on draft-roach-bis-documents-00)
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <CADaq8jdRMUZAN3rRXoActXqvGpkgx_-kW67uwzGLtVPoh7LfAQ@mail.gmail.com>
Date: Sat, 11 May 2019 08:10:03 +0200
Cc: IETF Discussion Mailing List <ietf@ietf.org>, RFC Interest <rfc-interest@rfc-editor.org>
X-Mao-Original-Outgoing-Id: 579247801.3203239-20deb3f38f41ac38bf062c28f7f1a00f
Reply-To: RFC Interest <rfc-interest@rfc-editor.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <6E787E2A-18F2-4EFE-BFBA-61B1B4300930@tzi.org>
References: <CADaq8jdRMUZAN3rRXoActXqvGpkgx_-kW67uwzGLtVPoh7LfAQ@mail.gmail.com>
To: David Noveck <davenoveck@gmail.com>
X-Mailer: Apple Mail (2.3445.9.1)
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf/fBwnY_-q66ILny1AZhWNYuzuYGM>
X-BeenThere: ietf@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IETF-Discussion <ietf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf>, <mailto:ietf-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf/>
List-Post: <mailto:ietf@ietf.org>
List-Help: <mailto:ietf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf>, <mailto:ietf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 11 May 2019 06:10:08 -0000

Hi David,

Thank you for your article in ietf@ietf.org, it is really a little treasure-trove.

Let me extract those parts that pertain to your experiences with the evolution of RFCXML and add rfc-interest; please continue discussion on rfc-interest (which you may want to subscribe if RFC authoring is part of your life).

> On May 11, 2019, at 01:31, David Noveck <davenoveck@gmail.com> wrote:
> 
> Apparently an .xml file used to prodtce an rfc is not necessarily acceptable to later versions of xml2rfc.  You can get an idea of the issues by looking at rc5661Base.xml and the file I wound up with, rfc5661Ready.xml; both are attached.

Most of your changes have to do with the better validation in the current tools, namely:

— validating the syntax of anchors (no spaces, plus signs)
— validating artwork.

For the latter, doing a wholesale <![CDATA[ … ]]> is always a better approach than sprinkling &amp;/&lt; (you almost never need &gt;, by the way).  (See also authoring tools below.)

The anchor syntax — it is just too bad the v1 tool didn’t check that.  

> I was able to get a an .xml from which a .txt file could be generated, but despite my best efforts there were differences (more than a few) between it and rfc5661, which your document would consider to be "spurious" but appear to be unavoidable. In any case they are not gratuitous changes and should not interfere with consideration of the document.   The majority of the diffs arise from the following  issues:
> 	• Despite the fact that the xml for the reference sections of both xml files are identical and the processing options are identical (symrefs="no" sortrefs="yes") the reference ids in rfc5661Ready.txt and in rfc5661 are different so that rfcdiff shows every line containing a reference as part of a diff.  Apparently, different versions of xml2rfc use different approaches to sorting references. 

Protip: DO NOT USE numeric references.  Ever.  This was stylistically appealing for some tiny documents, but rarely is appropriate for actual specifications (and certainly not for NFSv4-sized ones!).

> 	• There are a fair number of difference that seem to have arisen because the RFC edtitor made minor corrections directly on the .txt file so that each such correction (while valid) is reported by rfcdiiff as a difference.

Right.  That will be less of a problem in the future, but it does require tediously porting back changes to the document source.  

> 	• The reference sections are exposed to the same sorting issues as the reference id's.  To the naked eye, and to rfcdiff they look very different, despite that fact that they contain the same references.

The sorting issue should be taken care of by not using numeric references (really, for the sake of your readers, please don’t).
Since RFC 5661, we also got DOIs on RFCs, so it is inevitable there are a lot of diffs.  Again, tedious, but not really avoidable.

Of course, I would not recommend directly authoring in XML these days (there are now good markdown and asciidoc choices, as well as an org-mode one if that is your thing), but that was the way things were done in 2010.  (If you do want to make the transition, there are some conversion tools available; I may be able to help.)

Grüße, Carsten