Re: [Tools-discuss] Why post text and not XML? (was: I-D statistics)

John C Klensin <john-ietf@jck.com> Sun, 17 March 2024 00:09 UTC

Return-Path: <john-ietf@jck.com>
X-Original-To: tools-discuss@ietfa.amsl.com
Delivered-To: tools-discuss@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9B96CC14F5E4 for <tools-discuss@ietfa.amsl.com>; Sat, 16 Mar 2024 17:09:27 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.906
X-Spam-Level:
X-Spam-Status: No, score=-1.906 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id JaofyxCPLKF8 for <tools-discuss@ietfa.amsl.com>; Sat, 16 Mar 2024 17:09:26 -0700 (PDT)
Received: from bsa2.jck.com (bsa2.jck.com [70.88.254.51]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 7CBD1C14F5E2 for <tools-discuss@ietf.org>; Sat, 16 Mar 2024 17:09:26 -0700 (PDT)
Received: from [198.252.137.10] (helo=PSB) by bsa2.jck.com with esmtp (Exim 4.82 (FreeBSD)) (envelope-from <john-ietf@jck.com>) id 1rle5F-000PKN-S5; Sat, 16 Mar 2024 20:09:01 -0400
Date: Sat, 16 Mar 2024 20:08:55 -0400
From: John C Klensin <john-ietf@jck.com>
To: Brian E Carpenter <brian.e.carpenter@gmail.com>, Michael Richardson <mcr+ietf@sandelman.ca>, Carsten Bormann <cabo@tzi.org>, tools-discuss <tools-discuss@ietf.org>
Message-ID: <E97D1FFB4A1584DE3710E14A@PSB>
In-Reply-To: <60f18950-a2e0-16be-3a05-33f9a637062d@gmail.com>
References: <1952067F-6467-4BEC-9CA5-BB8B16FA662B@tzi.org> <14807.1709682543@obiwan.sandelman.ca> <effb521c-1e20-cff8-acd3-17212a6b3fb9@gmail.com> <447A96F55A3D36851570B3B6@PSB> <60f18950-a2e0-16be-3a05-33f9a637062d@gmail.com>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
X-SA-Exim-Connect-IP: 198.252.137.10
X-SA-Exim-Mail-From: john-ietf@jck.com
X-SA-Exim-Scanned: No (on bsa2.jck.com); SAEximRunCond expanded to false
Archived-At: <https://mailarchive.ietf.org/arch/msg/tools-discuss/UlRSMUwTn_IN9fU5DeTUj_MHoqk>
Subject: Re: [Tools-discuss] Why post text and not XML? (was: I-D statistics)
X-BeenThere: tools-discuss@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: IETF Tools Discussion <tools-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tools-discuss>, <mailto:tools-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tools-discuss/>
List-Post: <mailto:tools-discuss@ietf.org>
List-Help: <mailto:tools-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tools-discuss>, <mailto:tools-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 17 Mar 2024 00:09:27 -0000


--On Sunday, March 17, 2024 09:36 +1300 Brian E Carpenter
<brian.e.carpenter@gmail.com> wrote:

> John,
> 
> Thanks for explaining.
> 
> In line...
> 
> 
> On 17-Mar-24 07:51, John C Klensin wrote:
>> 
>> 
>> --On Saturday, March 16, 2024 17:13 +1300 Brian E Carpenter
>> <brian.e.carpenter@gmail.com> wrote:
>> 
>>> ...
>>> More confusingly, there are still a few current drafts
>>> submitted as txt only. No XML at all. I wonder why we still
>>> allow that, and what tools people are using. 
> 
> I certainly don't care about the XML as such; what I miss
> is the HTML version, which is much nicer to read than either
> the plain text or the HTMLized version. My problem statement
> was too simple.
> 
> Is it possible for you to submit the .txt and the .html
> versions?

Have not investigated that, but maybe.  Can try to look into it
when I have time.  The one difficulty is that, if someone is
going to hold be responsible for the HTML being nicely
formatted, that could easily be another matter because, for a
long document, I don't even feel like taking the time to check
it (an issue that has come up with the RPC during AUTH48 as well.

>>> Doesn't this
>>> create busywork if the document progresses?
>> 
>> As one of the offenders, I think I have explained this before
>> but let me do it again.  Short answer: it is a side-effect of
>> work that makes the document development process in WGs much
>> more efficient in part because it makes "we have been over
>> that before, in YYYYMM, and reached those conclusions
>> because..." input much easier and more accessible.  It also
>> helps in preparation of accurate final change summaries and
>> acknowledgment easier and more accurate.  The "tools" are an
>> emacs-clone editor with an XML mode and a handful of personal
>> macros and templates.   The only "busywork" is stripping that
>> stuff out just before the XML is handed to the RPC.

> Good. I was concerned about the RPC having to synthesize the
> XML from a plain text submission.

A different problem entirely even though we do allow that.  But
I believe I have always made a sanitized version of my XML
available to them or at least offered to do so.

>>     =================
>> 
>> For anyone interested and in the hope of not having to repeat
>> this again...
>> 
>> Especially for long, complex, and long-lived documents,
>> especially those that are replacements, significant updates
>> for earlier documents, or merges of others, I use extensive
>> comments in the XML to track changes and decisions.   Other
>> comments are used to provide information to, or prepare for
>> discussions with, the RPC about why particular text phrasing
>> and constructions or document organizations were chosen, etc.
>> With one current document, those comments add up to more that
>> 30% of the size of the XML file.  Some of those comments are
>> over 20 years old and have been carried forward from xml2rfc
>> v1 files associated with previous documents.

> Understood. The "modern" approach is of course to embed such
> comments in GitHub issues, which tends to lead to
> self-censorship
> of any "unkind" comments, and then the nit-picking takes place
> on GitHub too.

Setting aside my issues with GitHub --most of which also have to
do with acting to exclude people who are not very active
participants in WGs that have been discussed as part of the
"pre-meeting document posting deadlines" thread on the IETF
list-- that approach asks for an additional investment of author
time that does not contribute directly to the quality of the
resulting document.

>> Why not just post all of that information?  Because, given
>> experience with the IETF community, it would be only a matter
>> of time before someone, probably several someones, decided to
>> nit-pick details of the comments or complain about the
>> incorrect or unkind terminology in some of them, even some of
>> the 20-year-old ones.  They are personal notes and neither I
>> nor the community would need the wasted time that could be
>> spent on the substantive parts of the document or on other
>> work.  For those and other reasons, I don't want to share
>> those comments, and hence the XML, with the community.  Too
>> many of them are personal notes.  They could be edited into
>> generally acceptable forms, but it would take significant
>> work and time that could be better spent in other ways.
 
>> When the I-D is approved and handed off to the RPC for
>> production and publishing, I prepare a version of the RFCXML
>> file that contains only the comments that are likely to be
>> helpful to the RPC or in conversations with them.   Certainly
>> those that are relevant only to prior RFCs that the new
>> document replaces are gone.  However that is a
>> comment-by-comment editing job that typically takes some
>> hours, not something I want to do with every I-D posting.
 
>> Could I establish conventions within the comments that would
>> permit automatic removal such that there would be the
>> copy/version I work on and an easily generated redacted one I
>> could post?  Yes, probably, but that would be extra work too.
>> Moreover, over the quarter-century since xml2rfc was
>> introduced, my conventions have changed -- I have even used
>> different conventions during the WG I-D development period
>> and during IETF LC.  If I had perfect foresight around 2000,
>> maybe, but...
>> 
>> So, not going to happen.  And if someone makes a rule that the
>> XML must (MUST?) be posted, I've authored or edited my last
>> long and/or complex and/or updating or replacing document.
>> Find someone else to do it.  Or conclude the IETF is not
>> interested in the topic area any more.
 
> Rules of course are made to be bypassed.

Which, to the extent to which there is already moaning about my
not posting XML for I-Ds, is what I am doing by posting the text
only.  Of course, those whom we still allow to prepare documents
using MSWord, nroff, or other tools that don't involve editing
XML or an XML intermediary have an even better problem.  And,
IMO, deciding we want to drive some or all of them away or tell
them they can't write I-Ds and get them posted would not be
exactly welcoming, newcomer-friendly, open, or many of the other
things we claim to be.   And, IMO, not an issue that can or
should be resolved on  this particular list.

>> Maybe I'm the only one, but I suspect I'm not.
 
>> I don't know. Your usage is rational but I'm genuinely unsure
>> about the ones that have crossed my screen lately.

As suggested above, I'm guessing that we still have some people
preparing documents in non-RFCXML forms or using mechanisms that
do not have it as an intermediary.  While I note that following
the path from the IETF home page through to
https://authors.ietf.org/choosing-a-format-and-tools leads to a
statement "3. Drafting in any other format is not recommended
for new authors." (a statement I don't remember anyone asking
the community about and determining consensus), some authors are
not new and even that page goes on to list, under "Full list of
authoring formats", several other formats (including plain text
submissions and word processors) are listed.  FWIW, I have some
other quibbles about the content of that page, but will save
them until after this week and, if everyone is lucky, not
remember by then.

best,
   john