Re: [xml2rfc] Differentiating prepped from unprepared documents

Brian E Carpenter <brian.e.carpenter@gmail.com> Wed, 22 June 2022 21:45 UTC

Return-Path: <brian.e.carpenter@gmail.com>
X-Original-To: xml2rfc@ietfa.amsl.com
Delivered-To: xml2rfc@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0B421C159827; Wed, 22 Jun 2022 14:45:34 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.983
X-Spam-Level:
X-Spam-Status: No, score=-3.983 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, NICE_REPLY_A=-1.876, RCVD_IN_DNSWL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id sGnF1egF7UjJ; Wed, 22 Jun 2022 14:45:29 -0700 (PDT)
Received: from mail-pl1-x634.google.com (mail-pl1-x634.google.com [IPv6:2607:f8b0:4864:20::634]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D5B36C13C68F; Wed, 22 Jun 2022 14:45:29 -0700 (PDT)
Received: by mail-pl1-x634.google.com with SMTP id k14so5004694plh.4; Wed, 22 Jun 2022 14:45:29 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:date:mime-version:user-agent:subject:content-language:to :cc:references:from:in-reply-to:content-transfer-encoding; bh=yDLkOrjlP/irwcZ3lJ/diVWRRAWcJQT0VUcVxdDduhs=; b=AsG9aSwHYib6BoHPU4/Sqf60eYw01CxIz/0zVn8onSARgiAcjPNYoKLfaipyjtM7lV 31L9TGq+am/+/ZkEEGjJIHCcSMvjv5EEXahfeyBy3tplgkDNIZvWSpWVBCkO+YxU+7pY b3fvSdlpuCKce3lOOMUOsK5mKED/XZTKStuiVhCm42oDqq9fueB5H2mE4+jWPoqfii3N QTiYfS4P0SZbQfWj+tBD3m2dhF3slcufetWeg4sdQfgN8T6WG9aRjM743CHnV40gq+R7 FEdfUsmepOz+NMsc4KmdA3XFX41gGqxl59u1jPTARXiJtL6ftPnP/V/g93rduvgcb6GE kDuw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:cc:references:from:in-reply-to :content-transfer-encoding; bh=yDLkOrjlP/irwcZ3lJ/diVWRRAWcJQT0VUcVxdDduhs=; b=oe0tTekur4B5RL4d2iiZO2pv0x1CsNdudP1QF2kfHkOXmeLkANyZXkOLAUOcWCdatL e8yMhJteA/7liXF+pPhV7ZOS04tOZxkemi0ryQn62GRZLRmUm6JwLrC7eG+pqhKe6090 Iz/W05f44kf6nFzDU+XrpnmZXoJ4pNoT9ELo8G7ToLYLNtDCuOrn9tMNkv+YBPSmcrd0 KD9DOIkfwOdJEi26x9ptU1oV0rLT3waFLnOZp183aEHDWA5Yr/E1hPsksk6T0kJZ66vU 1lR/82mRRp8ulj40l7f2Acb+GRh0uduv7kZ8KJ2P/P6OI8k3ObZEuSSRwehMDUWl7zzW fc1A==
X-Gm-Message-State: AJIora+dMEmMnceU0mox5vWSDQ6dh5c/NmopkYHQOajLiZG1VK5U2RSd z7QqNzP4CzHTslEfpH/vkMOkqhaoo9mcBCg/
X-Google-Smtp-Source: AGRyM1sRdGJA7gLw8aROw0GVh9ub+7U5jZn8mBpVeO0qdM6svcCCAelsjJ9b/mDzNMsN+oQCS1H3RA==
X-Received: by 2002:a17:903:1105:b0:168:9cac:a0a5 with SMTP id n5-20020a170903110500b001689caca0a5mr37071738plh.59.1655934328623; Wed, 22 Jun 2022 14:45:28 -0700 (PDT)
Received: from ?IPV6:2406:e003:1124:9301:80b2:5c79:2266:e431? ([2406:e003:1124:9301:80b2:5c79:2266:e431]) by smtp.gmail.com with ESMTPSA id d15-20020a63a70f000000b0040c762eb57esm9569504pgf.82.2022.06.22.14.45.26 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 22 Jun 2022 14:45:27 -0700 (PDT)
Message-ID: <d8b7991c-344a-5741-afd8-52f5db1f8f50@gmail.com>
Date: Thu, 23 Jun 2022 09:45:23 +1200
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.10.0
Content-Language: en-US
To: Jay Daley <exec-director@ietf.org>, xml2rfc@ietf.org
Cc: Julian Reschke <julian.reschke@gmx.de>, John Levine <john.levine@standcore.com>
References: <222A7719-518F-45FB-BE05-A3CA1172DD9B@ietf.org>
From: Brian E Carpenter <brian.e.carpenter@gmail.com>
In-Reply-To: <222A7719-518F-45FB-BE05-A3CA1172DD9B@ietf.org>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: base64
Archived-At: <https://mailarchive.ietf.org/arch/msg/xml2rfc/O6HacSlyumXJtUuIoN6oOQT4xug>
Subject: Re: [xml2rfc] Differentiating prepped from unprepared documents
X-BeenThere: xml2rfc@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: XML2RFC discussion list <xml2rfc.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/xml2rfc>, <mailto:xml2rfc-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/xml2rfc/>
List-Post: <mailto:xml2rfc@ietf.org>
List-Help: <mailto:xml2rfc-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/xml2rfc>, <mailto:xml2rfc-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 22 Jun 2022 21:45:34 -0000

 From the viewpoint of a user, this sounds very sensible. However, I have a couple of questions:

1. (minor) Will things that are only of use in an I-D look as they do now? For example
<note removeInRFC="true">

2. (substantive) A common need when working on a bis document is to retrieve the *unprepped* XML as opposed to the canonical prepped XML, and ideally it should include exactly what was in an I-D that never existed - all the user's non-default choices, but also the text edits made by the RFC Editor. Can that need be satisfied?

Regards
    Brian Carpenter

On 23-Jun-22 00:09, Jay Daley wrote:
> A group of us have been discussing [1] an issue we see with the current grammar and I would like to continue that discussion on list, but I thought I should give some background first for others on this list.
> 
> The issue is that there are certain elements and attributes of the grammar that are only used by the prep tool and yet are 'visible' in the grammar specification, though not consistently, and are known to cause author confusion.  For those who are not clear on this, the 'prep tool' is a specific function of xml2rfc that add various RFCXML markup [2] to a document as a key step in converting an I-D into an RFC.  What the five of us have been discussing is how we separate the prepped and unprepped grammar in order to make it significantly easier for any reader (human or machine) to distinguish between the two.
> 
> A good example of the problem is the <toc> (table of contents) element, which is added by the prep tool.  This is included in the documentation (https://authors.ietf.org/en/rfcxml-vocabulary#toc-1) but with the warning "As a document author, you should not use <toc> directly".  However, the "pn" attribute that is added to multiple elements by the prep tool and which is the target of the table of contents entries, is not covered in the documentation.
> 
> Julian has suggested that we use a namespace extension for the prepped grammar.  That would mean that the prep tool would add a namespace attribute, such as xmlns:prep="http://www.rfc-editor.org/prep-namespace", to the <rfc> element and then all markup inserted by the tool would be prefixed with "prep" ("prep" is just an example prefix) such as the <prep:toc …> element and "prep:pn" attributes.  There would still only be one grammar file (i.e. rfcXXXX.rnc) but that would be amended to specify the namespace of each element/attribute.
> 
> Personally, I think this is the best way forward given where we are now, though I would have preferred us to have started differently.  Ideally we would have more clearly differentiated between I-Ds and RFCs in the authoring format to match the distinction that is so apparent in how these documents are used.  That would mean having two different grammar files, one for I-Ds with a root element <i-d> and one for RFCs with a root element of <rfc>, and with the prep tool converting between the two.  However, moving to that now seems excessive.
> 
> The only practical downside of the namespace solution is that it constricts I-D grammar to always being a subset of RFC grammar (as the latter is implemented by an extension) and so it makes it messy to implement such things as including annotations in the grammar that are only for use in I-Ds and not RFCs.
> 
> Jay
> 
> [1]  https://github.com/ietf-tools/xml2rfc/issues/791
> [2]  https://www.rfc-editor.org/rfc/rfc7998.html
>