Re: [rfc-i] XInclude usage

Jay Daley <jay@ietf.org> Thu, 05 August 2021 02:44 UTC

Return-Path: <rfc-interest-bounces@rfc-editor.org>
X-Original-To: ietfarch-rfc-interest-archive@ietfa.amsl.com
Delivered-To: ietfarch-rfc-interest-archive@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 981BD3A1024; Wed, 4 Aug 2021 19:44:24 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -5.199
X-Spam-Level:
X-Spam-Status: No, score=-5.199 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.001, HTML_MESSAGE=0.001, MAILING_LIST_MULTI=-1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Blx1z7s4RG5J; Wed, 4 Aug 2021 19:44:21 -0700 (PDT)
Received: from rfc-editor.org (rfc-editor.org [4.31.198.49]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B47A43A1336; Wed, 4 Aug 2021 19:44:21 -0700 (PDT)
Received: from rfcpa.amsl.com (localhost [IPv6:::1]) by rfc-editor.org (Postfix) with ESMTP id 5F30EF4073B; Wed, 4 Aug 2021 19:43:54 -0700 (PDT)
X-Original-To: rfc-interest@rfc-editor.org
Delivered-To: rfc-interest@rfc-editor.org
Received: from localhost (localhost [127.0.0.1]) by rfc-editor.org (Postfix) with ESMTP id C2C71F4073A for <rfc-interest@rfc-editor.org>; Wed, 4 Aug 2021 19:43:53 -0700 (PDT)
X-Virus-Scanned: amavisd-new at rfc-editor.org
Received: from rfc-editor.org ([127.0.0.1]) by localhost (rfcpa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id mN-kVmFH1Y0N for <rfc-interest@rfc-editor.org>; Wed, 4 Aug 2021 19:43:49 -0700 (PDT)
Received: from mail.ietf.org (mail.ietf.org [4.31.198.44]) by rfc-editor.org (Postfix) with ESMTPS id 6BC6BF4073B for <rfc-interest@rfc-editor.org>; Wed, 4 Aug 2021 19:43:49 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1794A3A133A; Wed, 4 Aug 2021 19:44:14 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 88AnX_pHWX7A; Wed, 4 Aug 2021 19:44:08 -0700 (PDT)
Received: from smtpclient.apple (unknown [158.140.230.105]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPSA id 2868C3A1336; Wed, 4 Aug 2021 19:44:07 -0700 (PDT)
From: Jay Daley <jay@ietf.org>
Message-Id: <798399E2-7B6B-44DE-97C4-B186219D3098@ietf.org>
Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.100.0.2.22\))
Date: Thu, 05 Aug 2021 14:44:04 +1200
In-Reply-To: <8e3b0d82-fb22-4c99-924a-1889b4045343@mozilla.com>
To: Peter Saint-Andre <stpeter@mozilla.com>
References: <8e3b0d82-fb22-4c99-924a-1889b4045343@mozilla.com>
X-Mailer: Apple Mail (2.3654.100.0.2.22)
Subject: Re: [rfc-i] XInclude usage
X-BeenThere: rfc-interest@rfc-editor.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "A list for discussion of the RFC series and RFC Editor functions." <rfc-interest.rfc-editor.org>
List-Unsubscribe: <https://www.rfc-editor.org/mailman/options/rfc-interest>, <mailto:rfc-interest-request@rfc-editor.org?subject=unsubscribe>
List-Archive: <http://www.rfc-editor.org/pipermail/rfc-interest/>
List-Post: <mailto:rfc-interest@rfc-editor.org>
List-Help: <mailto:rfc-interest-request@rfc-editor.org?subject=help>
List-Subscribe: <https://www.rfc-editor.org/mailman/listinfo/rfc-interest>, <mailto:rfc-interest-request@rfc-editor.org?subject=subscribe>
Cc: RFC Interest <rfc-interest@rfc-editor.org>
Content-Type: multipart/mixed; boundary="===============8856064429402205682=="
Errors-To: rfc-interest-bounces@rfc-editor.org
Sender: rfc-interest <rfc-interest-bounces@rfc-editor.org>

Hi Peter

One of my high level concerns about RFC XML is that it should use a formally defined syntax and thereby support validation by a range of common XML tools, not just xml2rfc.  Another one is that if we’re going to use XML then we should use XML and not some custom simulacrum that looks like it but doesn’t walk like it, because then we lose the power of the many XML processing tools out there.

In that regard, imposing any limitations on XInclude is problematic because it means that either only xml2rfc can validate a document, or we need to define our own inclusion mechanism (e.g. RFCInclude) that can be validated.  My strong preference would therefore be not to limit XInclude as per your a., b. and c. below.  

I'm also not clear why we need to limit XInclude this way.  When XInclude is used, it introduces a double validation step, one before the XInclude is processed and one after it is processed.  This can’t really be avoided.  If XInclude is used incorrectly then the second validation will pick that up.  For example, DocBook does not include XInclude in its syntax [1] and only validates the XML that is constructed *after* any XInclude statements are processed and the contents inserted.   

However, we really should examine if we need XInclude or not.  A good example is the way we include IPR boilerplate, which is by an attribute not any form of XML inclusion.  (BTW By doing this we have effectively created two separate XML vocabularies, as Mark was highlighting, and forced the usage of our custom processor, xml2rfc, to get from one to the other).  We could take a similar approach for BibXML references and define a syntax whereby they are simply identified by name and our custom processor inserts them at the right stage.  We already have a 'database' of BibXML references that this name could be used as the key for and we could provide a mechanism for people to add new references to that if needed.


> On 5/08/2021, at 5:51 AM, Peter Saint-Andre <stpeter@mozilla.com> wrote:
> 
> Back in 2017, Julian Reschke asked [1] that we clarify the scope of
> XInclude [2] usage, to wit:
> 
> ###
> 
> 1. It should be clarified how much of XInclude needs to be supported.
> I'm mentioning this because RFC7991 says that includes can not happen
> where no elements are allowed, but XInclude allows inclusion of plain
> text as well.
> 
> 2. Is support of the xpointer attribute required?

see above

> If so, do we have any
> guidance how this will work when the document from which to include uses
> xml2rfc format, and the pointer relies on the id-ness of an attribute?

I need to think this through more, but it would be useful to understand the use case given that RFCs are immutable, so is this only for I-Ds and why are people doing it?

[1]  http://www.sagehill.net/docbookxsl/ValidXinclude.html

Jay 

> 
> ###
> 
> We discussed this recently [3] within the RFC XML and Style Guide Change
> Management Team. Our understanding is that x:include is used primarily
> or perhaps exclusively to pull in reference files structured as XML. In
> order to keep things as simple as possible, my suggestion to the team
> and to this list is as follows:
> 
> a. Limit the usage of XInclude to XML only (parse="xml"), not arbitrary
> text (parse="text")
> 
> b. Don't support fallback [4]
> 
> c. Don't support XPointer [5]
> 
> I'm curious what others think.
> 
> Peter
> 
> [1] https://github.com/rfc-format/draft-iab-xml2rfc-v3-bis/issues/18 see
> also https://github.com/rfc-format/draft-iab-xml2rfc-v3-bis/issues/128
> [2] https://www.w3.org/TR/2006/REC-xinclude-20061115/
> [3] https://codimd.ietf.org/cmt-20210726
> [4] https://www.w3.org/TR/2006/REC-xinclude-20061115/#fallback
> [5] https://www.w3.org/TR/2003/REC-xptr-framework-20030325/
> 
> _______________________________________________
> rfc-interest mailing list
> rfc-interest@rfc-editor.org
> https://www.rfc-editor.org/mailman/listinfo/rfc-interest
> 

-- 
Jay Daley
IETF Executive Director
jay@ietf.org

_______________________________________________
rfc-interest mailing list
rfc-interest@rfc-editor.org
https://www.rfc-editor.org/mailman/listinfo/rfc-interest