Re: [Tools-discuss] xml2rfc in --v2 mode -- bug report?
Jay Daley <exec-director@ietf.org> Mon, 13 June 2022 21:36 UTC
Return-Path: <exec-director@ietf.org>
X-Original-To: tools-discuss@ietfa.amsl.com
Delivered-To: tools-discuss@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DC5AAC15AAF0 for <tools-discuss@ietfa.amsl.com>; Mon, 13 Jun 2022 14:36:34 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.908
X-Spam-Level:
X-Spam-Status: No, score=-1.908 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ejD6qV2VL4_G for <tools-discuss@ietfa.amsl.com>; Mon, 13 Jun 2022 14:36:30 -0700 (PDT)
Received: from ietfx.amsl.com (ietfx.amsl.com [50.223.129.196]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 99235C157B4C for <tools-discuss@ietf.org>; Mon, 13 Jun 2022 14:36:30 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1]) by ietfx.amsl.com (Postfix) with ESMTP id 7B10C4053E45; Mon, 13 Jun 2022 14:36:30 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
Received: from ietfx.amsl.com ([50.223.129.196]) by localhost (ietfx.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8DPNE5hDBNQz; Mon, 13 Jun 2022 14:36:30 -0700 (PDT)
Received: from smtpclient.apple (unknown [78.108.139.241]) by ietfx.amsl.com (Postfix) with ESMTPSA id 9E3984053E43; Mon, 13 Jun 2022 14:36:29 -0700 (PDT)
From: Jay Daley <exec-director@ietf.org>
Message-Id: <5C2BC1A3-7DAE-4AB3-ABE3-19AB161D38BA@ietf.org>
Content-Type: multipart/alternative; boundary="Apple-Mail=_67D18536-29C9-4230-A6BF-158BC2ECCA82"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.100.31\))
Date: Mon, 13 Jun 2022 23:36:26 +0200
In-Reply-To: <280CEA676989D89FF1789101@PSB>
Cc: tools-discuss@ietf.org
To: John C Klensin <john-ietf@jck.com>
References: <B39D28F0353AE74800217ADC@PSB> <7EDFAAE2-3109-4D16-BC16-1A47DB365522@ietf.org> <E022AAF289DF04D70F449FF7@PSB> <49687028-4FF4-44D1-A3D3-79FDF670A5A1@ietf.org> <280CEA676989D89FF1789101@PSB>
X-Mailer: Apple Mail (2.3696.100.31)
Archived-At: <https://mailarchive.ietf.org/arch/msg/tools-discuss/bKfOOqMRCA4RVMwXw8LsrL0fyIg>
Subject: Re: [Tools-discuss] xml2rfc in --v2 mode -- bug report?
X-BeenThere: tools-discuss@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: IETF Tools Discussion <tools-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tools-discuss>, <mailto:tools-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tools-discuss/>
List-Post: <mailto:tools-discuss@ietf.org>
List-Help: <mailto:tools-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tools-discuss>, <mailto:tools-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 13 Jun 2022 21:36:34 -0000
> On 13 Jun 2022, at 17:42, John C Klensin <john-ietf@jck.com> wrote: > > Jay, > > This is very helpful even thought I think, for the person who is > trying to get work done rather than being steeped in the theory, > the vocabulary - schema and PI distinctions (including the > additional distinction in Carsten's recent note) might be > considered fussy details. I don't believe (but have not gone > back and checked) that any of the definitions say something > equivalent to "don't bother to try to understand this without > first becoming very familiar with the details of XML terminology > and distinctions". Actually John, the only reason I had to explain that is because you took my recommendation about adding a PI as meaning that this was somehow creating a v2/v3 hybrid. Someone less worried about the formalities might not need that level of detail. Having said that, XML is indeed very complex and does come with a lot of baggage which we are all exposed to at times - but that ship has sailed. > > As to the transition, you may reasonably disagree either about > the principle or about the boundary point but, from my > perspective, the purpose of tools like the RFCXML definition and > xml2rfc is to let people contributing to the IETF to get work > done without unreasonable barriers (including sorting out of > working around significant bugs). From that perspective, it > would be a grave disservice to the community to declare v2 > support at an end until xml2rfc is actually stable and > relatively problem-free. As long as people, such as Carsten, > who have been paying far more attention to this than I have, are > mentioning hundreds of "issues" as significant or making > comments about keeping the newer versions of things alive, we > are not near "relatively problem-free" yet. I do not agree with the view that xml2rfc is not ready and I think that does a disservice to the work put into over the years. There are indeed anomalies in the grammar that are a bit more fundamental than xml2rfc but those are issues with the RFC (mostly due to hindsight) and not the tool. > > Finally and most important, we should all remember that there > has never been a publicly stated expectation that everyone in > the IETF who might need to write a document or use other tools > will be on this list. I have no way to know, but I assume that > only a very small fraction of those contributors are watching > the list carefully. Information such as that below --if you > and others think it is as important as it appears to be-- should > be on an easily found web page somewhere and the community > pointed to it, including in a normative reference from the > revised vocabulary document. Good job we have https://authors.ietf.org <https://authors.ietf.org/> which meets all of your criteria above. Jay > > thanks, > john > > > --On Monday, June 13, 2022 15:21 +0100 Jay Daley > <exec-director@ietf.org> wrote: > >> Hi John >> >> From reading your message I think I need to start with a clear >> taxonomy of the various moving parts here because this is >> still not clear to most participants, afaict: >> >> 1. All XML languages define what elements and attributes are >> acceptable within that language and what element can appear >> inside what other element. This is what we call RFCXML and >> what was previously called the xml2rfc vocabulary. This is >> more generally called the *grammar*. The grammar is defined >> in RFCs. >> >> 2. We have chosen to use another language to formally define >> the grammar above. For v1, defined in RFC 2629, the formal >> definition used a DTD. For v2, defined in RFC 7749, this was >> specified in a RelaxNG schema and not a DTD. For v2, defined >> in RFC 7991, this was also defined in a RelaxNG schema. This >> is more generally called the *schema*. At some point work was >> put into changing rfc2629.dtd to make it compliant with v2 >> (possibly even v3) but I believe it does not (and cannot) >> because of the limitations of DTDs, correctly define the v2 >> grammar. >> >> 3. There is an XML construct called a *processing >> instruction* (aka a PI), which is embedded inside an XML >> document and provides instructions that are to be interpreted >> by any XML processor. These are not part of the grammar and >> therefore cannot be part of the formal definition. To repeat >> myself - PIs sit outside of the grammar, they are conceptually >> similar to escape codes in that respect. DOCTYPE is a >> processing instruction, as are <?xml-model…> and >> <?xml-stylesheet…>.. None of the above RFCs have >> comprehensively covered PIs - RFC 2629 does not mention them >> at all, RFC 7749 notes that certain things are set by PIs but >> does not define them and RFC 7991 notes that certain grammar >> changes are intended to deprecate some PIs but doesn't >> formally define those. >> >> With that in mind ... >> >>> On 12 Jun 2022, at 18:05, John C Klensin <john-ietf@jck.com> >>> wrote: >>> >>> >>> >>> --On Sunday, June 12, 2022 16:43 +0100 Jay Daley >>> <exec-director@ietf.org> wrote: >>> >>>> Hi John >>>> >>>>> On 11 Jun 2022, at 20:29, John C Klensin <john-ietf@jck.com> >>>>> wrote: >>>>> >>>>> Hi. I have an old document, in xml2rfc v2 format, whose >>>>> content I'm trying to upgrade. When I run >>>>> xml2rfc DocName.xml --v2 >>>>> (with version 3.12.7) I get a series of messages that look >>>>> like >>>>> >>>>> Warning: >>>>> file:/C:/Users/Klensin/AppData/Local/Programs/Python/Pyt >>>>> hon37/lib/site-packages/xml2rfc/templates/rfc2629-xhtml. >>>>> ent is no longer needed as the special processing of >>>>> non-ASCII characters has been superseded by direct >>>>> support for non-ASCII characters in RFCXML. >>>>> >>>>> The source does not contain any references to >>>>> rfc2629-xhtml.ent. It does, of course, contain >>>>> <?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?> >>>>> as well as the DOCTYPE statement that specifies the DTD >>>>> >>>>> Are these warnings actually addressed to something in the >>>>> DTD or stylesheet file than should be cleaned out and, if >>>>> so, how does that get done? >>>> >>>> The DOCTYPE that you have included I am guessing is for >>>> rfc2629.dtd which includes by reference rfc2629-xhtml.ent, >>>> >>>> The DOCTYPE is superfluous for document validation and has >>>> been since v2 of the grammar because v2 is when the change >>>> was made from XML Schema (as specified in a DOCTYPE) to >>>> RelaxNG. However, it continued to be used partly I think >>>> because the templates were never updated to be RelaxNG-aware >>>> and partly because it was a convenient way to incorporate >>>> the character entities. >>> >>> But this is a v2 document whose first version dates to early >>> 2017, when v3 was still in its infancy and, IIR, the online >>> version of xml2rfc would not yet handle v3. RelaxNG was, again >>> IIR, only introduced with RFC 7749 in February 2016, bringing >>> the change from a DTD-based definition to a Schema-based one >>> with it. Most of the changes described in its Appendix B had >>> (as RFC 7749 says) been adopted some years earlier and >>> templates adjusted, so that the actual changes needed in 2016 >>> were very small and many documents and templates did not >>> require changes at all. >> >> That "most of the changes … had been adopted some years >> earlier" doesn't change the fact that v2 is defined in RFC >> 7749 and that uses a RelaxNG schema. >> >>> I also note that DOCTYPE appears in Section 4 of RFC >>> 7749 with language that implies to me that it is required. >> >> No, it's a PI not part of the grammar. >> >>> >>>> I normally recommend that authors replace the DOCTYPE >>>> statement with this: >>>> >>>> <?xml-model href="rfc7991bis.rnc"?> >>>> >>>> (The file referenced can be found at >>>> https://raw.githubusercontent.com/ietf-tools/rfcxml-template >>>> s- and-schemas/main/rfc7991bis.rnc) >>> >>> But that is a piece of version 3 vocabulary (see below). >> >> No, it's a PI and therefore outside of the v3 vocabulary. >> >> However, I did make a mistake here, forgetting that you are a >> v2 user. The correct PI for a v2 document would be >> >> <?xml-model href="rfc7749.rnc"?> >> >> Where that file can be found at >> https://raw.githubusercontent.com/ietf-tools/legacy-templates- >> and-schemas/main/rfc7749.rnc >> >>>> Doing this tells any XML processor that this uses the >>>> referenced RelaxNG schema and a RelaxNG aware editor will >>>> then both validate against this schema and provide >>>> schema-aware editing support (such as auto-suggestion and >>>> autoi-completion). >>> >>>> However afaict your editor, Epsilon, does not appear to do >>>> schema validation of any sort and so neither the statement >>>> above nor a DOCTYPE will result in any validation. >>> >>> Epsilon (and emacs) are "just" editors. Their modes for >>> handling XML are not aware of schema, just such fundamental >>> --and essentially lexical-- issues as formatting, element >>> matching, and so on. >> >> As explained by Carsten that is not correct for Emacs. From >> my research Epsilon is almost unique in not doing any form of >> schema validation. The reason all the others do it appears to >> be because they use the same underlying open source XML >> libraries that provide this functionality >> >>> >>> So I am confused by your explanation and suggestions: >>> >>> (1) They would seem to lead to documents that are v2-v3 >>> hybrids. I don't know how the current versions of xml2rfc >>> would deal with that but, given assorted v2 elements and >>> constructions that were deprecated in v3, I'd guess it would >>> be very hard to get right and that going in that direction >>> would be a bad idea. >> >> As noted above, PIs are not part of the grammar and so you can >> have a v2 document that uses new PIs and it is still a v2 >> document. >> >>> >>> (2) If retaining DOCTYPE, or at least DOCTYPE with those >>> definitions, in a v3 document is, as you suggest, obsolete >>> and a bad practice, then that should be reflected in the v2v3 >>> conversion process. However, when I did the conversion >>> yesterday, that definition (straight out of the v2 document >>> and RFC 7749) is retained unchanged. I presume that should go >>> onto the list of bugs in the converter. >> >> Except that it's not a bug. Having a DOCTYPE in a v3 >> document doesn't stop it from being a v3 document. What it >> does is provide an instruction to any XML processor that may >> be wrong and which it may choose to ignore. I agree however >> that a warning from xml2rfc would be helpful. >>> >>> >>> More generally, I have no idea what happens behind the scenes >>> when I invoke xml2rfc v3.12.7 with "--v2" >> >> https://authors.ietf.org/en/upgrading-from-v2 >> >>> but, given the very >>> large number of documents in the RFC Editor's collection in v2 >>> format that, I assume, have not been converted to v3 and >>> tested for consistency with the output produced, a decision >>> to retire support for v2 should be taken only with great care >>> (and, IMO, given the risks and tradeoffs, made only with IESG >>> signoff after a community Last Call). Until then, I believe >>> the tools team and IETF staff have considerable >>> responsibility for keeping version 2 supported. I don't think >>> a few spurious warning messages are a big deal unless they >>> are a sign of things to come, but, when the answer to a >>> problem with constructions that are valid and well-documented >>> under v2 is "convert to v3", that is not supporting v2 >>> properly and as the community has a reasonable right to >>> expect. >> >> v2 was officially obsoleted when RFC 7991 was published and >> RFC 7991 is explicit about that. Yes a transition process >> should be supported and it has for six years now, but I >> disagree that the community has any right to expect that to >> continue. It will inevitably get more expensive and complex >> to support v2. Having said that, the transition certainly has >> taken longer than many might expect, which I attribute to the >> templates that were provided until recently, which still used >> a number of v2 idioms and did not showcase v3. The new >> templates should ease the transition considerably. >> >>> >>> Similar comments apply to your comment about epsilon: my >>> expectation is that I can continue to use a text editor to >>> work with what are now called RFCXML files in both v2 and v3, >>> expecting that validation will come out of the xml2rfc program >>> (or, at worst, a competent, LINT-like, validator will be >>> provided by the IETF). I also expect those validation >>> processes will produce clear, correct, and, where possible, >>> actionable warning and error messages. If that ever becomes >>> not the case, e.g., if it were expected that people creating >>> documents will use an editor with XML Schema validating >>> capability and validation responsibility will lie there, that >>> would essentially require document authors who wish to work >>> in XML to use such an editor. I would assume that, too, would >>> require IESG signoff after an IETF Last Call if only because >>> it would increase the barriers to entry and participation in >>> the IETF and hence reduce the diversity of its active, >>> document-writing, participants. >> >> There's no expectation that you use a schema-aware editor, >> but if you do then with the appropriate processing >> instructions, it will make your work considerably easier. >> Your choice though, >> >> Jay >> >>> >>> best, >>> john > > -- Jay Daley IETF Executive Director exec-director@ietf.org
- [Tools-discuss] xml2rfc in --v2 mode -- bug repor… John C Klensin
- Re: [Tools-discuss] xml2rfc in --v2 mode -- bug r… Carsten Bormann
- Re: [Tools-discuss] xml2rfc in --v2 mode -- bug r… Carsten Bormann
- Re: [Tools-discuss] xml2rfc in --v2 mode -- bug r… Julian Reschke
- Re: [Tools-discuss] xml2rfc in --v2 mode -- bug r… Jay Daley
- Re: [Tools-discuss] xml2rfc in --v2 mode -- bug r… John C Klensin
- Re: [Tools-discuss] xml2rfc in --v2 mode -- bug r… Carsten Bormann
- Re: [Tools-discuss] xml2rfc in --v2 mode -- bug r… John C Klensin
- Re: [Tools-discuss] xml2rfc in --v2 mode -- bug r… Carsten Bormann
- Re: [Tools-discuss] xml2rfc in --v2 mode -- bug r… John C Klensin
- Re: [Tools-discuss] xml2rfc in --v2 mode -- bug r… Kesara Rathnayake
- Re: [Tools-discuss] xml2rfc in --v2 mode -- bug r… John C Klensin
- Re: [Tools-discuss] xml2rfc in --v2 mode -- bug r… Kesara Rathnayake
- Re: [Tools-discuss] xml2rfc in --v2 mode -- bug r… Jay Daley
- Re: [Tools-discuss] xml2rfc in --v2 mode -- bug r… Robert Sparks
- Re: [Tools-discuss] xml2rfc in --v2 mode -- bug r… Carsten Bormann
- Re: [Tools-discuss] xml2rfc in --v2 mode -- bug r… John C Klensin
- Re: [Tools-discuss] xml2rfc in --v2 mode -- bug r… John C Klensin
- Re: [Tools-discuss] xml2rfc in --v2 mode -- bug r… Julian Reschke
- Re: [Tools-discuss] xml2rfc in --v2 mode -- bug r… Julian Reschke
- Re: [Tools-discuss] xml2rfc in --v2 mode -- bug r… Julian Reschke
- Re: [Tools-discuss] xml2rfc in --v2 mode -- bug r… Julian Reschke
- Re: [Tools-discuss] xml2rfc in --v2 mode -- bug r… Jay Daley
- Re: [Tools-discuss] xml2rfc in --v2 mode -- bug r… John C Klensin
- Re: [Tools-discuss] xml2rfc in --v2 mode -- bug r… John Levine
- Re: [Tools-discuss] xml2rfc in --v2 mode -- bug r… Martin Thomson