Re: [TOOLS-DEVELOPMENT] Preview release of Text Submission Converter, id2xml

Henrik Levkowetz <henrik@levkowetz.com> Fri, 16 June 2017 17:54 UTC

Return-Path: <henrik@levkowetz.com>
X-Original-To: tools-development@ietfa.amsl.com
Delivered-To: tools-development@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id EBA1D12957F for <tools-development@ietfa.amsl.com>; Fri, 16 Jun 2017 10:54:12 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id sKbHHoHVH-Ft for <tools-development@ietfa.amsl.com>; Fri, 16 Jun 2017 10:54:09 -0700 (PDT)
Received: from durif.tools.ietf.org (durif.tools.ietf.org [IPv6:2001:1900:3001:11::3d]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A982A129B48 for <tools-development@ietf.org>; Fri, 16 Jun 2017 10:54:09 -0700 (PDT)
Received: from h-43-30.a357.priv.bahnhof.se ([79.136.43.30]:51472 helo=[192.168.1.120]) by durif.tools.ietf.org with esmtpsa (TLS1.2:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <henrik@levkowetz.com>) id 1dLvRg-0001zg-8n; Fri, 16 Jun 2017 10:54:09 -0700
To: Megan Ferguson <mferguson@amsl.com>
References: <A31CAA04-572A-411C-8BA1-4020273DC3D4@amsl.com>
Cc: tools-development@ietf.org
From: Henrik Levkowetz <henrik@levkowetz.com>
Message-ID: <557d6a64-92ef-4616-66c1-8e522e7c2dbd@levkowetz.com>
Date: Fri, 16 Jun 2017 19:54:00 +0200
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:45.0) Gecko/20100101 Thunderbird/45.8.0
MIME-Version: 1.0
In-Reply-To: <A31CAA04-572A-411C-8BA1-4020273DC3D4@amsl.com>
Content-Type: multipart/signed; micalg="pgp-sha256"; protocol="application/pgp-signature"; boundary="F1LqoWf3HNMTCAvPnlXNqBrJIhk7HW3pS"
X-SA-Exim-Connect-IP: 79.136.43.30
X-SA-Exim-Rcpt-To: tools-development@ietf.org, mferguson@amsl.com
X-SA-Exim-Mail-From: henrik@levkowetz.com
X-SA-Exim-Version: 4.2.1 (built Mon, 26 Dec 2011 16:24:06 +0000)
X-SA-Exim-Scanned: Yes (on durif.tools.ietf.org)
Archived-At: <https://mailarchive.ietf.org/arch/msg/tools-development/_QA9KOb6mFVfliNwJ8pjINKBaYo>
Subject: Re: [TOOLS-DEVELOPMENT] Preview release of Text Submission Converter, id2xml
X-BeenThere: tools-development@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Tools Development list server <tools-development.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tools-development>, <mailto:tools-development-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tools-development/>
List-Post: <mailto:tools-development@ietf.org>
List-Help: <mailto:tools-development-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tools-development>, <mailto:tools-development-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 16 Jun 2017 17:54:13 -0000

Hi Megan,

On 2017-06-16 18:54, Megan Ferguson wrote:
> Hi Henrik,
> 
> Revisiting this file using the new release (v1.0.1):
> 
>>> Input file: draft-ietf-trill-directory-assist-mechanisms-12
>>> Version: id2xml 1.0.0
>>> Issues: File not originally generated with XML
>>> Files available: 
>>> https://www.rfc-editor.org/rfc/v3test/draft-ietf-trill-directory-assist-mechanisms-12v3.original 
>>> https://www.rfc-editor.org/rfc/v3test/draft-ietf-trill-directory-assist-mechanisms-12v3.txt
>>> https://www.rfc-editor.org/rfc/v3test/draft-ietf-trill-directory-assist-mechanisms-12v3-rfcdiff.html
> 
> Note - we edited the file somewhat to combat the previously discussed issues.
> 
> 1) There were a few reference oddities and follow-up questions:
> 
> 
> a) The following are the work in progress (WiP) references that could not parse.  
> 
> draft-ietf-trill-directory-assist-mechanisms-12v3.txt(2237): Warning: Failed parsing a reference:
> [rfc6439bis] - D. Eastlake, Y. Li, M. Umair, A. Banerjee, and F. Hu,
>      "Routing Bridges (RBridges): Appointed Forwarders", draft-ietf-
>      trill-rfc6439bis, Work in Progress.
> 
> draft-ietf-trill-directory-assist-mechanisms-12v3.txt(2252): Warning: Failed parsing a reference:
> [ARPND] - Y. Li, D. Eastlake, L. Dunbar, R. Perlman, I. Gashinsky,
>      "TRILL: ARP/ND Optimization", draft-ietf-trill-arp-
>      optimization, Work in Progress.
> 
> draft-ietf-trill-directory-assist-mechanisms-12v3.txt(2256): Warning: Failed parsing a reference:
> [DirAsstEncap] L. Dunbar, D. Eastlake, R. Perlman, I. Gashingksy,
>      "Directory Assisted TRILL Encapsulation", draft-ietf-trill-
>      directory-assisted-encap, Work in Progress.
> 
> draft-ietf-trill-directory-assist-mechanisms-12v3.txt(2260): Warning: Failed parsing a reference:
> [SmartEN] R. Perlman, F. Hu, D. Eastlake, K. Krupakaran, T. Liao,
>      "TRILL Smart Endnodes", draft-ietf-trill-smart-endnodes",
>      draft-ietf-trill-smart-endnodes, Work in Progress.
> 
> Some questions/notes:
> 
> The general format for WiP references should be:
> 
>    [I-D.ietf-sidr-bgpsec-protocol]
>               Lepinski, M. and K. Sriram, "BGPsec Protocol                                                        
>               Specification", draft-ietf-sidr-bgpsec-protocol-20 (work
>               in progress), December 2016.
> 
> So we see that the first 4 references in the list above:
> 
> - are missing the date 
> - have the draft string before the Work in Progress designation
> - have draft strings that break across lines
> 
> We are curious which, or if all, of these things make parsing fail
> and the empty reference to be included?

None of them.  I believe the patterns in place will catch all of those.
In 1.0.0 there was no pattern to handle ', work in progress' rather than
'(work in progress)', but I added that in 1.0.1.  What is throwing the
matching off is that in all cases above, all the non-ultimate names in
the author list are given as 'A. Nonymous' instead of 'Nonymous, A'.

> b) With regard to this reference warning:
> 
> 
> draft-ietf-trill-directory-assist-mechanisms-12v3.txt(2264): Warning: Failed parsing a reference:
> [X.233] - ITU-T Recommendation X.233: Protocol for providing the
>      connectionless-mode network service: Protocol specification,
>      International Telecommunications Union, August 1997
> 
> It seems that the colon instead of a comma is to blame.  Once updated, this parsed fine.

That's lucky, in that case -- there really should be double quotes around
the title.  Adding missing quotes around any title would be among the first
fixes to improve parsing.

> c)  It seems using a 3-digit RFC number is not handled well.
> 
> In the xml2rfc text output from the id2xml xml file, we see:
> 
>   [RFC826]   - Plummer, D., "An Ethernet Address Resolution Protocol",
>              RFC 826, November 1982.
> 
>   [RFC903]   - Finlayson, R., Mann, T., Mogul, J., and M. Theimer, "A
>              Reverse Address Resolution Protocol", STD 38, RFC 903,
>              June 1984.
> The XML file contains:
> <!ENTITY RFC0826 SYSTEM "https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.0826.xml">
> <!ENTITY RFC0903 SYSTEM "https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.0903.xml”>
> 
> and it also contains:
> 
> 	<reference anchor="RFC826"><front>
> 	<title>An Ethernet Address Resolution Protocol</title>
> 	<author fullname="D. - Plummer" initials="D." surname="- Plummer">
> 	</author>
> 
> 	<date month="November" year="1982"/>
> 	</front>
> 
> 	<seriesInfo name="RFC" value="826"/>
> 	</reference>
> 
> 
> 	<reference anchor="RFC903"><front>
> 	<title>A Reverse Address Resolution Protocol</title>
> 	<author fullname="R. - Finlayson" initials="R." surname="- Finlayson">
> 	</author>
> 
> 	<author fullname="T. Mann" initials="T." surname="Mann">
> 	</author>
> 
> 	<author fullname="J. Mogul" initials="J." surname="Mogul">
> 	</author>
> 
> 	<author fullname="M. Theimer" initials="M." surname="Theimer">
> 	</author>
> 
> 	<date month="June" year="1984"/>
> 	</front>
> 
> 	<seriesInfo name="STD" value="38"/>
> 	<seriesInfo name="RFC" value="903"/>
> 	</reference>

Oops.  Right.  That's a bug.  Will fix.

> 
> 
> 2) List output variances. There are several areas where the lists in
> this document did not translate well.

Yes.  Out of the 195 hours spent on id2xml so far, I would guess that almost
half has been spent on variations of list analysis and handling.  Lists turned
out to be difficult!

> 
> Original:
>   The nature of dynamic distributed asynchronous systems is such that
>   it is impossible for a TRILL switch receiving Push Directory
>   information to be absolutely certain that it has complete
>   information.  However, it can obtain a reasonable assurance of
>   complete information by requiring two conditions to be met:
>      1. The PDSS field is 3 in the ESADI zero fragment from the server
>         for the relevant Data Label.
>      2. In so far as it can tell, it has had continuous data
>         connectivity to the server for a configurable amount of time
>         that defaults to twice the server's CSNP time (PushDirTimer,
>         see Section 2.7).
>   Condition 2 is necessary because a client TRILL switch might be just
>   coming up and receive an EASDI LSP meeting the requirement in
>   condition 1 above but has not yet received all of the ESADI LSP
>   fragments from the Push Directory server.
> 
> id2xml output:
> 
>   The nature of dynamic distributed asynchronous systems is such that
>  it is impossible for a TRILL switch receiving Push Directory
>  information to be absolutely certain that it has complete information.
>  However, it can obtain a reasonable assurance of complete information
>  by requiring two conditions to be met:
> 
>   1.  The PDSS field is 3 in the ESADI zero fragment from the server
>       for the relevant Data Label.
> 
>   1.  In so far as it can tell, it has had continuous data connectivity
>       to the server for a configurable amount of time that defaults to
>       twice the server's CSNP time (PushDirTimer, see Section 2.7).
>   Condition 2 is necessary because a client TRILL switch might be just
> 
>  coming up and receive an EASDI LSP meeting the requirement in
>  condition 1 above but has not yet received all of the ESADI LSP
>  fragments from the Push Directory server.

Right.  Clearly not optimal.  I've not seen this issue, with the same
list number repeated, but I think that if you insert blank lines in the
original to make it more like the default list output, it should be parsed
better.

> Original:
>            Query Address: The query is asking for any other addresses,
>               and the nickname of the TRILL switch from which they are
>               reachable, that correspond to the same interface as this
>               address, within the Data Label of the query of the
>               address provided. A typically Query Address would be
>               something like the following:
>               (1) A 48-bit MAC address with the querying TRILL switch
>                   primarily interested in either
>                   (1a) the RBridge by which that MAC address is
>                        reachable so that the querying RBridge can
>                        forward an unknown (before the query)
>                        destination MAC address native frame as a
>                        unicast TRILL Data packet rather than flooding
>                        it, or
>                   (1b) the IP address corresponding to the MAC address
>                        so that RBridge can locally respond to a RARP
>                        [RFC903] native frame.
>               (2) An IPv4 or IPv6 address with the querying RBridge
>                   interested in the corresponding MAC address so it can
>                   locally respond to an ARP [RFC826] or ND [RFC4861]
>                   native frame [ARPND].
>               But the query address could be some other address type
>               for which an AFN has been assigned, such as a 64-bit MAC
>               address [RFC7042] or a CLNS address [X.233].

Ouch.  That is going to be hard, particularly because of the non-standard
list markers (1) etc.  I don't think I'm even going to try to make it
handle this well.  Adding blank lines and using 1. instead of (1) is
probably the best approach, if you want this converted to something sensible.

> id2xml output:
>      Query Address: The query is asking for any other addresses,
>         and the nickname of the TRILL switch from which they are
>         reachable, that correspond to the same interface as this
>         address, within the Data Label of the query of the address
>         provided.  A typically Query Address would be something like
>         the following: (1) A 48-bit MAC address with the querying TRILL
>         switch primarily interested in either (1a) the RBridge by which
>         that MAC address is reachable so that the querying RBridge can
>         forward an unknown (before the query) destination MAC address
>         native frame as a unicast TRILL Data packet rather than
>         flooding it, or
> 
>              (1b) the IP address corresponding to the MAC address so
>              that RBridge can locally respond to a RARP [RFC903] native
>              frame.
> 
>              (2) An IPv4 or IPv6 address with the querying RBridge
>                  interested in the corresponding MAC address so it can
>                  locally respond to an ARP [RFC826] or ND [RFC4861]
>                  native frame [ARPND].
>              But the query address could be some other address type
>                  for which an AFN has been assigned, such as a 64-bit
>                  MAC address [RFC7042] or a CLNS address [X.233].

Right.  Sorry, but I've discovered that this is really hard for a program.
I don't have suggestions beyond those given above, for this.

> Original:
>   There are a wide variety of strategies that a TRILL switch can adopt
>   for making use of directory assistance. A few suggestions are given
>   below.
> 
>      -  Even if a TRILL switch will normally be operating with
>      information from a complete Push Directory server, there will be a
>      period of time when it first comes up before the information it
>      holds is complete.  Or, it could be that the only Push Directories
>      that can push information to it are incomplete or that they are
>      just starting and may not yet have pushed the entire directory.
> 
> 
> D. Eastlake, et al                                             [Page 41]
> 
> INTERNET-DRAFT                       TRILL: Directory Service Mechanisms
> 
> 
>      Thus, it is RECOMMENDED that all TRILL switches have a strategy
>      for dealing with the situation where they do not have complete
>      directory information. Examples are to send a Pull Directory query
>      or to revert to [RFC6325] behavior.
> 
>      -  If a TRILL switch receives a native frame X resulting in
>      seeking directory information, a choice needs to be made as to
>      what to do if it does not already have the directory information
>      it needs. In particular, it could (1) immediately flood the TRILL
>      Data packet resulting from ingressing X in parallel with seeking
>      the directory information, (2) flood that TRILL Data packet after
>      a delay, if it fails to obtain the directory information, or (3)
>      discard X if it fails to obtain the information. The choice might
>      depend on the priority of frame X since the higher that priority
>      typically the more urgent the frame is and the greater the
>      probability of harm in delaying it. If a Pull Directory request is
>      sent, it is RECOMMENDED that its priority be derived from the
>      priority of the frame X with the derived priority configurable and
>      having the following defaults:

Yup, the '-' is seen as a list bullet, but then the text doesn't have the
right following indentation.  When I tested this document, I edited the
indentation of the lines which follow the dash, and things were fine.

> 
> 
> id2xml output:
>   There are a wide variety of strategies that a TRILL switch can adopt
>   for making use of directory assistance.  A few suggestions are given
>   below.
> 
> 
> 
>      -  Even if a TRILL switch will normally be operating with
>      -
> 
>      -
> 
>      -
> 
>      -
> 
>      -
> 
> 
>      Thus, it is RECOMMENDED that all TRILL switches have a strategy
>      for dealing with the situation where they do not have complete
> 
> 
> 
> Eastlake, et al.        Expires September 3, 2017              [Page 41]
> 
> Internet-Draft   TRILL: Edge Directory Assist Mechanisms      March 2017
> 
> 
>      directory information.  Examples are to send a Pull Directory
>      query or to revert to [RFC6325] behavior.
> 
> 
> 
>      -  If a TRILL switch receives a native frame X resulting in
>      -
> 
>      -
> 
>      -
> 
>      -
> 
>      -
> 
>      -
> 
>      -
> 
>      -
> 
>      -
> 
>      -
> 
>      -
> 
>      -
> 
>      -
> 
> 
> 3) Figure/table output
> 
> Original:
>   The 4-bit Flags field of the message header for an Update Message is
>   as follows:
> 
>         +---+---+---+---+
>         | F | P | N | R |
>         +---+---+---+---+
> 
> 
> id2xml output:
> 
>   The 4-bit Flags field of the message header for an Update Message is
>   as follows:
> 
>                             +---+---+---+---+
>                             | F | P | N | R |
>                             +---+---+---+---+
>                             +---+---+---+---+

Hmm.  This is identified as a texttable, but since there's only the one
row, things turn out wrong.  I may be able to fix this.


> 
> Original:
> 
>            Name         Default           Section   Note Below
>     ------------------  -------           -------   ----------
> 
>     DirQueryTimeout     100 milliseconds  3.2.1           1
>     DirQueryRetries       3               3.2.1           1
>     DirGenQPriority       5               3.2.1           2
> 
>     DirRespMaxPriority    6               3.2.2.1         3
> 
>     DirUpdateDelay       50 milliseconds  3.3
>     DirUpdatePriority     5               3.3.1
>     DirUpdateTimeout    100 milliseconds  3.3.3
>     DirUpdateRetries      3               3.3.3
> 
>     DirAckMaxPriority     5               3.3.2           4
> 
>   Note 1: Pull Directory Query client timeout waiting for response and
>   maximum number of retries
> 
> 
> id2xml output:
> 
>            Name         Default           Section   Note Below
>     ------------------  -------           -------   ----------
> 
>     DirQueryTimeout     100 milliseconds  3.2.1           1
>     DirQueryRetries       3               3.2.1           1
>     DirGenQPriority       5               3.2.1           2
> 
> 
> 
>      DirRespMaxPriority      6 3.2.2.1 3
> 
> 
>     DirUpdateDelay       50 milliseconds  3.3
>     DirUpdatePriority     5               3.3.1
>     DirUpdateTimeout    100 milliseconds  3.3.3
>     DirUpdateRetries      3               3.3.3
> 
> 
> 
>      DirAckMaxPriority       5 3.3.2 4
> 
> 
>   Note 1: Pull Directory Query client timeout waiting for response and
>   maximum number of retries

I actually don't know for sure what this should be modelled as.  Do you
have a suggestion of how to handle this?

> 
> 4) Header vs. Authors’ Addresses section
> 
> Ideally, the Addresses section would show the full organization name.

Yees, agreed.  Could you send me an example of where it doesn't?


Best regards,

	Henrik