[TOOLS-DEVELOPMENT] Preview release of Text Submission Converter, id2xml

Megan Ferguson <mferguson@amsl.com> Wed, 09 August 2017 16:02 UTC

Return-Path: <mferguson@amsl.com>
X-Original-To: tools-development@ietfa.amsl.com
Delivered-To: tools-development@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1FCC11323CB for <tools-development@ietfa.amsl.com>; Wed, 9 Aug 2017 09:02:37 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.201
X-Spam-Level:
X-Spam-Status: No, score=-4.201 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id NM-7VAoUE7Np for <tools-development@ietfa.amsl.com>; Wed, 9 Aug 2017 09:02:35 -0700 (PDT)
Received: from mail.amsl.com (c8a.amsl.com [4.31.198.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1178E1323A6 for <tools-development@ietf.org>; Wed, 9 Aug 2017 09:02:35 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1]) by c8a.amsl.com (Postfix) with ESMTP id 311F61C3A5E; Wed, 9 Aug 2017 09:02:18 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
Received: from c8a.amsl.com ([127.0.0.1]) by localhost (c8a.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id SMfRghhHNyv7; Wed, 9 Aug 2017 09:02:18 -0700 (PDT)
Received: from meganfeiussmbp2.fios-router.home (unknown [47.144.132.130]) by c8a.amsl.com (Postfix) with ESMTPA id EFB631C3437; Wed, 9 Aug 2017 09:02:17 -0700 (PDT)
From: Megan Ferguson <mferguson@amsl.com>
Content-Type: text/plain; charset="windows-1252"
Content-Transfer-Encoding: quoted-printable
Date: Wed, 09 Aug 2017 09:02:36 -0700
Message-Id: <F14A70DA-852D-4F13-9D59-82D40EC6BEE6@amsl.com>
Cc: tools-development@ietf.org
To: henrik@levkowetz.com
Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\))
X-Mailer: Apple Mail (2.1878.6)
Archived-At: <https://mailarchive.ietf.org/arch/msg/tools-development/COYy9uDQziPDUulxuDfgF_3GBEs>
Subject: [TOOLS-DEVELOPMENT] Preview release of Text Submission Converter, id2xml
X-BeenThere: tools-development@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Tools Development list server <tools-development.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tools-development>, <mailto:tools-development-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tools-development/>
List-Post: <mailto:tools-development@ietf.org>
List-Help: <mailto:tools-development-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tools-development>, <mailto:tools-development-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 09 Aug 2017 16:02:37 -0000

Hi Henrik,

Mostly small input and notes for our records, so combining several test docs in the message 
below (and using continuous numbering). 
 
The majority are things that probably deserve no fix unless something is easy to update.
However, the Copyright title fix revisited (#6 below) would be good to have, IMHO.

General feedback that the references to RFCs and other known citation tags are *much improved* 
(thank you!).


Input file: draft-ietf-ipsecme-rfc4307bis-18
Version: id2xml 1.1.0
Issues: reference parsing (note - this was an xml2rfc file originally)
Files available: 
https://www.rfc-editor.org/rfc/v3test/draft-ietf-ipsecme-rfc4307bis-18v3.original
https://www.rfc-editor.org/rfc/v3test/draft-ietf-ipsecme-rfc4307bis-18v3.txt
https://www.rfc-editor.org/rfc/v3test/draft-ietf-ipsecme-rfc4307bis-18v3-rfcdiff.html
https://www.rfc-editor.org/rfc/v3test/draft-ietf-ipsecme-rfc4307bis-18v3.xml


1) Not sure what the deal with this reference is.  Initially, I got:

draft-ietf-ipsecme-rfc4307bis-18v3.txt(915): Warning: Failed parsing a reference.  Are all elements separated
   by commas (not periods, not just spaces)?:
   [TRANSCRIPTION]
              Bhargavan, K. and G. Leurent, "Transcript Collision
              Attacks: Breaking Authentication in TLS, IKE, and SSH",
              NDSS , feb 2016.


So I updated the weird space/comma and the date, and then I got:

Converting 'draft-ietf-ipsecme-rfc4307bis-18v3.txt'

draft-ietf-ipsecme-rfc4307bis-18v3.txt(918): Exception: need more than 1 value
   to unpack
Failure converting draft-ietf-ipsecme-rfc4307bis-18v3.txt: need more than 1 value to unpack
Traceback (most recent call last):
  File "/usr/bin/id2xml", line 9, in <module>
    load_entry_point('id2xml==1.1.0', 'console_scripts', 'id2xml')()
  File "/usr/lib/python2.7/site-packages/id2xml/run.py", line 226, in run
    xml = parser.parse_to_xml()
  File "/usr/lib/python2.7/site-packages/id2xml/parser.py", line 975, in parse_to_xml
    doc = self.document()
  File "/usr/lib/python2.7/site-packages/id2xml/parser.py", line 1004, in document
    self.root.append(self.back())
  File "<decorator-gen-37>", line 2, in back
  File "/usr/lib/python2.7/site-packages/id2xml/parser.py", line 578, in dtrace
    ret = fn(self, *params,**kwargs)
  File "/usr/lib/python2.7/site-packages/id2xml/parser.py", line 2738, in back
    references = self.references([ str(self.section_number) ])
  File "<decorator-gen-38>", line 2, in references
  File "/usr/lib/python2.7/site-packages/id2xml/parser.py", line 578, in dtrace
    ret = fn(self, *params,**kwargs)
  File "/usr/lib/python2.7/site-packages/id2xml/parser.py", line 2786, in references
    references = self.references(sublist, level+1)
  File "<decorator-gen-38>", line 2, in references
  File "/usr/lib/python2.7/site-packages/id2xml/parser.py", line 578, in dtrace
    ret = fn(self, *params,**kwargs)
  File "/usr/lib/python2.7/site-packages/id2xml/parser.py", line 2797, in references
    ref, entity = self.reference()
  File "<decorator-gen-39>", line 2, in reference
  File "/usr/lib/python2.7/site-packages/id2xml/parser.py", line 578, in dtrace
    ret = fn(self, *params,**kwargs)
  File "/usr/lib/python2.7/site-packages/id2xml/parser.py", line 2955, in reference
    name, value = docname.split(None, 1)
ValueError: need more than 1 value to unpack

The only way I could get it to parse was to remove “NDSS”.  
Just curious about this case as we see other items in that position frequently that don’t cause issues.

----




Input file: draft-ietf-mpls-tp-aps-updates-04
Version: id2xml 1.1.0
Issues: Lowercase surnames, References, texttables
Files available: 
https://www.rfc-editor.org/rfc/v3test/draft-ietf-mpls-tp-aps-updates-04v3.original
https://www.rfc-editor.org/rfc/v3test/draft-ietf-mpls-tp-aps-updates-04v3.txt
https://www.rfc-editor.org/rfc/v3test/draft-ietf-mpls-tp-aps-updates-04v3-rfcdiff.html
https://www.rfc-editor.org/rfc/v3test/draft-ietf-mpls-tp-aps-updates-04v3.xml

2) It doesn’t appear that the surnames beginning with a lowercase letter are recognized.  
Note - IMHO, this is okay to leave as is because the warning points out the issue and this 
is not common, so please feel free to leave as is unless an easy fix.

draft-ietf-mpls-tp-aps-updates-04v3.txt(355): Warning: This author is listed in the Authors' Addresses section, but was
   not found  on the first page: Huub van Helvoort


3) FYI - Here is another case where a texttable was created poorly.

Original:
   The last paragraph in Section 11 of [RFC7271] is modified as follows:

   ---------
   Old text:
   ---------
   In the state transition tables below, the letter 'i' stands for
   "ignore" and is an indication to remain in the current state and
   continue transmitting the current PSC message.
   ---------
   New text:
   ---------
   In the state transition tables below, the letter 'i' is the
   "ignore" flag, and if it is set it means that the top-priority
   global request is ignored.


id2xml text output:

   The last paragraph in Section 11 of [RFC7271] is modified as follows:

                                    Ol
                                    --
                                    In
                                    "i
                                    co
                                    Ne
                                    In
                                    gl


While this use of dashes is not usual (i.e., around “Old” and “New”), just want to point out in case.

------

Input file: draft-ietf-bmwg-dcbench-terminology-19
Version: id2xml 1.1.0
Issues: Numbered sections after references, sections missing general text indentation, Acks trouble with ToC
Files available: 
https://www.rfc-editor.org/rfc/v3test/draft-ietf-bmwg-dcbench-terminology-19v3.original
https://www.rfc-editor.org/rfc/v3test/draft-ietf-bmwg-dcbench-terminology-19v3.txt
https://www.rfc-editor.org/rfc/v3test/draft-ietf-bmwg-dcbench-terminology-19v3-rfcdiff.html
https://www.rfc-editor.org/rfc/v3test/draft-ietf-bmwg-dcbench-terminology-19v3.xml

4) The Abstract appeared without any indentation, which made things weird in the xml (turned everything into <note>s.

Original:

Abstract

The purpose of this informational document is to establish definitions
and describe measurement techniques for data center benchmarking, as
well as it is to introduce new terminologies applicable to performance
evaluations of data center network equipment. This document establishes
the important concepts for benchmarking network switches and routers in
the data center and, is a pre-requisite to the test methodology
publication [draft-ietf-bmwg-dcbench-methodology]. Many of these terms
and methods may be applicable to network equipment beyond this
publication's scope as the technologies originally applied in the data
center are deployed elsewhere.


xml:
<abstract/><note title="The purpose of this informational document is to establish definitions"/><note title="and describe measurement techniques for data center benchmarking, as"/><note title="well as it is to introduce new terminologies applicable to performance"/><note title="evaluations of data center network equipment. This document establishes"/><note title="the important concepts for benchmarking network switches and routers in"/><note title="the data center and, is a pre-requisite to the test methodology"/><note title="publication [draft-ietf-bmwg-dcbench-methodology]. Many of these terms"/><note title="and methods may be applicable to network equipment beyond this"/><note title="publication's scope as the technologies originally applied in the data"/><note title="center are deployed elsewhere."/></front>


5) Related to the above:  The whole text of the Acknowledgments section was pulled into the ToC/section title 
because it was not indented (same as the Abstract).  

Original:

   10.  References  . . . . . . . . . . . . . . . . . . . . . . . . . 16
     10.1.  Normative References  . . . . . . . . . . . . . . . . . . 16
     10.2.  Informative References  . . . . . . . . . . . . . . . . . 17
     10.3.  Acknowledgments . . . . . . . . . . . . . . . . . . . . . 17


...

10.3.  Acknowledgments

         The authors would like to thank Alfred Morton, Scott Bradner,
         Ian Cox, Tim Stevenson for their reviews and feedback.



id2xml output (removed the numbering):
   10. References  . . . . . . . . . . . . . . . . . . . . . . . . .  16
     10.1.  Normative References . . . . . . . . . . . . . . . . . .  16
     10.2.  Informative References . . . . . . . . . . . . . . . . .  17
   Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . .  17
   The authors would like to thank Alfred Morton, Scott Bradner, Ian
   Cox, Tim Stevenson for their reviews and feedback.  . . . . . . .  17
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  17


…
Acknowledgments

authors would like to thank Alfred Morton, Scott Bradner, Ian Cox, Tim
Stevenson for their reviews and feedback.

—————

Input file: draft-ietf-trill-mtu-negotiation-08
Version: id2xml 1.1.0
Issues: Updates values in header, Copyright title
Files available: 
https://www.rfc-editor.org/rfc/v3test/draft-ietf-trill-mtu-negotiation-08v3.original
https://www.rfc-editor.org/rfc/v3test/draft-ietf-trill-mtu-negotiation-08v3.txt
https://www.rfc-editor.org/rfc/v3test/draft-ietf-trill-mtu-negotiation-08v3-rfcdiff.html
https://www.rfc-editor.org/rfc/v3test/draft-ietf-trill-mtu-negotiation-08v3.xml

6) It appears that the title “Copyright and License Notice” is not recognized.  
Once I updated, I got successful parsing.  Just want to revisit this one as it seems we get this title a lot.

Warning: Expected a back section, found '1.  Introduction’

The list of common section titles from my previous mail on the topic:

> Copyright
> Copyright Notice
> Copyright Notice and License
> Copyright and License Notice
> Copyright, Disclaimer, and Additional IPR Provisions
> Copyright and IPR Provisions
> Copyright Statement




Thanks!

Megan