Re: [TOOLS-DEVELOPMENT] Fwd: Preview release of Text Submission Converter, id2xml

Henrik Levkowetz <henrik@levkowetz.com> Fri, 26 May 2017 20:27 UTC

Return-Path: <henrik@levkowetz.com>
X-Original-To: tools-development@ietfa.amsl.com
Delivered-To: tools-development@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 945B7127ABE for <tools-development@ietfa.amsl.com>; Fri, 26 May 2017 13:27:34 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 27XvnCXLXDto for <tools-development@ietfa.amsl.com>; Fri, 26 May 2017 13:27:31 -0700 (PDT)
Received: from durif.tools.ietf.org (durif.tools.ietf.org [IPv6:2001:1900:3001:11::3d]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A6F2512E741 for <tools-development@ietf.org>; Fri, 26 May 2017 13:27:30 -0700 (PDT)
Received: from h-43-30.a357.priv.bahnhof.se ([79.136.43.30]:50678 helo=[192.168.1.120]) by durif.tools.ietf.org with esmtpsa (TLS1.2:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <henrik@levkowetz.com>) id 1dELpZ-0003ii-Np; Fri, 26 May 2017 13:27:30 -0700
To: Megan Ferguson <mferguson@amsl.com>
References: <591F6199.60801@levkowetz.com> <E5A96E3B-8D3A-4046-8D14-FCD260B66CA6@amsl.com>
Cc: tools-development@ietf.org, Sandy Ginoza <sginoza@amsl.com>, Alice Russo <arusso@amsl.com>
From: Henrik Levkowetz <henrik@levkowetz.com>
Message-ID: <59288FA9.3010005@levkowetz.com>
Date: Fri, 26 May 2017 22:27:21 +0200
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:38.0) Gecko/20100101 Thunderbird/38.4.0
MIME-Version: 1.0
In-Reply-To: <E5A96E3B-8D3A-4046-8D14-FCD260B66CA6@amsl.com>
Content-Type: multipart/signed; micalg="pgp-sha256"; protocol="application/pgp-signature"; boundary="ATXKawwApoQnxSNe8WOIpqRKmPgLLTt2S"
X-SA-Exim-Connect-IP: 79.136.43.30
X-SA-Exim-Rcpt-To: arusso@amsl.com, sginoza@amsl.com, tools-development@ietf.org, mferguson@amsl.com
X-SA-Exim-Mail-From: henrik@levkowetz.com
X-SA-Exim-Version: 4.2.1 (built Mon, 26 Dec 2011 16:24:06 +0000)
X-SA-Exim-Scanned: Yes (on durif.tools.ietf.org)
Archived-At: <https://mailarchive.ietf.org/arch/msg/tools-development/GerV4BQ9ar5bDBVGqHUSsRcrTX0>
Subject: Re: [TOOLS-DEVELOPMENT] Fwd: Preview release of Text Submission Converter, id2xml
X-BeenThere: tools-development@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Tools Development list server <tools-development.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tools-development>, <mailto:tools-development-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tools-development/>
List-Post: <mailto:tools-development@ietf.org>
List-Help: <mailto:tools-development-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tools-development>, <mailto:tools-development-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 26 May 2017 20:27:35 -0000

Hi Megan,

On 2017-05-26 20:49, Megan Ferguson wrote:
> Hi Henrik,
> 
> Thank you for your reply and explanations.  Inline below with MF.
> 
> Megan
> 
> Begin forwarded message:
> 
>> From: Henrik Levkowetz <henrik@levkowetz.com>
>> Subject: Re: [TOOLS-DEVELOPMENT] Preview release of Text Submission Converter, id2xml
>> Date: May 19, 2017 at 2:20:25 PM PDT
>> To: Megan Ferguson <mferguson@amsl.com>
>> Cc: tools-development@ietf.org
>> 
>> Hi Megan,
>> 
>> On 2017-05-19 21:55, Megan Ferguson wrote:
>>> Hi Henrik,
>>> 
>>> Some notes on our initial test pass. Please let me know if you would
>>> like any further information on any of the points below.
>> 
>> In general, I'd very much appreciate the relevant draft name when you
>> point out an issue; that will let me use that in testing and fixing
>> the issue.  As you'll see below, I believe that many of the issues you
>> mention below are fixed in the latest release (1.0.0-rc1), but there
>> are some cases below where the draft name would be good to have.
> 
> MF- Ack.  I used draft-ietf-isis-mi-bis-03 originally.  I have rerun this same file with the newer version and found 
> most of the items previously discussed to be resolved.
> 
> I also tested on draft-ietf-rtgwg-yang-key-chain-24.
> 
> Latest Release Notes:
> ---------------------
> 
> 1) Missing references:
> 
> When running id2xml (v1.0.0-rc1) on draft-ietf-isis-mi-bis-03, I
> received the following warning:
> 
> id2xml draft-ietf-isis-mi-bis-03test.txt
> Converting 'draft-ietf-isis-mi-bis-03test.txt'
> 
> draft-ietf-isis-mi-bis-03test.txt(660): Warning: Failed parsing a reference:
>    [ISO10589]
>               "Intermediate system to Intermediate system intra-domain
>               routeing information exchange protocol for use in
>               conjunction with the protocol for providing the
>               connectionless-mode Network Service (ISO 8473), ISO/IEC
>               10589:2002, Second Edition.", Nov 2002.
> Written to 'draft-ietf-isis-mi-bis-03test.xml’
> 
> With the reference being removed but the reference element still
> included (causing v2 not to parse until the empty element was
> removed).

Yes, this is the current behaviour with a reference that cannot be parsed.

If you are happy with the text in the input document (i.e., it looks the way
you would want to see it in an RFC, then please let me know, and I'll see if
I can adjust the set of regexes used to parse reference text.  In the case
above, it seems to me that series information has incorrectly been made part
of the document title.  I would expect the reference to be correctly parsed
if you break out the series info the way you'd like to see it in a published
document.


> Note that I also got a similar error running
> draft-ietf-rtgwg-yang-key-chain-24:
> 
> id2xml draft-ietf-rtgwg-yang-key-chain-24test.txt
> Converting 'draft-ietf-rtgwg-yang-key-chain-24test.txt'
> 
> draft-ietf-rtgwg-yang-key-chain-24test.txt(983): Warning: Failed parsing a reference:
>    [Dobb96b]  Dobbertin, H., "The Status of MD5 After a Recent Attack",
>               CryptoBytes Vol. 2, No. 2, Summer 1996.

Yes.  There's nothing in the current rule set for reference text which knows
how to handle 'Summer 1996'.

> *This second instance was more interesting to me based on the fact
> that the document contains a [Dobb96a] reference that is used very
> similarly, but was picked up just fine.
> 
> 2) I was curious about how a reference to a BCP that included more
> than one RFC would be handled (for example, BCP 9 below).

[snip]

Sandy posted a comment on this, which was to-the-point.

> 3) There are XML snippets in the appendices of
> draft-ietf-rtgwg-yang-key-chain-24. id2xml put the first two them
> into lists, which broke the alignment (the original XML by the
> authors used <artwork>), but the third snippet in the last appendix
> was put into <artwork>.

Right.  This is something I also found meanwhile.  I've enhanced the
regex which identifies code snippets to recognise xml and relatives;
this should be handled correctly in 2.0.0rc3.


[snipping a number of points where you confirm that 1.0.0-rc1 fixed
the issue -- thank you, good to get the confirmation]


>> The choice of inserting both entity definitions at the start, and expanded
>> xml refereces in the <references/> sections was done to make it easier for
>> you to choose one or the other.  If you'd rather I inserted the appropriate
>> entity at the point of reference, I could do so.  You would then instead
>> of:
>> 
>>  <back>
>>    <references title="Normative References">
>>      <?rfc include="reference.RFC.2119"?>
>>      <?rfc include='reference.RFC.5304'?>
>>      ...
>> 
>> see:
>> 
>>  <back>
>>    <references title="Normative References">
>>      &RFC2119;
>>      &RFC5304;
>>      ...
>> 
>> and it would pull the reference entries from the citation library in the
>> same way as with the <?rfc include ?> pi.
>> 
> MF - We believe it is more desirable for our purposes to have the
> entity definitions at the start and the &RFC2119;-style calls in the
> References section.

Ok, I will remove the generated reference xml and instead insert the
reference entities in the next release.

>> 
>>> 8) In v2, there is currently a setting to keep DOIs, draft strings,
>>> and URLs on the same line (if possible for the latter).
>> 
>> Ok; which setting are you thinking of?  I can certainly insert it if
>> desired.
> 
> MF - Sorry - I don’t know of a specific setting.  I simply remember them implementing it into one of the v2 versions. 

Ok.  FWIW, when xml2rfc processes what id2xml currently generates, it
tries to keep DOIs etc. on one line.  That's actually a source of
differences with the input documents in the test suite, in several cases,
because they don't always try as hard to keep the DOIs etc. on the same
line ,:-)

>>> 9) We generally see a single space after the author initials instead
>>> of two spaces there.
>>> original text:
>>> 
>>>   [RFC5310]  Bhatia, M., Manral, V., Li, T., Atkinson, R., White, R.,
>>>              and M. Fanto, "IS-IS Generic Cryptographic
>>>              Authentication", RFC 5310, DOI 10.17487/RFC5310, February
>>>              2009, <http://www.rfc-editor.org/info/rfc5310>.
>>> 
>>> id2xml text output:
>>> 
>>>   [RFC5310]  Bhatia, M., Manral, V., Li, T., Atkinson, R., White, R.,
>>>              and M.  Fanto, "IS-IS Generic Cryptographic
>>>              Authentication", RFC 5310, DOI 10.17487/RFC5310, February
>>>              2009, <http://www.rfc-editor.org/info/rfc5310>.
>> 
>> There was an issue in 0.9.2 where extra spaces were not stripped from the
>> initials string; that was fixed in 0.9.3, and should also be fixed in 
>> 1.0.0-rc1.  Please let me know if you find that it's not the case.
>> 
> MF - This is resolved.

Ok, splendid.


Thank you again for the feedback!


Best regards,

	Henrik