Re: [Ietf-and-github] New Version Notification for draft-kwatsen-git-xiax-automation-00.txt

Kent Watsen <kent+ietf@watsen.net> Tue, 26 February 2019 18:05 UTC

Return-Path: <010001692afbb77c-4e1fe666-0631-48e5-aa44-54e477da77d2-000000@amazonses.watsen.net>
X-Original-To: ietf-and-github@ietfa.amsl.com
Delivered-To: ietf-and-github@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B74DF130ECA for <ietf-and-github@ietfa.amsl.com>; Tue, 26 Feb 2019 10:05:37 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=amazonses.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id UtSJn0hbSWAf for <ietf-and-github@ietfa.amsl.com>; Tue, 26 Feb 2019 10:05:34 -0800 (PST)
Received: from a8-83.smtp-out.amazonses.com (a8-83.smtp-out.amazonses.com [54.240.8.83]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id AC7F6130EC7 for <ietf-and-github@ietf.org>; Tue, 26 Feb 2019 10:05:34 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/simple; s=ug7nbtf4gccmlpwj322ax3p6ow6yfsug; d=amazonses.com; t=1551204333; h=From:Message-Id:Content-Type:Mime-Version:Subject:Date:In-Reply-To:Cc:To:References:Feedback-ID; bh=EDq8UTAGZXk5zCZghGH9ew2y18kjSMbecGD9KERJq8E=; b=XzlojVZ9ssq1rbWrsrDFbLOXy/wB+8FWvT63qA3oZCa3n37I/mM7cWx8x0oep2wv IzSG/7q0hbNOnpqSjcCM8ndAAUEaSqouvTiUX9Sb2fMR35JqkoXxyyE0xi5GD7h25cP 2biPWv3OTu91MEJU1jY9CWl3b6bnWzmmkcNR4qmk=
From: Kent Watsen <kent+ietf@watsen.net>
Message-ID: <010001692afbb77c-4e1fe666-0631-48e5-aa44-54e477da77d2-000000@email.amazonses.com>
Content-Type: multipart/alternative; boundary="Apple-Mail=_9CB67424-7C2D-4B04-8FF4-82217FA48768"
Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\))
Date: Tue, 26 Feb 2019 18:05:33 +0000
In-Reply-To: <05d6799b-b4ee-9f04-e77e-dd4f8ea4e3aa@levkowetz.com>
Cc: ietf-and-github@ietf.org, RFC Interest <rfc-interest@rfc-editor.org>
To: Henrik Levkowetz <henrik@levkowetz.com>
References: <155112114000.10633.2593235416875795961.idtracker@ietfa.amsl.com> <01000169261421c7-978ecbf5-dcc4-4738-ba58-f409ce6adaf1-000000@email.amazonses.com> <0100016926dbd0be-1c4219e8-5389-433c-9326-c17addd023c6-000000@email.amazonses.com> <05d6799b-b4ee-9f04-e77e-dd4f8ea4e3aa@levkowetz.com>
X-Mailer: Apple Mail (2.3445.102.3)
X-SES-Outgoing: 2019.02.26-54.240.8.83
Feedback-ID: 1.us-east-1.DKmIRZFhhsBhtmFMNikgwZUWVrODEw9qVcPhqJEI2DA=:AmazonSES
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf-and-github/mpbFR2203niWC3OrTqiRX4q5u8s>
Subject: Re: [Ietf-and-github] New Version Notification for draft-kwatsen-git-xiax-automation-00.txt
X-BeenThere: ietf-and-github@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussion of using GitHub in IETF activities, particularly for Working Groups" <ietf-and-github.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf-and-github>, <mailto:ietf-and-github-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf-and-github/>
List-Post: <mailto:ietf-and-github@ietf.org>
List-Help: <mailto:ietf-and-github-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf-and-github>, <mailto:ietf-and-github-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 26 Feb 2019 18:05:38 -0000

Hi Henrik,


> For both xml2rfc version 2 <artwork>, and version 3 <sourcecode> (I don't
> think you should touch v3 <artwork>, but that's a separate discussion),

I can see why you might say that, assuming the primary value-add is build-time
validation, but note also that the "xiax:gen" attribute is used to dynamically 
*generate artwork*, which extends to v3 artwork (including SVGs) as well.


> I believe you should be using the "name" attribute, not the 
> "src"/"originalSrc" or xiax:src attributes to provide the file names
> to export to; see https://tools.ietf.org/html/rfc7991#section-2.48.2.
> 
> The "name" attribute on <artwork> is also supported in the v2 schema.
> 
> If there are any specific reasons why you've not used "name" attribute,
> which is provided with the intention of being used by extraction tools,
> we should have a discussion and understand why.


As I just wrote Julian, `xiax` was using the "name" attribute originally, but 
it needed to store the file's entire path (not just its name), so that the same
directory structure can be recreated during extraction, so that paths found
in the validation/generation scripts are valid, and hence round-tripping works.
Note that authors tend to place various inclusion files in subdirectories so
as to keep their document's top-level directory clean, so paths are fairly 
common.  

But I also saw that prep-tool didn't auto-set the "name" attribute and, as
`xiax` is effectively a preptool too, I figured it shouldn't muck with the "name"
attribute either.   That said, I think that it would make sense for `xiax` to
also set the "name" attribute, to be just the "basename" component of the
filepath string, as this must be intended 99.9% of the time...

FWIW, `xiax` originally tried to set the "originalSrc" attribute, which was
nice and clean but, unfortunately, `xml2rfc` didn't accept it.  That's when
"xiax-block" came to be, which would be total overkill if it were for just 
this single purpose, but now xiax-block is being used to store much more
metadata than the original "src" value, so I'm happy the switch was made.

As a further side note, the original plan was to store all the per-element
metadata as a bunch of "xiax" prefixed attributes in each element.  While
this would work, it would be ugly and difficult on the copy editors.  For
instance, looking at the "yang-data xiax-block" in
https://tools.ietf.org/html/draft-kwatsen-git-xiax-automation-00#appendix-B.3.1 <https://tools.ietf.org/html/draft-kwatsen-git-xiax-automation-00#appendix-B.3.1>,
note how the per-inclusion "src" and "gen" elements are hierarchical and
each can include whole files.   If per-element "xiax" prefixed attributes were
used, they would wind up being large/ugly base64-encoded blobs.  Having
one large block at the end helped keep things clean in the body of the
document.


> Some additional comments on the draft (I looked at -01):
> 
> |  
> |  5.  Updates to RFC 7991
> |  
> |     This section is just a placeholder for now, but it is expected that
> |     [RFC7991] will need to be modified in order to support some of this
> |     work.
> |  
> |     At a minimum, [RFC7991] should be updated to support attributes from
> |     other namespaces, such that the `rfc2xml` tool would neither process
> |     nor discard them.
> 
> Blanket acceptance of unknown attributes would make validation of schema
> attributes go away.  I'm fine with discussing specific new attributes for
> specific elements, but not a blanket acceptance of any attributes as valid
> in the input.  Accepting wildcard attributes in specific namespaces for
> <sourcecode> in the v3 grammar seems doable (I've just tested a grammar
> tweak for that).  I think this is what you're after, but the text in the
> draft seems a bit too open-ended.

I appreciate your concern but, in my effort to understand better, what issue
is there for random-prefixed elements/attributes appearing, if the default
processing ignores them all (other than not discarding them)?

Tying in my previous comment, having the xiax-block at the end where it is
out of the way seems good, but I worry about it being stored as an XML 
comment.  It seems that trepidatious luck is in play that comments are not
discarded.  At least, I'm unaware of any statement that XML comments
are guaranteed to be preserved.  In lieu of such statement being made,
it seems that it might be safer to use a <xiax:block> element just before
the close of the </rfc> tag but, this assumes a guarantee that random
prefixed elements won't be discarded either...



> |  7.  Previous Work
> ...
> |     o  The RFC Submit [submit] tool has been modified to test YANG
> |        modules contained within I-Ds, and the resulting document page in
> |        Datatracker [datatracker] displays a new "Yang Validation" field
> |        containing a varying color yin-yang symbol (green if no errors,
> |        red if errors) along with counts.  This tool is okay for what it
> |        is, but it neither aids authors between updates nor validates
> |        anything beyond YANG modules.
> 
> Additional info: The datatracker submission checking for YANG modules
> was written to be easily extended, exactly for the purpose of adding
> additional checkers in the future (I'm mentioning this as an additional
> point in support of generalizing the tool work in this area).

Good to know!    And, as I've mentioned to you in a private thread before,
I hope to join you at the Code Sprint in Prague to see about integrating
some of this work.



> |  10.  Security Considerations
> |  
> |  10.1.  Automated Execution of Arbitrary Scripts
> ...
> |     o  Allow arbitrary scripts, but don't execute them automatically when
> |        a document is extracted.  This solution is appealing as it still
> |        ensures these scripts were executed on the author's computer at
> |        time of construction, and the scripts themselves can be extracted
> |        and audited on the reviewer's computer.  If desired, after
> |        auditing a script, a reviewer could choose to manually execute it
> |        on their own computer.
> 
> Creating a generalized solution that would permit packaging of the verification
> code in the document seems sooo tempting.  But we've seen time and again that
> if you make it possible to automate execution of arbitrary code, it will be
> expanded to actually do the automation by someone, and then used by bad actors
> down the road. -1.

As a hyper-paranoid Security person, I'm very much aligned with your thinking
here.  That said, as the above quoted text states, what harm is there is the 
scripts are *NOT* executed automatically on extraction, that it would require
a manual/explicit action and, presumably, only then after reviewing the script
for such shenanigans?   The lure is pretty strong...


> |     o  Don't allow arbitrary scripts but, instead, support parameterized
> |        files that declare all the information necessary to construct the
> |        command(s) necessary to generate derived views and/or validate
> |        inclusions.
> 
> I like this better, but it's a much larger apparatus.  It might mean building
> a registry of validation tools (for yang, the state of the art is such
> that it seems feasible, but for other content that might not be so easy).

This approach is the only one that `xiax` supports now.   While I do think the
other approach has merit, it seemed important to first test how feasible this
approach would be.  Yes, code needs to be added for each new content-type  
(https://tools.ietf.org/html/draft-kwatsen-git-xiax-automation-00#appendix-B.2 <https://tools.ietf.org/html/draft-kwatsen-git-xiax-automation-00#appendix-B.2>).
And, even in the world of YANG, there's the issue of which YANG-tools (e.g.,
pyang, yanglint, etc) to support and to what extent.


> I think it's worth exploring, in any case.

Yes, exactly, nothing is locked down as of yet, plenty of opportunity for 
mid-course corrections.


Cheers,
Kent