Re: [Ietf-and-github] [rfc-i] New Version Notification for draft-kwatsen-git-xiax-automation-00.txt

Henrik Levkowetz <henrik@levkowetz.com> Tue, 26 February 2019 18:37 UTC

Return-Path: <henrik@levkowetz.com>
X-Original-To: ietf-and-github@ietfa.amsl.com
Delivered-To: ietf-and-github@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A3AE2130EB9 for <ietf-and-github@ietfa.amsl.com>; Tue, 26 Feb 2019 10:37:50 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id efxA0ssOJCtu for <ietf-and-github@ietfa.amsl.com>; Tue, 26 Feb 2019 10:37:47 -0800 (PST)
Received: from zinfandel.tools.ietf.org (zinfandel.tools.ietf.org [IPv6:2001:1890:126c::1:2a]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 58518130E9A for <ietf-and-github@ietf.org>; Tue, 26 Feb 2019 10:37:47 -0800 (PST)
Received: from h-202-242.a357.priv.bahnhof.se ([158.174.202.242]:50069 helo=tannat.localdomain) by zinfandel.tools.ietf.org with esmtpsa (TLS1.2:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <henrik@levkowetz.com>) id 1gyhbt-0004gD-13; Tue, 26 Feb 2019 10:37:46 -0800
To: Kent Watsen <kent+ietf@watsen.net>
References: <155112114000.10633.2593235416875795961.idtracker@ietfa.amsl.com> <01000169261421c7-978ecbf5-dcc4-4738-ba58-f409ce6adaf1-000000@email.amazonses.com> <0100016926dbd0be-1c4219e8-5389-433c-9326-c17addd023c6-000000@email.amazonses.com> <05d6799b-b4ee-9f04-e77e-dd4f8ea4e3aa@levkowetz.com> <010001692afbb77c-4e1fe666-0631-48e5-aa44-54e477da77d2-000000@email.amazonses.com>
Cc: RFC Interest <rfc-interest@rfc-editor.org>, ietf-and-github@ietf.org
From: Henrik Levkowetz <henrik@levkowetz.com>
Message-ID: <add6a423-7c4a-61b6-bef3-0f36bfac82cd@levkowetz.com>
Date: Tue, 26 Feb 2019 19:37:37 +0100
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:45.0) Gecko/20100101 Thunderbird/45.8.0
MIME-Version: 1.0
In-Reply-To: <010001692afbb77c-4e1fe666-0631-48e5-aa44-54e477da77d2-000000@email.amazonses.com>
Content-Type: multipart/signed; micalg="pgp-sha256"; protocol="application/pgp-signature"; boundary="Kt5Pdpoq8G2Bf8AUgF7e45xW9LM3GkFXW"
X-SA-Exim-Connect-IP: 158.174.202.242
X-SA-Exim-Rcpt-To: ietf-and-github@ietf.org, rfc-interest@rfc-editor.org, kent+ietf@watsen.net
X-SA-Exim-Mail-From: henrik@levkowetz.com
X-SA-Exim-Version: 4.2.1 (built Mon, 26 Dec 2011 16:24:06 +0000)
X-SA-Exim-Scanned: Yes (on zinfandel.tools.ietf.org)
X-Clacks-Overhead: GNU Terry Pratchett
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf-and-github/IKHnfO60vpzHrOF4ZjnnqQ6ERh8>
Subject: Re: [Ietf-and-github] [rfc-i] New Version Notification for draft-kwatsen-git-xiax-automation-00.txt
X-BeenThere: ietf-and-github@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussion of using GitHub in IETF activities, particularly for Working Groups" <ietf-and-github.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf-and-github>, <mailto:ietf-and-github-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf-and-github/>
List-Post: <mailto:ietf-and-github@ietf.org>
List-Help: <mailto:ietf-and-github-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf-and-github>, <mailto:ietf-and-github-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 26 Feb 2019 18:37:58 -0000

Hi Kent,

On 2019-02-26 19:05, Kent Watsen wrote:
> 
> Hi Henrik,
> 
> 
>> For both xml2rfc version 2 <artwork>, and version 3 <sourcecode> (I don't
>> think you should touch v3 <artwork>, but that's a separate discussion),
> 
> I can see why you might say that, assuming the primary value-add is build-time
> validation, but note also that the "xiax:gen" attribute is used to dynamically 
> *generate artwork*, which extends to v3 artwork (including SVGs) as well.

Ahh.  Point.  Yes.

>> I believe you should be using the "name" attribute, not the 
>> "src"/"originalSrc" or xiax:src attributes to provide the file names
>> to export to; see https://tools.ietf.org/html/rfc7991#section-2.48.2.
>> 
>> The "name" attribute on <artwork> is also supported in the v2 schema.
>> 
>> If there are any specific reasons why you've not used "name" attribute,
>> which is provided with the intention of being used by extraction tools,
>> we should have a discussion and understand why.
> 
> As I just wrote Julian, `xiax` was using the "name" attribute originally, but 
> it needed to store the file's entire path (not just its name), so that the same
> directory structure can be recreated during extraction, so that paths found
> in the validation/generation scripts are valid, and hence round-tripping works.
> Note that authors tend to place various inclusion files in subdirectories so
> as to keep their document's top-level directory clean, so paths are fairly 
> common.  

But there's no limitation on the "name" attribute that would exclude using
a relative path, that I'm aware of?

> But I also saw that prep-tool didn't auto-set the "name" attribute and, as
> `xiax` is effectively a preptool too, I figured it shouldn't muck with the "name"
> attribute either.   That said, I think that it would make sense for `xiax` to
> also set the "name" attribute, to be just the "basename" component of the
> filepath string, as this must be intended 99.9% of the time...

Mmm.  It seems to me that for xiax, all the generation, inclusion and
extraction purposes fit within what was the intended use of "name", except
that you'd want to set "src" if at some point the end products of external
tools processing would be a not-standalone .XML file dependent on external
files which were intended to be included by xml2rfc or an equivalent tool.

But given that you've worked with this more, I believe you have a better
understanding of what's needed -- what I'm trying to achieve here is to
separate concerns, trying to get to the point where we don't have multiple
interrelated attributes; instead having clearly separate ones, with
distinct purposes and effects.

> FWIW, `xiax` originally tried to set the "originalSrc" attribute, which was
> nice and clean but, unfortunately, `xml2rfc` didn't accept it.  That's when
> "xiax-block" came to be, which would be total overkill if it were for just 
> this single purpose, but now xiax-block is being used to store much more
> metadata than the original "src" value, so I'm happy the switch was made.

Ok, I can see that.

> As a further side note, the original plan was to store all the
> per-element metadata as a bunch of "xiax" prefixed attributes in each
> element. While this would work, it would be ugly and difficult on the
> copy editors. For instance, looking at the "yang-data xiax-block" in 
> https://tools.ietf.org/html/draft-kwatsen-git-xiax-automation-00#appendix-B.3.1
> <https://tools.ietf.org/html/draft-kwatsen-git-xiax-automation-00#appendix-B.3.1>,
> note how the per-inclusion "src" and "gen" elements are hierarchical and
> each can include whole files. If per-element "xiax" prefixed
> attributes were used, they would wind up being large/ugly
> base64-encoded blobs. Having one large block at the end helped keep
> things clean in the body of the document.

Ack, makes sense.

>> Some additional comments on the draft (I looked at -01):
>> 
>> |  
>> |  5.  Updates to RFC 7991
>> |  
>> |     This section is just a placeholder for now, but it is expected that
>> |     [RFC7991] will need to be modified in order to support some of this
>> |     work.
>> |  
>> |     At a minimum, [RFC7991] should be updated to support attributes from
>> |     other namespaces, such that the `rfc2xml` tool would neither process
>> |     nor discard them.
>> 
>> Blanket acceptance of unknown attributes would make validation of schema
>> attributes go away.  I'm fine with discussing specific new attributes for
>> specific elements, but not a blanket acceptance of any attributes as valid
>> in the input.  Accepting wildcard attributes in specific namespaces for
>> <sourcecode> in the v3 grammar seems doable (I've just tested a grammar
>> tweak for that).  I think this is what you're after, but the text in the
>> draft seems a bit too open-ended.
> 
> I appreciate your concern but, in my effort to understand better, what issue
> is there for random-prefixed elements/attributes appearing, if the default
> processing ignores them all (other than not discarding them)?

Mainly expressing this in the grammar in such a way that it doesn't preclude
validation of the base schema, and keeps the grammar bounded.

There's also a secondary wish to capture sensible requirements, generalising
them, and making them available in the base vocabulary, eventually, in order
to enrich it over time.

> Tying in my previous comment, having the xiax-block at the end where it is
> out of the way seems good, but I worry about it being stored as an XML 
> comment.  It seems that trepidatious luck is in play that comments are not
> discarded.  At least, I'm unaware of any statement that XML comments
> are guaranteed to be preserved.

Umm.  Right.  There has been several times when changes in the code has led
to upsets with comments, and I've had to take care to preserve comments, even
if it caused the code to become slightly more complex.

> In lieu of such statement being made,
> it seems that it might be safer to use a <xiax:block> element just before
> the close of the </rfc> tag but, this assumes a guarantee that random
> prefixed elements won't be discarded either...

Ack.  But as I mentioned, I'm not opposed to adding specific namespaces
attributes (and possibly even elements) in selected places -- it's the
unboundedness of permitting any elements and any attributes with any
namespace that makes me uncomfortable.

>> |  7.  Previous Work
>> ...
>> |     o  The RFC Submit [submit] tool has been modified to test YANG
>> |        modules contained within I-Ds, and the resulting document page in
>> |        Datatracker [datatracker] displays a new "Yang Validation" field
>> |        containing a varying color yin-yang symbol (green if no errors,
>> |        red if errors) along with counts.  This tool is okay for what it
>> |        is, but it neither aids authors between updates nor validates
>> |        anything beyond YANG modules.
>> 
>> Additional info: The datatracker submission checking for YANG modules
>> was written to be easily extended, exactly for the purpose of adding
>> additional checkers in the future (I'm mentioning this as an additional
>> point in support of generalizing the tool work in this area).
> 
> Good to know!    And, as I've mentioned to you in a private thread before,
> I hope to join you at the Code Sprint in Prague to see about integrating
> some of this work.

Ack :-)

>> |  10.  Security Considerations
>> |  
>> |  10.1.  Automated Execution of Arbitrary Scripts
>> ...
>> |     o  Allow arbitrary scripts, but don't execute them automatically when
>> |        a document is extracted.  This solution is appealing as it still
>> |        ensures these scripts were executed on the author's computer at
>> |        time of construction, and the scripts themselves can be extracted
>> |        and audited on the reviewer's computer.  If desired, after
>> |        auditing a script, a reviewer could choose to manually execute it
>> |        on their own computer.
>> 
>> Creating a generalized solution that would permit packaging of the verification
>> code in the document seems sooo tempting.  But we've seen time and again that
>> if you make it possible to automate execution of arbitrary code, it will be
>> expanded to actually do the automation by someone, and then used by bad actors
>> down the road. -1.
> 
> As a hyper-paranoid Security person, I'm very much aligned with your thinking
> here.  That said, as the above quoted text states, what harm is there is the 
> scripts are *NOT* executed automatically on extraction, that it would require
> a manual/explicit action and, presumably, only then after reviewing the script
> for such shenanigans?   The lure is pretty strong...

That you would not run embedded scripts automatically won't prevent others
from building such automation, if we build in features to embed scripts
intended to be run for validation, in the first place.

I'm sure everybody who's originally made it possible to put executable code
into .doc, .pdf, etc. would be horrified at what people have done with the
capability later.  We should have learnt that lesson by now ,:-}

>> |     o  Don't allow arbitrary scripts but, instead, support parameterized
>> |        files that declare all the information necessary to construct the
>> |        command(s) necessary to generate derived views and/or validate
>> |        inclusions.
>> 
>> I like this better, but it's a much larger apparatus.  It might mean building
>> a registry of validation tools (for yang, the state of the art is such
>> that it seems feasible, but for other content that might not be so easy).
> 
> This approach is the only one that `xiax` supports now.   While I do think the
> other approach has merit, it seemed important to first test how feasible this
> approach would be.  Yes, code needs to be added for each new content-type  
> (https://tools.ietf.org/html/draft-kwatsen-git-xiax-automation-00#appendix-B.2
> <https://tools.ietf.org/html/draft-kwatsen-git-xiax-automation-00#appendix-B.2>).
> And, even in the world of YANG, there's the issue of which YANG-tools (e.g.,
> pyang, yanglint, etc) to support and to what extent.

Ack.

>> I think it's worth exploring, in any case.
> 
> Yes, exactly, nothing is locked down as of yet, plenty of opportunity for 
> mid-course corrections.

Ack.


	Henrik