Re: [Netconf] WG LC for zerotouch

Martin Bjorklund <mbj@tail-f.com> Thu, 28 September 2017 07:46 UTC

Return-Path: <mbj@tail-f.com>
X-Original-To: netconf@ietfa.amsl.com
Delivered-To: netconf@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A750E135457 for <netconf@ietfa.amsl.com>; Thu, 28 Sep 2017 00:46:50 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id RCfhY8ILZX_B for <netconf@ietfa.amsl.com>; Thu, 28 Sep 2017 00:46:48 -0700 (PDT)
Received: from mail.tail-f.com (mail.tail-f.com [46.21.102.45]) by ietfa.amsl.com (Postfix) with ESMTP id BB0EF135449 for <netconf@ietf.org>; Thu, 28 Sep 2017 00:46:46 -0700 (PDT)
Received: from localhost (unknown [173.38.220.41]) by mail.tail-f.com (Postfix) with ESMTPSA id EFF8A1AE02A7; Thu, 28 Sep 2017 09:46:44 +0200 (CEST)
Date: Thu, 28 Sep 2017 09:45:14 +0200
Message-Id: <20170928.094514.526215195914925651.mbj@tail-f.com>
To: kwatsen@juniper.net
Cc: netconf@ietf.org
From: Martin Bjorklund <mbj@tail-f.com>
In-Reply-To: <B08CEFF5-8E7E-4D67-A7E4-9C8992114623@juniper.net>
References: <9F33F0C4-7697-4774-B7F1-A8556A6A73A1@gmail.com> <20170926.192701.1726303533240950350.mbj@tail-f.com> <B08CEFF5-8E7E-4D67-A7E4-9C8992114623@juniper.net>
X-Mailer: Mew version 6.7 on Emacs 24.5 / Mule 6.0 (HANACHIRUSATO)
Mime-Version: 1.0
Content-Type: Text/Plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/netconf/huKv0UuA5TY-K9uBi8lo8Kx9pu0>
Subject: Re: [Netconf] WG LC for zerotouch
X-BeenThere: netconf@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Network Configuration WG mailing list <netconf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/netconf>, <mailto:netconf-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/netconf/>
List-Post: <mailto:netconf@ietf.org>
List-Help: <mailto:netconf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/netconf>, <mailto:netconf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 28 Sep 2017 07:46:51 -0000

Kent Watsen <kwatsen@juniper.net> wrote:
> Hi Martin,
> 
> Thanks for your review!  Comments below.
> 
> K.
> 
> 
> > I have reviewed this document, and I believe the document is ready to
> > be published after a couple of issues have been addressed (see below).
> 
> We're getting there.
> 
> 
> > I don't have the expertise to tell if the solution is secure enough or
> > not.
> 
> This comment has been made by others as well.  The assumption is that
> SecDir will do that evaluation.  Is that fair?

Fine with me!

> > Here are my comments.
> > 
> > Major issues
> > ------------
> >
> > o  Section 7.2
> >
> >  Adding custom query parameters like this breaks the normal YANG
> >  contract.  If I understand this correctly, the instance data tree
> >  presented to the client is supposed to change based on these query
> >  parameters.
> 
> Correct.  The idea is that the response could vary based on the input
> query params.  Of course, in case there was any doubt, the response 
> would always conform to same YANG schema.

See below.

> >  If you really need to have that functionality, it would better to
> >  model the device-specific data as an action; input would be the
> >  os-name, os-version etc, and output would be the
> >  zerotouch-information and voucher etc.
> 
> I thought of this as well, but didn't do anything about it as my 
> last message to the list said that we'd add "query parameters"
> and I wanted to follow through on that statement.
> 
> But what is the issue?  Is it primarily that using query params 
> on GET should be limited to returning different representations
> of a resource, and this use seems to be returning different 
> resources altogether?

Yes, this is my concern.  A normal RESTCONF / YANG server has *one*
single instance of the operational state, and it just provides an API
to this data.  It can be filtered and pruned based on access rights,
but not modified from the outside based on query parameters.

> I think that this is a grey area, but
> appreciate the perspective.
> 
> Alternatively, is the issue more pragmatic, in that the list of
> query params seems unbounded, and we might run out of room on
> the URL string (1024 chars?) someday?  Or maybe perhaps that
> input can be hierarchal whereas params can't?
> 
> I imagine the answer is "all of the above"  ;)
> 
> FWIW, if we were to do this, we might also consider moving the 
> 'serial-number' field from the URL to an input field

Hmm, did you mean 'unique-id'?

> , and make
> a similar change to the 'update-progress' action.  But that then
> would mess up your NACM rule below.  Thoughts?

We can have:

  list device {
    key unique-id;

    leaf unique-id { ... }

    action get-bootstrapping-data {
      input {
        leaf os-name { ... }
        leaf os-version { ... }
        ...
      }
      output {
        leaf zerotouch-information { ... }
        leaf ownership-voucher { ... }
        leaf owner-certificate { ... }
      }
    }

    action report-progress {  // I like this name better :)
      ...
    }
  }

And then NACM can still be used to provide access control to this
list.


> >  Also, the draft doesn't discuss things like "one-time use voucher";
> >  this term is just used here.  It seems to me that this requires more
> >  descriptive text.
> 
> Good catch.  draft-ietf-anima-voucher informally calls it an "audit
> voucher", which is an unfortunate misnomer.   Nonetheless, point taken,
> the draft should tie into what's in the voucher draft.  Max Pritikin
> told me a couple weeks ago that calling it a "nonced voucher" or a 
> "voucher with a nonce" was probably best.  BTW, this seems like an
> editorial fix to me, but I can see how it could look like a major
> issue if you didn't know what the text was alluding to... 

Ok.

> > o  Section 4.4
> >
> >     When a device is
> >     not able to trust a bootstrap server, it MUST NOT send its IDevID
> >     certificate in the form of a TLS client certificate
> >
> >  How will the server authenticate the client in this case?
> 
> The idea is that the server doesn't auth the client in this case.
> The goal behind not sending the IDevID certificate was so a rouge
> bootstrap server wouldn't be able to discover the device's serial
> number, which is why Appendix B suggests also masking the serial
> number the device sends in the URL.  That said, we might need to
> rethink this, especially in conjunction with your next comment.  
> (continued below)  
> 
> 
> >  I see that in section 7.3 you have:
> >
> >   Note that the bootstrap server MUST NOT process a progress update
> >   from a device without first authenticating the device.  This is in
> >   contrast to when a device is fetching data from the server, a read-
> >   only operation, in which case device authentication is not strictly
> >   required (e.g., when sending signed information).
> >
> >  But the server is a normal RESTCONF server, so it will follow
> >  section 2.5 of RFC 8040; it will require clients to be
> >  authenticated.
> 
> I was hoping that the draft could suppress this auth requirement
> for this specific "untrusted" interaction.  In essence, reducing
> the bootstrap server to a simple file server providing anonymous
> read-access.
> 
> But in the security analysis of things, it comes down to what's 
> more important to prevent:
>  a) a rogue server discovering a valid device's serial number,
>     while not being able to provide the device a response the
>     device can trust, or
>  b) a valid server giving signed redirect information to a
>     rogue device, where the information given doesn't provide
>     the device any more information than the device must have
>     already had to make the request in the first place, other
>     than a confirmation that indeed that serial number is one
>     that the server is expecting.  
> 
> Looking at it this way, (b) is more important to prevent and, 
> besides:
> 1) there is already a Security Consideration related to the use
>    of serial numbers in the solution, so (a) seems to be just
>    more of the same, 
> 2) rogue connections could play havoc with the server's logs, and
> 3) (to your point) it would be tricky business to define an
>    exception for the standard RESTCONF server to follow.
> 
> All this is to say that I think we should go back to the device
> always sending its IDevID cert and, of course, the server always
> authenticating the device.  But I also think that a device should
> continue NOT sending any other information (i.e. query params,
> progress updates, etc.), beyond its serial number & IDevID cert,
> until it can fully authenticate the server.

Ok.

> > Normal issues
> > -------------
> >
> > o  Section 1.2
> >       The term "manufacturer" is used herein to refer to the
> >       manufacturer of a device or a delegate of the manufacturer.
> >
> >  In the text you sometimes use "manufacturer" and sometimes
> >  "manufacturer ot delegate".  Perhaps the term "manufacturer" should
> >  be reserved to mean just the manufacturer, and then use
> >  "manufacturer or delegate" in the places where that's appropriate.
> 
> I mean for it to always be "manufacturer or delegate".  The better
> fix is to remove "or delegate" throughout the document.  Agreed?

Ok.

> > o  Section 3.2
> >
> >  In the first paragraph, maybe include a reference to RFC 5280.
> 
> Yes.
> 
> 
> > o  Section 5.1
> >
> >  The numbers in the picture don't match the numbers in the list
> >  (specifically 4-5).
> 
> Ooops, I meant 2 & 3 to be 2a and 2b respectively.  Will fix, most
> likely by flattening the (2) text into distinct 2 & 3 sections
> again.
> 
> 
> > o  Section 5.6
> >
> >     Upon rebooting, the device MUST still be
> >     in its initial state, causing the bootstrapping process to run again,
> >     which will eventually come to this very point,
> >
> >  Why this MUST?  What if the new boot image contains other factory
> >  defaults that does not enable ZTP, would that be illegal?
> >
> >  I would think that implementations are free to do whatever they want
> >  in case of errors.
> 
> Maybe not *illegal*, but it would certainly brick the device at that
> point.  Would just reducing the "MUST" to a "must" suffice, or should
> we add a qualifier along the lines of "in order for the solution to
> work..." ?

I would just say something like 

     Upon rebooting, if the new image also enables zerotouch, the
     device will still be
     in its initial state, causing the bootstrapping process to run again,
     which will eventually come to this very point,


> >    In the case of errors, the device MUST reset
> >    itself in such a way that forces a reinstallation of the boot image,
> >    thereby wiping out any bad state the script may have left behind.
> >
> >  Is this also required?  Why can't stop and wait for manual
> >  intervention be allowed?
> 
> It could, but again, it would brick the device.  Same question as above,
> would reducing the "MUST" to a "must" suffice, or should we add a 
> qualifier along the lines of "in order for the solution to work..." ?

Well, this depends if the error is transient or not.  If it is not
transient, it will just happen again and again.

I don't have a strong opinion, but I would probably be silent about
what to do in case of these kinds of errors.  Leave that to the
implementation to figure out.

> > o  Section 7.3
> >
> >  The examples have "device=123456" in the URLs; it should be
> >  "device=123456789".
> 
> Yes.
> 
> 
> >  I suggest you shorten the lines of the examples even more, so that
> >  the examples are properly indented in the draft (they are currently
> >  outdented)
> 
> It's hard to make the lines much shorter with the URN being so long.
> Options are 1) don't indent just the xmlns line or 2) move from two-
> space indent to single-space indent.  Do you have a preference?

In the second example you use (1).  You can also use "\", which you
are also already using.

I just found another outdented figure in section 5.3, which probably
is easier to fix.

> > o  Section 9
> >
> >  Should you also mention that /device list should be protected by
> >  nacm rules?
> 
> Yes, indeed, I left out the standard security template for the bootstrap
> server YANG module.  Will add.
> 
> 
> >  If the RESTCONF user name is the device's unique-id, then a single
> >  NACM rule can be used:
> >
> >       <rule>
> >         <name>allow-device</name>
> >         <path xmlns:ztbs="urn:ietf:params:xml:ns:yang:\ietf-zerotouch-bootstrap-server">
> >           /ztbs:device[ztbs:unique-id=$USER]
> >         </path>
> >         <access-operations>read</access-operations>
> >         <action>permit</action>
> >       </rule>
> 
> Yes, but would you expect this to appear in the draft?  Typically, the specific
> NACM rule syntax is left as an exercise to the reader...

In order to do this, the "username" must be the same as the
"unique-id".  This requires the client cert to be constructed
properly.

Since the NACM rule is not obvious, and due to the cert implications,
I think it would be very helpful with an appendix that describes how
this can be set up.

> > o  ietf-zerotouch-information.yang
> >
> >  o sha256 is defined like this:
> >
> >             leaf sha256 {
> >                type string;
> >                  "The hex-encoded SHA-256 hash over the boot
> >                   image file.
> >
> >    Should this be type yang:hex-string?  If not, I think you need to
> >    specify how the hex-chars are encoded in the string.
> 
> Yes, it should be yang:hex-string.
> 
> 
> >   o uri is defined like this:
> >
> >          leaf-list uri {
> >            type inet:uri;
> >            min-elements 1;
> >            description
> >              "An ordered list of URIs to where the boot-image file MAY
> >               be obtained.
> >
> >   This leaf-list should be ordered-by user to indicate that the order is
> >   significant.
> 
> Yes, it should be ordered-by user.
> 
> 
> > o  ietf-zerotouch-bootstrap-server.yang
> >
> >  o the action "update-progress" should have a description
> 
> Strange, this wasn't caught by any of:
>   pyang --ietf --strict ...
>   pyang --canonical ...
>   yanglint ...

Yes, this is b/c --ietf implements 6087, and actions were not listed
there.  Will be fixed!

> But will add anyways.
> 
> 
> >  o     If a script is erroneously provided to a device that does not
> >        support the execution of scripts, the device SHOULD send a
> >        'script-warning' notification message
> >
> >    Rephrase to avoid the term "notification message"
> >    Also, the types are "pre-script-warning" and "post-script-warning".
> 
> Will fix.
> 
> 
> >   o The description of the "script" typedef says:
> >
> >       No attempt is made to standardize the contents, running context,
> >       or programming language of the script.
> >       [...]
> >       The script returns exit status code '0' on success and non-zero
> >       on error, with accompanying stderr/stdout for logging purposes.
> >
> >     I think the last quoted sentence contradicts the first - it does in
> >     fact make assumptions about the running context of the script.
> >
> >     I suggest the latter sentence is removed, and the rest of the test
> >     adjusted.  Don't talk about exit codes, but instead talk about
> >     "success" or "error" etc.
> 
> Will do.
> 
> 
> > o  Other comment
> >
> >  Is it assumed that there will be a vendor-specific YANG module to
> >  enable the zero-touch process?  Did you consider an
> >  ietf-zerotouch-device module for this purpose?  Such a configuration
> >  must be carefully designed to allow for a "merge" configuration file
> >  to actually disable the zero touch process.
> 
> It is assumed to be vendor-specific currently.  I have not thought to
> define a ietf-zerotouch-device module as of yet that, presumably, would
> contain a boolean or an enum (but not a p-container) for enabling the
> service.  Do you think we should add such a module to the draft, or just
> put a note somewhere about it?

I don't have a strong opinion, but if there's not a standard module
for this, the text should mention that it is assumed that vendors
define this themselves.   Hmm, I think I would prefer a standard
module; it would make the solution more complete.


> > Editorial nits:
> > ---------------
> >
> > o  use "" instead of '' consistently  (except in YANG strings)
> 
> Thanks. Will double-check all.
> 
> 
> > o  you probably should include a reference to ITU-T X.690.
> 
> Will do.
> 
> 
> > o  s/bootstrap data/bootstrapping data/
> 
> Good catch, thought I got them all before.
> 
> 
> > o  I suggest you run the modules through 'pyang -f yang
> >   --keep-comments' in order to fix some inconsistent indentations
> 
> Will do.
> 
> 
> Thanks again!
> Kent



/martin