Re: [Netconf] WG LC for zerotouch

Kent Watsen <kwatsen@juniper.net> Fri, 29 September 2017 01:44 UTC

From: Kent Watsen <kwatsen@juniper.net>
To: Martin Bjorklund <mbj@tail-f.com>
CC: "netconf@ietf.org" <netconf@ietf.org>
Thread-Topic: [Netconf] WG LC for zerotouch
Thread-Index: AQHTNmkgNr3NKKCDrEalHAYIEUO5aqLHbIGAgAH3n4CAAIp+AIAA6muA
Date: Fri, 29 Sep 2017 01:44:16 +0000
Message-ID: <36B6D286-F8B6-4423-8DD2-4DF74943F2A1@juniper.net>
References: <9F33F0C4-7697-4774-B7F1-A8556A6A73A1@gmail.com> <20170926.192701.1726303533240950350.mbj@tail-f.com> <B08CEFF5-8E7E-4D67-A7E4-9C8992114623@juniper.net> <20170928.094514.526215195914925651.mbj@tail-f.com>
In-Reply-To: <20170928.094514.526215195914925651.mbj@tail-f.com>
Accept-Language: en-US
Content-Language: en-US
user-agent: Microsoft-MacOutlook/f.20.0.170309
received-spf: None (protection.outlook.com: juniper.net does not designate permitted sender hosts)
spamdiagnosticoutput: 1:99
spamdiagnosticmetadata: NSPM
Content-Type: text/plain; charset="utf-8"
Content-ID: <8711C489189E9344B089A533EFC54A9B@namprd05.prod.outlook.com>
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-MS-Exchange-CrossTenant-originalarrivaltime: 29 Sep 2017 01:44:16.1657 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: bea78b3c-4cdb-4130-854a-1d193232e5f4
X-MS-Exchange-Transport-CrossTenantHeadersStamped: BLUPR05MB273
Archived-At: <https://mailarchive.ietf.org/arch/msg/netconf/_fdahdpx8-ef_D4MAFq2aIa6jx0>
Subject: Re: [Netconf] WG LC for zerotouch
X-BeenThere: netconf@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Network Configuration WG mailing list <netconf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/netconf>, <mailto:netconf-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/netconf/>
List-Post: <mailto:netconf@ietf.org>
List-Help: <mailto:netconf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/netconf>, <mailto:netconf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 29 Sep 2017 01:44:24 -0000

Hi Martin,

I trimmed the discussion down to just the open threads.


>> > o  Section 7.2
>> >
>> >  Adding custom query parameters like this breaks the normal YANG
>> >  contract.  If I understand this correctly, the instance data tree
>> >  presented to the client is supposed to change based on these query
>> >  parameters.
>> 
>> Correct.  The idea is that the response could vary based on the input
>> query params.  Of course, in case there was any doubt, the response 
>> would always conform to same YANG schema.
>
>See below.
>
>> >  If you really need to have that functionality, it would better to
>> >  model the device-specific data as an action; input would be the
>> >  os-name, os-version etc, and output would be the
>> >  zerotouch-information and voucher etc.
>> 
>> I thought of this as well, but didn't do anything about it as my 
>> last message to the list said that we'd add "query parameters"
>> and I wanted to follow through on that statement.
>> 
>> But what is the issue?  Is it primarily that using query params 
>> on GET should be limited to returning different representations
>> of a resource, and this use seems to be returning different 
>> resources altogether?
>
> Yes, this is my concern.  A normal RESTCONF / YANG server has *one*
> single instance of the operational state, and it just provides an API
> to this data.  It can be filtered and pruned based on access rights,
> but not modified from the outside based on query parameters.

I agree, it seems more proper.  Okay, let's make this change.  No
objections from the WG, right?  As a heads-up for those following
along, this change doesn't affect the essence of the solution, but
the diff will appear somewhat big...


>> FWIW, if we were to do this, we might also consider moving the 
>> 'serial-number' field from the URL to an input field
>
> Hmm, did you mean 'unique-id'?

Yes. But what about the idea, moving 'unique-id' from the URL path
to an input param?  Or, even more dramatically, we could remove 
'unique-id' altogether, relying instead on the RESTCONF username
extracted from the DevID certificate?

If relying on the extracted username, note that none of the 
'cert-to-name' identities defined in RFC 7407, which is also
used in draft-ietf-netconf-restconf-client-server, would work.
The 'common-name' identity is closest, but what is really needed
is a combination of a 'serialNumber' attribute and an optional
'hardwareModuleName' attribute (not all DevID certs define it).
I suppose that this is an issue for the afore-mentioned restconf
client/server draft more so than this one though.  Still, since
we're talking about it, assuming we did this, would that draft
add said identities, or would a 7407bis be better?


>> , and make
>> a similar change to the 'update-progress' action.  But that then
>> would mess up your NACM rule below.  Thoughts?
>
>We can have:
>
>  list device {
>    key unique-id;
>
>    leaf unique-id { ... }
>
>    action get-bootstrapping-data {
>      input {
>        leaf os-name { ... }
>        leaf os-version { ... }
>        ...
>      }
>      output {
>        leaf zerotouch-information { ... }
>        leaf ownership-voucher { ... }
>        leaf owner-certificate { ... }
>      }
>    }
>
>    action report-progress {  // I like this name better :)
>      ...
>    }
>  }

Yes, this is roughly what switching to an action statement would look
like though, pending on the above discussion, we may use top-level RPCs
instead of actions, if relying on the RESTCONF username.

I agree, 'report-progress' does seem better.  It's a minor thing, but
happy to oblige.  (no objections from the WG, right?)


>> > o  Section 4.4
>> >
>> >     When a device is
>> >     not able to trust a bootstrap server, it MUST NOT send its IDevID
>> >     certificate in the form of a TLS client certificate
>> >
>> >  How will the server authenticate the client in this case?
>> 
>> The idea is that the server doesn't auth the client in this case.
>> The goal behind not sending the IDevID certificate was so a rouge
>> bootstrap server wouldn't be able to discover the device's serial
>> number, which is why Appendix B suggests also masking the serial
>> number the device sends in the URL.  That said, we might need to
>> rethink this, especially in conjunction with your next comment.  
>> (continued below)  
>> 
>> 
>> >  I see that in section 7.3 you have:
>> >
>> >   Note that the bootstrap server MUST NOT process a progress update
>> >   from a device without first authenticating the device.  This is in
>> >   contrast to when a device is fetching data from the server, a read-
>> >   only operation, in which case device authentication is not strictly
>> >   required (e.g., when sending signed information).
>> >
>> >  But the server is a normal RESTCONF server, so it will follow
>> >  section 2.5 of RFC 8040; it will require clients to be
>> >  authenticated.
>> 
>> I was hoping that the draft could suppress this auth requirement
>> for this specific "untrusted" interaction.  In essence, reducing
>> the bootstrap server to a simple file server providing anonymous
>> read-access.
>> 
>> But in the security analysis of things, it comes down to what's 
>> more important to prevent:
>>  a) a rogue server discovering a valid device's serial number,
>>     while not being able to provide the device a response the
>>     device can trust, or
>>  b) a valid server giving signed redirect information to a
>>     rogue device, where the information given doesn't provide
>>     the device any more information than the device must have
>>     already had to make the request in the first place, other
>>     than a confirmation that indeed that serial number is one
>>     that the server is expecting.  
>> 
>> Looking at it this way, (b) is more important to prevent and, 
>> besides:
>> 1) there is already a Security Consideration related to the use
>>    of serial numbers in the solution, so (a) seems to be just
>>    more of the same, 
>> 2) rogue connections could play havoc with the server's logs, and
>> 3) (to your point) it would be tricky business to define an
>>    exception for the standard RESTCONF server to follow.
>> 
>> All this is to say that I think we should go back to the device
>> always sending its IDevID cert and, of course, the server always
>> authenticating the device.  But I also think that a device should
>> continue NOT sending any other information (i.e. query params,
>> progress updates, etc.), beyond its serial number & IDevID cert,
>> until it can fully authenticate the server.
>
> Ok.

Okay, it seems that we agree.  Any objections from the WG?  In case it's
unclear, this change is fairly significant from a solution-perspective.
It will also have a non-trivial impact from a draft-diff perspective.
Please chime in now with any opinions now.

 
>> > o  Section 5.6
>> >
>> >     Upon rebooting, the device MUST still be
>> >     in its initial state, causing the bootstrapping process to run again,
>> >     which will eventually come to this very point,
>> >
>> >  Why this MUST?  What if the new boot image contains other factory
>> >  defaults that does not enable ZTP, would that be illegal?
>> >
>> >  I would think that implementations are free to do whatever they want
>> >  in case of errors.
>> 
>> Maybe not *illegal*, but it would certainly brick the device at that
>> point.  Would just reducing the "MUST" to a "must" suffice, or should
>> we add a qualifier along the lines of "in order for the solution to
>> work..." ?
>
> I would just say something like 
>
>     Upon rebooting, if the new image also enables zerotouch, the
>     device will still be
>     in its initial state, causing the bootstrapping process to run again,
>     which will eventually come to this very point,

Okay.


>> >    In the case of errors, the device MUST reset
>> >    itself in such a way that forces a reinstallation of the boot image,
>> >    thereby wiping out any bad state the script may have left behind.
>> >
>> >  Is this also required?  Why can't stop and wait for manual
>> >  intervention be allowed?
>> 
>> It could, but again, it would brick the device.  Same question as above,
>> would reducing the "MUST" to a "must" suffice, or should we add a 
>> qualifier along the lines of "in order for the solution to work..." ?
>
> Well, this depends if the error is transient or not.  If it is not
> transient, it will just happen again and again.

Maybe, maybe not.  The device could potentially land onto another/different
source of bootstrapping data that works the next go around.  The hope is
that the device keeps trying (and posting progress-updates regarding its
errors) ad infinitum until things eventually clear up.  For instance, an
admin eventually takes notice and administers corrective action.

> I don't have a strong opinion, but I would probably be silent about
> what to do in case of these kinds of errors.  Leave that to the
> implementation to figure out.

I feel that it's critical that manual intervention is never required.
Yes, a manual step could be employed, but it should be a rare step.  I
think it's best to say something along these lines.  I'll try to state
this more clearly.


>> >  I suggest you shorten the lines of the examples even more, so that
>> >  the examples are properly indented in the draft (they are currently
>> >  outdented)
>> 
>> It's hard to make the lines much shorter with the URN being so long.
>> Options are 1) don't indent just the xmlns line or 2) move from two-
>> space indent to single-space indent.  Do you have a preference?
>
> In the second example you use (1).  You can also use "\", which you
> are also already using.

So I do, albeit only in the HTTP-header parts, not the HTTP-body parts.

FWIW, my build-script inserts source files in situ as directed by
processing instructions. I can modify the build-script to do the 
auto-'\' insertion logic on some column-boundary, but it may need
a complimentary update on the extraction tools.  Do we need some 
sort of standard around this?

I'm aware that here you're only talking about examples, which 
currently are not extracted via tools, but the same auto-'\' 
insertion logic could also help with the YANG modules, as they
could become unwieldy too...though I suppose liberal use of 
groupings could always be employed to keep line-lengths in
check.  Okay, but still, a YANG Doctors discussion at IETF 99
suggested introducing the ability to auto-extract and validate
examples in drafts too, to automate as much of the YD review
process as possible, so still I think there may be a need for
a '\' standard of some sort here.  Thoguhts?



> I just found another outdented figure in section 5.3, which probably
> is easier to fix.

Huh?  I don't see it.  The figure in 5.3 looks okay to me...


>> > o  Section 9
>> >
>> >  Should you also mention that /device list should be protected by
>> >  nacm rules?
>> 
>> Yes, indeed, I left out the standard security template for the bootstrap
>> server YANG module.  Will add.
>> 
>> 
>> >  If the RESTCONF user name is the device's unique-id, then a single
>> >  NACM rule can be used:
>> >
>> >       <rule>
>> >         <name>allow-device</name>
>> >         <path xmlns:ztbs="urn:ietf:params:xml:ns:yang:\ietf-zerotouch-bootstrap-server">
>> >           /ztbs:device[ztbs:unique-id=$USER]
>> >         </path>
>> >         <access-operations>read</access-operations>
>> >         <action>permit</action>
>> >       </rule>
>> 
>> Yes, but would you expect this to appear in the draft?  Typically, the specific
>> NACM rule syntax is left as an exercise to the reader...
>
> In order to do this, the "username" must be the same as the
> "unique-id".  This requires the client cert to be constructed
> properly.

Agreed.  In this case, the client cert is a DevID cert, and  802.1AR 
governs its construction.  Not too many options there.  Of course, 
the mapping of said cert to username is being discussed above wrt the
cert-to-name mapping logic... 


>> Since the NACM rule is not obvious, and due to the cert implications,
>> I think it would be very helpful with an appendix that describes how
>> this can be set up.

I think that this will depend on if we go with an action or a top-level 
RPC. If an action, then NACM may have a place and an example may help.
But, if using a top-level RPC, then it seems that the server has no 
option but to auth the DevID cert and provide a RESTCONF username-specific
response - NACM can't constrain it any further, right?  (note: this seems
more secure to me - I like it)


>> > o  Other comment
>> >
>> >  Is it assumed that there will be a vendor-specific YANG module to
>> >  enable the zero-touch process?  Did you consider an
>> >  ietf-zerotouch-device module for this purpose?  Such a configuration
>> >  must be carefully designed to allow for a "merge" configuration file
>> >  to actually disable the zero touch process.
>> 
>> It is assumed to be vendor-specific currently.  I have not thought to
>> define a ietf-zerotouch-device module as of yet that, presumably, would
>> contain a boolean or an enum (but not a p-container) for enabling the
>> service.  Do you think we should add such a module to the draft, or just
>> put a note somewhere about it?
>
> I don't have a strong opinion, but if there's not a standard module
> for this, the text should mention that it is assumed that vendors
> define this themselves.   Hmm, I think I would prefer a standard
> module; it would make the solution more complete.

We can add a module, but maybe it should be called something like
ietf-zerotouch-bootstrap-client, to mirror the ietf-zerotouch-
bootstrap-server module? - though this may be a mismatch as the
former regards a southbound protocol and the latter regards a
configuration model...

Maybe we can discuss what would be in this model.  I'm not sure if
I'm oversimplifying it, but I think the module might be just a single
config true leaf called something like "enabled".   There might also
be some config false leafs for the stuff mentioned in section 5.1, 
though it doing so is not so important since the bootstrapping only
occurs once (not ongoing management).  So a module would primarily
be for the "enabled" leaf.  Anything else?  Still think it's worth it?


Thanks again,
Kent

Re: [Netconf] WG LC for zerotouch Martin Bjorklund
[Netconf] WG LC for zerotouch Mahesh Jethanandani
Re: [Netconf] WG LC for zerotouch Kent Watsen
Re: [Netconf] WG LC for zerotouch Martin Bjorklund
Re: [Netconf] WG LC for zerotouch Kent Watsen
Re: [Netconf] WG LC for zerotouch Martin Bjorklund
Re: [Netconf] WG LC for zerotouch Kent Watsen
Re: [Netconf] WG LC for zerotouch Martin Bjorklund
Re: [Netconf] WG LC for zerotouch Kent Watsen
Re: [Netconf] WG LC for zerotouch Martin Bjorklund
[Netconf] IPR and known implementation (Was Re: W… Mahesh Jethanandani
Re: [Netconf] IPR and known implementation (Was R… Mikael Abrahamsson
Re: [Netconf] IPR and known implementation (Was R… ianfarrer
Re: [Netconf] IPR and known implementation (Was R… Kent Watsen
Re: [Netconf] IPR and known implementation (Was R… Kent Watsen