Re: [netmod] AD review: draft-ietf-netmod-revised-datastores-08

Robert Wilton <rwilton@cisco.com> Thu, 21 December 2017 14:54 UTC

Return-Path: <rwilton@cisco.com>
X-Original-To: netmod@ietfa.amsl.com
Delivered-To: netmod@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id EEA5412D886 for <netmod@ietfa.amsl.com>; Thu, 21 Dec 2017 06:54:14 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -14.51
X-Spam-Level:
X-Spam-Status: No, score=-14.51 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_HI=-5, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, URIBL_BLOCKED=0.001, USER_IN_DEF_DKIM_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cisco.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CHM9KYT7-Ap4 for <netmod@ietfa.amsl.com>; Thu, 21 Dec 2017 06:54:11 -0800 (PST)
Received: from aer-iport-1.cisco.com (aer-iport-1.cisco.com [173.38.203.51]) (using TLSv1.2 with cipher DHE-RSA-SEED-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D689112D883 for <netmod@ietf.org>; Thu, 21 Dec 2017 06:54:06 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=16323; q=dns/txt; s=iport; t=1513868047; x=1515077647; h=subject:to:cc:references:from:message-id:date: mime-version:in-reply-to:content-transfer-encoding; bh=u1WvZutVZLU3gagbrSWXBbdUbEft5koKQRqvPukxajg=; b=BH8RGC6uEFvRuREonm/DS8CMq2jCZMFDO656/nkzedXo7jHV1wQYi3Nr GhQs1elpeMSsLQ1U81BS7j/MTrEiYJkPRajIAv+512I5wk4SRmGOEV8Iy jgRRmYbR3zU80y4faDUekZS+dfoZVO68cTvWagZld+IvKrpxrjK51QHv4 Q=;
X-IronPort-AV: E=Sophos;i="5.45,436,1508803200"; d="scan'208";a="1069259"
Received: from aer-iport-nat.cisco.com (HELO aer-core-3.cisco.com) ([173.38.203.22]) by aer-iport-1.cisco.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 21 Dec 2017 14:53:52 +0000
Received: from [10.63.23.84] (dhcp-ensft1-uk-vla370-10-63-23-84.cisco.com [10.63.23.84]) by aer-core-3.cisco.com (8.14.5/8.14.5) with ESMTP id vBLErqCj024788; Thu, 21 Dec 2017 14:53:52 GMT
To: Vladimir Vassilev <vladimir@transpacket.com>, NETMOD Working Group <netmod@ietf.org>
References: <e2fd599f-7547-d2f7-d450-f67a3f409ae1@cisco.com> <fe856e5c-5760-9bb9-ace3-cec0cfb39278@cisco.com> <79d1baae-397d-883e-3bc0-e1c5f71fc4f8@transpacket.com> <64f59023-e000-18c4-8830-29ba6e9be7e9@cisco.com> <6e899e21-8931-b61c-3b73-6c8a8a1c912a@transpacket.com>
From: Robert Wilton <rwilton@cisco.com>
Message-ID: <537f059a-d2ed-1242-8636-11e5c125e0a2@cisco.com>
Date: Thu, 21 Dec 2017 14:53:52 +0000
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.5.0
MIME-Version: 1.0
In-Reply-To: <6e899e21-8931-b61c-3b73-6c8a8a1c912a@transpacket.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Content-Language: en-US
Archived-At: <https://mailarchive.ietf.org/arch/msg/netmod/GymJ8wrFHzXZelkCBdsrOpyQ0RE>
Subject: Re: [netmod] AD review: draft-ietf-netmod-revised-datastores-08
X-BeenThere: netmod@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: NETMOD WG list <netmod.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/netmod>, <mailto:netmod-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/netmod/>
List-Post: <mailto:netmod@ietf.org>
List-Help: <mailto:netmod-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/netmod>, <mailto:netmod-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 21 Dec 2017 14:54:15 -0000


On 21/12/2017 13:03, Vladimir Vassilev wrote:
> On 12/21/2017 11:34 AM, Robert Wilton wrote:
>
>> Hi Vladimir,
>>
>> First point of clarification is that this is not about 
>> running/intended at all.  The contents of running/intended do not 
>> change in anyway depending on whether hardware is present or absent.
>>
>> The section is only concerned with how the configuration is applied 
>> in operational, and basically says that you cannot apply 
>> configuration for resources that are missing (which seems 
>> reasonable).  E.g. I cannot configure an IP address on a physical 
>> interface that isn't there.  Or if the physical interface gets 
>> removed then the configuration associated with that interface is also 
>> removed from operational.
>>
>> Operational isn't validated and data model constraints are allowed to 
>> be broken (ideally transiently).
> I want to focus on this. IMO giving up schema validitiy for any 
> datastore is unacceptable price. Pre-NMDA devices had full model 
> support in operational data (all YANG constrains part of the model 
> without discrimination were enforced). If this is about to change it 
> will compromise interoperability and a significant portion of the 
> client implementation workload that can be automated will need to be 
> coded in hand and tested.
I don't agree with this.  A client can easily see if configuration has 
been applied by looking at the corresponding data node in the 
operational datastore.  If <operational> is fully implemented then this 
applies to any data node in the configuration.


> Unresolved leafrefs, undefined behaviour of different implementations 
> removing different configuration nodes in violation of YANG semantic 
> constraints (which I do not think can be so clearly separated from the 
> syntactic constraints when one considers types like leafref, 
> instance-identifier etc.) and the corresponding side effects based on 
> the server implementators own creativity is eventually going to create 
> more problems.
I believe that returning the truth is more important that returning 
false information (or no information at all) to satisfy constraints in a 
schema model.

>
> 1. IMO the only acceptable solution is to have YANG valid operational 
> datastore at all times. operational like any other datastore MUST be 
> valid YANG data tree and it has to be a system implementation task to 
> consider all complications resulting from the removal of the resources 
> leading to any data transformations. If this is difficult or 
> impossible other mechanisms to flag missing resources should be used 
> (e.g. /interfaces/interface/oper-status=not-present) This sounds like 
> a useful contract providing the value of a standard the alternative 
> does not.
I think that forcing operational to always be consistent quickly falls 
down as a solution:
  - In the case of a device with multiple linecards you cannot force 
that they are always lockstep in sync with each other.
  - In a case of device with multiple concurrent daemon processes, you 
cannot force that they always all have a consistent view of the 
operational state.
  - When configuration is changing the system may go through transient 
states where the operational state is invalid.
  - The system might run of memory, or daemons crash, or fail, or become 
corrupted, all leaving the system in an invalid operational state.
- You can't stop someone from physical removing an piece of hardware, 
which could immediately make the applied configuration and operational 
state inconsistent (or inaccurate).

If you don't allow invalid states to be reported then you merely prevent 
the client from querying any operational state from the device whilst it 
is an inconsistent state.  I.e. the making the device harder to manage, 
and less maintainable.

The "other mechanism" that you describe above sounds like it requires an 
adhoc solution to this problem for every schema node. The <operational> 
datastore solves this problem in a generic way.  A client can always see 
what configuration is currently applied in the system.


>
> 2. Even with the change in 1. I do not see the removal of intended 
> configuration nodes from operational as a solution worth implementing 
> on our servers. I do not see a real world plug-and-play scenario that 
> can be automatically solved without specific additions to the models 
> e.g. /interfaces/interface/oper-status=not-present is oversimplified 
> solution but it needs to be extended exactly as much as the solution 
> provided by the removal of config true; nodes without the sacrifice of 
> YANG validity of operational.
There is somewhat of an implementation choice in this.

The server is obliged to return what is "in use" in operational, but it 
is up to the server vendor to decide what "in use" actually means for a 
particular item of configuration (e.g. for a configured value that might 
be distributed to multiple internal daemons).


> 3. Solutions like /interfaces/interface/admin-state stop working. With 
> the interface removed you can no longer figure if the if-mib has or 
> does not have the interface enabled so an operator has to use SNMP or 
> wait for a replacement line card to be connected to figure this bit of 
> information. My interpretation of the MAY as requirement level in sec. 
> 5.3. The Operational State Datastore (<operational>) is that 
> plug-and-play solutions can be implemented without this limited 
> approach that has the same problem as the pre-NMDA only now we have to 
> have /interfaces-state to keep config false; data relevant to hardware 
> that is configured but not present:
Admin-status is the intended configuration, clients can always query 
<running> or <intended> to see the desired state of the system.  They 
can also query <operational> and see that the interface doesn't 
currently exist in the system.  The rest of the properties associated 
with the interface sort of seem a bit irrelevant at that point.

/interfaces-state is deprecated and going away.  Its direct equivalent 
is /interfaces in <operational>.  I'm not sure that the semantics 
between the two has necessarily changed very much, except that the NDMA 
architecture now describes a formal mechanism for hardware removal, 
whereas previously it was left entirely to the vendors to each do their 
own thing.

For some of our systems, the applied configuration for a physical 
interface is managed on the linecard hosting that interface.  If that 
linecard is pulled out then all of that applied configuration really has 
gone.  In other cases, a user might remove the optics module for an 
interface, in which case the interface still exists but the interface is 
reported as being operationally down.


>
>    configuration data nodes supported in a configuration datastore
>    MAY be omitted from <operational> if a server is not able to
>    accurately report them.
>
> I realize this discussion comes late. I have stated my objections to 
> this particular part of the NMDA draft earlier.
Yes, this discussion does seem very late.

Did you raise your comments during either of the WG LCs?  I thought that 
I had been pretty diligent tracking and replying to all issues that had 
been raised during the WG LC.

Thanks,
Rob


>
> Vladimir
>
>>   But I agree that there could be configuration that is referencing 
>> those missing resources, and depending on implementation then that 
>> configuration may need to become not applied as well.  Or perhaps the 
>> failure is reported in a different way (e.g. IGP neighbor is down).
>>
>> I also agree that this is non trivial, but the systems that I am 
>> familiar with have always had to deal with this issue.  At the data 
>> model level I don't think that this is any more complex than the 
>> existing 'when' statement processing that has exactly the same issues 
>> if a "when" statement becomes invalid during a config change and 
>> requires the associated configuration to be deleted (which again can 
>> recursively require configuration to be removed).
>>
>> Alternative solutions are:
>>  - mandate that nobody physically removes a linecard if there is 
>> still configuration referencing it, but it is hard to enforce this in 
>> software :-)
>>  - freeze the config from any further changes if a linecard is 
>> removed that makes the config invalid, but this doesn't seem like a 
>> robust solution ...
>>
>> I think that the existing solution is the best approach.
>>
>> A couple of further comments inline below as well ...
>>
>> On 20/12/2017 21:44, Vladimir Vassilev wrote:
>>> Hello,
>>>
>>> On 12/20/2017 05:40 PM, Benoit Claise wrote:
>>>
>>>> Dear all,
>>>>
>>>> In order not to be the bottleneck in the process and assuming that 
>>>> the document will be in "publication requested" pretty soon, here 
>>>> is my AD review of draft-ietf-netmod-revised-datastores-08
>>>>
>>>> -
>>>>
>>>>
>>>>         5.3.2. Missing Resources
>>>>
>>>>     Configuration in <intended> can refer to resources that are not
>>>>     available or otherwise not physically present.  In these 
>>>> situations,
>>>>     these parts of <intended> are not applied.  The data appears in
>>>>     <intended> but does not appear in <operational>.
>>>
>>> I have some concerns with this section.
>>>
>>> Systems implementing this are expected to remove config true; nodes 
>>> while figuring the necessary changes to ensure the remaining set of 
>>> config true; nodes in operational validates against the operational 
>>> datastore model. The implementation of this is not a trivial task at 
>>> all. In order to remove configuration nodes considered inactive on 
>>> the fly one needs to remove all references to those nodes in 
>>> mandatory leafrefs in the best case and a potentially long and 
>>> complex dependency chain of YANG constrain-statements (Xpath etc.) 
>>> have to be resolved in a worse case. It is difficult to automate 
>>> this. It requires significant effort to track and remove/fix all 
>>> those dependencies just to come up with valid configuration that 
>>> represents the configuration without the "inactive" nodes which in 
>>> many usecases is completely unjustified implementation effort.
>>>
>>> In addition in many cases it is not desirable to remove config true; 
>>> nodes that depended on a removed resource. For example:
>>>
>>> 1. A configuration instance of a filter with mandatory interface-ref 
>>> ingress and egress ports has to be removed from the operational 
>>> datastore if the egress port is removed as a physical resource. This 
>>> in effect removes the config false; statistics that might be still 
>>> of interest counting the matched traffic while the filter does not 
>>> have physical egress port to send the packets.
>> This isn't necessarily true.  The architecture does not require that 
>> the filter object is removed because operational is allowed to 
>> violate the constraints.  Ultimately I think that the behaviour here 
>> will depend on implementation.
>>
>>>
>>> 2. Alarm that is configured with mandatory reference to the missing 
>>> resource containing a counter of the elapsed time since the resource 
>>> went missing etc.
>> Again, the draft does not require that the alarm becomes not 
>> applied.  This also depends on the implementation.
>>
>> Thanks,
>> Rob
>>
>>
>>>
>>> I do not find any text in the draft addressing the concerns above. I 
>>> do not propose a change yet but I hope to hear what others think 
>>> about that.
>>>
>>> Vladimir
>>>
>>>> I understand what you want to say.
>>>> Let me take an example. I have a router with a Line Card configured 
>>>> and working well. if I remove the LC, the configuration should 
>>>> still be in the <running> and <intended> but not in <operational>.
>>>> However, based on figure below, the notion of "inactive" nodes 
>>>> might be misleading. Indeed, people might read that the LC is 
>>>> inactive, so the LC configuration should not be in <intended>
>>>>       +-------------+                 +-----------+
>>>>       | <candidate> |                 | <startup> |
>>>>       |  (ct, rw)   |<---+       +--->| (ct, rw)  |
>>>>       +-------------+    |       |    +-----------+
>>>>              |           |       |           |
>>>>              |         +-----------+         |
>>>>              +-------->| <running> |<--------+
>>>>                        | (ct, rw)  |
>>>>                        +-----------+
>>>>                              |
>>>>                              |        // configuration 
>>>> transformations,
>>>>                              |        // e.g., removal of "inactive"
>>>>                              |        // nodes, expansion of templates
>>>>                              v
>>>>                        +------------+
>>>>                        | <intended> | // subject to validation
>>>>                        | (ct, ro)   |
>>>>                        +------------+
>>>> I understand that "inactive nodes" has a different meaning.
>>>>
>>>> Proposal:
>>>> OLD: removal of "inactive" nodes
>>>> NEW: removal of the nodes marked as "inactive"
>>>>
>>>> - In the C.1 example,
>>>>     <system
>>>>         xmlns="urn:example:system"
>>>> xmlns:or="urn:ietf:params:xml:ns:yang:ietf-origin">
>>>>
>>>>       <hostname or:origin="or:dynamic">bar</hostname>
>>>>
>>>>       <interface or:origin="or:intended">
>>>>         <name>eth0</name>
>>>>         <auto-negotiation>
>>>>           <enabled or:origin="or:default">true</enabled>
>>>>           <speed>1000</speed>
>>>>         </auto-negotiation>
>>>>         <speed>100</speed>
>>>>         <address>
>>>>           <ip>2001:db8::10</ip>
>>>>           <prefix-length>64</prefix-length>
>>>>         </address>
>>>>         <address or:origin="or:dynamic">
>>>>           <ip>2001:db8::1:100</ip>
>>>>           <prefix-length>64</prefix-length>
>>>>         </address>
>>>>       </interface>
>>>> I guess it "or:dynamic" should be replaced by "or:learned"
>>>>
>>>> Justification:
>>>>
>>>>       identity learned {
>>>>         base origin;
>>>>         description
>>>>           "Denotes configuration learned from protocol interactions 
>>>> with
>>>>            other devices, instead of via either the intended
>>>>            configuration datastore or any dynamic configuration
>>>>            datastore.
>>>>
>>>>            Examples of protocols that provide learned configuration
>>>>            include link-layer negotiations, routing protocols,_and 
>>>> DHCP._";
>>>>
>>>> _Editorial:_
>>>>
>>>> - number the figures
>>>>
>>>> - section 8.2
>>>>     This document registers two YANG modules in the YANG Module Names
>>>>     registry [RFC6020 <https://tools.ietf.org/html/rfc6020>].  
>>>> Following the format in [RFC6020 
>>>> <https://tools.ietf.org/html/rfc6020>], the the
>>>>     following registrations are requested:
>>>>
>>>> duplicated "the the"
>>>>   Regards, Benoit (OPS AD)
>>>>
>>>>
>>>> _______________________________________________
>>>> netmod mailing list
>>>> netmod@ietf.org
>>>> https://www.ietf.org/mailman/listinfo/netmod
>>>
>>> _______________________________________________
>>> netmod mailing list
>>> netmod@ietf.org
>>> https://www.ietf.org/mailman/listinfo/netmod
>>
>
> .
>