Re: [netmod] AD review: draft-ietf-netmod-revised-datastores-08

Martin Bjorklund <mbj@tail-f.com> Tue, 09 January 2018 11:29 UTC

Return-Path: <mbj@tail-f.com>
X-Original-To: netmod@ietfa.amsl.com
Delivered-To: netmod@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 237101200FC for <netmod@ietfa.amsl.com>; Tue, 9 Jan 2018 03:29:55 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.91
X-Spam-Level:
X-Spam-Status: No, score=-1.91 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id NoBXlCWFQxqB for <netmod@ietfa.amsl.com>; Tue, 9 Jan 2018 03:29:51 -0800 (PST)
Received: from mail.tail-f.com (mail.tail-f.com [46.21.102.45]) by ietfa.amsl.com (Postfix) with ESMTP id 53E96126DEE for <netmod@ietf.org>; Tue, 9 Jan 2018 03:29:51 -0800 (PST)
Received: from localhost (unknown [173.38.220.56]) by mail.tail-f.com (Postfix) with ESMTPSA id E03671AE0144; Tue, 9 Jan 2018 12:29:49 +0100 (CET)
Date: Tue, 09 Jan 2018 12:28:07 +0100
Message-Id: <20180109.122807.1121028038684414186.mbj@tail-f.com>
To: rwilton@cisco.com
Cc: andy@yumaworks.com, netmod@ietf.org
From: Martin Bjorklund <mbj@tail-f.com>
In-Reply-To: <d2f8abd1-56fb-93b0-da3c-37cf16d2d4db@cisco.com>
References: <cf27d398-1883-c1ce-a54a-4644bac8a1dc@cisco.com> <CABCOCHQCv8ih9uKFxmews_=3c_rX6fSAA=L8vtW91k-pMSHOEg@mail.gmail.com> <d2f8abd1-56fb-93b0-da3c-37cf16d2d4db@cisco.com>
X-Mailer: Mew version 6.7 on Emacs 24.5 / Mule 6.0 (HANACHIRUSATO)
Mime-Version: 1.0
Content-Type: Text/Plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/netmod/HWAi4pN4tQeYHcCFZwpZZ4FtBHo>
Subject: Re: [netmod] AD review: draft-ietf-netmod-revised-datastores-08
X-BeenThere: netmod@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: NETMOD WG list <netmod.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/netmod>, <mailto:netmod-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/netmod/>
List-Post: <mailto:netmod@ietf.org>
List-Help: <mailto:netmod-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/netmod>, <mailto:netmod-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 09 Jan 2018 11:29:55 -0000

Robert Wilton <rwilton@cisco.com> wrote:
> Hi Andy,
> 
> 
> On 08/01/2018 19:45, Andy Bierman wrote:
> >
> >
> > On Mon, Jan 8, 2018 at 5:55 AM, Robert Wilton <rwilton@cisco.com
> > <mailto:rwilton@cisco.com>> wrote:
> >
> >     Hi Andy,
> >
> >     Regarding your comment below, this intent is captured by this text
> >     describing the operational datastore in section 5.3:
> >
> >         <operational> SHOULD conform to any constraints specified in the
> >         data
> >         model, but given the principal aim of returning "in use" values, it
> >         is possible that constraints MAY be violated under some
> >         circumstances, e.g., an abnormal value is "in use", the structure of
> >         a list is being modified, or due to remnant configuration (see
> >         Section 5.3.1).  Note, that deviations SHOULD be used when it is
> >         known in advance that a device does not fully conform to the
> >         <operational> schema.
> >
> >         Only semantic constraints MAY be violated, these are the YANG
> >         "when",
> >         "must", "mandatory", "unique", "min-elements", and "max-elements"
> >         statements; and the uniqueness of key values.
> >
> >         Syntactic constraints MUST NOT be violated, including hierarchical
> >         organization, identifiers, and type-based constraints.  If a node in
> >         <operational> does not meet the syntactic constraints then it MUST
> >         NOT be returned, and some other mechanism should be used to flag the
> >         error.
> >
> >
> >     Do you agree that this is sufficient?
> >
> >
> >
> > Not really.
> > It does not address my concern, which is that NMDA is
> > removing the YANG constraints on config=false data nodes
> > for no apparent reason.
> 
> There is a reason. I don't think that the constraints on config=false
> is really being removed, because I don't think that they truly existed
> in the first place (despite what RFC 7950 might indicate!).

I agree.  But note that RFC 7950 says:

   o  If the constraint is defined on state data, it MUST be true in a
      valid state data tree.

It is not defined anywhere that <get> must return a "valid state data
tree".

In reality, I suspect that all implementations of <get> call various
instrumentation call back functions in some order, possibly in
parallell, which means that data will be collected at different times
from the backend systems.  I don't think it is feasible to freeze the
operational state of a device, collect all data, and unfreeze, in
order to get a consistent snapshot of the operational state.


/martin

> I think that we all agree on the expected behavior for configuration:
> If a client sends configuration to a server that would cause <running>
> to become invalid then the server should reject that change, to ensure
> that <running> always holds a consistent configuration.  Having a
> consistent configuration is the most important property here. 
> I.e. the server has the right to reject an invalid configuration
> request from a client.
> 
> However, the flow of operational state data in opposite direction
> cannot hold to the same rules.  If during the processing of a get
> request (or YANG push) a server sends operational state data back to a
> client then a client has to choose how to process the message:
>  - if the message is garbled or not sane then it makes sense to
> discard it.
>  - however, what should the client do if the message is well formed
> but either (i) contains some values outside the permitted schema range
> (but can be represented by the schema datatype), or (ii) by applying
> the values would cause the clients copy of <operational> to become
> invalid?
> 
> If the client discards the message because of one bad value, then that
> doesn't seem to be helpful, since it allows for a very fragile model
> of system management.  I.e. if one small thing is bad then the whole
> house of cards collapses.
> 
> So I think that the only sensible behaviour here is that the client
> has to process the operational state update in a best effort fashion,
> keep all the good data and probably flag any values that are outside
> the value constraints.  Similarly any reference constraint failures
> (i.e. when/must) can similarly be flagged up, but throwing away an
> update message that would cause the operational state to become
> inconsistent doesn't seem to be helpful.  I.e. it is much better if
> the client gets to see the true state of the server, even if that
> state isn't good (or consistent).
> 
> Similar questions arise on the server itself:
>  - what if the real value in use (e.g. that is read from the hardware)
> is outside the permitted range (because of a logic defect)?  Is it
> really better to suppress that value entirely or return a value that
> server knows to be wrong?
>  - can a server even know that its operational view is consistent? For
> complex systems where the real operational state is split across
> multiple underlying linecards, or remote devices, I think that this is
> very hard (if not impossible) to do.
> 
> So what the NMDA architecture states is:
>  (i) if a server knows that it won't conform to the operational schema
> then it must use deviations,
>  (ii) a server in a normal steady state should conform to the
> operational schema (and be valid),
>  (iii) but, if the system is churning (e.g. configuration, route
> update, etc) then the operational state of the server might be
> transiently inconsistent and this is OK,
>  (iv) if, the server is in a bad state, then it is better to return
> the actual state than to lie or not report a particular value (as long
> as it can be encoded).
>  (v) a server does not need to explicitly validate that its view of
> operational is valid. It is unclear what it would/could do if it
> detected that the operational state is invalid, nor is it clear that
> servers would generally be able to always perform this operation.
> 
> >
> > The server implementation requirements expressed in YANG constraints
> > are applicable to any data node, not just config=true data nodes.
> > The requirement to implement the ancestor nodes (with keys) does not
> > change.
> 
> The draft does not allow this to be violated.  I.e. the following
> statement prevents this: "Syntactic constraints MUST NOT be violated,
> including hierarchical organization".
> 
> 
> > The requirement to conform to the YANG constraints defined within
> > config=false
> > data nodes does not change.
> >
> > To do otherwise does not make sense.  E.g. "when" conditions that add
> > ethernet
> > counters only when the interface type is ethernetCsmacd. Why would it
> > be OK for
> > the server to ignore that when-stmt and add ethernet counters to every
> > interface?
> 
> It is not OK for a server to ignore that and add Ethernet counters to
> every interface (without using a deviation).  The draft is not trying
> to allow that.
> 
> But if an interface could change type (e.g. between Ethernet and ATM
> via a different optics module being inserted) then it would be allowed
> for a server to transiently report the ethernet counters on the
> interface whilst it is in the process of changing the interface type
> from ethernet to ATM (e.g. if the counters are maintained by a
> separate daemon that is updated asynchronously with respect to the
> config or optics change).  Once the change had completed, the the
> system reaches steady state then the Ethernet counter must no longer
> be reported.
> 
> Thanks,
> Rob
> 
> 
> >
> > IMO the text above can only apply to the operational values of
> > config=true nodes.
> >
> >
> >     Thanks,
> >     Rob
> >
> >
> >
> > Andy
> >
> >
> >
> >     On 21/12/2017 22:49, Andy Bierman wrote:
> >>     Hi,
> >>
> >>     It should be clear somehow that server requirements to provide
> >>     config=false data
> >>     that is valid according to the YANG definitions is not affected
> >>     by NMDA.
> >>     That is not being taken away.  The ability to validate
> >>     operational values
> >>     of configuration data has never been provided, and therefore is
> >>     not being taken away either.
> >>
> >>     A constraint on config=true nodes only applies to configuration
> >>     datastores.
> >>     These are the only constraints that should be ignored in
> >>     <operational>.
> >>     Constraints on config=false nodes still apply in <operational>.
> >>
> >>
> >>     Andy
> >>
> >>
> >>
> >>     On Thu, Dec 21, 2017 at 2:27 PM, Juergen Schoenwaelder
> >>     <j.schoenwaelder@jacobs-university.de
> >>     <mailto:j.schoenwaelder@jacobs-university.de>> wrote:
> >>
> >>         On Thu, Dec 21, 2017 at 07:52:54PM +0100, Vladimir Vassilev
> >>         wrote:
> >>         > On 12/21/2017 02:20 PM, Juergen Schoenwaelder wrote:
> >>         >
> >>         > > On Thu, Dec 21, 2017 at 02:03:45PM +0100, Vladimir
> >>         Vassilev wrote:
> >>         > > > On 12/21/2017 11:34 AM, Robert Wilton wrote:
> >>         > > >
> >>         > > > > Hi Vladimir,
> >>         > > > >
> >>         > > > > First point of clarification is that this is not
> >>         about running/intended
> >>         > > > > at all.  The contents of running/intended do not
> >>         change in anyway
> >>         > > > > depending on whether hardware is present or absent.
> >>         > > > >
> >>         > > > > The section is only concerned with how the
> >>         configuration is applied in
> >>         > > > > operational, and basically says that you cannot apply
> >>         configuration for
> >>         > > > > resources that are missing (which seems reasonable). 
> >>         E.g. I cannot
> >>         > > > > configure an IP address on a physical interface that
> >>         isn't there.  Or if
> >>         > > > > the physical interface gets removed then the
> >>         configuration associated
> >>         > > > > with that interface is also removed from operational.
> >>         > > > >
> >>         > > > > Operational isn't validated and data model
> >>         constraints are allowed to be
> >>         > > > > broken (ideally transiently).
> >>         > > > I want to focus on this. IMO giving up schema validitiy
> >>         for any datastore is
> >>         > > > unacceptable price. Pre-NMDA devices had full model
> >>         support in operational
> >>         > > > data (all YANG constrains part of the model without
> >>         discrimination were
> >>         > > > enforced).
> >>         > > There was a long debate about the value of returning the true
> >>         > > operational state. What do you do if the operational
> >>         state is invalid?
> >>         > > A server can reject configuration changes if they lead to
> >>         invalid
> >>         > > state, a server can not reject reality.
> >>         > IMO if the model can represent reality then data conforming
> >>         to the model
> >>         > can. If not a better model is needed not a hack that breaks
> >>         the datastore
> >>         > conformance to the YANG model. I do not see how
> >>         > /interfaces/interface/oper-status=not-present was not
> >>         representing the
> >>         > reality of a system with removed line card that is
> >>         configured and ready to
> >>         > resume operation as soon as the line card is reconnected.
> >>
> >>         I assume this is all system and implementation specific. If your
> >>         system knows about interfaces that are not present (i.e.,
> >>         there is
> >>         operational state about them), you can report these
> >>         interfaces.  But
> >>         'is configured' is confusing here. I am not sure a line card
> >>         that does
> >>         not exist should be considered configured. But yes, this may
> >>         be system
> >>         specific. Anyway, draft-ietf-netmod-rfc7223bis-01.txt still has
> >>         oper-status 'not-present' - so this seems to be a mood point.
> >>
> >>         > > > If this is about to change it will compromise
> >>         interoperability
> >>         > > > and a significant portion of the client implementation
> >>         workload that can be
> >>         > > > automated will need to be coded in hand and tested.
> >>         Unresolved leafrefs,
> >>         > > > undefined behaviour of different implementations
> >>         removing different
> >>         > > > configuration nodes in violation of YANG semantic
> >>         constraints (which I do
> >>         > > > not think can be so clearly separated from the
> >>         syntactic constraints when
> >>         > > > one considers types like leafref, instance-identifier
> >>         etc.) and the
> >>         > > > corresponding side effects based on the server
> >>         implementators own creativity
> >>         > > > is eventually going to create more problems.
> >>         > > >
> >>         > > > 1. IMO the only acceptable solution is to have YANG
> >>         valid operational
> >>         > > > datastore at all times. operational like any other
> >>         datastore MUST be valid
> >>         > > > YANG data tree and it has to be a system implementation
> >>         task to consider all
> >>         > > > complications resulting from the removal of the
> >>         resources leading to any
> >>         > > > data transformations. If this is difficult or
> >>         impossible other mechanisms to
> >>         > > > flag missing resources should be used (e.g.
> >>         > > > /interfaces/interface/oper-status=not-present) This
> >>         sounds like a useful
> >>         > > > contract providing the value of a standard the
> >>         alternative does not.
> >>         > > As said above, it is impossible to report valid
> >>         operational state if
> >>         > > the operational state is not valid according to the models.
> >>         > >
> >>         > > > 2. Even with the change in 1. I do not see the removal
> >>         of intended
> >>         > > > configuration nodes from operational as a solution
> >>         worth implementing on our
> >>         > > > servers. I do not see a real world plug-and-play
> >>         scenario that can be
> >>         > > > automatically solved without specific additions to the
> >>         models e.g.
> >>         > > > /interfaces/interface/oper-status=not-present is
> >>         oversimplified solution but
> >>         > > > it needs to be extended exactly as much as the solution
> >>         provided by the
> >>         > > > removal of config true; nodes without the sacrifice of
> >>         YANG validity of
> >>         > > > operational.
> >>         > > Your thinking is likely wrong. <operational> reports the
> >>         operational
> >>         > > state. It may have little in common with <intended>.
> >>         Trying to derive
> >>         > > operational from intended is likely a not well working
> >>         approach.
> >>         > The proposal for this solution ("derive operational from
> >>         intended" e.g.
> >>         > merge /interfaces-state in /interfaces) comes from the
> >>         revised datastores
> >>         > draft not me.
> >>         >
> >>         > By definition config true; data represents intent. Reusing
> >>         the model of a
> >>         > config true; data to represent state absent of intent (e.g.
> >>         > /interfaces/interface with origin="or:system") is a hack.
> >>         The hack works
> >>         > fine without compromising the conformance of operational to
> >>         the YANG model
> >>         > as long as certain conditions are met. I am pointing out
> >>         that one of the
> >>         > conditions is to keep all of the intended configuration
> >>         data present in
> >>         > 'operational' and handle missing resources with
> >>         conventional means e.g.
> >>         > /interfaces/interface/oper-status=not-present instead of
> >>         adding the straw
> >>         > that breaks the camel's back.
> >>
> >>         I fail to see why you believe all objects that appear in intended
> >>         configuration needs to exist in applied configuration. In fact,
> >>         operators told us very clearly that they care about the
> >>         distinction
> >>         between intended and applied config.
> >>
> >>         > > > 3. Solutions like /interfaces/interface/admin-state
> >>         stop working. With the
> >>         > > > interface removed you can no longer figure if the
> >>         if-mib has or does not
> >>         > > > have the interface enabled so an operator has to use
> >>         SNMP or wait for a
> >>         > > > replacement line card to be connected to figure this
> >>         bit of information.
> >>         > > At least on my boxes, if I remove a line card, the
> >>         interface also
> >>         > > disappears in SNMP tables. Stuff that is operationally
> >>         not present is
> >>         > > simply operationally not present.
> >>         > >
> >>         > > > My
> >>         > > > interpretation of the MAY as requirement level in sec.
> >>         5.3. The Operational
> >>         > > > State Datastore (<operational>) is that plug-and-play
> >>         solutions can be
> >>         > > > implemented without this limited approach that has the
> >>         same problem as the
> >>         > > > pre-NMDA only now we have to have /interfaces-state to
> >>         keep config false;
> >>         > > > data relevant to hardware that is configured but not
> >>         present:
> >>         > > >
> >>         > > >     configuration data nodes supported in a
> >>         configuration datastore
> >>         > > >     MAY be omitted from <operational> if a server is
> >>         not able to
> >>         > > >     accurately report them.
> >>         > > >
> >>         > > > I realize this discussion comes late. I have stated my
> >>         objections to this
> >>         > > > particular part of the NMDA draft earlier.
> >>         > > I believe there is a conceptual misunderstanding. I think
> >>         there never
> >>         > > was a requirement that a server reports the state of
> >>         hardware that is
> >>         > > not present.
> >>         > "Data relevant to hardware that is configured but not
> >>         present" is different
> >>         > from "state of hardware that is not present". For example
> >>         information
> >>         > indicating when the line card became unavailable, what was
> >>         the reason, or
> >>         > other information like how many packets that had this
> >>         interface as egress
> >>         > destination are being dropped as a result of the removal.
> >>
> >>         I think that systems handle non-existing interfaces
> >>         differently. It
> >>         seems that ietf-interfaces is flexible enough to accomodate the
> >>         differnet styles.
> >>
> >>         /js
> >>
> >>         --
> >>         Juergen Schoenwaelder           Jacobs University Bremen gGmbH
> >>         Phone: +49 421 200 3587         Campus Ring 1 | 28759 Bremen
> >>         | Germany
> >>         Fax:   +49 421 200 3103       
> >>          <http://www.jacobs-university.de/
> >>         <http://www.jacobs-university.de/>>
> >>
> >>         _______________________________________________
> >>         netmod mailing list
> >>         netmod@ietf.org <mailto:netmod@ietf.org>
> >>         https://www.ietf.org/mailman/listinfo/netmod
> >>         <https://www.ietf.org/mailman/listinfo/netmod>
> >>
> >>
> >>
> >>
> >>     _______________________________________________
> >>     netmod mailing list
> >>     netmod@ietf.org <mailto:netmod@ietf.org>
> >>     https://www.ietf.org/mailman/listinfo/netmod
> >>     <https://www.ietf.org/mailman/listinfo/netmod>
> >
> >
>