Re: [Netconf] mbj's WGLC review of subscribed-notifications-16

Martin Bjorklund <mbj@tail-f.com> Wed, 12 September 2018 09:34 UTC

Date: Wed, 12 Sep 2018 11:34:46 +0200
Message-Id: <20180912.113446.2017030234338816835.mbj@tail-f.com>
To: evoit@cisco.com
Cc: netconf@ietf.org
From: Martin Bjorklund <mbj@tail-f.com>
In-Reply-To: <aad419155e494b69933b6fb24c6685db@XCH-RTP-013.cisco.com>
References: <20180910.194152.138356385922962289.mbj@tail-f.com> <aad419155e494b69933b6fb24c6685db@XCH-RTP-013.cisco.com>
Mime-Version: 1.0
Content-Type: Text/Plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/netconf/CZrlVYIyKyCOl5KOyv5RD9WVUPc>
Subject: Re: [Netconf] mbj's WGLC review of subscribed-notifications-16
Precedence: list

"Eric Voit (evoit)" <evoit@cisco.com> wrote:
> Thanks for the comments Martin,  thoughts in-line...
> 
> > From: Martin Bjorklund, September 10, 2018 1:42 PM

[...]

> > o  Section 2.1
> > 
> >   The text says:
> > 
> >     It is out of the scope of this document
> >     to identify [...] b) how event records are defined
> > 
> >   Is this really what we want?  Shouldn't this document specify that
> >   an event record is an instance of a YANG "notification" statement?
> >   Do we want to support event records that are not defined in YANG?
> >
> >   This was discussed in the thread "comments on
> >   draft-ietf-netconf-subscribed-notifications-12", and it seems that
> >   noone really have any use case for non-YANG-defined event records,
> >   accessed with YANG-defined operations.
> 
> Yes this was discussed previously.  My reading was that others wanted
> to support subscription to non-YANG events.
> E.g., see Andy & Tianran's thoughts in
> https://www.ietf.org/mail-archive/web/netconf/current/msg14764.html 

Yes, but read on in that thread, see
https://www.ietf.org/mail-archive/web/netconf/current/msg14766.html
for a clarification.

AFAIK, noone has presented a use case for sending non-YANG-defined
data here.

> >   I think that this document should state that an event record is an
> >   instance of a YANG "notification" statement.  If not, it is unclear
> >   both how subtree/xpath filters and xml/json/... encodings would work.
> 
> Applying xpath filters against XML based event records seems viable to
> me.

Where does it say that it has to be XML?  Whatabout mixed-content?
What if such a notification is sent when the encoding is JSON?

If you want to make this specification open for other data modeling
languages, I think you need to carefully define how that is supposed
to work; essentially similar to how we handle transports; we have a
set of requirement on what a transport document must define.  IMO this
is super complicated and overkill.  This draft is complex enough.
Don't add even more complexity.

> Of course it will be up to the definition of any particular
> stream to articulate the event record contents sufficiently for
> filters to be created.
> 
> > o  Section 2.1
> > 
> >   The text says:
> > 
> >     Among
> >     these included NETCONF XML event records are individual YANG 1.1
> >     notifications described in section 7.16 of [RFC7950].
> > 
> >   First of all, does this mean that there can be event records on the
> >   "NETCONF" stream that are *not* defined with a "notification"
> >   statement?
> 
> For the NETCONF stream, personally I only care about things YANG
> notification statement.  But looking again at Andy's comment from:
> https://www.ietf.org/mail-archive/web/netconf/current/msg14764.html 
> he says "I would say SHOULD be defined in YANG, for the "NETCONF"
> stream."

I strongly disagree.

> So for -v16 I updated the text in a way that doesn't
> preclude this.

See above.  If all events records are defined in YANG we can remove
this.


> >   Second, this seems to imply that *all* notifications defined in all
> >   YANG modules end up in the "NETCONF" stream, but I don't think this
> >   is the intention.  At least I hope so.  ("yp:push-update" is an
> >   example of a notification that will not be sent on the "NETCONF"
> >   stream.)
> 
> The NETCONF stream has a very broad/inclusive definition in RFC-5277.
> This definition is: "This stream contains all NETCONF XML event
> notifications supported by the NETCONF server".

This is very underspecified.  IIRC the original intention was
notifications related to NETCONF, hence "*NETCONF* XML event
notifications".

> To me, thinking about how that statement and applying it to YANG, this
> translates as YANG 1.1 notifications should go into the NETCONF
> stream, unless explicitly excluded.

I disagree.  This is not how 5277 works.  Until we have worked out a
useful way to assign notifications to streams, we shouldn't restrict
anything in this regard.

> This provides a very simple way
> determining which types of notifications should be placed on the
> stream.  (Otherwise, how is a developer of a YANG notification
> supposed to know whether it should go into the NETCONF stream or
> not?

They don't.  But note that it might be an operator decision for how to
assign notifications to streams.

> The alternative would seem to be having each notification opt-in to
> the NETCONF stream.)
> 
> So if you agree with this reasoning, how about:
> 
> The "NETCONF" event stream contains all instances of YANG 1.1
> notifications generated by a publisher, except where a type of
> notification has explicitly been excluded from the stream.

I think that this document should keep the definition of the NETCONF
stream as defined in RFC 5277.

Also, I think it should state that an event record is an instantiation
of a YANG "notification" statement (any version of YANG).

[...]

> > o  Section 2.4.2
> > 
> >   The text says:
> > 
> >     The transport selected by the subscriber
> >     to reach the publisher MUST be able to support multiple "establish-
> >     subscription" requests made within the same transport session.
> > 
> >   This seems to indicate that RESTCONF cannot be used with this
> >   specification, since RESTCONF doesn't have sessions.
> 
> The desired behavior is that a subscriber can request multiple
> subscriptions without each requiring a separate transport session
> establishment.  I think we are ok here with RESTCONF because
> underneath RESTCONF is HTTP, TLS, TCP.  And each of those lower layers
> can have sessions.
> 
> To clarify further, I have tweaked the text to:
> 
> "The transport selected by the subscriber to reach the publisher MUST
> be able to support multiple "establish-subscription" requests without
> each requiring a separate persistent transport connection."
> 
> >   This makes me wonder, why does this document specify this
> >   requirement?  I agree that this is the behaviour we want from
> >   NETCONF, but then I think that should be defined in the
> >   netconf-notif draft.
> 
> By saying something in SN about this requirement, I believe we can
> preclude an implementation which can only support one subscription per
> subscriber.  In any case, the specific "how" details will still need
> to be articulated in each transport draft.

I think you are mixing the specification and implementation.  If a
transport document says that an implemention MUST support multiple
subscriptions per session (in a transport-specific way), then an
implementation cannot just support a sinlge subscription.

My point is that this a transport-specific requirement, and it doesn't
belong in this document.

> > o  Section 2.4.2
> > 
> >   s/"start-time"/"replay-start-time"/  (3 occurrences)
> 
> Fixed
> 
> >   Also, the text says that the replay-start-time MUST be in the past.
> > 
> >   Suppose the server receives a "replay-start-time" of 12:00:00.001,
> >   and the replay buffer is empty.
> > 
> >   A) If the current time is 12:00:00.002, the request is accepted, and
> >      a "replay-completed" notif is sent.
> > 
> >   B) If the current time is 12:00:00.000, the request is rejected with
> >      an error.
> > 
> >   Why is this important?  Why not simply just send the
> >   replay-completed notif in both cases?
> 
> For (B), if an event occurs at 12:00:00.0005, should it be sent?
> Should it be sent before or after the "replay-start-time"?  How long
> should we allow the clocks to drift before we reject the subscription?
> As we don't support future/pending subscriptions, we don't have to
> worry about these questions with the current definition.

I would simply say that if the replay-start-time is later than any
buffered notif, then "replay-complete" is immediately sent.  It dont'
matter if it is one microsecond or one year later.

The problem otherwise is how to decide if something is "in the past".
In the past when the client sent the request?  When the server parsed
the XML?  When the instrumentation code runs?

But this is not a big issue.

[...]

> > o  Section 2.4.2.1
> > 
> >   There is no text that explains what happens in the "succesful" case;
> >   i.e., when there are buffered event records with time greater than
> >   or equal to the requested "replay-start-time".
> > 
> >   Maybe the first paragraph is supposed to cover this, but it just
> >   says "meets the timeframe criteria", which is a bit vague.
> 
> This was supposed to be covered by the first paragraph.  To help
> clarify, I have updated the text to:
> 
> "In other words, as the subscription initializes itself, it sends any
> event records within the target event stream which meet the filter
> criteria, which have an event time which is after the
> "replay-start-time", and which have an event time before the
> "stop-time" should this "stop-time" exist."

Ok.

[...]

> > o  Section 2.4.6
> > 
> >   The text says:
> > 
> >     1.  "establish-subscription-stream-error-info": This MUST be returned
> >         if an RPC error reason has not been placed elsewhere within the
> >         transport portion of a failed "establish-subscription" RPC
> >         response.
> > 
> >   (and similar for the other errors)
> > 
> >   What exactly does "if an RPC error reason has not been placed
> >   elsewhere" mean?
> 
> The intent is to match how NETCONF & RESTCONF historically handles
> errors.  As you know with NETCONF, each error identity will be
> inserted as the "error-app-tag".

It could also be in "error-tag" and/or "error-message".  My point is
that I don't think that you mean that if an implementation populate
say "error-message", then it should not popluate error-info with
establish-subscription-stream-error-info.

> And therefore placing the YANG data
> "establish-subscription-stream-error-info" would be redundant.  For
> NETCONF, "establish-subscription-stream-error-info" are only needed
> when hints to be provided.
> 
> Therefore assuming you are looking for an alternative way to make this
> point, I have tweaked the text to:
> "This MUST be returned if an RPC error reason has not been mapped down
> into transport specific error objects contained within a failed
> "establish-subscription" RPC response."

Howabout simply removing this sentence?

> > o  Section 2.5.1
> > 
> >   In the second diagram (for the valid state), the states are called
> >   "receiver connecting", "reciever timeout" etc.  I find this a bit
> >   confusing; it is not the *receiver* that is connecting, it is the
> >   publisher.  Ok, the legend then says:
> > 
> >      dashed boxes which include the word 'receiver' show the possible
> >      states for an individual receiver
> > 
> >  But since all dashed boxes include the word "receiver", and the
> >  diagram
> > specifically is the state machine *per receiver*, maybe we can simply
> > remove
> > the word "receiver" from the boxes?
> 
> There have been some discussions on this.  I agree it is certainly
> valid to remove "receiver" from these four boxes.  However I have
> found that getting readers to really internalize that the state
> machine is actually per-receiver rather than per-subscription, it
> seems to go faster/cleaner when receiver is shown in the four boxes.
> 
> Nevertheless, if you absolutely want "receiver" removed from the
> boxes, I can do that.

Maybe you can remove receiver from the individual boxes and then
change the word "valid" from the outer box to "per-receiver state
machine"?

Or maybe remove the outer box and add a Figure title: 'per-receiver
state machine for the subscription's "valid" state'

Also not a big deal, but I found it confusing.

[...]

> > o  Section 2.5.6
> > 
> >   I know that this feature was presented in Montreal.  I think it is
> >   problematic, and suggest it is removed.
> 
> This issue was discussed in Montreal.  The feature is of course is
> optional.  There are a large number of downsides for someone who
> chooses not to implement for certain use cases.  Please see the
> Montreal slides and the long threads with Kent for more details.

I did watch the Montreal session, but I didn't see any discussion on
this issue (just your presentation).   Can you send a pointer to the
email thread with Kent?

> If
> you want to pursue this further it is probably best to start a new
> thread dedicated to this topic alone.
> 
> >   For example, what happens if the transport session is lost and
> >   re-established; are all event records in the buffer replayed again?
> 
> No.  The publisher will not lose context of what was sent.

This is not clear from the text.  In fact, the text explicitly says:

   The leading event record sent will be the first event record
   subsequent to the latest of three different times: the
   "replay-log-creation-time", "replay-log-aged- time", or the most
   recent publisher boot time.

> >   What happens if a new receiver is added to a subscription that
> >   already has a receiver, will all events in the buffer be replayed to
> >   the new receiver?
> 
> Yes.  (To clarify, I added "to each configured receiver" at the end of
> the last sentence of the first paragraph.)
> 
> >   When a new subscription is configured with configured-replay set,
> >   the publisher will immediately try to connect to the publisher and
> >   send all buffered event from the last reboot.  Is this really
> >   useful?
> 
> Yes.  See the security cases reviewed on the earlier threads with
> Kent.  If someone doesn't care about log entries since boot, they
> simply don't have to implement this optional feature.
> 
> >   If the mechansim relies on the subscriber being able to also use
> >   dynamic subscriptions for replay of all event records, maybe we
> >   could simplify things by always relying on dynamic subscriptions for
> >   replay for configured subscriptions.  Specifically, remove
> >   "configured-replay" etc, but keep the "replay-previous-event-time"
> >   in the "subscription-started" notification.  Then a subscriber can
> >   simply check this value to see if anything is missing, and if it is,
> >   use a dynamic subscription to request replay from the latest time he
> >   knows about.
> 
> This is a valid approach, and can be used as an alternative to this
> optional feature.  However there are still many downsides to the
> alternative.  See the slides from Montreal, as well as the previous
> thread discussions for details.
> 
> > o  Section 2.6
> > 
> >   It just occurred to me that this text about using the <notification>
> >   message shouldn't be in this document, but in the netconf-notif (and
> >   restconf-notif) document.  It is highly transport and encoding
> >   specific.
> 
> I think you are referring to both the sentence: "This notification
> message MUST be encoded in a <notification> message as defined within
> [RFC5277], Section 4.  And per [RFC5277]'s "eventTime" object
> definition, the "eventTime" is populated with the event occurrence
> time."  As well as to the example compliant message in Figure 10.
> 
> While the text could be extracted and placed in the netconf-notif
> document, extracting it fully from SN would leave no baseline message
> format.

Things like this are inevitable when you have a transport-independent
specification.  Of course, you can require that all transports MUST
use the 5277 spec of a notification.  But then I think you rule out
transports like CoMI and possibly other future binary protocols.

> Perhaps others will have a problem if the SN solution does
> not have some pre-defined structure to hold event records?
> 
> I can move this if I hear no objection.  As you know, I am hoping that
> draft-ietf-netconf-notification-messages at some point displaces the
> <notification> message of [RFC5277], Section 4.  And this is easier to
> do with fewer references to RFC5277 within SN.

[...]

> > o  Section 4
> > 
> >           action reset {
> >             if-feature "configured";
> >             description
> >               "Allows the reset of this configured subscription receiver
> > 
> >   This action will be available also for dynamic subscription, if the
> >   server supports the "configured" feature.  Maybe add:
> > 
> >     when "../../../configured-subscription-state";
> 
> This would be nice to have.  But action doesn't support "when" as an
> allowed sub-statement.

Ha!  That's unfortunate... I wonder why we didn't allow this...



/martin

[Netconf] mbj's WGLC review of subscribed-notific… Martin Bjorklund
Re: [Netconf] mbj's WGLC review of subscribed-not… Eric Voit (evoit)
Re: [Netconf] mbj's WGLC review of subscribed-not… Martin Bjorklund
Re: [Netconf] mbj's WGLC review of subscribed-not… Eric Voit (evoit)
Re: [Netconf] mbj's WGLC review of subscribed-not… Andy Bierman
Re: [Netconf] mbj's WGLC review of subscribed-not… Eric Voit (evoit)
Re: [Netconf] mbj's WGLC review of subscribed-not… Martin Bjorklund
Re: [Netconf] mbj's WGLC review of subscribed-not… Andy Bierman
Re: [Netconf] mbj's WGLC review of subscribed-not… Martin Bjorklund
Re: [Netconf] mbj's WGLC review of subscribed-not… Martin Bjorklund
Re: [Netconf] mbj's WGLC review of subscribed-not… Eric Voit (evoit)
Re: [Netconf] mbj's WGLC review of subscribed-not… Eric Voit (evoit)
Re: [Netconf] mbj's WGLC review of subscribed-not… Martin Bjorklund
Re: [Netconf] mbj's WGLC review of subscribed-not… Reshad Rahman (rrahman)
Re: [Netconf] mbj's WGLC review of subscribed-not… Martin Bjorklund
Re: [Netconf] mbj's WGLC review of subscribed-not… Eric Voit (evoit)
Re: [Netconf] mbj's WGLC review of subscribed-not… Martin Bjorklund
Re: [Netconf] mbj's WGLC review of subscribed-not… Eric Voit (evoit)
Re: [Netconf] mbj's WGLC review of subscribed-not… Eric Voit (evoit)