Re: [Netconf] LC on subscribed-notifications-10

"Eric Voit (evoit)" <evoit@cisco.com> Thu, 26 April 2018 16:49 UTC

Return-Path: <evoit@cisco.com>
X-Original-To: netconf@ietfa.amsl.com
Delivered-To: netconf@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2AE2912783A for <netconf@ietfa.amsl.com>; Thu, 26 Apr 2018 09:49:23 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.509
X-Spam-Level:
X-Spam-Status: No, score=-9.509 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, GB_SUMOF=5, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, SPF_PASS=-0.001, T_DKIMWL_WL_MED=-0.01, URIBL_BLOCKED=0.001, USER_IN_DEF_DKIM_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cisco.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 467uWwjBw6P7 for <netconf@ietfa.amsl.com>; Thu, 26 Apr 2018 09:49:20 -0700 (PDT)
Received: from rcdn-iport-4.cisco.com (rcdn-iport-4.cisco.com [173.37.86.75]) (using TLSv1.2 with cipher DHE-RSA-SEED-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id E36091267BB for <netconf@ietf.org>; Thu, 26 Apr 2018 09:49:19 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=81300; q=dns/txt; s=iport; t=1524761359; x=1525970959; h=from:to:subject:date:message-id:references:in-reply-to: mime-version; bh=gf0hnHL7ne3uq9utxLVtbZpKBxpY2RgKyMwZOpP/Sp0=; b=fXHYdovywxj7p1pbcs/IhX4uJ0C+bDIyg0Au8DceIr8BZepkFhMSqWzj hXD+idlt7SwVKsm091dZC9c5uGuHNNlD2Viv+m/YbCZ+IXbEymoRQhQz2 M9v6Cm1yG0x4fY04HZtA8CIGTBrr/nkBqrWtJW/FktPJ2iqpYu/Zklzam 8=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: A0AcAQByAuJa/5ldJa1UBxkBAQEBAQEBAQEBAQEHAQEBAQGCTUsrYRdjKAqDYYgCjHeBdHUakxMUgWQLJYRHAhqCLiE0GAECAQEBAQEBAmwcDIUiAQEBAQIBGgkEBkoHCwIBCA4HEBMBCQICAjAlAgQBGhEChBBcCA+oKYFpMx+IJYJABYgCD4FUP4EPglY1gxECgUgFGgcqgkiCVAKFOYFxB4V7g0WHGAgChWGCUYJkgyiBPYY7hGWJPYZXAhETAYEkARw4gVJwFYJ+giAXegECgkiKUQFvjmYCBR8DBIEBgRgBAQ
X-IronPort-AV: E=Sophos;i="5.49,330,1520899200"; d="scan'208,217";a="386954075"
Received: from rcdn-core-2.cisco.com ([173.37.93.153]) by rcdn-iport-4.cisco.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 26 Apr 2018 16:49:18 +0000
Received: from XCH-RTP-015.cisco.com (xch-rtp-015.cisco.com [64.101.220.155]) by rcdn-core-2.cisco.com (8.14.5/8.14.5) with ESMTP id w3QGnHWZ008362 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=FAIL); Thu, 26 Apr 2018 16:49:18 GMT
Received: from xch-rtp-013.cisco.com (64.101.220.153) by XCH-RTP-015.cisco.com (64.101.220.155) with Microsoft SMTP Server (TLS) id 15.0.1320.4; Thu, 26 Apr 2018 12:49:17 -0400
Received: from xch-rtp-013.cisco.com ([64.101.220.153]) by XCH-RTP-013.cisco.com ([64.101.220.153]) with mapi id 15.00.1320.000; Thu, 26 Apr 2018 12:49:17 -0400
From: "Eric Voit (evoit)" <evoit@cisco.com>
To: Kent Watsen <kwatsen@juniper.net>, Alexander Clemm <ludwig@clemm.org>, "netconf@ietf.org" <netconf@ietf.org>
Thread-Topic: [Netconf] LC on subscribed-notifications-10
Thread-Index: AQHTvAAnP4UPxNeFY0CSJ8tCCoPN1aPROUcQgATP1QCAHwQrAIADjkNQgA1teoD//750sIAJjRkA///cqgCAA11fAIAAqbKw
Date: Thu, 26 Apr 2018 16:49:17 +0000
Message-ID: <87fbe3cb907a473f816295c4545bd7fa@XCH-RTP-013.cisco.com>
References: <17B884BF-0BB8-4B7C-BFBB-0AAFBEA857F6@juniper.net> <aedeb7390d0b4faa9f2bf12c2fe45cd2@XCH-RTP-013.cisco.com> <040a01d3be9f$09700490$1c500db0$@clemm.org> <2089023D-DA09-48E9-8F37-8FE459DC4F49@juniper.net> <dfc78f2b1062498388824b1f6dd97ff6@XCH-RTP-013.cisco.com> <1EC2E732-C524-4552-A3AD-27507239F763@juniper.net> <2b788c22f7ee4af889813b805348d69a@XCH-RTP-013.cisco.com> <9E7F3A66-98B9-4528-882C-43AAD19F0AEC@juniper.net> <96615f0331cd455182901ddf3e6ece23@XCH-RTP-013.cisco.com> <7F8F2AF4-28A5-4016-B727-10CAF6A093AF@juniper.net>
In-Reply-To: <7F8F2AF4-28A5-4016-B727-10CAF6A093AF@juniper.net>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-ms-exchange-transport-fromentityheader: Hosted
x-originating-ip: [10.118.56.228]
Content-Type: multipart/alternative; boundary="_000_87fbe3cb907a473f816295c4545bd7faXCHRTP013ciscocom_"
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/netconf/POANCeU4wWG5ZL2ek_awvSPxeik>
Subject: Re: [Netconf] LC on subscribed-notifications-10
X-BeenThere: netconf@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Network Configuration WG mailing list <netconf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/netconf>, <mailto:netconf-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/netconf/>
List-Post: <mailto:netconf@ietf.org>
List-Help: <mailto:netconf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/netconf>, <mailto:netconf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 26 Apr 2018 16:49:23 -0000

Hi Kent,

All changes below are also in the latest in the pending v13 draft within:
https://github.com/netconf-wg/rfc5277bis/blob/master/draft-ietf-netconf-subscribed-notifications-13.txt

From: Kent Watsen, April 25, 2018 6:07 PM


[further trimming]





 <Eric4>  Ahhh.  I got it now.  The two reasons are:

·       Insufficient resources (e.g., CPU)

·       Unsupportable volume (i.e., a bandwidth constraint)



I adjusted the diagram to:
                      .........
                      : start :
                      :.......:
                          |
                 establish-subscription
                          |
                          |   .------modify-subscription-------.
                          v   v                                |
                    .-----------.                        .-----------.
         .--------. | receiver  |-insufficient CPU, b/w->| receiver  |
     modify-       '|  ACTIVE   |                        | SUSPENDED |
     subscription   |           |<---CPU, b/w sufficient-|           |
         ---------->'-----------'                        '-----------'
                          |                                    |
               delete/kill-subscription                   delete/kill-
                          |                               subscription
                          v                                    |
                      .........                                |
                      :  end  :<-------------------------------'
                      :.......:



With the supporting bullet items under the diagram:



·       A publisher may choose to suspend a subscription when there is insufficient CPU or bandwidth available to service the subscription. This is notified to a subscriber with a "subscription-suspended" state change notification.



·       A suspended subscription may be modified by the subscriber (for example in an attempt to use fewer resources).  Successful modification returns the subscription to an active state.



·       Even without a "modify-subscription" request, a publisher may return a subscription to the active state should the resource constraints clear.  This is announced to the subscriber via the "subscription-resumed" subscription state change notification.



<KENT4> yes, this is it.  Just a couple nits, can you widen the diagram a couple characters to give more dashes (-) around the "insufficient" line?



<Eric5> Done.



Also, in the 3rd bullet, maybe replace "resource constraints clear" to "resource constraints become sufficient again" so that it binds with the words in the diagram?



<Eric5> Done.  I also inserted changes matching your “insufficient CPU, b/w” request into the Configured receivers Section 2.5.1.

<snip/>

<Eric3>  Added the descriptive paragraph requested in the middle of the three paragraphs below...

It is possible to place a start time on a configured subscription.  This enables streaming of logged information immediately after restart.

Replay of events records created since restart can be quite useful.  This allows event records generated before transport connectivity was supportable by a publisher to be passed to a receiver.  In addition, event records logged before restart are not sent.  This avoids the potential for accidental event record duplication.  Such duplication might otherwise be likely as a configured subscription’s identifier before and after the reboot is the same, and there may be not be evidence to a receiver that a restart has occurred.  By establishing restart as the earliest potential time for event records to be included in notification messages, a well-understood timeframe for replay is defined.

Therefore, when configured replay subscription receivers first become ACTIVE, buffered event records (if any) will be sent immediately after the "subscription-started" notification.  And the leading event record sent will be the first event record subsequent to the latest of four different times: the "replay-log-creation-time", "replay-log-aged-time", "replay-start-time", or the most recent publisher boot time.

<Kent3> Hmmm, I'm having a negative reaction to the "event records logged before restart are not sent" bit.  I know what you are trying to do, but I worry that this behavior might drop important logs, perhaps to the advantage of an adversary.  Note that some devices implement an <edit-config> with a restart.  Maybe the solution should require publishers to maintain a per configured-subscription awareness of (roughly) which log was sent last?   - and notify the receiver when a restart has occurred, or when the replaying of events occurs, so that they can be aware that there might be some duplicates?

<Eric4>  The current solution guarantees no duplicates, and also informs the receiver of each new “start-time”.  This allows the receiver to attempt to reconstruct any gaps from the last event previously pushed, should the choose to attempt such reconstruction.   As a dynamic subscription has no such boundary constraints on replay and boot time, all a subsequent dynamic subscription needs to do is to request the events between the last received event previously received from that configured subscription and the new replay-start-time.

<Kent4> So, the receiver is informed of each new "start-time" via the "subscription-started" control message, and then MUST do a short-lived dynamic subscription to scoop-up any possibly-missed logs, for which there may be none?   If we choose to keep this behavior, the draft should say this more clearly, perhaps in the Security Considerations section…

<Eric5> Added the following text as the last paragraph in the Implementation Considerations Section...

For configured replay subscriptions, the receiver is protected from duplicated events being pushed after a publisher is rebooted.  However it is possible that a receiver might want to acquire event records which failed to be delivered just prior to the reboot. Delivering these event records be accomplished by leveraging the “eventTime” from the last event record received prior to the receipt of a “subscription-started” state change notification.  With this “eventTime” and  the “replay-start-time” from the “subscription-started” notification, an independent dynamic subscription can be established which retrieves any event records which may have been generated but not sent to the receiver.



<Eric4>  Note that this solution acts identically for loss of events when the platform *doesn’t* reboot, and events are just lost due to some overflow.  See the Section 2.5.2 text:
   “However if events are lost (rather than just delayed) due to replay buffer overflow, a new "subscription-started" must be sent.  This new "subscription-started" indicates an event record discontinuity.”
I.e., this way the receiver doesn’t have to do forensics to determine and attempt to determine the cause of a transient loss of events on a publisher.

<Kent4> okay, but note that this section refers to Section 2.4.2.1 (not 2.5.2).   I understand what you mean, but I think more text is needed to convey it to readers…

<Eric5> Added the sentence:

The most recent publisher boot time ensures that duplicate event records are not replayed from a previous time the publisher was booted.

In any case, tracking the last event sent to each receiver will be a pretty hard requirement to meet during a publisher crash.  Things are simpler to just let the receiver attempt a reconstruction should they need to.



<Kent4> this I agree with, but I really don't like the fact that receiver MUST do a short-lived dynamic subscription to scoop-up any possibly-missed logs, for which there may be none.  Perhaps we could add more values into the "subscription-started" notification message that would enable to receiver to make a local determination if such a dynamic subscription would be  helpful?



<Eric5> I recommend against providing extra objects/reasons in the “subscription-started” at this time.  Publishers might not want to advertise a reboot, and they might not want to advertise why there was loss in event continuity.   All that should matter to a receiver is that such a discontinuity existed, and they have a way to try to fill event the gap should they care.  If the need for more data and the cause of the discontinuity turns out to be required, we can always augment here with future objects.







<Kent3> Going back to my original comment, the new paragraph helps, it certainly caught my attention regarding reboots wiping out the replay log buffer.



<Eric4>  There is no requirement that the reboot wipe outs out the buffer (the solution is agnostic to that).   The only requirement is that a configured subscription replay start no earlier than the last reboot time.



<Kent4> I'm glad to hear that the logs before restart aren't lost, just rather that there is no attempt to send them by default.   This wasn't all that obvious to me from what you wrote before.



<Eric4>  Tweaked a Section 2.4.2.1 sentence to say:



This document puts no restrictions on the size or form of the log, where it resides within the publisher, or when event record entries in the log are purged.



I suggest adding text that clarifies this, and details the need for a short-lived dynamic-subscription.



<Eric4> The tweak above, with the suggested text in the Implementation Considerations section above hopefully covers this.





<snip/>



>   Re: the 6th paragraph, I'm surprised that requirements for transport-

>   bindings wasn't discussed before in its own section.  It seems like

>   a new thing here, that a receiver's transport might not be secure.

>   I'm okay with and support this, btw, as its sometimes better to

>   offload devices thru the use of a local collector node, for which

>   encryption may not be needed...



Agree with your comments.

<KENT> but where's the change?  Shouldn't this have been discussed

previously in the draft somewhere?



<Eric2> The vast majority of transport binding discussions are addressed in the transport document.  So I see this as guidance to a documenter of a transport document.  Perhaps that is unnecessary for this document, and the paragraph should be removed.  I would be fine with that.



<Kent2> wait, I don't think you can offload transport-requirements to the transport-binding documents.   I think that this document needs to define the requirements and the transport-binding documents then show how they adhere to them.   Does this make sense?



 <Eric3>With the varied transports of NETCONF, HTTP/RESTCONF, UDP, CoAP already in drafts my belief is that only a high level subset of transport requirements spanning the universe of potential transports can potentially be abstracted in this document.  The secure transport requirement is one such example, and that is a recommendation.  The Security Considerations section is a good place for that one.  Beyond the security recommendation there aren’t too many transport independent possibilities.   I did just added one new transport requirement to the very end of “Event Streams” section though (which perhaps wasn’t explicit enough elsewhere).  This requirement is:



“Event records MUST NOT be delivered to a receiver in a different order than they were placed onto an event stream.”



What other transport-independent transport requirements might there be which are not already documented?



Stepping back, I see the transport draft plus this drafts providing the aggregate set of requirements for a full solution.  And I had thought it would be up to the draft authors plus WGs to validate that the sum of the documents is sufficient.





<Kent3> unsure.  For example, RFC 6241has Section 2 (Transport Protocol Requirements) that the SSH and TLS binding drafts refer to.  It seems that this draft should have a similar section that highlights what MUST or MUST NOT be supported.  It could even include some additional text indicating that bindings MAY introduce additional requirements.



<Eric4> I re-read RFC6241 Section 2 a couple times.  There are a comparisons can be made from that document to a subset of requirements currently in this document’s security section.  But I don’t see anything missing on the MUST and MUST NOT side of things.   FYI: the specific requirements I am thinking of are:



   For both configured and dynamic subscriptions the publisher MUST

   authenticate and authorize a receiver via some transport level

   mechanism before sending any updates.



   A secure transport is highly recommended and the publisher MUST

   ensure that the receiver has sufficient authorization to perform the

   function they are requesting against the specific subset of content

   involved.



   With configured subscriptions, one or more publishers could be used

   to overwhelm a receiver.  Notification messages SHOULD NOT be sent to

   any receiver which does not support this specification.  Receivers

   that do not want notification messages need only terminate or refuse

   any transport sessions from the publisher.



That is about it for common stuff.  Considering the wide variety of potential transports, and ubiquity for the need of stream transports, I am simply not aware of any more common requirements.  If you need me to,  I can extract these three requirements, and put this under a separate transport requirements section.   But this seems excessive, especially as we have transport specific documents with eyes on them from the WG.  But if really do want this, I will place these into a new, separate section; and I will add your text: “bindings MAY introduce additional requirements.”



<Kent4> yes, this is what I'm thinking is needed, even if just for these 3 requirements + a statement that each transport MAY impose additional limitations (not so much a "requirement" as a "fact of life using said transport" I think)



<Eric5>  Move to a new section just before Security Considerations.   Added the last sentence:



Additional transport requirements will be dictated by the choice of transport used with a subscription. For an example of such requirements with NETCONF transport, see [I-D.draft-ietf-netconf-netconf-event-notifications].



/Eric5



Kent // contributor