Re: [Netconf] LC on subscribed-notifications-10

"Eric Voit (evoit)" <evoit@cisco.com> Mon, 25 June 2018 21:39 UTC

Return-Path: <evoit@cisco.com>
X-Original-To: netconf@ietfa.amsl.com
Delivered-To: netconf@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 954FB130E5F for <netconf@ietfa.amsl.com>; Mon, 25 Jun 2018 14:39:23 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -12.521
X-Spam-Level:
X-Spam-Status: No, score=-12.521 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, HTTPS_HTTP_MISMATCH=1.989, RCVD_IN_DNSWL_HI=-5, SPF_PASS=-0.001, T_DKIMWL_WL_MED=-0.01, USER_IN_DEF_DKIM_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cisco.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ZeeR9civooHq for <netconf@ietfa.amsl.com>; Mon, 25 Jun 2018 14:39:19 -0700 (PDT)
Received: from rcdn-iport-2.cisco.com (rcdn-iport-2.cisco.com [173.37.86.73]) (using TLSv1.2 with cipher DHE-RSA-SEED-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B3DC5130E4C for <netconf@ietf.org>; Mon, 25 Jun 2018 14:39:18 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=41384; q=dns/txt; s=iport; t=1529962758; x=1531172358; h=from:to:cc:subject:date:message-id:references: in-reply-to:mime-version; bh=ZX/9EwDIOc13GIyIDqIdlZVVXXmyvAaQtA6QZfoagjQ=; b=C14IRENTqiW0TDtw2Z+d63X/qY5XovgSXLQuFoKpir28fGKQ7cHuyd6w e79VsSsr23tuQLcEHtxBt5Ii4by5Jzf+nwg37HYFN5ywy6tsGTmchuH9i 2l8w04XgQ+cuyUD76EEUwJOwC/yCczaGsJQesa4mVnjIg0aveZRexrqxd A=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: A0CfAAC4YDFb/4wNJK1TChkBAQEBAQEBAQEBAQEHAQEBAQGCU1cBAQEBIWJ/KAqDb4FfhiWMQYIFlQqBegskhEgCF4J2ITQYAQIBAQEBAQECbRwMhSgBAQECAiMKSgIQAgEIDgQDEBMBCQICAjAXDgIEDg0TgwuBG2QPrCeCHB+IKIETBYdVgQgPgVY/gQ+DD4MYAQEBARiBGwQuB4JzglUCmS8JAoV8gmSGJ4FIhAaCaoUZiiSHIgIREwGBJB04gVJwFYMkgXOEDIpSbwEBAY0AgS6BGgEB
X-IronPort-AV: E=Sophos;i="5.51,272,1526342400"; d="scan'208,217";a="418401227"
Received: from alln-core-7.cisco.com ([173.36.13.140]) by rcdn-iport-2.cisco.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 25 Jun 2018 21:39:16 +0000
Received: from XCH-RTP-014.cisco.com (xch-rtp-014.cisco.com [64.101.220.154]) by alln-core-7.cisco.com (8.14.5/8.14.5) with ESMTP id w5PLdGk9001088 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=FAIL); Mon, 25 Jun 2018 21:39:16 GMT
Received: from xch-rtp-013.cisco.com (64.101.220.153) by XCH-RTP-014.cisco.com (64.101.220.154) with Microsoft SMTP Server (TLS) id 15.0.1320.4; Mon, 25 Jun 2018 17:39:15 -0400
Received: from xch-rtp-013.cisco.com ([64.101.220.153]) by XCH-RTP-013.cisco.com ([64.101.220.153]) with mapi id 15.00.1320.000; Mon, 25 Jun 2018 17:39:15 -0400
From: "Eric Voit (evoit)" <evoit@cisco.com>
To: Kent Watsen <kwatsen@juniper.net>
CC: "netconf@ietf.org" <netconf@ietf.org>, Alexander Clemm <ludwig@clemm.org>
Thread-Topic: [Netconf] LC on subscribed-notifications-10
Thread-Index: AQHTvAAnP4UPxNeFY0CSJ8tCCoPN1aPROUcQgATP1QCAHwQrAIADjkNQgA1teoD//750sIAJjRkA///cqgCAA11fAIAAqbKwgBOiSQCAARvxAIAIk/4AgACP9NCADaFDgIAAiRzAgAubxgD///ywEAJlO7OAACjJQxABysFmgAAIUZsAAI1lOQAABkzf0A==
Date: Mon, 25 Jun 2018 21:39:15 +0000
Message-ID: <bc1b705b88f04d368334b78fbe91b7dd@XCH-RTP-013.cisco.com>
References: <17B884BF-0BB8-4B7C-BFBB-0AAFBEA857F6@juniper.net> <aedeb7390d0b4faa9f2bf12c2fe45cd2@XCH-RTP-013.cisco.com> <040a01d3be9f$09700490$1c500db0$@clemm.org> <2089023D-DA09-48E9-8F37-8FE459DC4F49@juniper.net> <dfc78f2b1062498388824b1f6dd97ff6@XCH-RTP-013.cisco.com> <1EC2E732-C524-4552-A3AD-27507239F763@juniper.net> <2b788c22f7ee4af889813b805348d69a@XCH-RTP-013.cisco.com> <9E7F3A66-98B9-4528-882C-43AAD19F0AEC@juniper.net> <96615f0331cd455182901ddf3e6ece23@XCH-RTP-013.cisco.com> <7F8F2AF4-28A5-4016-B727-10CAF6A093AF@juniper.net> <87fbe3cb907a473f816295c4545bd7fa@XCH-RTP-013.cisco.com> <CEE5B81C-31AE-40C6-B2F0-23D93C644D85@juniper.net> <fd172bddff134db6aeda49b7e8bfd3e9@XCH-RTP-013.cisco.com> <B112DC20-D6FC-44BA-AACE-0E641D49C5C3@juniper.net> <3b4744f4e2144ee18b9bfd5225360bf4@XCH-RTP-013.cisco.com> <01486F5E-CEE3-4BDD-9CD2-CA2754981000@juniper.net> <e414fe96c38f4aeba97dd56592748a23@XCH-RTP-013.cisco.com> <49943A03-D229-4084-9947-3065CE58A672@juniper.net> <a18cacd026e046b0a0c08f7a3fc969d2@XCH-RTP-013.cisco.com> <470391DD-9A9E-47EC-9CEC-E8E6BABE3DDF@juniper.net> <b94935c9fbbb4ced8b7393ea42457471@XCH-RTP-013.cisco.com> <38DB151D-81C9-49E4-B6A3-73D083298C53@juniper.net> <fd74cc7419894fec87f5af3e7dc688bd@XCH-RTP-013.cisco.com> <230D4B7A-42E6-4A9E-909B-BE91EE5D2FF3@juniper.net>
In-Reply-To: <230D4B7A-42E6-4A9E-909B-BE91EE5D2FF3@juniper.net>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-ms-exchange-transport-fromentityheader: Hosted
x-originating-ip: [10.118.56.228]
Content-Type: multipart/alternative; boundary="_000_bc1b705b88f04d368334b78fbe91b7ddXCHRTP013ciscocom_"
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/netconf/Pv5nLDN_r3kHycy0tvDLaiHLS4g>
Subject: Re: [Netconf] LC on subscribed-notifications-10
X-BeenThere: netconf@ietf.org
X-Mailman-Version: 2.1.26
Precedence: list
List-Id: Network Configuration WG mailing list <netconf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/netconf>, <mailto:netconf-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/netconf/>
List-Post: <mailto:netconf@ietf.org>
List-Help: <mailto:netconf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/netconf>, <mailto:netconf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 25 Jun 2018 21:39:24 -0000

And <Eric12>

From: Kent Watsen, June 25, 2018 4:29 PM

Please look for <Kent11> below.


<snip>






<Kent4> this I agree with, but I really don't like the fact that receiver MUST do a short-lived dynamic subscription to scoop-up any possibly-missed logs, for which there may be none.  Perhaps we could add more values into the "subscription-started" notification message that would enable to receiver to make a local determination if such a dynamic subscription would be  helpful?



<Eric5> I recommend against providing extra objects/reasons in the “subscription-started” at this time.  Publishers might not want to advertise a reboot, and they might not want to advertise why there was loss in event continuity.   All that should matter to a receiver is that such a discontinuity existed, and they have a way to try to fill event the gap should they care.  If the need for more data and the cause of the discontinuity turns out to be required, we can always augment here with future objects.



<KENT5> first, I'm still not 100% sure if this is just a reboot problem, or any time the subscription is restarted/resumed.



<Eric6> Per above: retrieving missing event records is not a reboot specific problem.  But unintentionally replicating event records is reboot specific.  (Otherwise the configured replay-start-time would drive a repeat of everything on each and every reboot.)



<Kent6> okay, I think I got it this time.  Having a *configurable* replay-start-time is so confusing.  Is it really worth having?



<Eric7>   Yes it is worth having.

(a) In many environments, reboot is very infrequent.  Without configurable start time, an operator setting up a configured subscription would not have the ability to designate what to send.  It could only send the full log (at whatever size).

(b) on-publisher security or troubleshooting diagnostics might identify a breach or some event where streaming recent historical event records is a MUST.  As a result, it might want to stream a subset of event records off a box going back in time to potential events which might have been evidence or contributing factors.



<Kent7> Let me come at this another way.  Assume we drop all support for *configurable* replay-start-time.  As such, configured subscriptions always start with the next-generated event (no replay at all).   This covers most use-cases, right?   For those receivers that really wanted the older logs, can't they just do a dynamic subscription to collect them, same as we've been discussing above?



<Eric8> Some reasons this might not always be practical:

(a) IoT devices just might want to passively listen to event streams of Telemetry.  (I.e., this would force configured receivers to support dynamic subscriptions.)

(b) This forces complexity onto applications which only ever need to track what has happened since boot.  (E.g., per above, continuous Integrity Measurement Architecture (IMA) boot log streaming and evaluation.)

(c) Publisher access permissions for who can use the establish-subscription RPC might have to be expanded to include lots of configured receivers.  This might open up a vector to control plane DDoS.  Right now the access permissions would just have to allow the receiver read access to the event records.

(d) A publisher may choose to firewall classes of receivers (or locations of receivers) into a listen-only mode without the ability to establish subscriptions.



<Kent8> This response seems to address the "can't they just do a dynamic subscription" aspect of my comment, but doesn't really address the "why is it important" (I paraphrase) part.  My contention is that the concept of a *configurable* replay-start-time seems confusing and of low value.   I acknowledge that there is some value, but it seems like the value is limited to a one-time start-up optimization that can be alternatively addressed by a dynamic subscription to fetch earlier events (assuming it's allowed, per your points b-d).   Additionally, FWIW, I've never seen such a feature implemented before, and logging mechanisms have been around for decades, so this makes me think that this is something that probably isn't worth having.



<Eric9> As you point out, the why "can't they just do a dynamic subscription" is covered, and we shouldn’t always assume away (b)-(d) as they can matter in some scenarios.  So if we want to support the use case of streaming log entries made after boot, but before the transport session is available, the only alternative I see is to have a configured replay-flag rather than a configuring a start-time.  Are you ok with a flag instead?  Or do you have an alternative suggestion?



<Kent9> see below.



In terms of using this configured replay capability, Cisco’s Integrity Verification application

https://www.cisco.com/c/dam/en/us/td/docs/cloud-systems-management/application-policy-infrastructure-controller-enterprise-module/1-5-x/integrity_verification/user-guide/Cisco_Integrity_Verification_Application_APIC-EM_User_Guide_1_5_0_x.pdf<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.cisco.com_c_dam_en_us_td_docs_cloud-2Dsystems-2Dmanagement_application-2Dpolicy-2Dinfrastructure-2Dcontroller-2Denterprise-2Dmodule_1-2D5-2Dx_integrity-5Fverification_user-2Dguide_Cisco-5FIntegrity-5FVerification-5FApplication-5FAPIC-2DEM-5FUser-5FGuide-5F1-5F5-5F0-5Fx.pdf&d=DwMGaQ&c=HAkYuh63rsuhr6Scbfh0UjBXeMK-ndb3voDTXcWzoCI&r=9zkP0xnJUvZGJ9EPoOH7Yhqn2gsBYaGTvjISlaJdcZo&m=YLzifR1978kb_hHj64ZtYbrlHE2fJaofeSKu9OAFQXg&s=Vc8m5WAJJE8YkQIpZuxlnVTgAtVKQZ-n0dyoRKX3Eao&e=>

does do a shell access event log fetch of the full log after boot, and then just does incremental fetch the deltas of the log (based on log line numbers).  This application is interested in configured subscriptions subsequent to boot for this purpose.  So such incremental streaming of portions of syslog after boot seems like a typical/common need to me.



<Kent9> it might be typical/common desire, but it's still once in the lifetime of the configured subscription.  It seems like, if the device supports dynamic subscriptions, after receiving subscription-started, the client could a) pause the configured subscription, b) use a dynamic subscript to fetch the missing logs, and then c) resume the flow of logs from the configured subscriptions.



<Eric10> Your proposal still precludes (b)-(d) above.   In addition for your step a), there is no RPC or action which allows the event records from a configured (or dynamic) subscription to be paused.  The solution also adds complexity into the client to recognize that early events might be missing, to issue an establish-subscription, and then to tie the results of the independent subscriptions together.



<Kent10> pausing can be implemented by the receiver not reading any more from the TCP socket, or something else.



<Eric11> There is no mechanism for a receiver to pause a single subscription without pausing other subscriptions on the TCP session (as subscriptions typically would share a common TCP.)



<Kent11> Different "receivers" of different configured subscriptions pointing to the same underlying netconf or restconf call-home connection?



<Eric12> Yes





How is it any more complex for the client/receiver than the following in the SN draft already?



   When a receiver of a configured subscription gets a new

   "subscription-started" message for a known subscription where it is

   already consuming events, the receiver SHOULD retrieve any event

   records generated since the last event record was received.  This can

   be accomplish by establishing a separate dynamic replay subscription

   with the same filtering criteria with the publisher", assuming the

   publisher supports the "replay" feature.



<Eric11> It is the same general process.  But it turns the SHOULD into a MUST for applications which need to know the events since boot.  It also doesn’t deliver the events in order to the application, delaying application event analysis.



<Kent11> here's another question that might be good to raise to the WG level.   Please be sure to capture my general concern and also the availability of this workaround.  Thanks.



<Eric12>  You are welcome to take the question to the WG level.  I have no desire to waste people’s time with such an obvious question:

- The current solution does not add the extra complexity described above for configured subscription replay.

- The current solution supports deployment scenarios (b)-(d) above.

- The current solution has far less implementation complexity and error reconciliation states for the client.



/Eric12





Eric



/Kent11