[OPSAWG] Roman Danyliw's Discuss on draft-ietf-opsawg-ntf-11: (with DISCUSS and COMMENT)
Roman Danyliw via Datatracker <noreply@ietf.org> Wed, 01 December 2021 02:09 UTC
Return-Path: <noreply@ietf.org>
X-Original-To: opsawg@ietf.org
Delivered-To: opsawg@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id 388EE3A0820; Tue, 30 Nov 2021 18:09:44 -0800 (PST)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
From: Roman Danyliw via Datatracker <noreply@ietf.org>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-opsawg-ntf@ietf.org, opsawg-chairs@ietf.org, opsawg@ietf.org, ludwig@clemm.org, ludwig@clemm.org
X-Test-IDTracker: no
X-IETF-IDTracker: 7.40.0
Auto-Submitted: auto-generated
Precedence: bulk
Reply-To: Roman Danyliw <rdd@cert.org>
Message-ID: <163832458350.9944.17802735924974626797@ietfa.amsl.com>
Date: Tue, 30 Nov 2021 18:09:44 -0800
Archived-At: <https://mailarchive.ietf.org/arch/msg/opsawg/LB-7Wfm6YUOog4HeMeArtVCLfT4>
Subject: [OPSAWG] Roman Danyliw's Discuss on draft-ietf-opsawg-ntf-11: (with DISCUSS and COMMENT)
X-BeenThere: opsawg@ietf.org
X-Mailman-Version: 2.1.29
List-Id: OPSA Working Group Mail List <opsawg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/opsawg>, <mailto:opsawg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/opsawg/>
List-Post: <mailto:opsawg@ietf.org>
List-Help: <mailto:opsawg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/opsawg>, <mailto:opsawg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 01 Dec 2021 02:09:44 -0000
Roman Danyliw has entered the following ballot position for draft-ietf-opsawg-ntf-11: Discuss When responding, please keep the subject line intact and reply to all email addresses included in the To and CC lines. (Feel free to cut this introductory paragraph, however.) Please refer to https://www.ietf.org/blog/handling-iesg-ballot-positions/ for more information about how to handle DISCUSS and COMMENT positions. The document, along with other ballot positions, can be found here: https://datatracker.ietf.org/doc/draft-ietf-opsawg-ntf/ ---------------------------------------------------------------------- DISCUSS: ---------------------------------------------------------------------- Thank you for being responsive to the SECDIR review threat to improve the security considerations text. Specifically, https://mailarchive.ietf.org/arch/msg/secdir/GUvFWXP7n9IjXW8xlIdMS5ZE5u0/. Even after these edits, there are a few straightforward ambiguities to clear up. (a) Section 2. “When a network's endpoints do not represent individual users (e.g. in industrial, datacenter, and infrastructure contexts), network operations can often benefit from large-scale data collection without breaching user privacy.” Is network telemetry architecture being restricted to such a limited applicability? To quote the original SECDIR thread, is this saying “The Network Telemetry Framework is not applicable to networks whose endpoints represent individual users, such as general-purpose access networks”? If so, I’d recommend being that explicit. (b) Section 2.1. “To preserve user privacy, the user packet content should not be collected.” This is a great principle, but extremely nuanced and potentially complicated to implement. Is this saying (using the words of this framework), “To preserve the privacy of end-users, no user packet content should be collected. Specifically, the data objects generated, exported, and collected by the Network Telemetry Framework should not include any packet payload from traffic associated with end-users systems”? (c) Section 2.5. Please use stronger and consistent language. OLD Disclaimer: large-scale network data collection is a major threat to user privacy [RFC7258]. The network telemetry framework presented in this document should not be applied to collect and retain individual user data or any data that can identify end users without consent. Any data collection or retention using the framework must be tightly limited to protect user privacy. NEW Large-scale network data collection is a major threat to user privacy and may be indistinguishable from pervasive monitoring [RFC7258]. The network telemetry framework presented in this document must not be applied to generating, exporting, collecting, analyzing or retaining individual user data or any data that can identify end users or characterize their behavior without consent. The principles described in (a), (b) and (c) seems sufficiently important they shouldn’t be scattered across the document. Please either make an applicability statement section early in the document or a dedicated privacy consideration section. ---------------------------------------------------------------------- COMMENT: ---------------------------------------------------------------------- (Apologize if any of the below section numbers are wrong. I conducted most of my review on -10 and then -11 was published which renumbered the document) Thanks to Alexey Melnikov for the SECDIR review. I'm a bit of confusion on the framing of this document. It seems to me to be suggesting that “OAM” is a tied to a series of static technologies and practices, and a set of new practices called “network telemetry” are needed. I don’t disagree with the idea that network management practices need to evolve, and that the “networks of the future” will look different than today. Relying on BCP 161 (RFC 6291), I took OAM to mean an evolving set of practices and technology. Using Section 3 of BCP 161, O + A + M seemed like a contextual set of operations that would be done now and still required in networks of the future. The document acknowledges that there is some ambiguity in “network telemetry”. I think it needs to equally acknowledge that the same is true of OAM, and that RFC7276 is not OAM. In the aggregate, I don’t think the text realizes the clarity that it set out to provide by defining “key characteristics of network telemetry which set a clear distinction from the conventional network OAM and show that some conventional OAM technologies can be considered a subset of the network telemetry technologies.”. To be clear, I’m not raising an objection to many of the properties linked to network telemetry. Instead, I think the clarity of message is getting diluted because a very particular distinction is trying to be made (OAM vs. network telemetry) and it isn’t clear. See below for a specifics. ** Section 1. … using a wide variety of techniques including machine learning, data analysis, and correlation. ML, data analysis and correlation are unlike things. ML is a particular AI technique, data analysis is a generic description of an activity, and is correlation intended to be a statistical technique? ** Section 1 Network telemetry extends beyond the historical network Operations, Administration, and Management (OAM) techniques and expects to support better flexibility, scalability, accuracy, coverage, and performance. This seems hypothetical depending on the definition on which technologies are considered in scope of network telemetry and OAM. ** Section 2. Today one can access advanced big data analytics capability through a plethora of commercial and open source platforms (e.g., Apache Hadoop), tools (e.g., Apache Spark), and techniques (e.g., machine learning). Thanks to the advance of computing and storage technologies, network big data analytics gives network operators an opportunity to gain network insights and move towards network autonomy. In trying to contextual this observation, where is this capability relative to Figure 1? In general, I would recommend that this reference architecture when assessing the ecosystem. ** Section 2. However, while the data processing capability is improved and applications are hungry for more data ... What does it mean and what applications are “hungry for more data”. Is a reference possible here? ** Section 2. Editorial. s/concerned in the context/relevant in this document/ ** Section 2.1 Less but higher quality data are often better than lots of low quality data. This seems like a broad generalization that doesn’t consider the application and the cost of acquisition or processing. ** Section 2.2. The ultimate goal is to achieve the ideal security with no, or only minimal, human intervention. What is “ideal” security? ** Section 2.2. While machine learning technologies can be used for root cause analysis, it up to the network to sense and provide the relevant diagnostic data which are either actively fed into, or passively retrieved by, machine learning applications. This text is asymmetric with the others bullets since don’t discuss specific techniques. Personally, it also seem odd to include this text as there are other ways to do root cause analysis beyond ML (to include other AI approaches). ** Section 2.3 For a long time, network operators have relied upon SNMP [RFC3416], Command-Line Interface (CLI), or Syslog to monitor the network. Some other OAM techniques as described in [RFC7276] are also used to facilitate network troubleshooting. ... These challenges were addressed by newer standards and techniques (e.g., IPFIX/Netflow, PSAMP, IOAM, and YANG-Push) and more are emerging. These standards and techniques need to be recognized and accommodated in a new framework. This section is an exemplar of the disconnect I noted in the definitions of OAM. The first paragraph presents a narrow view of currently used (albeit older) network monitoring technologies (SNMP, CLI Syslog). However, in the closing paragraph, the text names more modern technologies I would also consider OAM, and these technologies could meet some of the challenges mentioned in this section. Furthermore, some of these “newer standards” are framed as things that need to be “recognized”. This is puzzling because my understanding was that technologies like IPFIX/Netflow have been very widely deployed for quite some time now. What’s the new framework needed? ** Section 2.4 Network telemetry covers the conventional network OAM and has a wider scope. Can the text be more specific in what way network telemetry is wider. I thought OAM was rather ambiguous. ** Section 2.4 Hence, the network telemetry can directly trigger the automated network operation, while in contrast some conventional OAM tools are designed and used to help human operators to monitor and diagnose the networks and guide manual network operations. I’m not sure if this is a fair generalization. Even “older technologies” like SNMP currently trigger automated responses based on the values they return. ** Section 2.4. Per “data fusion,” which part of the Figure 1 is this happening? ** Section 2.5. Network data analytics and machine-learning technologies are applied for network operation automation, relying on abundant and coherent data from networks. -- What is the difference between a network data analytics system and ML technologies? Isn’t analytics a superset of ML? -- What is coherent data? ** Section 2.5. In detail, such a framework would benefit application development for the following reasons: It might be helpful to level set what an application is in this context. Is this the “network operations application” of Figure 1? ** Section 2.5 All the use cases and applications are better to be supported uniformly and coherently under a single intelligent agent -- Editorial. There is a missing word which leads to this sentence not parsing. -- What’s the basis for asserting that a “single intelligent agent” is the best approach? -- Maybe the issue is of semantics, what is an “intelligent agent” in this context? ** Section 2.5. Network visibility presents multiple viewpoints and Efficient data fusion is critical for applications to reduce the overall quantity of data and improve the accuracy of analysis. Are these generalizations expected to be true across the broad use cases? ** Figure 2. For the management plane, the data model module has MIB and syslog listed, but the data encodings as GPB, JSON and XML. These data models and encodings don’t line up (i.e., MIBs and syslog typically don’t rely on GPB, JSON or XML). ** Section 3.1. Where do network security applications such as WAFs, IDS/IPS/ NGF, DLP, web-proxies, and pDNS fit into this taxonomy? ** Section 3.1.* These sections inconsistently describe properties/requirements for an architectural element and their challenges (but no solutions or requirements for) a given elements. As a result, I had trouble understanding what an implementer should understand these components. It would have been clearer is the different modules had common and module specific requirements. ** Section 3.1.1. Per the requirements of “Convenient Data Subscription”, “Structured Data”, etc. why wouldn’t those be desirable requirements for all four of the modules? ** Section 3.1.3. Providing “timely data” and “structured data”, seem like the restatements of Section 4.1.1’s “structure data” and “high speed transport”. Is this a common requirement? ** Section 3.1.3. Why wouldn’t it be desirable for all of the modules to support incremental deployment note here? ** Section 3.2. * Data Query, Analysis, and Storage: This component works at the application layer. I need a bit of topological orientation. What is the application layer of say a “forwarding plane” or “external data” be? What are the other layers? ** Section 5. Recommend explicitly saying that this document doesn’t define specific technologies to shift the responsibility of specific considerations. OLD Security considerations for networks that use telemetry methods may include: NEW This document proposes a conceptual architectural for collecting, transporting, and analyzing a wide variety of data sources in support of network applications. The protocols, data formats, and configurations chosen to implement this framework will dictate the specific Security Considerations. These considerations may include: ** Section 5. OLD * Telemetry data stores, storage encryption and methods of access; NEW * Telemetry data stores, storage encryption, methods of access, and retention practices.
- [OPSAWG] Roman Danyliw's Discuss on draft-ietf-op… Roman Danyliw via Datatracker
- Re: [OPSAWG] Roman Danyliw's Discuss on draft-iet… Haoyu Song