[I2nsf] Benjamin Kaduk's Discuss on draft-ietf-i2nsf-nsf-monitoring-data-model-15: (with DISCUSS and COMMENT)
Benjamin Kaduk via Datatracker <noreply@ietf.org> Wed, 16 February 2022 07:26 UTC
Return-Path: <noreply@ietf.org>
X-Original-To: i2nsf@ietf.org
Delivered-To: i2nsf@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id 35A9A3A0E4A; Tue, 15 Feb 2022 23:26:04 -0800 (PST)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
From: Benjamin Kaduk via Datatracker <noreply@ietf.org>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-i2nsf-nsf-monitoring-data-model@ietf.org, i2nsf-chairs@ietf.org, i2nsf@ietf.org, dunbar.ll@gmail.com, dunbar.ll@gmail.com
X-Test-IDTracker: no
X-IETF-IDTracker: 7.45.0
Auto-Submitted: auto-generated
Precedence: bulk
Reply-To: Benjamin Kaduk <kaduk@mit.edu>
Message-ID: <164499636418.12383.5348778569888235022@ietfa.amsl.com>
Date: Tue, 15 Feb 2022 23:26:04 -0800
Archived-At: <https://mailarchive.ietf.org/arch/msg/i2nsf/1UI9tA1kZ8uQaTwvveIpkKW5I5E>
Subject: [I2nsf] Benjamin Kaduk's Discuss on draft-ietf-i2nsf-nsf-monitoring-data-model-15: (with DISCUSS and COMMENT)
X-BeenThere: i2nsf@ietf.org
X-Mailman-Version: 2.1.29
List-Id: "*I2NSF: Interface to Network Security Functions mailing list*" <i2nsf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i2nsf>, <mailto:i2nsf-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i2nsf/>
List-Post: <mailto:i2nsf@ietf.org>
List-Help: <mailto:i2nsf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i2nsf>, <mailto:i2nsf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 16 Feb 2022 07:26:05 -0000
Benjamin Kaduk has entered the following ballot position for draft-ietf-i2nsf-nsf-monitoring-data-model-15: Discuss When responding, please keep the subject line intact and reply to all email addresses included in the To and CC lines. (Feel free to cut this introductory paragraph, however.) Please refer to https://www.ietf.org/blog/handling-iesg-ballot-positions/ for more information about how to handle DISCUSS and COMMENT positions. The document, along with other ballot positions, can be found here: https://datatracker.ietf.org/doc/draft-ietf-i2nsf-nsf-monitoring-data-model/ ---------------------------------------------------------------------- DISCUSS: ---------------------------------------------------------------------- (1) I'm not sure I understand the motivation for recommending (in §6.3.4) that the HTTP Cookie header field be included in a notification about a Web Attack Event. In general, the cookie field can contain very sensitive information, including credentials, and it is very risky to be sending the cookies around outside of their primary protocol context. Perhaps, if we are fully confident that the NSF has correctly identified an attack, it might be useful to send the cookies around, but I think there are still some scenarios (e.g., a compromised end-user browser) where the cookies in an attack request are still confidential information that should not be disclosed. Could we say more about why it is recommended to always include the cookies or weaken the recommendation? (2) I'm not sure I understand the relationship between the different pieces of information listed in the information model (§6.7.1) for firewall counters. My understanding is that typically a firewall will function as if it were a "bump in the wire" on a particular wire, processing all traffic into and out of a given part of the network (at least on a particular interface), and that the internal network might contain multiple machines that reside on multiple network prefixes. So when we have the information model that looks to be the counters reported by a firewall security function, I don't know how to interpret fields like "Source IP address" and "Destination port", which are typically tied to a particular flow or machine, whereas the firewall covers many different flows and potentially many machines as well. Is the intent to report on firewall behavior on a per-flow granularity, akin to what IPFIX does? That seems likely to produce a very high volume of log information and it's not clear how useful it would be to the NSF data collector. The YANG data model uses a list with the policy name as the list key, indicating that perhaps the intent is to show what a given firewall policy has done, but (a) that should be made clear from the description in the information model, and (b) it still doesn't help relate the different 4-tuple components to each other. (3) The information model gives a list of DPI action types that's prefaced with "e.g.", indicating that it is giving examples only and is not comprehensive, but the dpi-type YANG typedef is modeled as an enumeration that does not allow extensibility for future values not listed here. This seems like an internal inconsistency that needs to be rectified, whether by claiming to be a comprehensive list or by switching the YANG to use extensible identities to represent the DPI action types. (4) Please confirm that we have achieved the intended level of consistency between the information model and the YANG data model, in light of the remarks in my COMMENT section around Sections 6.3.2 through 6.3.5, 6.4.3, and 6.5.1. ---------------------------------------------------------------------- COMMENT: ---------------------------------------------------------------------- I share the surprise of the directorate reviewer that we need to report on disk, CPU, and RAM, and interface statistics in the NSF-specific model (rather than using a more generic OAM functionality for those reports), but I guess it is harmless to duplicate the functionality. The way that the timestamp grouping is integrated into the data nodes seems quite unusual to me. Not least because it's repeated in each element of a list (rather than a single timestamp that applies to all entries, which "the time of the message" description would imply), but also because we typically think of YANG as providing a data model for state of the device being monitored, whereas this data node is more a function of when data is retrieved than a function of the state of the node. Section 2 * Subscription: An agreement initialized by the NSF data collector to receive monitoring information from an NSF. The method to subscribe follows the method explained in [RFC5277]. It looks like RFC 5277 is specific to NETCONF; do we want to reference RFC 8650 for RESTCONF notifications as well? Section 3 * The I2NSF User that is the security administrator can configure a policy that is triggered on a specific event occurring in the NSF or the network [RFC8329] [I-D.ietf-i2nsf-consumer-facing-interface-dm]. If an NSF data collector detects the specified event, it configures additional security functions as defined by policies. I wonder if it might be more appropriate to indicate that it is the security controller (which is often, but not required to be, the data collector) that would configure the additional security functions. Section 4.1 I2NSF Record: A record is defined as an item of information that is kept to be looked at and used in the future. Typically, records are information generated by a system entity (e.g., NSF) that is based on operational and informational data (i.e., various changes in system characteristics), and are generated at particular instants to be kept without any changes afterward. A set of records has an ordering in time based on when they are generated. (side note) I think you should probably keep this text as it is now, but in a certain sense, you could get a set of records (most notably when from different sources) and even though they all have timestamps attached, it may not always be possible to recover a strict ordering in time for the events, owing to clock skew, clock precision, and the possibility of parallel execution. But trying to describe that subtlety would detract from the point being made here and would (in my opinion) take more words than it's worth. Section 4.2 A specific task of an I2NSF User is to process I2NSF Policy Rules. If the User is a human, would it make more sense to say that they "provide" the policy rules, rather than to "process" them? The current "process" formulation seems to imply that the User is involved in applying or taking action based on the policy rules, and I'm not sure if that's the intent. Section 6 * Emission type: The cause type for the message to be emitted. It can be "on-change", "periodic", or "on-request". An "on-change" Does the emission type matter at all when the acquisition method is "query"? (That is, does it only apply when the aquisition method is "subscription"?) Section 6.1.3 * usage: Specifies the size of disk space used. The YANG instantiation of this represents 'usage' as percent, so we probably don't want to say "size", which implies a non-percentage amount. Section 6.2.4 It's not clear that the src-mac/dst-mac should always be included in the traffic flow event. They make sense in some situations, but (for example) if the NSF is in the middle of a provider network, the MAC addresses will just be from its neighboring routers in the provider and do not convey information about the customer traffic that is triggering the notification/report. * arrival-rate: Arrival rate of packets of the traffic flow in packet per second calculated from the beginning of the flow. * arrival-throughput: Arrival rate of packets of the traffic flow in bytes per second calculated from the beginning of the flow. Thank you for making a clear definition of how these rates are computed. That said, I am not sure that "calculated from the beginning of the flow" will be the most useful value, as a more instantaneous rate might also be of interest (e.g., measured over the past minute or ten minutes). Section 6.3.1 * attack-src-ip: The IP address of the source of the DDoS attack. Is this really going to always be a single IP address, even for a *distributed* denial of service attack? I see that the YANG models this as a leaf-list, which suggests that this should be described using the plural sense of the terms. However, it is still unclear that attempting to enumerate every IP addres seen participating in a DDoS attack is a useful thing to do. In other contexts (e.g., the DOTS WG), we limit ourselves to just picking an arbitrary sampling of the "top talkers" and conserving server resources by not trying to list out the long tail of attack IPs. * dst-port: The port number that the attack traffic aims at. Likewise, might an attack target a range of ports? Section 6.3.2, 6.3.3, 6.3.4 We do not list an "action" information element here, but there is a data element for the log-action in the YANG data model in §8. Section 6.3.3 We do not list information elements here for attack-rate or attack-throughput, but there are such data elements in the corresponding YAND model in §8. (I think that is likely to be an error in the YANG rather than an omission from here, though.) * protocol: The employed transport layer protocol. e.g., TCP or UDP. * app: The employed application layer protocol. e.g., HTTP or FTP. It might be worth considering whether QUIC would be listed as the transport or application layer protocol (or both). Section 6.3.4, 6.3.5 We do not list an information element here for "severity", but such an element existst in the corresponding YANG data model in §8. (I think that it is likely to be an error in the YANG rather than an omission from here, though.) Section 6.3.4 * req-user-agent: The HTTP User-Agent header field of the request. Perhaps no action is needed at this time, but there is some effort underway at the W3C to deprecate the user-agent header field in favor of something like RFC 8942 Client Hints (see, e.g., https://blog.chromium.org/2021/05/update-on-user-agent-string-reduction.html). The user-agent value may not remain a valuable piece of information for much longer. Section 6.4.3 We list "cause" as a potential additional information element here, but there does not seem to be a way to represent that information in the YANG data model in §8. Section 6.5.1 I'm slightly curious what the src-user information would be used for when processing the DPI logs. It doesn't seem inherently problematic to include this information, but I am not sure when it would be useful. We list "action" as an information model element here, but I don't see a way to represent that in the YANG data model in §8. It may have been misplaced in the ietf-nsf-detection-ddos container, which has such a leaf-list that is not reflected in the corresponding information model. Section 6.6.1 Would it make sense to have a "current rate/throughput" (computed over the past, e.g., 1 or 10 minutes as previously) to supplement the average and peak rates that are already listed? Section 6.7.1 As I remarked on §6.5.1, some clarity on why the name of the I2NSF User that generated the policy is worth reporting, would be useful. (This sentiment applies throughout the document, but I will stop repeating it.) Likewise, my remark from §6.6.1 about "current rate" seems to apply here as well. Section 7 It's interesting that the subsection structure under §6 doesn't quite match up to the YANG tree structure for the i2nsf-event notification's sub-event-type choice. (To be clear, it may or may not be problematic that they don't match; I just don't know the motivation for doing it this way as I read through from the top.) Section 8 I would consider using a more complicated grouping structure so that the 'message' leaf (and probably severity and timestamp as well, if not vendor-name and nsf-name) currently in common-monitoring-data does not need to appear under the lists in the tree of data nodes. The 'message' seems tailored for notifications only. I don't understand why there is no configuration leaf for the virus-detection and VoIP/VoCN notifications (e.g., the "enabled" and "dampening-period" that we have for the other feature-controlled functionality). Some of the nodes that are using uint32 might merit bumping up to a wider type. E.g., the ddos attack-rate is currently modeled as uint32, but we've apparently seen an 809 Mpps attack a couple years ago, which is close enough to values that are unrepresentable in uint32 to cause some worry about future growth in attack size. typedef log-action { type enumeration { I would suggest adding more description to most of the enum values. E.g., what scope does a "block-ip" or "block-service" block apply to? identity dampening-type { base characteristics; description "The type of message dampening to stop the rapid transmission of messages. The dampening types are on-repetition and no-dampening"; Making a claim like this that there are only two possible types, seems more aligned with a YANG enumeration than an identity-based scheme. (But I do not see a strong motivation to change it, at this time.) identity protocol { description "An identity used to enable type choices in leaves and leaflists with respect to protocol metadata. This is used to identify the type of protocol that goes through the NSF."; Why are we defining our own set of identities to identify protocols? Is there no suitable prior art we could import? grouping characteristics { description "A set of characteristics of a notification."; Note that this grouping is used in the system-interface, nsf-firewall, and nsf-policy-hits lists, so the description of being "of a notification" is not accurate at present. (The grouping's members like "dampening-type" don't seem to make much sense in a data node tree.) leaf acquisition-method { [...] leaf emission-type { Also, it's slightly surprising that the "acquisition-method" and "emission-type" are included in the notification payloads, when to some extent they must be known already from the context in which the notification is receieved. case i2nsf-system-detection-event { container i2nsf-system-detection-event { description "This notification is sent when a security-sensitive authentication action fails."; This description does not seem to match the name of the event/case here. list changes { key policy-name; description "Describes the modification that was made to the configuration. The minimum information that must be provided is the name of the policy that has been altered (added, modified, or removed). This list can be extended with the detailed information about the specific changes made to the configuration based on the implementation."; Should we say this is only applicable to the configuration-change events? case i2nsf-traffic-flows { container i2nsf-traffic-flows { [...] leaf protocol { type identityref { base protocol; } description "The protocol type for nsf-detection-intrusion notification"; but this isn't the nsf-detection-intrusion notification. case i2nsf-nsf-log-dpi { if-feature "i2nsf-nsf-log-dpi"; container i2nsf-nsf-log-dpi { (Per above,) it seems like this container is missing a "uses log-action". leaf end-time { type yang:date-and-time; description "The time stamp indicating when the attack ended. If the attack is still undergoing when sending out the notification, this field can be empty."; Empty or omitted? leaf-list attack-src-port { type inet:port-number; description "The transport layer source ports of the DDoS attack"; } leaf-list attack-dst-port { type inet:port-number; description "The transport layer destination ports of the DDoS attack"; We might say something about how not all ports will have been seen on all the corresponding src/dest IP addresses. leaf file-type { type string; description "The type of file virus code is found in (if applicable)."; reference "IANA Website: Media Types"; I don't think media type is a common reference for the notion of "file type", as might be reflected by the file's suffix. Section 10 While this text probably suffices to convey the needed requirements, I note that the REGEXT working group has a long-established formulation for expressing what seems to be essentially the same requirement. E.g., from RFC 9095: % This document uses the prefix "b-dn" for the namespace % "urn:ietf:params:xml:ns:epp:b-dn" throughout. Implementations cannot % assume that any particular prefix is used and must employ a % namespace-aware XML parser and serializer to interpret and output the % XML documents. It might be worth considering reusing this established formulation instead of creating a new one. Section 12 I suggest cautioning consumers of the input and output of the system access log to take care when processing those contents, in light of common shell vulnerabilities relating to quoting and wildcard expansion. Additionally, many of the data nodes in this YANG module such as containers "i2nsf-system-user-activity-log", "i2nsf-system-detection- event", and "i2nsf-nsf-detection-voip-vocn" are privacy sensitive. They may describe specific or aggregate user activity including associating user names with specific IP addresses; or users with specific network usage. Let's also add another couple sentences here: "They also may describe the specific commands that were run by users and the resulting output. Any sensitive information in that command input or output will be visible to the NSF data collector and potentially other entities, and care must be taken to protect the confidentiality of such data from unauthorized parties." NITS Section 1 This document defines an information model of an NSF monitoring interface that provides visibility into an NSF for the NSF data collector. Note that an NSF data collector is defined as an entity to collect NSF monitoring data from an NSF, such as Security Controller. It specifies the information and illustrates the methods that enable an NSF to provide the information required in order to be monitored in a scalable and efficient way via the NSF Monitoring Interface. [...] I think in the last quoted sentence the "it specifies" is referring again to "this document" rather than the NSF data collector mentioned in the preceding sentence. We might state "this document" again specifically, or put the "note" sentence inside parentheses (but not both), to clarify what the pronoun is referring to. Section 4 Every system entity creates information about some context with defined I2NSF monitoring data, and so every entity can be an I2NSF component. [...] We might want to add another few words to clarify that when we say "every (system) entity" we are referring to every entity within a certain scope. (I.e., what is that scope?) Section 4.1 something that happens which may be of interest. Examples for an event are a fault, a change in status, crossing a threshold, or an s/for/of/ I2NSF Record: A record is defined as an item of information that is kept to be looked at and used in the future. Typically, records are information generated by a system entity (e.g., NSF) that is based on operational and informational data (i.e., various changes s/that is/that are/ entity or NSF. The examples of records include as user activities, device performance, and network status. They are s/include as/include/ Section 4.2 In I2NSF monitoring, a notification is used to deliver either an event and a record via the I2NSF Monitoring Interface. The difference between the event and record is the timing by which the notifications are emitted. An event is emitted as soon as it happens I think s/event and a record/event or a record/, to match with the preceding "either". Section 6.1.5 * severity: The severity level of the message. There are total levels, i.e., critical, high, middle, and low. s/total/four/ Section 6.3.1 * attack-type: The type of DoS or DDoS Attack, i.e., SYN flood, ACK flood, SYN-ACK flood, FIN/RST flood, TCP Connection flood, UDP flood, ICMP flood, HTTPS flood, HTTP flood, DNS query flood, DNS reply flood, SIP flood, SSL flood, and NTP amplification flood. Please consider s/SSL/TLS/. Section 6.4.1 Access logs record administrators' login, logout, and operations on a device. By analyzing them, security vulnerabilities can be identified. [...] I'd suggest s/security vulnerabilities can be/some security vulnerabilities can be/. Section 8 enum block-service{ missing space before open curly brace. typedef dpi-type{ ditto Please look for more instances; there seem to be too many to list individually. identity application-protocol { base protocol; description "Base identity for Application protocol. Note that popular application protocols (e.g., HTTP, HTTPS, FTP, POP3, and IMAP) are handled in this YANG module, rather than all the existing application protocols."; I suggest saying "a subset of" rather than "popular", to avoid sparking arguments about what protocols are "popular". leaf src-ip { type inet:ip-address-no-zone; description "The source IPv4 (or IPv6) address of the flow"; } leaf dst-ip { type inet:ip-address-no-zone; description "The destination IPv4 (or IPv6) address of the flow"; I think it would be okay to omit the parentheses and just say "IPv4 or IPv6".
- [I2nsf] Benjamin Kaduk's Discuss on draft-ietf-i2… Benjamin Kaduk via Datatracker