[OPSAWG] Roman Danyliw's No Objection on draft-ietf-opsawg-ntf-12: (with COMMENT)

Roman Danyliw via Datatracker <noreply@ietf.org> Thu, 02 December 2021 14:45 UTC

MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
From: Roman Danyliw via Datatracker <noreply@ietf.org>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-opsawg-ntf@ietf.org, opsawg-chairs@ietf.org, opsawg@ietf.org, ludwig@clemm.org, ludwig@clemm.org
Auto-Submitted: auto-generated
Precedence: bulk
Reply-To: Roman Danyliw <rdd@cert.org>
Message-ID: <163845632253.16885.11307038580196101361@ietfa.amsl.com>
Date: Thu, 02 Dec 2021 06:45:22 -0800
Archived-At: <https://mailarchive.ietf.org/arch/msg/opsawg/Y2EqLA1I-82DclhkJ43gzftrg4s>
Subject: [OPSAWG] Roman Danyliw's No Objection on draft-ietf-opsawg-ntf-12: (with COMMENT)

Roman Danyliw has entered the following ballot position for
draft-ietf-opsawg-ntf-12: No Objection

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/blog/handling-iesg-ballot-positions/
for more information about how to handle DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-opsawg-ntf/



----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

Thanks to Alexey Melnikov for the SECDIR review.

Thanks for addressing my DISCUSS point and some of my COMMENTs.

(Ballot note on -12: I wanted to quickly update my ballot before the telechat
to reflect that -12 resolved my discuss.  I need to more carefully review the
responses to the comments.  Where resolution could be quickly assessed from the
diff, I have already updated my ballot accordingly.  -12 may in fact address
more of the comments still noted below.)

(Ballot on -11):
I'm a bit of confusion on the framing of this document.  It seems to me to be
suggesting that “OAM” is a tied to a series of static technologies and
practices, and a set of new practices called “network telemetry” are needed.  I
don’t disagree with the idea that network management practices need to evolve,
and that the “networks of the future” will look different than today.  Relying
on BCP 161 (RFC 6291), I took OAM to mean an evolving set of practices and
technology.  Using Section 3 of BCP 161, O + A + M seemed like a contextual set
of operations that would be done now and still required in networks of the
future.  The document acknowledges that there is some ambiguity in “network
telemetry”.  I think it needs to equally acknowledge that the same is true of
OAM, and that RFC7276 is not OAM.  In the aggregate, I don’t think the text
realizes the clarity that it set out to provide by defining “key
characteristics of network telemetry which set a clear distinction from the
conventional network OAM and show that some conventional OAM technologies can
be considered a subset of the network telemetry technologies.”.  To be clear,
I’m not raising an objection to many of the properties linked to network
telemetry.  Instead, I think the clarity of message is getting diluted because
a very particular distinction is trying to be made (OAM vs. network telemetry)
and it isn’t clear.  See below for a specifics.

** Section 1
   Network telemetry extends beyond the historical network Operations,
   Administration, and Management (OAM) techniques and expects to
   support better flexibility, scalability, accuracy, coverage, and
   performance.

This seems hypothetical depending on the definition on which technologies are
considered in scope of network telemetry and OAM.

** Section 2.

Today one can access advanced big data analytics capability through a
   plethora of commercial and open source platforms (e.g., Apache
   Hadoop), tools (e.g., Apache Spark), and techniques (e.g., machine
   learning).  Thanks to the advance of computing and storage
   technologies, network big data analytics gives network operators an
   opportunity to gain network insights and move towards network
   autonomy.
In trying to contextual this observation, where is this capability relative to
Figure 1?  In general, I would recommend that this reference architecture when
assessing the ecosystem.

** Section 2.

However, while the data processing capability is improved and
   applications are hungry for more data ...

What does it mean and what applications are “hungry for more data”.  Is a
reference possible here?

** Section 2.3
   For a long time, network operators have relied upon SNMP [RFC3416],
   Command-Line Interface (CLI), or Syslog to monitor the network.  Some
   other OAM techniques as described in [RFC7276] are also used to
   facilitate network troubleshooting.
...
   These challenges were addressed by newer standards and techniques
   (e.g., IPFIX/Netflow, PSAMP, IOAM, and YANG-Push) and more are
   emerging.  These standards and techniques need to be recognized and
   accommodated in a new framework.

This section is an exemplar of the disconnect I noted in the definitions of
OAM.  The first paragraph presents a narrow view of currently used (albeit
older) network monitoring technologies (SNMP, CLI Syslog).  However, in the
closing paragraph, the text names more modern technologies I would also
consider OAM, and these technologies could meet some of the challenges
mentioned in this section.  Furthermore, some of these “newer standards” are
framed as things that need to be “recognized”.  This is puzzling because my
understanding was that technologies like IPFIX/Netflow have been very widely
deployed for quite some time now.  What’s the new framework needed?

** Section 2.4
Network telemetry covers the conventional network OAM and
   has a wider scope.

Can the text be more specific in what way network telemetry is wider.  I
thought OAM was rather ambiguous.

** Section 2.4
Hence, the network telemetry can directly
   trigger the automated network operation, while in contrast some
   conventional OAM tools are designed and used to help human operators
   to monitor and diagnose the networks and guide manual network
   operations.

I’m not sure if this is a fair generalization.  Even “older technologies” like
SNMP currently trigger automated responses based on the values they return.

** Section 2.4.  Per “data fusion,” which part of the Figure 1 is this
happening?

** Section 2.5.

Network data analytics and machine-learning technologies are applied
   for network operation automation, relying on abundant and coherent
   data from networks.

What is coherent data?

** Section 2.5
All the use cases and
      applications are better to be supported uniformly and coherently
      under a single intelligent agent

-- Editorial.  There is a missing word which leads to this sentence not parsing.

-- What’s the basis for asserting that a “single intelligent agent” is the 
best approach?

-- Maybe the issue is of semantics, what is an “intelligent agent” in this
context?

** Section 2.5.

Network visibility presents multiple viewpoints

and

Efficient data fusion is critical for applications to reduce the
      overall quantity of data and improve the accuracy of analysis.

Are these generalizations expected to be true across the broad use cases?

** Figure 2.  For the management plane, the data model module has MIB and
syslog listed, but the data encodings as GPB, JSON and XML.  These data models
and encodings don’t line up (i.e., MIBs and syslog typically don’t rely on GPB,
JSON or XML).

** Section 3.1.  Where do network security applications such as WAFs, IDS/IPS/
NGF, DLP, web-proxies, and pDNS fit into this taxonomy?

** Section 3.1.* These sections inconsistently describe properties/requirements
for an architectural element and their challenges (but no solutions or
requirements for) a given elements.  As a result, I had trouble understanding
what an implementer should understand these components.  It would have been
clearer is the different modules had common and module specific requirements.

** Section 3.1.1.  Per the requirements of “Convenient Data Subscription”,
“Structured Data”, etc. why wouldn’t those be desirable requirements for all
four of the modules?

** Section 3.1.3.  Providing “timely data” and “structured data”, seem like the
restatements of Section 4.1.1’s “structure data” and “high speed transport”. 
Is this a common requirement?

** Section 3.1.3.  Why wouldn’t it be desirable for all of the modules to
support incremental deployment note here?

[OPSAWG] Roman Danyliw's No Objection on draft-ie… Roman Danyliw via Datatracker