[ippm] Benjamin Kaduk's Discuss on draft-ietf-ippm-ioam-data-15: (with DISCUSS and COMMENT)

Benjamin Kaduk via Datatracker <noreply@ietf.org> Mon, 11 October 2021 22:29 UTC

Return-Path: <noreply@ietf.org>
X-Original-To: ippm@ietf.org
Delivered-To: ippm@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id 648DE3A0E9E; Mon, 11 Oct 2021 15:29:53 -0700 (PDT)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
From: Benjamin Kaduk via Datatracker <noreply@ietf.org>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-ippm-ioam-data@ietf.org, ippm-chairs@ietf.org, ippm@ietf.org, Al Morton <acm@research.att.com>, acm@research.att.com
X-Test-IDTracker: no
X-IETF-IDTracker: 7.39.0
Auto-Submitted: auto-generated
Precedence: bulk
Reply-To: Benjamin Kaduk <kaduk@mit.edu>
Message-ID: <163399139337.5936.5073810021440741362@ietfa.amsl.com>
Date: Mon, 11 Oct 2021 15:29:53 -0700
Archived-At: <https://mailarchive.ietf.org/arch/msg/ippm/B_F7xuPPsX6peDp2cMfKaKe-bGc>
Subject: [ippm] Benjamin Kaduk's Discuss on draft-ietf-ippm-ioam-data-15: (with DISCUSS and COMMENT)
X-BeenThere: ippm@ietf.org
X-Mailman-Version: 2.1.29
List-Id: IETF IP Performance Metrics Working Group <ippm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ippm>, <mailto:ippm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ippm/>
List-Post: <mailto:ippm@ietf.org>
List-Help: <mailto:ippm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ippm>, <mailto:ippm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 11 Oct 2021 22:29:54 -0000

Benjamin Kaduk has entered the following ballot position for
draft-ietf-ippm-ioam-data-15: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/blog/handling-iesg-ballot-positions/
for more information about how to handle DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-ippm-ioam-data/



----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

Thanks for the many updates and email discussion about the relationship
between limited (network) domains, IOAM domains, IOAM namespaces, and
the like -- I think I do now have a pretty clear picture of how they're
expected to interact!  However, I think there may still be a couple
places in the document that need to get updated in order to match that
vision.  One point here, and some (more minor) instances in the COMMENT
section...

Section 5.2 has:

   The role of an IOAM-encapsulating, IOAM-transit or IOAM-decapsulating
   node is always performed within a specific IOAM-Namespace.  This
   [...]
   described above, that is added in a future revision.  An IOAM
   decapsulating node situated at the edge of an IOAM domain MUST remove
   all IOAM-Option-Types and associated encapsulation headers for all
   IOAM-Namespaces from the packet.

The "MUST remove [...] for all IOAM-Namespaces" at the end seems to
conflict with the notion of the role of IOAM-decapsulating node being
performed within a specific IOAM-Namespace.  Indeed, later on in Section
5.3 we see that namespace identifiers "allow devices which are IOAM
capable to determine: [...] o  whether IOAM-Option-Type(s) have to be
removed from the packet, e.g., at a domain edge or domain boundary."  If
a decapsulating node always had to remove IOAM options from all
namespaces, then the namespace identifier is irrelevant to whether
option type(s) are removed from the packet.


[the following paragraph is retained unchanged from my ballot position
on the -12, since the topic seems to still be open.]

As foreshadowed in
https://mailarchive.ietf.org/arch/msg/last-call/Ak2NAIKQ7p4Rij9jfv123xeTXQY/
I think we need to have a discussion about the expectations and
provisions for cryptographic (e.g., integrity) protection of IOAM data.
>From my perspective, IOAM is a new (class of) protocols that is seeking
publication on the IETF stream as Proposed Standard.  While we do make
exceptions for modifications to protocols that were developed before we
realized how important integrated security mechanisms are, it's
generally the case that new protocols are held to the current IETF
expectations for what security properties are provided; the default
expectation is that a protocol is expected to provide secure operation
in the internet threat model of RFC 3552.  This draft currently only
provides a brief note in the security considerations that there exists
an individual draft (draft-brockners-ippm-ioam-data-integrity) that
might provide ways to protect the integrity of IOAM data fields.
Shouldn't the security mechanisms be getting developed side-by-side by
the protocol mechanisms, to ensure that they are properly integrated and
fit for purpose?  (This does not necessarily have to be in the same
document and could be part of a cluster of related documents, but I
don't think that an informative reference to a non-WG draft really
qualifies.)

[new disucssion on this topic as of the -15]
The discussion on this topic was over a rather protracted timescale, for
which I share much of the blame.  I think that the latest message is
https://mailarchive.ietf.org/arch/msg/ippm/POycw2NpSl5cIruqSimTa_4WrwI/
where I make a proposal to have some text about how actual use of these
data fields in a protocol or encapsulation needs to provide some
(possibly optional) mechanism for cryptographic integrity protection,
which could be draft-brockners-ippm-ioam-data-integrity but could also
be native to the encapsulation format.  I think that such a construction
would allow this document to proceed to RFC without waiting for the
other one to be complete.


----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

All-new (okay, almost all) comments as of the -15.

Mentioning here for lack of a better place, I happened to follow the
reference to draft-brokners-opsawg-ioam-deployment, which seems to still
be using the old definition of transit node and should get updated to
match this document's definition.

Section 5.2

   IOAM is expected to be deployed in a specific domain.  The part of
   the network which employs IOAM is referred to as the "IOAM-Domain".

In light of the (new) text up in §5, we might consider rewording this
part as well, mostly to avoid mentioning "network" that risks confusion
about "network domain" (pre previous discussion).

   An "IOAM transit node" read and/or write and/or process one or more
   of the IOAM-Data-Fields.  If both the Pre-allocated and the
   Incremental Trace Option-Types are present in the packet, each IOAM
   transit node based on configuration and available implementation of
   IOAM populates IOAM trace data in either Pre-allocated or Incremental
   Trace Option-Type but not both.  [...]

Since we redefined transit nodes to include only reading, in addition to
modifying, it doesn't seem 100% accurate to say that "each transit node
populates [one or the other but not both]" -- it seems valid for a
transit node to populate zero of the trace option types.

Section 5.3

   An IOAM-Namespace can be associated to a subset or all of the the
   IOAM-Option-Types and their corresponding IOAM-Data-Fields.  IOAM-

This sentence seems confusing to me.  It talks about namespaces as if
they contain options, but if we go on to examine the actual data
structures each option has a field to indicate the associated namespace.
We even go on to say that the IOAM-Namespace field "MUST be included in
all future IOAM-Option-Types" (side note: might be worth calling that
out in the guidance in the IANA considerations, and also to specifically
say that it's the first field of the option).  So it's really not clear
to me what this sentence is adding -- would it be safe to just remove
it?

   o  IOAM-Namespaces can be used by an operator to distinguish
      different operational domains.  Devices at domain edges can filter
      on Namespace-IDs to provide for proper IOAM-Domain isolation.

I suggest tweaking the wording to clarify that the "different
operational domains" in the first sentence are precisely the
"IOAM-Domain"s of the second sentence.

      *  Assigning different IOAM Namespace-IDs to different sets of
         [...]

There are two instances of a bullet point that talks about assembling a
full trace from partial traces, but the text has substantial
differences.  I suspect that one is just an editing artifact and should
be removed, but am less sure which one has the intended text (I would
guess the latter, for what it's worth).

Section 5.4

   "IOAM tracing data" is expected to be collected at every IOAM transit
   node that a packet traverses to ensure visibility into the entire
   path a packet takes within an IOAM-Domain.  [...]

Since we redefined transit nodes to include only reading, this "is
expected to be collected" doesn't seem entirely representative anymore.

   [...].  If not all nodes within a domain support IOAM functionality
   as defined in this document, IOAM tracing information (i.e., node
   data, see below) will only be collected on those nodes which support
   IOAM functionality as defined in this document.  [...]

Similarly, I might s/will only/can only/ here for the same reason.


Section 5.4.2

   Some IOAM-Data-Fields defined below, such as interface identifiers or
   IOAM-Namespace specific data, are defined in both "short format" as
   well as "wide format".  "Short format" refers to an IOAM-Data-Field
   which comprises 4 octets.  "Wide format" refers to an IOAM-Data-Field
   which comprises 8 octets.  [...]

We have definition entries for short/wide format in Section 3, so these
clarification sentences may not be needed.

Section 5.5

draft-ietf-sfc-proof-of-transit might be undergoing significant changes
prompted by the results of its secdir review thread.  I don't object to
leaving this discusison in place but it may be prudent to think about
paring down what we say here.  In particular, there seems to be some
(admittely, small) chance that the SFC doc will not have exactly two POT
types.

Section 6.3

It seems that in going from the -12 to -15 we lost the paragraph about
leap seconds here.  My understanding is that posix timestamps *are*
affected by leap seconds, and so it is correct to include such a
statement.  My ballot comment on the -12 was conditioned on the use of
TAI for the epoch, and thus my comment on the -12 is irrelevant.

Section 10

We should probably say that the opaque snapshot, namespace specific
data, etc., will have security considerations corresponding to their
defined data contents that should be described where those formats are
defined. [ed. this remains unchanged from my comment on the -12; I'm not
sure if the intent to add some text just got overlooked, but don't
intend to rehash any discussions that were already made.]

Since we clarified the definition of transit nodes to include read-only
transit nodes, we might want to say something about how transit nodes
that only implement support for one or the other trace option types (as
is clearly permitted) will have an incomplete picture of the trace in
cases where both trace option types are used for the same packet.  In
many cases that is innocuous, of course, but it does not seem guaranteed
to always be so.

   allowing attackers to collect information about network paths,
   performance, queue states, buffer occupancy and other information.

One possible application of such reconiassance is to gauge the
effectiveness of an ongoing attack (e.g., if buffers and queues are
overflowing).  I don't know whether it's particularly useful to mention
that scenario here or not, though, and the lack of response the previous
time I made the comment suggests that it's not actually useful to
mention it.

                  Indeed, in order to limit the scope of threats
   mentioned above to within the current network domain the network
   operator is expected to enforce policies that prevent IOAM traffic
   from leaking outside of the IOAM domain, and prevent IOAM data from
   outside the domain to be processed and used within the domain.

On the -12, I said "it would be great if we could provide a bit more
detail on the scope of consequences if the operator fails to do so."
Some of the follow-up discussion suggested that
draft-brockners-opsawg-ioam-deployment would be a better home, which I
don't object to; I'm retaining this comment just in case there was
actual desire to put such content in this document.  No specific reply
is expected or required, either way.

NITS

Section 5.3

      controllers.  For example, the node identifier field (node_id, see
      below) does not need to be unique in a deployment (e.g., if an
      operator wishes to use different node identifiers for different
      IOAM layers, even within the same device; or node identifiers
      might not be unique for other organizational reasons, such as
      after a merger of two formerly separated organizations), the
      Namespace-ID can be used as a context identifier, such that the
      combination of node_id and Namespace-ID will always be unique.

This looks like a comma splice (maybe put a sentence break after the
long parenthetical?).

      *  Assigning different IOAM Namespace-IDs to different sets of
         nodes or network partitions and using the Namespace-ID as a
         selector at the IOAM encapsulating node, a full trace for a
         flow could be collected and constructed via partial traces in
         different packets of the same flow.  Example: An operator could

I think s/Assigning/By assigning/ (note the comment that indicates
there are "two copies" of this text).

Section 5.4

   o  Time of day when the packet was processed by the node as well as
      the transit delay.  Different definitions of processing time are
      feasible and expected, though it is important that all devices of
      an in-situ OAM domain follow the same definition.

I think we've been standardizing on the "IOAM domain" spelling.