[ippm] Benjamin Kaduk's Discuss on draft-ietf-ippm-multipoint-alt-mark-07: (with DISCUSS and COMMENT)

Benjamin Kaduk via Datatracker <noreply@ietf.org> Thu, 12 March 2020 02:02 UTC

Return-Path: <noreply@ietf.org>
X-Original-To: ippm@ietf.org
Delivered-To: ippm@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id AE6C33A112D; Wed, 11 Mar 2020 19:02:26 -0700 (PDT)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
From: Benjamin Kaduk via Datatracker <noreply@ietf.org>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-ippm-multipoint-alt-mark@ietf.org, ippm-chairs@ietf.org, ippm@ietf.org, Tal Mizrahi <tal.mizrahi.phd@gmail.com>, tal.mizrahi.phd@gmail.com
X-Test-IDTracker: no
X-IETF-IDTracker: 6.120.0
Auto-Submitted: auto-generated
Precedence: bulk
Reply-To: Benjamin Kaduk <kaduk@mit.edu>
Message-ID: <158397854669.19608.10592898561339579819@ietfa.amsl.com>
Date: Wed, 11 Mar 2020 19:02:26 -0700
Archived-At: <https://mailarchive.ietf.org/arch/msg/ippm/9qhDf14Gy8ylUofXjaU424ytb_M>
Subject: [ippm] Benjamin Kaduk's Discuss on draft-ietf-ippm-multipoint-alt-mark-07: (with DISCUSS and COMMENT)
X-BeenThere: ippm@ietf.org
X-Mailman-Version: 2.1.29
List-Id: IETF IP Performance Metrics Working Group <ippm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ippm>, <mailto:ippm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ippm/>
List-Post: <mailto:ippm@ietf.org>
List-Help: <mailto:ippm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ippm>, <mailto:ippm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 12 Mar 2020 02:02:27 -0000

Benjamin Kaduk has entered the following ballot position for
draft-ietf-ippm-multipoint-alt-mark-07: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
for more information about IESG DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-ippm-multipoint-alt-mark/



----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

Let's discuss whether [I-D.mizrahi-ippm-compact-alternate-marking] needs
to be a normative reference, to describe how the Hash selection method
works for multipoint.  This document alone does not even mention what is
used as input to the hash (though I think I have a good guess based on
the context).  Even if the intent is that RFC 5474 suffices (avoiding
the "dependency on individual document" issue), that is also listed only
as an informative reference.

Also, if the grouping procedure (section 6.1) does in fact require a
distinguished (but arbitrary?) choice of initial endpoint as I suspect
it does, that should be clarified.  (See COMMENT.)


----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

I echo the other reviewers' comments about clarity as to "multipoint"
vs. "multicast" and defining various terms.

I agree with Barry that there's a lot of editorial work to be done; I've
to note many instances in these comments even when the proper resolution
is unclear to me.

Section 1

   This approach fits very well with the Intelligent Network and
   Software Defined Network (SDN) paradigm where the SDN Orchestrator
   and the SDN Controllers are the brains of the network and can manage
   flow control to the switches and routers and, in the same way, can
   calibrate the performance measurements depending on the necessity.

nit: "necessity" doesn't sound right

   An SDN Controller Application can orchestrate how deep the network
   performance monitoring is setup by applying the Multipoint Alternate
   Marking as described in this document.

nit: "how deep the network" doesn't sound right

Section 3

   In this way the flows to be monitored are selected into the
   monitoring points using packet selection rules, that can also change
   the pattern of the monitored network.

nit: I'm not sure what "this way" refers to.  It also seems like this
may be two independent thoughts needlessly joined together with a comma.

Section 4.1

   [I-D.ietf-ippm-route]).  In general there are different options: the
   monitoring network can be obtained by considering all the possible
   paths for the traffic or also by checking the traffic sometimes and
   update the graph consequently.

"also by checking the traffic sometimes and update the graph
consequently" feels pretty informal and under-specified.

Section 5

   Since all the packets of the considered flow leaving the network have
   previously entered the network, the number of packets counted by all
   the input nodes is always greater or equal than the number of packets
   counted by all the output nodes.

[I think I could imagine some exotic cases where this does not hold, but
none of them really seem topical for this work.]

   It is possible to define the Network Packet Loss of one monitored
   flow for a single period: <<In a packet network, the number of lost
   packets is the number of packets counted by the input nodes minus the
   number of packets counted by the output nodes>>.  This is true for
   every packet flow in each marking period.

As another reviewer implies, this discounts the difference between the
number of packets "still in the network" at the start and end of the
measurement period.

      The flow definition is generalized here, indeed, as described
      before, a multipoint packet flow is considered and the
      identification fields of the 5-tuple can be selected without any
      constraints.

I don't think I understand what this means.  If identification is fully
general, how are we still limited to a 5-tuple?

Section 6
   In a completely monitored network (a network where every network
   interface is monitored), each network device corresponds to a Cluster
   and each physical link corresponds to two Clusters (one for each
   direction).

("what about unidirectional links?")

Section 6.1

I'm not following how the first step of this algorithm produces the
listed groups.  Are the links considered to be bidirectional?  Did we
have to pick some arbitrary starting node (R1), and decree that once a
link is in one group it cannot be in some other group?

   In this way the calculation of packet loss can be made on Cluster
   basis.  Note that CIR(Committed Information Rate) and EIR(Excess
   Information Rate) can also be deduced on Cluster basis.

Do we use CIR and/or EIR anywhere else?  (Are they references to some
other body of work?)

   In this way in a very large network there is no need to configure
   detailed filter criteria to inspect the traffic.  You can check
   multipoint network and only in case of problems you can go deep with
   a step-by-step cluster analysis, but only for the cluster or
   combination of clusters where the problem happens.

nits: there at least one missing article here ("a multipoint network")
and I think there was something else that feels not quite right.

   In summary, once defined a flow, the algorithm to build the Cluster
   Partition considers all the possible links and nodes crossed by the

What does "once defined a flow" mean?

   information.  So, if the flow do not enter or traverse all the nodes,
   the counters has a non-zero value for the involved nodes, while a

nit: singular/plural mismatch "the flow"/"do not", "counters"/"has a
[...] value"

   The algorithm described above is an Iterative clustering algorithm,
   but it is also possible to apply a Recursive clustering algorithm by
   using the node-node adjacency matrix representation.

Is there a reference for this?  (Do I assume it's the
[IEEE-ACM-ToN-MPNPM] at the end of the next paragraph?)

Section 7
   Therefore, when we expand to multipoint-to-multipoint flows, we have
   to consider that all source nodes mark the traffic and this adds more
   complexity.

I'm not sure what chain of reasoning "therefore" is attempting to
indicate.

   But we should now consider an additional contribution.  Since all
   source nodes mark the traffic, the source measurement intervals can
   be of different lengths and with different offsets and this mismatch
   m can be added to d, as shown in figure.

Please define m and d.

   So the misalignment between the marking source routers gives an
   additional constraint and the value of m is added to d (that already
   includes clock error and network delay).

   Thus, three different possible constraints are considered: clock
   error between network devices, network delay between measurement
   points and the misalignment between the marking source routers.

Are these "constraints" or "contributions [to error/uncertainty]"?  RFC
8321 does not seem to use "constraint" in this fashion.

   In the end, the condition that must be satisfied to enable the method
   to function properly is that the available counting interval must be
   > 0, and that means: L - 2m - 2d > 0 for each measurement point on
   the multipoint path.  Therefore, the mismatch between measurement
   intervals must satisfy this condition.

Is it bad to just make L really large?

Section 8

   period and this is the time reference that can be used.  It is
   important to highlight that both delay and delay variation
   measurements make sense in a multipoint path.  The Delay Variation is
   calculated by considering the same packets selected for measuring the
   Delay.

Is the "variation" considered here just the variation across the
separate paths that the selected packets take, or over time, or
something else?

   o  Delay measurements on single packets basis means that you can use
      multipoint path just to easily couple packets between inputs and

nit: singular/plural mismatch "packets"/"basis" (also missing "a"?)
nit: "inpur" singular, I think.

Section 8.1.1

I don't understand what weights are used for the "weighted averages".

Section 8.2.1

   since they would not be representative of the entire flow.  The
   packets can follow different paths with various delays and in general
   it can be very difficult to recognize marked packets in a multipoint-
   to-multipoint path especially in case they are more than one per
   period.

nits: "the case when there is", comma before "and in general".

Section 8.2.2

   [I-D.mizrahi-ippm-compact-alternate-marking] introduces how to use
   the Hash method combined with alternate marking method for point-to-

Is it really appropriate for a WG document to depend on an individual
document to introduce a concept?

   In a multipoint environment the behaviour is similar to point-to
   point flow.  In particular, in the context of multipoint-to-
   multipoint flow, the dynamic hash could be the solution to perform

nits: both of these either need "flows" plural or the article "a".

   The management system receives the samples including the timestamps
   and the hash value from all the MPs, and this happens both for point-
   to-point and for multipoint-to-multipoint flow.  Then the longest

nit: "flows"

Also, this seems to be the first time we abbreviate "measurement
points"; it would be good to provide the abbreviation/expansion together
and consistently use the abbreviation.  (Interestingly
https://www.rfc-editor.org/materials/abbrev.expansion.txt only has
"multiprotocol" and "maintenance point".)

   to-point and for multipoint-to-multipoint flow.  Then the longest
   hash used by MPs is deduced and it is applied to couple timestamps of
   same packets of 2 MPs of a point-to-point path or of input and output
   MPs of a Cluster (or a Super Cluster or the entire network).  But

nit: there appear to be several missing words here.

   In summary, the basic hash is logically similar to the double marking
   method, and in case of point-to-point path double marking and basic

I don't really recall any specific discussion of similarities between
basic hash and double marking, so calling this a "summary" is perhaps a
stretch.

Section 9

   An SDN Controller or a Network Management System (NMS) can calibrate
   Performance Measurements since it is aware of the network topology.
   It can start without examining in depth.  In case of necessity
   (packet loss is measured or the delay is too high), the filtering
   criteria could be immediately specified more in order to perform a
   partition of the network by using Clusters and/or different
   combinations of Clusters.  In this way the problem can be localized

nit: "specified more" seems incomplete; perhaps something about detail?

Section 11

I don't see much in RFC 8321 to note that "traffic observed by
measurement points may contain private information if not protected by
transport-layer security protocols, so measurement infrastructure should
be as equally protected/secured as routing hardware".  That said, it is
hopefully obvious, and not specific to this work, so I don't feel a
particular need to have it mentioned here.