[Bier] AD Review of draft-ietf-bier-pmmm-oam-05

Alvaro Retana <aretana.ietf@gmail.com> Wed, 26 June 2019 18:40 UTC

Return-Path: <aretana.ietf@gmail.com>
X-Original-To: bier@ietfa.amsl.com
Delivered-To: bier@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 041621205EB; Wed, 26 Jun 2019 11:40:40 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.996
X-Spam-Level:
X-Spam-Status: No, score=-1.996 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CVrU1bYVR0Bf; Wed, 26 Jun 2019 11:40:33 -0700 (PDT)
Received: from mail-ed1-x533.google.com (mail-ed1-x533.google.com [IPv6:2a00:1450:4864:20::533]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 7060D120609; Wed, 26 Jun 2019 11:40:03 -0700 (PDT)
Received: by mail-ed1-x533.google.com with SMTP id r12so4599290edo.5; Wed, 26 Jun 2019 11:40:03 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:mime-version:date:message-id:subject:to:cc; bh=Rtz+2VdFE36S8LPeLlkfUBGk0gPDt/tBTlRQf5RbYsE=; b=e3Hr6bDq4iarg05m/p1TnTHW++5X2Cfzd4oJwlobujLnf9M2XT8BHMleOu5N9tTd2q t07TJpwdl5kML/orpqRL4FrJUqwuRaopiamq9XYfuReJqGI2VHnybXnpIEoOwG6wwCUw 0Vi+oJj+4dH1jex0qfGEHUAyv854PTQPCyc9JuaRRajBjv1MRp9gVGDkh1Zg6VaxfovO L17lhNqWSECT+rmkVu1boaCyLuJ4GMDmpkYTq+rMklWBVhcCvgPZ8laSUv8VstQDqlJ9 hhNnjxSgReCQe7Htz97YUDIIMiusGAF9nqsKoPh05DafziFGSippdCmsEtn6+mMJR9vL iZBw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:mime-version:date:message-id:subject:to:cc; bh=Rtz+2VdFE36S8LPeLlkfUBGk0gPDt/tBTlRQf5RbYsE=; b=XxqtaMzsafhV2sv3gbTzuB2TlX1RC7VEN6aOKFPhWWzXEFaRMe4ccIbsacD+KPsYWv +dvaj5If9uorxh+eSC3Q9h9e+tydI7TMbU6wt50oWyLooG0mI/j9y2MebkZRHCNS6YNQ V586Ih3Ocxbqaa9TRv6ELO1AMq0Dj+LJ7uZHv3qhIqAYZtuzHTVO+lyDhpsM1uG/A03V FsETuxBNOhUOeIBWDtfLrAXUgK6aEpdY5z4fdbHqMnKWV1xzh3BUkdRPsbdM4Qc8rp91 Rk5PVOYKU4HL+Bgsa0Ch2dRI0n8Fk1n9MEmNB3Cbtljm9DlqgzaqIckFKkn3tJ3e4SBD s4TQ==
X-Gm-Message-State: APjAAAWE/aNZkXpr6aaIhSB3K10/yyRwctPaK/ebxowsxDhaANcy23hK TqQpL4VmZVFIF/QADOvd4etr8ZAbc8/HccSO5FfCWg==
X-Google-Smtp-Source: APXvYqzmvzdyydTRkLfx7DblweKN9yHO2Pq4QxRK8WxlJyTUtfji1D0/4S2zZ1nHYOeE8GBaj9a4rEBbbT1tGa3f/fc=
X-Received: by 2002:a17:906:4552:: with SMTP id s18mr486523ejq.271.1561574401344; Wed, 26 Jun 2019 11:40:01 -0700 (PDT)
Received: from 1058052472880 named unknown by gmailapi.google.com with HTTPREST; Wed, 26 Jun 2019 14:40:00 -0400
From: Alvaro Retana <aretana.ietf@gmail.com>
MIME-Version: 1.0
Date: Wed, 26 Jun 2019 14:40:00 -0400
Message-ID: <CAMMESsyKwB_ha85WfQJ9LiOcE3gWaAXLkocz4-f8U9jah7z=kA@mail.gmail.com>
To: draft-ietf-bier-pmmm-oam@ietf.org
Cc: BIER WG Chairs <bier-chairs@ietf.org>, BIER WG <bier@ietf.org>
Content-Type: multipart/alternative; boundary="0000000000007b8986058c3e5f27"
Archived-At: <https://mailarchive.ietf.org/arch/msg/bier/vXfLs0N0sK0jBPoRE4JOGHi2QUE>
Subject: [Bier] AD Review of draft-ietf-bier-pmmm-oam-05
X-BeenThere: bier@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "\"Bit Indexed Explicit Replication discussion list\"" <bier.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/bier>, <mailto:bier-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/bier/>
List-Post: <mailto:bier@ietf.org>
List-Help: <mailto:bier-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/bier>, <mailto:bier-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Jun 2019 18:40:40 -0000

Dear authors:

I just finished reading this document.  I was looking forward to reviewing
a short and straight-forward document, but ended this one with many
questions and concerns.  Please take a look at the details.

I'm starting with some overall issues/concerns:

(1) This document uses the PNPM method described in rfc8321, which makes
that RFC a required Normative reference because it "must be read to
understand or implement the technology" [1].  Note that some of my comments
below are precisely about being explicit with the use of PNPM; the text
talks about the marking method in general...

(2) The use of PNPM results then in the fact that this document cannot be
on the Standards Track because rfc8321 is Experimental.  In general,
downward references are possible, but I don't think this is one of those
cases.  The Shepherd writeup for rfc8321 [2] states that "the measurement
utility of this extension still is to be demonstrated at a variety of
scales in a plurality of network conditions."  As far as I can tell, that
hasn't been demonstrated, nor specific information about the completion of
the experiment was included in the RFC text.  I didn't see the topic of the
document status discussed in the WG -- nor am I aware of discussions about
the maturity of rfc8321 in the ippm WG.  The result is then that this
document should be either Informational or Experimental.

(3) Before digging further into the status of rfc8321, I want to ask the
question of the applicability of PNPM to multicast traffic, is it?  On one
hand, I see that rfc8321 reports that the "methodology has been used
experimentally in Telecom Italia's network and is applied to
multicast...".  On the other hand, draft-ietf-ippm-multipoint-alt-mark [*]
starts by saying that rfc8321 "can be applied only to point-to-point
flows".  I think that we could stretch the use of PNPM to monitor
(referring to Figure 2) both A-C-G and A-C-F using a single set of markings
at A...but that piece of the methodology is not specified in the document.

(4) Finally, what is the relationship between this document and
draft-ietf-bier-oam-requirements?  How does this document address the
requirements?  Why isn't draft-ietf-bier-oam-requirements even mentioned?


Given these issues (and others identified below), I am inclined to return
this document to the WG to consider the Status, applicability, the
relationship to other work items, etc.  I will wait for an initial response
to these comments before doing so.

Thanks!

Alvaro.


[*] draft-ietf-ippm-multipoint-alt-mark is an ippm WG item and one of the
authors is also an author of rfc8321 and of this document.

[1]
https://www.ietf.org/blog/iesg-statement-normative-and-informative-references/
[2]
https://datatracker.ietf.org/doc/draft-ietf-ippm-alt-mark/shepherdwriteup/




[Line numbers from idnits.]

...
15 Abstract

17   This document describes a hybrid performance measurement method for
18   multicast service over Bit Index Explicit Replication (BIER) domain.

[nit] s/over Bit Index Explicit Replication (BIER) domain/through a Bit
Index Explicit Replication (BIER) domain


...
70 1.  Introduction

72   [RFC8279] introduces and explains Bit Index Explicit Replication
73   (BIER) architecture and how it supports forwarding of multicast data
74   packets.  [RFC8296] specified that in case of BIER encapsulation in
75   MPLS network a BIER-MPLS label, the label that is at the bottom of
76   the label stack, uniquely identifies the multicast flow.  [RFC8321]
77   describes hybrid performance measurement method, per [RFC7799]
78   classification of measurement methods.  Packet Network Performance
79   Monitoring (PNPM), which can be used to measure packet loss, latency,
80   and jitter on live traffic.  Because this method is based on marking
81   consecutive batches of packets the method often referred to as
82   Marking Method (MM).

[nit] s/explains Bit Index Explicit/explains the Bit Index Explicit

[nit] s/encapsulation in MPLS network/encapsulation in an MPLS network

[nit] s/describes hybrid performance/describes a hybrid performance

[nit] s/[RFC7799] classification of measurement methods./RFC7799's
classification of measurement methods [RFC7799].

[nit] s/Packet Network Performance Monitoring (PNPM), which can be used/The
method, called Packet Network Performance Monitoring (PNPM), can be used

[nit] s/the method often referred/the method is often referred

[major] It's not clear to me whether PNPM is known as *the* Marking Method,
or if it is simply *a* marking method.  Please clarify.  I note that the
later mentions to "marking method" are all in lower case, which seem to
imply something generic -- if referring to PNPM, it would be better to do
it explicitly.

84   This document defines how marking method can be used on BIER layer to
85   measure packet loss and delay metrics of a multicast flow in MPLS
86   network.

[nit] s/used on BIER layer/used on the BIER layer

[nit] s/in MPLS network/in an MPLS network

88 2.  Conventions used in this document

90 2.1.  Terminology
...
99   MM: Marking Method

[minor] It looks like MM is only used in the Introduction...


...
111 3.  OAM Field in BIER Header

113   [RFC8296] defined the two-bit long field, referred to as OAM,
114   designated for the marking performance measurement method.  The OAM
115   field MUST NOT be used in defining forwarding and/or quality of
116   service treatment of a BIER packet.  The OAM field MUST be used only
117   for the performance measurement of data traffic in BIER layer.
118   Because the setting of the field to any value does not affect
119   forwarding and/or quality of service treatment of a packet, the
120   marking method in BIER layer can be viewed as the example of the
121   hybrid performance measurement method.

[major] "designated for the marking performance measurement method"
 rfc8296 clearly says that this document is an example of a document that
may define the non-default use of the bits.  It doesn't designate the use
of the bits in any way.

[major] "The OAM field MUST NOT be used in defining forwarding and/or
quality of service treatment of a BIER packet."  This sentence seems to
paraphrase rfc8296.  Is that the intent?  If so, that is not what rfc8296
says: it is not Normative in the same way.

If not, then I'm not sure what the Normative statement is.

In general, it seems to me that there is no value in that text in this
document.  Note that the paragraph ends with "because the setting of the
field to any value does not affect forwarding and/or quality of service
treatment of a packet...", which is a statement in line with rfc8296, then
there doesn't seem to be a need to include the Normative sentence at all...

[major] "The OAM field MUST be used only for the performance measurement of
data traffic in BIER layer."  What is the intended Normative action of this
sentence?  It seems to me that it wants to avoid other uses of the OAM
field...but without updating rfc8296, which says that the use of the field
"in other than the default manner is OPTIONAL".  IOW, this statement
contradicts rfc8296.

If you intend for the statement to only apply to implementations of this
document, then you don't need to even include it: the document itself is
about specifying the use of the OAM field (for nodes that support it).

[minor] "Because the setting of the field...the marking method in BIER
layer can be viewed as the example of the hybrid performance measurement
method."  I don't understand how the conclusion is drawn (based on the
setting of the field)...nor how this relates to the text in the
Introduction where it basically says that PNPM, which is a hybrid
performance measurement method is known as MM... ??

123   The Figure 1 displays format of the OAM field

[major] Please be explicit in saying that this is how this document defines
the OAM field.

125    0
126    0   1
127   +-+-+-+-+
128   | L | D |
129   +-+-+-+-+

131                 Figure 1: OAM field of BIER Header format

133   where:

135   o  L - Loss flag;

137   o  D - Delay flag.

[minor] Please add a forward reference to where the meaning and use of
these flags is specified.

[minor] The name of these flags doesn't really represent loss/delay...

139 4.  Theory of Operation

141   The marking method can be successfully used in the multicast
142   environment supported by BIER layer.  Without limiting any generality
143   consider multicast network presented in Figure 2.  Any combination of
144   markings, Loss and/or Delay, can be applied to a multicast flow by
145   any Bit Forwarding Router (BFR) at either ingress or egress point to
146   perform node, link, segment or end-to-end measurement to detect
147   performance degradation defect and localize it efficiently.

[nit] "The marking method can be successfully used in the multicast
environment supported by BIER layer."  Sounds like a marketing statement...

[major] "Any combination of markings...can be applied...by any Bit
Forwarding Router (BFR)..."  rfc8296 says that the "bits are set...by the
BFIR and are not modified by other BFRs".  What is the assumption?  Please
be clear!

149                           -----
150                         --| D |
151                 -----  /  -----
152               --| B |--
153              /  -----  \  -----
154             /           --| E |
155   -----    /              -----
156   | A |---                -----
157   -----    \            --| F |
158             \  -----   /  -----
159              --| C |--
160                -----   \  -----
161                         --| G |
162                           -----

164                        Figure 2: Multicast network

166   Using the marking method, a BFR creates distinct sub-flows in the
167   particular multicast traffic over BIER layer.  Each sub-flow consists
168   of consecutive blocks, consisting of identically marked packets, that
169   are unambiguously recognizable by a monitoring point at any BFR and
170   can be measured to calculate packet loss and/or packet delay metrics.
171   It is expected that the marking values be set and cleared at the edge
172   of BIER domain.  Thus for the scenario presented in Figure 2 if the
173   operator initially monitors A-C-G and A-B-D segments he may enable
174   measurements on segments C-F and B-E at any time.

[major] What are sub-flows?  This question was asked in the Shepherd
review, but no clarification made it into the document.  Note that §4.1
talks about "alternate flows" -- is that the same thing?  And §4.2 uses
"monitored flow"...

[nit] s/monitors A-C-G/monitors the A-C-G

[major] "...if the operator initially monitors A-C-G and A-B-D segments he
may enable measurements on segments C-F and B-E at any time."  How?  A
similar question was asked in the Shepherd's review, and the answer
included this: "the AltMark domain may be arbitrary and not identical to
the BIER domain. But from operational PoV, I believe, it is useful to apply
AltMark at BFIR and then clear them by removing BIER encapsulation at
BFERs." [3]  However, the document doesn't have any type of discussion
related to the existence of multiple types of domains, or their
congruence...much less operational guidance.  Please be explicit about the
potential different domains, their relationship to the specification in
rfc8296 and provide operational guidance in an Operational Considerations
section.

[3] https://mailarchive.ietf.org/arch/msg/bier/0rn7_VSjJQPRAOxSSfnGp-kFWBE


176 4.1.  Single Mark Enabled Measurement

[major] rfc8321 uses "Single-Marking" (not Single Mark).  Please be
consistent.

178   As explained in the [RFC8321], marking can be applied to delineate
179   blocks of packets based either on the equal number of packets in a
180   block or based on equal time interval.  The latter method offers
181   better control as it allows better account for capabilities of
182   downstream nodes to report statistics related to batches of packets
183   and, at the same time, time resolution that affects defect detection
184   interval.

[nit] s/in the [RFC8321]/in [RFC8321]

186   If the Single Mark measurement used to measure packet loss, then the
187   D flag MUST be set to zero on transmit and ignored by monitoring
188   point.

[nit] s/measurement used/measurement is used

[nit] s/ignored by monitoring/ignored by the monitoring

[major] In this document I don't see a way to discover (or detect from the
signaling) what methodology is in use.  All the nodes in the network (or at
least the ones doing measurement) MUST then be configured beforehand.  Is
that true?  Please be explicit about those type of requirements.  An
rfc5706-type Operational Considerations section would be ideal -- look
specially at §2.

190   The L flag is used to create alternate flows to measure the packet
191   loss by switching the value of the L flag every N-th packet or at
192   certain time intervals.  Delay metrics MAY be calculated with the
193   alternate flow using any of the following methods:

195   o  First/Last Packet Delay calculation: whenever the marking, i.e.
196      value of L flag changes, a BFR can store the timestamp of the
197      first/last packet of the block.  The timestamp can be compared
198      with the timestamp of the packet that arrived in the same order
199      through a monitoring point at downstream BFR to compute packet
200      delay.  Because timestamps collected based on order of arrival
201      this method is sensitive to packet loss and re-ordering of packets

[nit] s/at downstream BFR/at a downstream BFR

[major] "this method is sensitive to packet loss and re-ordering of
packets"  It will be important to point at the Considerations section from
rfc8321.  It would be ideal to do so in the Operational Considerations
section.

203   o  Average Packet Delay calculation: an average delay is calculated
204      by considering the average arrival time of the packets within a
205      single block.  A BFR may collect timestamps for each packet
206      received within a single block.  Average of the timestamp is the
207      sum of all the timestamps divided by the total number of packets
208      received.  Then the difference between averages calculated at two
209      monitoring points is the average packet delay on that segment.
210      This method is robust to out of order packets and also to packet
211      loss (only a small error is introduced).  This method only
212      provides a single metric for the duration of the block and it
213      doesn't give the minimum and maximum delay values.  This
214      limitation could be overcome by reducing the duration of the block
215      by means of a highly optimized implementation of the method.

[minor] "Then the difference between averages calculated at two monitoring
points is the average packet delay on that segment."  Maybe my math is
rusty, but I would have thought the average delay to be the average of the
averages, not the difference.  ??

[minor] "only a small error is introduced"  This seems to be a statement of
faith: no proof or justification.

[major] "This method...[has a] limitation [which] could be overcome by
reducing the duration of the block by means of a highly optimized
implementation of the method."  What does that mean?  How is it done?

[major] What considerations should be taken into account when deciding the
length of time, or number of packets, to be used in each monitored flow?
Please include this information in the Operational Considerations section.

217 4.2.  Double Mark Enabled Measurement

[major] rfc8321 uses "Double-Marking" (not Double Mark).  Please be
consistent.

219   Double Mark method allows measurement of minimum and maximum delays
220   for the monitored flow but it requires more nodal and network
221   resources.  If the Double Mark method used, then the L flag MUST be
222   used to create the alternate flow, i.e. mark larger batches of
223   packets.  The D flag MUST be used to mark single packets to measure
224   delay jitter.

[major] "If the Double Mark method used, then the L flag MUST be used...The
D flag MUST be used..."   There's no Normative value in the use of MUST
here.  If appropriate, please use Normative language to indicate *how* the
flags are used instead.  s/MUST/must

226   The first marking (L flag alternation) is needed for packet loss and
227   also for average delay measurement.  The second marking (D flag is
228   put to one) creates a new set of marked packets that are fully
229   identified over the BIER network, so that a BFR can store the
230   timestamps of these packets; these timestamps can be compared with
231   the timestamps of the same packets on a second BFR to compute packet
232   delay values for each packet.  The number of measurements can be
233   easily increased by changing the frequency of the second marking.
234   But the frequency of the second marking must be not too high in order
235   to avoid out of order issues.  This method is useful to measure not
236   only the average delay but also the minimum and maximum delay values
237   and, in wider terms, to know more about the statistic distribution of
238   delay values.

[major] "the frequency of the second marking must be not too high"  What is
"too high"?

240 5.  IANA Considerations

242   This document requests IANA to register format of the OAM field of
243   BIER Header as the following:

[major] §3 calls these bits L and D.

[major] If you're assigning values to all the bits in the field, then a
registry is not needed.

[major] If you want to set up a registry to avoid others using the bits in
different ways, then that requires an Update to rfc8296.  The way I read
rfc8296 is that there could be multiple ways of using the field.  I also
didn't see this type of discussion in the WG archive.

245   +--------------+---------+--------------------------+---------------+
246   | Bit Position | Marking | Description              | Reference     |
247   +--------------+---------+--------------------------+---------------+
248   |      0       |    S    | Single Mark Measurement  | This document |
249   |      1       |    D    | Double Mark Measurement  | This document |
250   +--------------+---------+--------------------------+---------------+

252                     Table 1: OAM field of BIER Header

[major] §3 uses a different description, which seems inline with §4.  Also,
the use of a single bit doesn't indicate (according to §4) the type of
measurement done.

254 6.  Security Considerations

256   This document list the OAM requirement for BIER-enabled domain and
257   does not raise any security concerns or issues in addition to ones
258   common to networking.

[major] "This document list the OAM requirement for BIER-enabled domain..."
 That is not what this document does!

[major] "common to networking"  That's a very wide statement -- do you have
a reference?

[major] You should at least point to rfc8296.

[major] I would also like to see a pointer to rfc8321, and a discussion of
how the concerns there apply (or not) to the BIER application.


...
283 8.2.  Informative References
...
295   [RFC8321]  Fioccola, G., Ed., Capello, A., Cociglio, M., Castaldelli,
296              L., Chen, M., Zheng, L., Mirsky, G., and T. Mizrahi,
297              "Alternate-Marking Method for Passive and Hybrid
298              Performance Monitoring", RFC 8321, DOI 10.17487/RFC8321,
299              January 2018, <https://www.rfc-editor.org/info/rfc8321>;.

[major] This reference must be Normative.