[Roll] AD Review of draft-ietf-roll-nsa-extension-10

Alvaro Retana <aretana.ietf@gmail.com> Thu, 17 March 2022 17:33 UTC

Dear authors:

Thank you for this work!  The draft is clear and to the point.  I am
sorry it took me so long to get to it due to too many other documents
in the queue.

I have a general comment that I will mention up-front and several
in-line comments/questions (below).  I will wait for at least a
revision to start the IETF Last Call.

(1) Motivation -- are there other applications?

The document starts by prominently using PRE to justify the work.  The
Abstract even promises that it "details how to apply Packet
Replication and Elimination in RPL"  However, PRE is barely mentioned
beyond the Introduction.  The only significant mention comes in §6,
where implementation control is recommended -- but no guidance is
given on when or how to use it (see specific comments below).

This document aims to specify the CA OF and the PS NSA extension (as
the title mentions).  PRE may have been the original inspiration, just
that.  But, IMHO, the document shouldn't be centered around it -- we
could even eliminate §6.

Are there other potential users of these extensions?  Thinking out
loud...  The AP can be used as a load-sharing path (at least until the
traffic reaches a CA).  The AP could also be used as a pre-calculated
backup path.

This comment is not a show stopper, but I think the document would
benefit from less focus on PRE: using it as an example of potentially
many other applications.



[Line numbers from idnits.]

141	2.  Terminology

143	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
144	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
145	   document are to be interpreted as described in [RFC2119].

[major] Use the text in rfc8174.

147	   The draft uses the following Terminology:
156	   Parent Set (PS):  Given a RPL node, the set of its neighbor nodes
157	      which participate in the same RPL DODAG and which can potentially
158	      take the role of the node's preferred parent.

[minor] This term already exists in rfc6550.  It looks like the
definition is the same, but to avoid confusion or divergence please
point there instead.  For example: "The following terms from rfc6550
are used in this document: PS, ...".

169	3.  Common Ancestor AP Selection Policies

171	   In the RPL protocol, each node maintains a list of potential parents.
172	   For PRE, the Preferred Parent (PP) node is defined to be the same as
173	   the RPL DODAG Preferred Parent node.  Furthermore, to construct an
174	   alternative path toward the root, in addition to the PP node, each
175	   node in the network selects additional parent(s), called alternative
176	   parent(s), from its Parent Set (PS).

[minor] "For PRE, the Preferred Parent (PP) node is defined to be the
same as the RPL DODAG Preferred Parent node."

IOW, there isn't really a PRE PP, just a PP, as defined in rfc6550, right?

   PRE uses the RPL DODAG Preferred Parent node.

[** See related comment below in §4. **]

[nit] "additional parent(s)"  §2 uses "Additional Parents"
capitalized.  Please be consistent when using defined terms.

181	   All three policies defined perform AP selection based on common
182	   ancestors, named Common Ancestor Strict, Common Ancestor Medium, and
183	   Common Ancestor Relaxed, depending on how restrictive the selection
184	   process is.  A more restrictive policy will limit flooding but might
185	   fail to select an appropriate AP, while a less restrictive one will
186	   more often find an appropriate AP but might increase flooding.

[major] Besides the consideration of more or less flooding, what other
things should an operator keep in mind when selecting a policy?

[** I revisit this question in §4.1. **]

192	3.1.  Common Ancestor Strict
233	   node S can decide to use node B as its AP node, since PP(PP(S)) = Y =
234	   PP(B).

[nit] s/node S/Node S

[major] "node S can decide to use node B as its AP node"

What do you mean by "can decide"?  If Node S is using the CA Strict OF
then B *is* its AP node.  There's no room for other decisions.  Am I
missing something?

I can see the case where multiple nodes meet the condition -- for
example if a new node E also has PP(E) = Y -- then I can see how node
S "can decide" to use one or the other.  Are you implying that the
result of the policy is somehow optional?

[major] BTW, if there was more than one option, how would node S make
that tie-breaking decision?

236	3.2.  Common Ancestor Medium
251	   node S can decide to use node B or D as its AP node.

[nit] s/node S/Node S

[] Same questions about "can decide" and multiple nodes that satisfy
the condition.

253	3.3.  Common Ancestor Relaxed
269	   node S can decide to use node A, B or D as its AP node.

[nit] s/node S/Node S

[] Same questions about "can decide".

[major] How would node S select between A, B, and D?

271	4.  Common Ancestor Objective Function

273	   An OF which allows the multiple paths to remain correlated is
274	   detailed here.  More specifically, when using this OF a node will
275	   select an AP node close to its PP node to allow the operation of
276	   overhearing between parents.  For more details about overhearing and
277	   its use in this context see the "Complex Track with Replication and
278	   Elimination" in Section 4.5.3 of [I-D.ietf-6tisch-architecture].  If
279	   multiple potential APs match this condition, the AP with the lowest
280	   rank will be registered.

[minor] "a node will select an AP node close to its PP node to allow
the operation of overhearing between parents"

How is this used in the policies defined in §3?  Are you referring to
physical distance or inferring proximity based on the selection

[] "the AP with the lowest rank will be registered."

Is this the tie-breaker for the policies in §3?

282	   The OF described here is an extension of The Minimum Rank with
283	   Hysteresis Objective Function [MRHOF].  In general, this OF extends
284	   MRHOF by specifying how an AP is selected.  Importantly, the
285	   calculation of the rank of the node through each candidate neighbor
286	   and the selection of the PP is kept the same as in MRHOF.

[minor] "Importantly, the...selection of the PP is kept the same as in MRHOF."

§3 says that "For PRE, the Preferred Parent (PP) node is defined to be
the same as the RPL DODAG Preferred Parent node."  I made a comment
there about clarifying that the PP was selected according to rfc6550
-- but the text here talks about using rfc6719 instead.  Please

[major style nit] Personally, I prefer using references in the form of
"[RFCxxx]" instead of a name, for example "[MRHOF]".  In this case,
the reference to "MRHOF" is used both as a reference to rfc6719 and as
an abbreviation.  The result is that the rest of the text is not clear
as to whether you're extending the objective function defined in
rfc6719, or rfc6719 itself (as in a formal Update).

The RFC Style Guide (rfc7322) says this:

   3.6.  Abbreviation Rules

   Abbreviations should be expanded in document titles and upon first
   use in the document.  The full expansion of the text should be
   followed by the abbreviation itself in parentheses.

s/The Minimum Rank with Hysteresis Objective Function [MRHOF]/the
Minimum Rank with Hysteresis Objective Function (MRHOF) [RFC6719]

Please use "[RFC6719]" as the reference throughout.

[major] I am assuming that this document doesn't intend to formally
Update rfc6719.  Instead, it defines a new OF (with a different OCP)
based on MRHOF.  Is that correct?

288	   The ways in which the CA OF modifies MRHOF in a section-by-section
289	   manner follows in detail:

[minor] s/CA OF modifies MRHOF/CA OF differs from MRHOF

298	   [MRHOF], Section 3 "The Minimum Rank with Hysteresis Objective
299	   Function":
300	      Same as MRHOF extended to AP selection.  Minimum Rank path
301	      selection and switching applies correspondingly to the AP with the
302	      extra CA requirement of having some match between ancestors.

[] "extra CA requirement of having some match between ancestors"

You're referring to the policies defined §3, right?  If so, please say so.

304	   [MRHOF], Section 3.1 "Computing the Path Cost":
305	      Same as MRHOF extended to AP selection.  If a candidate neighbor
306	      does not fulfill the CA requirement then the path through that
307	      neighbor SHOULD be set to MAX_PATH_COST, the same value used by
308	      MRHOF.  As a result, the node MUST NOT select the candidate
309	      neighbor as its AP.

[minor] s/the path...set to MAX_PATH_COST/the path cost...set to MAX_PATH_COST

[major] "SHOULD be set to MAX_PATH_COST...MUST NOT select"

When is it ok to not set the path cost to MAX_PATH_COST?  Why is this
action recommended and not required?

I realize that rfc6719 also only recommends the setting.  This is the
text from §3.1:

   If the selected metric is a link metric and the metric of the link to
   a neighbor is not available, the path cost for the path through that
   neighbor SHOULD be set to MAX_PATH_COST.  This cost value will
   prevent this path from being considered for path selection.

The difference is that the text in rfc6719 goes on to say what the
expected result of setting the cost to MAX_PATH_COST is -- in a
non-normative way.

OTOH, this document follows up by saying that "As a result, the node
MUST NOT select the candidate neighbor as its AP."  So -- the required
action to not select is contingent on the recommended action of
setting the cost.  What are the cases when it is ok to not use

[Even if you use the wording from rfc6719 I will still want to know
when it is ok to not use MAX_PATH_COST.]

319	   [MRHOF], Section 3.2.2 "Parent Selection Algorithm":
320	      Same as MRHOF extended to AP selection.  If the smallest path cost
321	      for paths through the candidate neighbors is smaller than
322	      cur_ap_min_path_cost by less than PARENT_SWITCH_THRESHOLD (the
323	      same variable as MRHOF uses), the node MAY continue to use the
324	      current AP.  Additionally, if there is no PP selected, there MUST
325	      NOT be any AP selected as well.  Finally, as with MRHOF, a node
326	      MAY include up to PARENT_SET_SIZE-1 additional candidate neighbors
327	      in its alternative parent set.  The value of PARENT_SET_SIZE is
328	      the same as in MRHOF.

[minor] s/MUST NOT be any AP selected as well/MUST NOT be any AP selected either

336	   [MRHOF], Section 3.5 "Working without Metric Containers":
337	      It is not possible to work without metric containers, since CA AP
338	      selection requires information from parents regarding their parent
339	      sets, which is transmitted via the NSA object in the DIO Mectric
340	      Container.

[major] "It is not possible to work without metric containers..."

What if the metric container is not present?  Is this one of the risks
that should also be mentioned in the Security section?

353	      Alternative parent set:  Corresponding to the MRHOF parent set.
354	         The size is defined by the same PARENT_SET_SIZE parameter as in
355	         MRHOF.  The Alternative parent set MUST be a non-strict subset
356	         of the parent set.

[major] "MUST be a non-strict subset of the parent set"

Maybe I don't understand the full meaning of non-strict, but the AP
set should at least not include the PP, right?  That means that the AP
set has to be a strict subset of the parent set.  What am I missing?

358	      cur_ap_min_path_cost:  Corresponding to the MRHOF
359	         cur_min_path_cost variable.  To support the operation of the
360	         hysteresis function for AP selection.

[major] "Corresponding to the MRHOF cur_min_path_cost variable."

This is an additional variable which is similar to cur_min_path_cost,
right?  IOW, you're not replacing one with the other.

371	4.1.  Usage

373	   All OF policies apply their corresponding criterion to filter the
374	   list of candidate neighbours in the alternative parent set.  The AP
375	   is then selected from the alternative parent set based on Rank and
376	   using hysteresis as is done for the PP in MRHOF.  It is noteworthy
377	   that the OF uses the same Objective Code Point (OCP): TBD1 for all
378	   policies used.

[minor] s/All OF policies/All Common Ancestor AP Selection Policies (Section 3)

[minor] s/Objective Code Point (OCP): TBD1/Objective Code Point (OCP) (TBD1)

380	   The PS information can be used by any of the described AP selection
381	   policies or other ones not described here, depending on requirements.
382	   It is optional for all nodes to use the same AP selection policies.
383	   Different nodes may use different AP selection policies, since the
384	   selection policy is local to each node.  For example, using different
385	   policies can be used to vary the transmission reliability in each
386	   hop.

[] It is ok to not require the same policy for on all nodes, but it
opens questions about policy selection and configuration (below).

[major] Operational Considerations: How should an operator choose
which policy to apply where?  I understand the difference with respect
to the strictness, but what makes a policy appropriate for a specific
situation.  For example, (just making this up) should more strict
policies be applies closer to the edge (leaves) of the network where
there's a greater probability of common GP?

I later found Appendix B...  Please reference it from here.

[major] How are the different policies provisioned at different nodes?
 In many instances the root decides about the behavior of the DODAG
and propagates that information (all do the same).  But in this case
all nodes won't have the same configuration.  How is that provisioned?

388	5.  Node State and Attribute (NSA) object type extension
455	   The structure of the DAG Metric Container data in the form of a Node
456	   State and Attribute (NSA) object with a TLV in the NSA Optional TLVs
457	   field is shown in Figure 3.  The first 32 bits comprise the DAG
458	   Metric Container header and all the following bits are part of the
459	   Node State and Attribute object body, as defined in [RFC6551].  This
460	   document defines a new TLV, which MAY be carried in the Node State
461	   and Attribute (NSA) object Optional TLVs field.  The TLV is named
462	   Parent Set and is abbreviated as PS in Figure 3.

[major] "MAY be carried"

It is an optional TLV.  However, in the context of this document it is
required to convey the PS.  I would prefer it if the text mentioned
that: "MUST be carried to indicate the PS...".

[related] If the network is using the CA OF, what does it mean for the
TLV to not be present?  Everyone has a parent...well, except for the

466	   PS Length:  The total length of the TLV value field (PS IPv6
467	         address(es)) in bytes.  The length is an integral multiple of
468	         16, the number of bytes in an IPv6 address.

[major] If my math is correct, the length cannot be larger than 240.
What should the receiver do if the length is not a multiple of 16 or
more than 240?

[minor] Is 0 a valid length?  What does it mean to not have parents?

470	   PS IPv6 address(es)  One or more 128-bit IPv6 address(es) without any
471	         separator between them.  The field consists of one IPv6 address
472	         per parent in the parent set.  The parent addresses are listed
473	         in decreasing order of preference and not all parents in the
474	         parent set need to be included.  The selection of how many
475	         parents from the parent set are to be included is left to the
476	         implementation.  The number of parent addresses in the PS IPv6
477	         address(es) field can be deduced by dividing the length of the
478	         PS IPv6 address(es) field in bytes by 16, the number of bytes
479	         in an IPv6 address.

[] I am surprised that these addresses were not compressed.

481	5.1.  Usage

483	   The PS SHOULD be used in the process of parent selection, and
484	   especially in AP selection, since it can help the alternative path to
485	   not significantly deviate from the preferred path.  The Parent Set is
486	   information local to the node that broadcasts it.

[major] "PS SHOULD be used"

What are the cases where it should not be used, or when it is ok to
not use it?  In the context of this document, why is the use
recommended and not required?

The CA policy is defined per-node...and also (from above) the
propagation of the PS is not required.  Not propagating the PS may
undermine any efforts the downstream has to figure out their GP...

488	   The PS is used only within NSA objects configured as a metric,
489	   therefore the DAG Metric Container field "C" MUST be 0.
490	   Additionally, since the information in the PS needs to be propagated
491	   downstream but it cannot be aggregated, the DAG Metric Container
492	   field "R" MUST be 1.  Finally, since the information contained is by
493	   definition partial, more specifically just the parent set of the DIO-
494	   sending node, the DAG Metric Container field "P" MUST be 1.

[major] What should a receiver do if the PS is included but the flags
are not set correctly?

496	   It is important that the PS does not affect the calculation of the
497	   rank through candidate neighbors.  It is only used with the CA OF to
498	   remove nodes which do not fulfill the CA OF criteria from the
499	   candidate neighbor list.

[] I'm not sure what you're trying to get to with the first sentence.
Are you saying that the PS shouldn't be used for anything else?  How
can you prevent that?  What if in the future there's a new use?

501	6.  Controlling PRE

503	   PRE is very helpful when the aim is to increase reliability for a
504	   certain path, however its use creates additional traffic as part of
505	   the replication process.  It is conceivable that not all paths have
506	   stringent reliability requirements.  Therefore, a way to control
507	   whether PRE is applied to a path's packets SHOULD be implemented.
508	   For example, a traffic class label can be used to determine this
509	   behavior per flow type as described in Deterministic Networking
510	   Architecture [RFC8655].

[major] "a way to control whether PRE is applied to a path's packets
SHOULD be implemented."

Why is this a recommendation and not a requirement?  Is it expected
that if PRE is available on the node that it should apply to all
traffic?  Even if the configuration knob existed, what criteria should
the network operator use to decide whether to activate PRE or not?

512	7.  Security Considerations

[major] Please indicate that the security considerations from rfc6550,
rfc6551, and rfc6719 apply.

514	   The structure of the DIO control message is extended, within the pre-
515	   defined DIO options.  The additional information are the IPv6
516	   addresses of the parent set of the node transmitting the DIO.  This
517	   use of this additional information can have the following potential
518	   consequences:

520	   *  A malicious node that can receive and read the DIO can "see"
521	      further than it's own neighbourhood by one hop, learning the
522	      addresses of it's two hop neighbors.  This is a privacy / network
523	      discovery issue.

[] I would stay away from "privacy".  In general, similar information
can be gleaned by looking at the traffic and (in storing mode) holding
routes.  Topology discovery inside a domain is usually not an issue.

OTOH, if you have real privacy concerns, please take a look at rfc6973
and include a section on it.

532	8.  IANA Considerations

534	   This proposal requests the allocation of a new value TBD1 from the
535	   "Objective Code Point (OCP)" sub-registry of the "Routing Protocol
536	   for Low Power and Lossy Networks (RPL)" registry.

538	   This proposal also requests the allocation of a new value TBD2 for
539	   the "Parent Set" TLV from the "Routing Metric/Constraint TLVs" sub-
540	   registry of the "Routing Protocol for Low Power and Lossy Networks
541	   (RPL) Routing Metric/Constraint" registry.

[nit] s/This proposal/This document/g

569	10.2.  Informative references

571	   [I-D.ietf-6tisch-architecture]
572	              Thubert, P., "An Architecture for IPv6 over the TSCH mode
573	              of IEEE 802.15.4", Work in Progress, Internet-Draft,
574	              draft-ietf-6tisch-architecture-29, 27 August 2020,
575	              <https://tools.ietf.org/html/draft-ietf-6tisch-
576	              architecture-29>.

== Outdated reference: draft-ietf-6tisch-architecture has been published as
   RFC 9030

586	   [MRHOF]    Gnawali, O. and P. Levis, "The Minimum Rank with
587	              Hysteresis Objective Function", RFC 6719,
588	              DOI 10.17487/RFC6719, September 2012,
589	              <https://www.rfc-editor.org/info/rfc6719>.

[major] This reference has to be Normative.

596	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
597	              Requirement Levels", BCP 14, RFC 2119,
598	              DOI 10.17487/RFC2119, March 1997,
599	              <https://www.rfc-editor.org/info/rfc2119>.

[major] This reference, and the required one to rfc8174, have to be Normative.

[EoR -10]