[Roll] Benjamin Kaduk's Discuss on draft-ietf-roll-aodv-rpl-10: (with DISCUSS and COMMENT)

Benjamin Kaduk via Datatracker <noreply@ietf.org> Thu, 22 April 2021 22:48 UTC
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
From: Benjamin Kaduk via Datatracker <noreply@ietf.org>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-roll-aodv-rpl@ietf.org, roll-chairs@ietf.org, roll@ietf.org, Ines Robles <mariainesrobles@googlemail.com>, aretana.ietf@gmail.com, mariainesrobles@googlemail.com
Auto-Submitted: auto-generated
Precedence: bulk
Reply-To: Benjamin Kaduk <kaduk@mit.edu>
Message-ID: <161913172006.16574.8625402788675096789@ietfa.amsl.com>
Date: Thu, 22 Apr 2021 15:48:40 -0700
Archived-At: <https://mailarchive.ietf.org/arch/msg/roll/5uoqSV-wvmbJpahkTYUkoXcC57c>
Subject: [Roll] Benjamin Kaduk's Discuss on draft-ietf-roll-aodv-rpl-10: (with DISCUSS and COMMENT)
Benjamin Kaduk has entered the following ballot position for
draft-ietf-roll-aodv-rpl-10: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
for more information about DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-roll-aodv-rpl/



----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

My apologies for coming in with a late review, but I think there are
some serious internal inconsistencies in this document that leave me
unsure whether the document is in a reviewable form.  It might be
prudent to have the document return to the WG to fix the identified
issues and get additional review.

Specifically, there are several places in the document (most notably
Section 6.4) that provide steps for processing a RREP-DIO that refer to
the value of the "S bit".  There is no S bit in the RREP option as
defined in Section 4.2; indeed, there has never been an S bit in the
RREP option since it was introduced in the -03.  The -02 was proposing
changes directly in the DIO base object, which included an S bit, so in
that version of the document referring to an "S bit" in the reply
processing could have made sense.

There are also a few places that refer to using RREP (reply) processing
to relate to membership in or joining of the RREQ (request) DODAG.  I
assume that these are, in effect, typographical errors that should refer
to the RREP DODAG, but the one character has extreme significance to
protocol operations.

I also think that there is too much ambiguity relating to the processing
of RREPs in the symmetric vs asymmetric case (which returns to the
question of whether there is or should be an S bit in the RREP option).
In particular, the semantics of the Address Vector field (for the
source-routing case only, of course) vary.  In the symmetric case this
field is set by TargNode and propagated unchanged in the RREPs, but for
the asymmetric case each intermediate node needs to add its address in
the Address Vector.  We do cover these different behaviors in Sections
6.3.1 and 6.3.2, but leave it very unclear as to how an intermediate
node tells whether a received RREP is for the symmetric or asymmetric
case.  An explicit S bit would make this easy, of course, though it
seems like it *might* be possible to use whether the RREP was received
over a unicast or multicast address/interface as a stand in.  However,
that technique would be complicated by the presence of gratuitous RREPs,
which are unicast in cases that do not quite align up with symmetric vs
asymmetric.  (Whether the processing behavior should reflect the "append
to address vector" or "propagate address vector unchanged" for the
gratuitous case is also not entirely clear to me.)

On a more minor note, I don't think the description of rollover in
Section 6.3.3 is correct.  More in the COMMENT, but in essence, even
though the shift is capped at 63, the instance ID can go up to 255 and
wrapping should occur at the instance ID boundary, not the shift
boundary.


----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

The Abstract and Introduction do not paint a very clear picture of what
is going to happen.  Section 3 helps a fair bit, but I would have
expected the introduction to mention that RREQ/RREP go in separate
(paired) RPL instances, and that instances are created (and destroyed?)
for each route being discovered.  (This would also be a great place to
clarify how AODV-RPL interacts with regular RPL, as was requested by
other ADs already.)

I would like to see a clearer picture of the relationship between the
lifetime of routes discovered using AODV-RPL and the lifetime of the
DODAGs used to build them.  The (non-infinite) DODAG lifetime options
are fairly short, and I would (perhaps naively) expect routes to have a
longer lifetime than that in many cases.  But it seems that the
information stored with a route includes the RPL InstanceID, and if the
route is to outlast the DODAG, then that information would become stale,
and I don't know what value there would be in keeping it around in that
case and risking collisions.  Is it expected that when routes are to be
long-lived, the L value of 0 is to be used?

Section 1

   (DAO) message of RPL.  AODV-RPL specifies a new MOP (Mode of
   Operation) running in a separate instance dedicated to discover P2P
   routes, which may differ from the point-to-multipoint routes
   discoverable by native RPL.  AODV-RPL can be operated whether or not

I don't really understand why we find it useful to make a comparison
between P2P routes and P2MP routes.

Section 2

   RREP-DIO message
      An AODV-RPL MOP DIO message containing the RREP option.  The
      RPLInstanceID in RREP-DIO is typically paired to the one in the

Typically, or actually (noting that §6.3.3 allows for the pairing
process to include a "Shift" count for cases where the value cannot
match exactly)?  Is this an attempt to reflect the symmetric case where
a DODAG is not built for symmetric routes?  If so, it's not clear that
we accurately portray what would be the "typical" case...but even in
that symmetric case we still have to populate the RPLInstance field in
the unicast RREP-DIO, and that still has the pairing logic.  So I'm back
to wondering when these would not be paired.

Section 3

   The routes discovered by AODV-RPL are not constrained to traverse a
   common ancestor.  AODV-RPL can enable asymmetric communication paths
   in networks with bidirectional asymmetric links.  For this purpose,

Can AODV-RPL function in networks with unidirectional links?

   to TargNode, and another from TargNode to OrigNode.  When possible,
   AODV-RPL also enables symmetric route discovery along Paired DODAGs
   (see Section 5).

In what circumstances is it not possible to do so?

Section 4.1

   OrigNode sets its IPv6 address in the DODAGID field of the RREQ-DIO
   message.  A RREQ-DIO message MUST carry exactly one RREQ option,
   otherwise it MUST be dropped.  (Similarly for RREP in §4.2.)

I suggest clarifying that other options are allowed (required, even).

Who sets the S bit, and can it change as the DODAG is being constructed?
("See Section 5" would be fine.)

   L
      2-bit unsigned integer determining the duration that a node is
      able to belong to the temporary DAG in RREQ-Instance, including
      the OrigNode and the TargNode.  Once the time is reached, a node
      MUST leave the DAG and stop sending or receiving any more DIOs for
      the temporary DODAG.

How do we account for time skew as the DIO propagates?  Each node just
leaves on their own timer?

   Address Vector
      A vector of IPv6 addresses representing the route that the RREQ-
      DIO has passed.  It is only present when the H bit is set to 0.
      The prefix of each address is elided according to the Compr field.

   TargNode can join the RREQ instance at a Rank whose integer portion
   is equal to the MaxRank.  Other nodes MUST NOT join a RREQ instance
   if its own Rank would be equal to or higher than MaxRank.  A router
   MUST discard a received RREQ if the integer part of the advertised
   Rank equals or exceeds the MaxRank limit.

Both of these descriptions might benefit from a bit more detail.  E.g.,
the latter paragraph doesn't say that TargNode can join if the rank is
less than MaxRank, only if it's equal.

Section 4.2

   H
      Requests either source routing (H=0) or hop-by-hop (H=1) for the
      downstream route.  It MUST be set to be the same as the H bit in
      RREQ option.

(editorial) I'd suggest putting the "MUST be the same" requirement as
the first sentence, and then the other sentence could be "determines
whether source routing (H=0) or hop-by-hop (H=1) is used for the
downstream route"

   L
      2-bit unsigned integer defined as in RREQ option.

Does L need to have the same value as in the triggering RREQ option?  If
not, when might TargNode choose a different value?

   Address Vector
      Only present when the H bit is set to 0.  For an asymmetric route,
      the Address Vector represents the IPv6 addresses of the route that
      the RREP-DIO has passed.  For a symmetric route, it is the Address
      Vector when the RREQ-DIO arrives at the TargNode, unchanged during
      the transmission to the OrigNode.

[ed. this was written before I made a discuss point about it, but I'm
leaving the text for the extra detail.  It's okay to just respond to the
discuss point and not here.]
If I understand correctly, the S bit indicating symmetric vs asymmetric
route is present only in the RREQ-DIO, and is not included in-band in
the RREP-DIO.  Does this require all nodes on the path to remember
whether a symmetric route is being constructed on the RREQ-DIO instance,
use the Shift in the RREP-DIO to correlate to the corresponding RREQ-DIO
and 'S' bit status, as part of the processing (to determine whether or
not to append to the Address Vector)?

Section 4.3

   Dest SeqNo

      In RREQ-DIO, if nonzero, it is the last known Sequence Number for
      TargNode for which a route is desired.  In RREP-DIO, it is the
      destination sequence number associated to the route.

The destination sequence number for the downstream route or the upstream
route?

Also, should we say that zero is used if there is no known information about
the sequence number of TargNode (and not otherwise)?

   r
      A one-bit reserved field.  This field MUST be initialized to zero
      by the sender and MUST be ignored by the receiver.

The secdir reviewer noted the mismatch between 'X' in the figure and 'r'
here; please fix.

   Prefix Length
      7-bit unsigned integer.  Number of valid leading bits in the IPv6
      Prefix.  If Prefix Length is 0, then the value in the Target
      Prefix / Address field represents an IPv6 address, not a prefix.

   Target Prefix / Address
      (variable-length field) An IPv6 destination address or prefix.
      The Prefix Length field contains the number of valid leading bits
      in the prefix.  The length of the field is the least number of
      octets that can contain all of the bits of the Prefix, in other
      words Floor((7+(Prefix Length))/8) octets.  The remaining bits in
      the Target Prefix / Address field after the prefix length (if any)
      MUST be set to zero on transmission and MUST be ignored on
      receipt.

Please specify how long the Address field is when Prefix Length is zero
(indicating that the last field is the Address variant).

Section 5

   Links are considered symmetric until additional information is
   collected.  [...]

What kinds of problems will arise if we start taking actions based on
this assumption before the "additional information" is available?
(That is to say, perhaps this is not a useful phrasing, since what we
actually do is get updates about the presence of asymmetric links as we
construct the route.)

   bit set to 1, then all the one-hop links on the route from the
   OrigNode O to this router meet the requirements of route discovery,

Re "the route", this would presumably be the one recorded in the Address
Vector of the RREQ in question?  (Multiple RREQs for the same route
computation can arrive at a given node with different address vectors,
right?

Also, the way this is written implies that it does not say anything
about "non-one-hop links" on the route, but I don't really know what a
link that's not a one-hop link would be.  Can we just say "all the hops"
or "all the links"?

   and the route can be used symmetrically.

But does that matter for any routers other than TargNode (for any of the
AODV-RPL Target Options)?

   doesn't satisfy the Objective Function.  Based on the S bit received
   in RREQ-DIO, TargNode T determines whether or not the route is
   symmetric before transmitting the RREP-DIO message upstream towards
   the OrigNode O.

Does that determination affect the construction of the RREP-DIO in any
way?  (E.g., if there was an S bit.)

            Figure 5: AODV-RPL with Asymmetric Paired Instances

Some discussion of how the third(? second?) intermediate router detects
the asymmetry and clears the S bit might be appropriate.

Section 6.1

   link-local multicast.  The DIO MUST contain at least one ART Option
   (see Section 4.3).  The S bit in RREQ-DIO sent out by the OrigNode is
   set to 1.

I'd suggest saying that the required ART Option indicates the TargNode.

   OrigNode can maintain different RPLInstances to discover routes with
   different requirements to the same targets.  Using the RPLInstanceID
   pairing mechanism (see Section 6.3.3), route replies (RREP-DIOs) for
   different RPLInstances can be distinguished.

When using different RPLInstances for this purpose, what constitutes
"initiates a route discovery process" across those instances -- is it
permissible to only increment the sequence number once when initiating
multiple discovery processes on different instances?

Section 6.2.1

   Step 1:

      If the S bit in the received RREQ-DIO is set to 1, the router MUST
      determine whether each direction of the link (by which the RREQ-
      DIO is received) satisfies the Objective Function.  In case that
      the downward (i.e. towards the TargNode) direction of the link
      does not satisfy the Objective Function, the link can't be used
      symmetrically, thus the S bit of the RREQ-DIO to be sent out MUST
      be set as 0.  If the S bit in the received RREQ-DIO is set to 0,
      the router MUST determine into the upward direction (towards the
      OrigNode) of the link.

      If the upward direction of the link can satisfy the Objective
      Function, and the router's Rank would not exceed the MaxUseRank
      limit, the router joins the DODAG of the RREQ-Instance.  The
      router that transmitted the received RREQ-DIO is selected as the
      preferred parent.  Otherwise, if the Objective Function is not
      satisfied or the MaxUseRank limit is exceeded, the router MUST
      discard the received RREQ-DIO and MUST NOT join the DODAG.

The way this is written is confusing to me.  It seems to say that (1)
you only check the upward direction is the S bit in the received
RREQ-DIO is set to zero, and (2) the only time you join the DODAG is if
you're checking the upward direction.  So, when the received S-bit is 1,
do you just never join the DODAG?  I assume this is not the intent, but
that is how I interpret the words that are on the page.

      Sequence Number.  The Destination Address and the RPLInstanceID
      respectively can be learned from the DODAGID and the RPLInstanceID
      of the RREQ-DIO, and the Source Address is the address used by the
      local router to send data to the OrigNode.  The Next Hop is the

"Source Address is the address used by the local router to send data to
the OrigNode" seems like the definition of the source address in a route
table entry, not a procedure for how to set it.  Should this be the
address used by the local router to send data to the preferred parent?

Section 6.3.1

   implementation-specific and out of scope.  If the implementation
   selects the symmetric route, and the L bit is not 0, the TargNode MAY
   delay transmitting the RREP-DIO for duration RREP_WAIT_TIME to await
   a symmetric route with a lower Rank.  The value of RREP_WAIT_TIME is
   set by default to 1/4 of the time duration determined by the L bit.

There is no L *bit* in the RREQ option or the RFC 6550 DIO.  There is a
two-bit L field in the RREQ option, but even if I replace 'bit' with
'field', it's still not clear why having a DODAG with no lifetime limit
implies that delaying the RREP-DIO is not allowed.

Section 6.3.2

   When a RREQ-DIO arrives at a TargNode with the S bit set to 0, the
   TargNode MUST build a DODAG in the RREP-Instance rooted at itself in

I don't understand how the definite article is appropriate for "the
RREP-Instance rooted at itself" -- I thought there were multiple
(paired) instances corresponding to the various RREQ DODAGs that
requested routes to TargNode.

   RREP_WAIT_TIME to await a route with a lower Rank.  The value of
   RREP_WAIT_TIME is set by default to 1/4 of the time duration
   determined by the L bit.

("L bit" again, and no indication of what to do for L==0.)

   The settings of the fields in RREP option and ART option are the same
   as for the symmetric route, except for the S bit.

There is no S bit in the RREP.  What is this intending to say?

Section 6.3.3

   When preparing the RREP-DIO, a TargNode could find the RPLInstanceID
   to be used for the RREP-Instance is already occupied by another RPL
   Instance from an earlier route discovery operation which is still
   active.  In other words, it might happen that two distinct OrigNodes
   need routes to the same TargNode, and they happen to use the same
   RPLInstanceID for RREQ-Instance.  In this case, the occupied
   RPLInstanceID MUST NOT be used again.  [...]

A reminder might be helpful that the RPLInstanceID is a property of a
DODAG, and a DODAG is identified by the DODAGID, which in this case is
the address of the TargNode.  So that is why we need to avoid reusing
RPLInstanceID in the context of the RREP-DIO, whereas there is no
problem with collisions in RPLInstanceID across RREQ-DIOs (where the
DODAGID is the OrigNode address, that suffices to disambiguate).

   shift to be applied to original RPLInstanceID.  When the new
   RPLInstanceID after shifting exceeds 63, it rolls over starting at 0.

I thought RPLInstanceID was a full 8-bit field (even though Shift is
only six bits); wouldn't rollover happen after 255?

   For example, the original RPLInstanceID is 60, and shifted by 6, the
   new RPLInstanceID will be 2.  Related operations can be found in
   Section 6.4.

(So this example wouldn't actually show rollover.)

Section 6.4

   Upon receiving a RREP-DIO, a router which does not belong to the
   RREQ-Instance goes through the following steps:

Do we care about RREQ-Instance membership or RREP-Instance membership,
for processing the RREP-DIO?

   Step 1:

      If the S bit is set to 1, the router MUST proceed to step 2.

There is no S bit in the RREP option!

      and the destination address is learned from the DODAGID.  The
      lifetime is set according to DODAG configuration (i.e., not the L
      bit) and can be extended when the route is actually used.  The

("L bit" again)

   Upon receiving a RREP-DIO, a router which already belongs to the
   RREQ-Instance SHOULD drop the RREP-DIO.

(RREQ-Instance vs RREP-Instance, again.)

Section 10

It seems like a malicious node that forges a gratuitous RREP could do
significant damage as well, so that might be worth mentioning.

   routing loop.  The TargNode MUST NOT generate a RREP if one of its
   addresses is present in the Address Vector.  An Intermediate Router
   MUST NOT forward a RREP if one of its addresses is present in the
   Address Vector.

These requirements seem important enough that I'd prefer to seem them
imposed in the main body text that covers RREP handling, and the
security considerations mentioned here and referring to those handling
requirements.
[Roll] Benjamin Kaduk's Discuss on draft-ietf-rol… Benjamin Kaduk via Datatracker
Re: [Roll] Benjamin Kaduk's Discuss on draft-ietf… Charlie Perkins
Re: [Roll] Benjamin Kaduk's Discuss on draft-ietf… Benjamin Kaduk