[Roll] Benjamin Kaduk's Discuss on draft-ietf-roll-aodv-rpl-11: (with DISCUSS and COMMENT)

Benjamin Kaduk via Datatracker <noreply@ietf.org> Fri, 29 October 2021 23:01 UTC

Return-Path: <noreply@ietf.org>
X-Original-To: roll@ietf.org
Delivered-To: roll@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id EE9D13A19E9; Fri, 29 Oct 2021 16:01:54 -0700 (PDT)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
From: Benjamin Kaduk via Datatracker <noreply@ietf.org>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-roll-aodv-rpl@ietf.org, roll-chairs@ietf.org, roll@ietf.org, Ines Robles <mariainesrobles@googlemail.com>, aretana.ietf@gmail.com, mariainesrobles@googlemail.com
X-Test-IDTracker: no
X-IETF-IDTracker: 7.39.0
Auto-Submitted: auto-generated
Precedence: bulk
Reply-To: Benjamin Kaduk <kaduk@mit.edu>
Message-ID: <163554851494.11244.14085107396965188459@ietfa.amsl.com>
Date: Fri, 29 Oct 2021 16:01:54 -0700
Archived-At: <https://mailarchive.ietf.org/arch/msg/roll/oD4aV5w31O513R_eOIIpcN_diFI>
Subject: [Roll] Benjamin Kaduk's Discuss on draft-ietf-roll-aodv-rpl-11: (with DISCUSS and COMMENT)
X-BeenThere: roll@ietf.org
X-Mailman-Version: 2.1.29
List-Id: Routing Over Low power and Lossy networks <roll.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/roll>, <mailto:roll-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/roll/>
List-Post: <mailto:roll@ietf.org>
List-Help: <mailto:roll-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/roll>, <mailto:roll-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 29 Oct 2021 23:02:02 -0000

Benjamin Kaduk has entered the following ballot position for
draft-ietf-roll-aodv-rpl-11: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/blog/handling-iesg-ballot-positions/
for more information about how to handle DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-roll-aodv-rpl/



----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

Thanks for the updates in the -11, they are a bit improvement.

(1) I did make a point of looking into the procedures for determining
whether a route will be symmetric or asymmetric, and I'm running into
trouble for the case where an intermediate router determines that a link
cannot work as part of a symmetric route.

For concreteness, let's consider the following snippet of topology:

  +-------+                   +-------+                   +-------+
  |       |    asymmetric     |       |                   |       |
  |   A   +-------------------+   B   +===================+   C   |
  |       |       L1          |       |       L2          |       |
  +-------+                   +-------+                   +-------+

Suppose that an RREQ-DIO arrives at A with S=1 and H=0.  Per step 3 of
§6.2.1, if the link the RREQ-DIO arrived on satisfies the objective
function, the outgoing RREQ-DIO is also transmitted with S=1.
A also associates state with its RREQ-Instance the value of the S bit
from the transmitted RREQ-DIO, i.e., 1.
That RREQ-DIO arrives at B, who performs the same check and determines
that the L1 cannot satisfy the objective function.  Accordingly, B sends
out a RREQ-DIO to C with S=0 and stores the state S=0.  The RREQ-DIO
continues to through C to TargNode (or maybe C is TargNode; it may not
matter), and TargNode proceeds to follow asymmetric procedures,
initiating an RREP-DIO with an initially empty address vector.  That
RREP-DIO arrives at C, who has stored state S=0, so C joins the DODAG of
the RREP-Instance, adds its address to the address vector (per step 4 of
§6.4), and sends an RREP-DIO to B.  B likewise has stored state S=0,
adds its address to the address vector, and sends an RREP-DIO to A.  But
A has stored state S=1, so in step 4 of §6.4, A is looking for an
address in the address vector to use as the unicast target for A's
outgoing RREP-DIO.  But there is no such entry in the address vector,
becuase up to now the RREP-DIO has been using asymmetric procedures, and
has no data on how to get from A to OrigNode!

Now, it's certainly possible that I've made an error in the above.  But
even if I have, it still seems suggestive that this boundary behavior is
pretty complicated and hard to get right.  It seems like we should have
some similar discussion in the document to cover how this case does
actually work.

(And if I didn't make an error in the above, it seems to still be
salvageable, with B storing a sentinel value of "I changed from S=1 to
S=0" instead of just 1 or 0, and holding on to the (symmetric) address
vector from OrigNode to B.  Then B could perform translation from the
asymmetric to symmetric regime for the RREP and all routers on the path
would be able to install useful route entries.  But there's not anything
in the current text to suggest that B should be doing that.)

(2) I'm putting this in the Discuss section because I think it's
important for the authors/WG to produce an answer.  Since I've been
wrong about it at least once, I do not claim to know the correct answer,
and thus the Discuss point ought to be easy to resolve.

Section 6.3.3 says:

                          Instead, the RPLInstanceID MUST be replaced by
   another value so that the two RREP-instances can be distinguished.
   In RREP-DIO option, the Shift field of the RREP-DIO message(Figure 2)
   indicates the shift to be applied to original RPLInstanceID to obtain
   the replacement RPLInstanceID.  When the new RPLInstanceID after
   shifting exceeds 255, it rolls over starting at 0.  For example, if
   the original RPLInstanceID is 252, and shifted by 6, the new
   RPLInstanceID will be 2. [...]

I know that the use of 255 as the largest value here comes as a result
of my earlier review, but wanted to note that the resulting discussion
thread may not have fully concluded.
In particular, I now see
https://datatracker.ietf.org/doc/html/rfc6550#section-5.1 that does
indicate that only 6 bits of "usable" ID are present for local
RPLInstanceIDs, which seem to be the ones in use here.  Sorry to have
missed that in my initial review; I hope that we can figure out what the
actual correct boundary value is.


----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

Some of the text I comment on below is new in the -11, and I would have
hoped that WG review of the changes would have detected more of this
type of thing.  This leaves me uncertain what level of review the WG
actually performed, and I am considering balloting Abstain once my
discuss points are resolved.

Section 3

   to TargNode, and another from TargNode to OrigNode.  When possible,
   AODV-RPL also enables symmetric route discovery along Paired DODAGs
   (see Section 5).

(Modifying a comment I made on the -10) Perhaps we could say "AODV-RPL
also enables discvoery of symmetric routes along Paired DODAGs when
symmetric routes are possible (see Section 5)"?

Section 4.2

   L
      2-bit unsigned integer defined as in RREQ option.

Per the discussion on my previous ballot thread, I suggest adding "The
lifetime of the RREP-Instance MUST be shorter than the lifetime of the
RREQ-Instance it is paired to" (or similar).

Section 4.3

   Target Prefix / Address
      (variable-length field) An IPv6 destination address or prefix.
      The Prefix Length field contains the number of valid leading bits
      in the prefix.  The Target Prefix / Address field contains the
      least number of octets that can represent all of the bits of the
      Prefix, in other words Ceil(Prefix Length/8) octets.  The initial
      bits in the Target Prefix / Address field preceding the prefix
      length (if any) MUST be set to zero on transmission and MUST be
      ignored on receipt.  If Prefix Length is zero, the Address field
      is 128 bits for IPv6 addresses.

This is a change from the -10 about where the target prefix is aligned
for prefix lengths that are not a multiple of 8.
I have no stance on which formulation is best, but it seems very
surprising to change the wire encoding in this manner at such a late
stage in the document lifecycle, without specific compelling reasoning.

Section 6.2.1

   Step 1:

      The router MUST first determine whether to propagate the RREQ-DIO.
      It does this by determining whether or not the downstream
      direction of the incoming link satisfies the Objective Function
      (OF).  If not the RREQ-DIO MUST be dropped, and the following
      steps are not processed.  Otherwise, the router MUST join the
      RREQ-Instance and prepare to propagate the RREQ-DIO.  The upstream
      neighbor router that transmitted the received RREQ-DIO is selected
      as the preferred parent.
   [...]
   Step 3:

      If the S bit of the incoming RREQ-DIO is 0, then the route cannot
      be symmetric, and the S bit of the RREQ-DIO to be transmitted is
      set to 0.  Otherwise, the router MUST determine whether the
      downward (i.e., towards the TargNode) direction of the incoming
      link satisfies the OF.  If so, the S bit of the RREQ-DIO to be
      transmitted is set to 1.  Otherwise the S bit of the RREQ-DIO to
      be transmitted is set to 0.

The step 1 procedure checks the downstream direction of the incoming
link, and the step 3 procedure also checks the dosntream direction of
the incoming link.  In order to assess whether the link works as a
symmetric link, I think that these checks need to be on different
directions of that link, but am not confident about which step should
check which direction.

section 6.3

                                                              If the
   implementation selects the symmetric route, and the L field is not 0,
   the TargNode MAY delay transmitting the RREP-DIO for duration
   RREP_WAIT_TIME to await a route with a lower Rank.  The value of

In the -10 the text allowing waiting was present in both the section for
the symmetric case and the section for the asymmetric case; is the
conditional "if the implementation selects the symmetric route" correct?

Section 6.3.2

   When a RREQ-DIO arrives at a TargNode with the S bit set to 0, the
   TargNode MUST build a DODAG in the RREP-Instance corresponding to the
   RREQ-DIO, rooted at itself in order to discover the downstream route

nit: this comma is misplaced (and may not be needed at all).

Section 6.4

   Upon receiving a RREP-DIO, a router performs the following steps:

   Step 1:

      If the Objective Function is not satisfied, the router MUST NOT
      join the DODAG; the router MUST discard the RREQ-DIO, and does not

s/RREQ/RREP/?

      If the S-bit of the RREQ-Instance is set to 0, the router MUST
      determine whether the downward direction of the link (towards the
      TargNode) over which the RREP-DIO is received satisfies the
      Objective Function, and the router's Rank would not exceed the
      MaxRank limit.  If so, the router joins the DODAG of the RREP-
      Instance.  The router that transmitted the received RREP-DIO is
      selected as the preferred parent.  Afterwards, other RREP-DIO
      messages can be received.

Please confirm whether "downward direction" is correct.  It seems to me
that in determining whether to join the RREP-Instance, we need to check
whether the "reply path" (from TargNode to OrigNode) is feasible, and
skip joining the instance if it's not a feasible path.  But the text
written here seem to be checking the feasibility of the "request path",
the same direction that was checked in §6.2.1 when deciding whether to
join the RREQ-Instance.

Section 10

Thanks for acting on my previous comment and moving the normative
requirements on nodes to not emit RREPs if they have an address in the
address vector already!  I still think it would be worth some text here
in the security considerations about what goes wrong if those checks are
skipped (I think, a routing loop occurs, but that's something of a
guess).

It seems that if Compr is set too large, there is some risk of a node
failing to check that it shares that many bits of address prefix with
the address in the DODAGID and thus decompression would produce an
incorrect route.

   If a rogue router is able to forge a gratuitous RREP, significant
   damage might result.

Would this damage be in the form of traffic amplification, routing loop,
DoS of certain (key) nodes, ...?