[babel] Benjamin Kaduk's Discuss on draft-ietf-babel-rfc6126bis-12: (with DISCUSS and COMMENT)

Benjamin Kaduk via Datatracker <noreply@ietf.org> Wed, 07 August 2019 22:13 UTC

Return-Path: <noreply@ietf.org>
X-Original-To: babel@ietf.org
Delivered-To: babel@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id E84DC1200DF; Wed, 7 Aug 2019 15:13:18 -0700 (PDT)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
From: Benjamin Kaduk via Datatracker <noreply@ietf.org>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-babel-rfc6126bis@ietf.org, Donald Eastlake <d3e3e3@gmail.com>, babel-chairs@ietf.org, d3e3e3@gmail.com, babel@ietf.org
X-Test-IDTracker: no
X-IETF-IDTracker: 6.100.0
Auto-Submitted: auto-generated
Precedence: bulk
Reply-To: Benjamin Kaduk <kaduk@mit.edu>
Message-ID: <156521599894.8313.13827924927219698158.idtracker@ietfa.amsl.com>
Date: Wed, 07 Aug 2019 15:13:18 -0700
Archived-At: <https://mailarchive.ietf.org/arch/msg/babel/4t6jBbGUlh9XlHGZR8FAsNMrus4>
Subject: [babel] Benjamin Kaduk's Discuss on draft-ietf-babel-rfc6126bis-12: (with DISCUSS and COMMENT)
X-BeenThere: babel@ietf.org
X-Mailman-Version: 2.1.29
List-Id: "A list for discussion of the Babel Routing Protocol." <babel.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/babel>, <mailto:babel-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/babel/>
List-Post: <mailto:babel@ietf.org>
List-Help: <mailto:babel-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/babel>, <mailto:babel-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 07 Aug 2019 22:13:19 -0000

Benjamin Kaduk has entered the following ballot position for
draft-ietf-babel-rfc6126bis-12: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
for more information about IESG DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-babel-rfc6126bis/



----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

I don't think that all of the arithmetic specified in Section 3.2.1 is
well defined.  Specifcally, the formulations involving bitwise AND
assume that the input to the bitwise AND is nonnegative, which does not
seem to be implied by the other stated constraints.  (For example, an
"integer n" may well be negative.)  Some discussion of the
representation of negative integers would then be needed, and then
whether the mathematical operation is performed in an abstract
infinite-precision machine or in a realizable approximation, etc..  It
might be simpler to just use the modular arithmetic flavor and avoid any
of the issues that can arise when providing two alternative definitions
that are intended to be equivalent (since there is always a risk of edge
cases).

Section 3.5.2 needs to explicitly say that the c and m arguments to M()
are the local link cost and the advertised metric, e.g., "the function
M(c, m) used for computing a metric from a locally computed link cost c
and the metric m advertised by a neighbor".

Section 3.8.2.1 notes that "[d]ue to duplicate suppression, only a small
number of such requests will actually reach the source." (for seqno
requests intending to avoid starvation).  But Section 3.8.1.2 only has a
SHOULD-level requirement to suppress duplicate seqno requests, so I
think there is an internal inconsistency.

I think we may need to have a discussion about the feasibility of
multicast acknowledgment requests with only a 16-bit nonce.  With random
assignment of nonces the risk of birthday collisions becomes
uncomfortably large, and non-random assignments are likely to have worse
pathologies.  (A pointer to a previous discussion of this topic would,
of course, short-circuit a lot of it if not all of it.)  Are we willing
to make hard assumptions about the maximum size of a multicast domain
and the risk of collision we are willing to accept?

The discussion in Section 4.6.9 of computing the prefix from an Update
message (and parser state) seems a little underspecified when the prefix
length is not a multiple of 8 bits.  (Additionally, "Plen" is not
described as measuring bits, explicitly, for any of the PDU descriptions
that I remember.)  Specifically, the "Prefix" description does not
mention that any trailing bits must be set to zero, but the subsequent
discussion about the prefix is "computed as follows" refers to
assembling the prefix as a collection of octets, including trailing zero
octets, implying that the computed prefix is the full length of the
address type.

I appreciate that we have some discussion in Section 4.5 about the need
for a stateful parser for the babel packet body; this seems like one of
the riskiest areas of the protocol from the implementation perspective.
However, I think it would be even more helpful to explicitly call out
what pieces of state are needed, what protocol elements affect the
state, and what ordering requirements (or non-requirements) there are
for the interactions between the different protocol elements that affect
parser state.  Can we have a discussion about whether it's appropriate
to add some text along these lines?


----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

Should there be a "changes since RFC 6126" section that is retained in
the published RFC?  (I assume that Appendix F is going to be dropped.)

The secdir review has some good thoughts (e.g., tracking "link-local"
IPv4 addresses, discussion of non-protection from hostile insiders), but
I don't see a response to it.

We use the phrase "a small multiple of" a few times, but I don't
remember seeing any concrete guidance for what factor to use.  Is it
intended to be closer to 1.1 or to 4?

In a related vein, there are many places in the document where the
precise details of processing are left intentionally underspecified
(e.g., computing a link's cost).  I understand that due to the protocol
guarantees the needed routing will still be achieved even if nodes use
different parameters and algorithms in these cases, but do we expect the
details to be chosen on a per-implementation basis, or in profile
documents, or even left up to operator configuration on a per-node
basis?

Section 1

The introduction should mention obsoleting 6126 and 7557, in addition to
doing so in the abstract.

Section 1.1

   Finally, Babel is a hybrid routing protocol, in the sense that it can
   carry routes for multiple network-layer protocols (IPv4 and IPv6),
   whichever protocol the Babel packets are themselves being carried
   over.

nit: I think "regardless of which" is better than "whichever", since the
latter might erroneously imply that there is a consistency requirement
of the carried routes and the carrying protocol.

Section 1.2

   Second, unless the optional algorithm described in Section 3.5.5 is
   implemented, Babel does impose a hold time when a prefix is

Similarly to my comment on the applicability doc, I'm not sure if
there's one or two things in Section 3.5.5 that would match this
description.

Section 2

   Conceptually, Bellman-Ford is executed in parallel for every source
   of routing information (destination of data traffic).  In the
   following discussion, we fix a source S; the reader will recall that
   the same algorithm is executed for all sources.

Just to check my understanding: this "source S" is a source of routing
information, not a source of data-plane traffic being routed?

Section 2.4

Is there a reference for AODV?

   To show that this feasibility condition still guarantees loop-
   freedom, recall that at the time when A accepts an update from B, the
   metric D(B) announced by B is no smaller than FD(B); since it is
   smaller than FD(A), at that point in time FD(B) < FD(A).  Since this
   property is preserved when A sends updates, it remains true at all
   times, which ensures that the forwarding graph has no loops.

I'm trying to walk through this and missing a step or two.  "the metric
D(B) announced by B is no smaller than FD(B)" is pretty clear, since
FD(B) is just the minimum value of D(B) over time thus far.  But I'm not
sure I follow how A can preserve the property FD(B) < FD(A) when A sends
updates.  Clearly FD(B(T')) <= FD(B(T0)) for any time T' after T0, but
suppose FD(B) remains constant but A is off interacting with some other
node C and finds a great path via C, which correspondingly causes D(A)
to reduce.  Can I get into a situation where
D(A) < FD(B) <= D(A) + C(A,B) (and thus, the subsequent
FD(A) < FD(B) <= FD(A) + C(A,B)) if A does not interact with B during
that time?

Section 2.5

Using the minusculeu and majuscule forms of the same letter to mean
different things (e.g., source S and sequence number s) is something of
a readability anti-pattern.

Section 3.2.6

It would probably be helpful to readers to note that "neighbor that
advertised" and "next-hop" can be different due to being different
address families.  (For the same address family, they are generally
going to be the same, modulo weird network-layer technologies, right?)

Section 3.5.1

(side note: I got a bit confused reading this section and had to go
double-check several definitions, due to the qualitative difference
between the "metric" and "metric'" under comparison.  Namely, the
"metric" is for the path from neighbor to S, but the "metric'" is for
the path from the current node to S, and so in some sense they are
"measuring different things".  Perhaps using "FD" instead of "metric'"
would help disambiguate.  I understand that this a fairly common pattern
for routing protocols, though, so don't necessarily expect any change to
the text.)

   router-id.  Feasibility distances are maintained in the source table,
   the exact procedure is given in Section 3.7.3.

nit: this is a comma splice.

Section 3.5.2

   Note that while strict monotonicity is essential to the integrity of
   the network (persistent routing loops may arise if it is not
   satisfied), left distributivity is not: if it is not satisfied, Babel
   will still converge to a loop-free configuration, but might not reach
   a global optimum (in fact, a global optimum may not even exist).

I might even go so far as to say that a global optimum "will likely not
exist", though this is fairly qualitative/intuitive since we don't
define a configuration space or metric over it in which to evaluate the
probability.

Section 3.5.4

We don't seem to use the "link cost value equal to cost" anywhere in
this section, so maybe it is superfluous.

   If such an entry exists:

   o  if the entry is currently selected, the update is unfeasible, and
      the router-id of the update is equal to the router-id of the
      entry, then the update MAY be ignored;

I guess the idea is that we can keep the old one around until it would
time out, since the initial timeout value for it means it should still
be workable until our timer expires, but it's only a MAY in case we want
to be more proactive about noticing that the advertised metric is now
unfeasible?  It might be worth saying a bit about when we might/might
not want to heed the MAY.

Section 3.5.5

   o  sending a retraction with an acknowledgment request (Section 3.3)
      to every reachable neighbour that has not explicitly retracted
      prefix P and waiting for all acknowledgments.

nit(?): I'd suggest a comma before "and waiting for all
acknowledgments", since that's the final gating factor to achieve the
goal.

   The former option is simpler and ensures that at that point, any
   routes for prefix P pointing at the current node have expired.
   However, since the expiry time can be as high as a few minutes, doing
   that prevents automatic aggregation by creating spurious black-holes
   for aggregated routes.  The latter option is RECOMMENDED as it
   dramatically reduces the time for which a prefix is unreachable in
   the presence of aggregated routes.

nit: I don't think this "prevents automatic aggregation" at a technical
level, but rather that it "makes automatic aggregation rather unusable
in practice" since if automatic aggregation is used, any route
retraction will result in a spurious blackhole for the (minutes) expiry
time, which is unacceptable for most environments.

Section 3.7

   Additionally, in order to ensure that any black-holes are reliably
   cleared in a timely manner, a Babel node sends retractions (updates
   with an infinite metric) for any recently retracted prefixes.

Is the sending of retractions the one described by the SHOULDs in 3.7.2?
If so, I'm not sure that "a Babel node sends retractions for any
recently retracted prefixes" is quite accurate (since SHOULD is not a
mandatory requirement); "can send" or "will generally send" might be
better.

Section 3.7.1

   Every Babel speaker periodically advertises all of its selected
   routes on all of its interfaces, including any recently retracted
   routes.  Since Babel doesn't suffer from routing loops (there is no
   "counting to infinity") and relies heavily on triggered updates
   (Section 3.7.2), this full dump only needs to happen infrequently.

Part of the need for the full dump stems from the potential for
unreliable links, right?  Do we want to mention that relationship here,
(and that if there are particularly unreliable links the frequency may
need to be more often)?

Section 3.8.1.2

We haven't introduced "hop count" yet and just mention it in passing
here as "[if the] hop count is 2 or more".

Intuitively, it seems like the routr should send an update if the
router-ids match and the requested seqno is equal to the route entry's
seqno, but I don't see this case covered in the current text.

   o  otherwise, if the node has one or more (not necessarily feasible)
      routes to the requested prefix with a next hop that is not the

nit: I think the parenthetical can just be "not feasible", as any
feasible routes in question would have matched the previous bullet
point.

   neighbours.  However, if a seqno request is resent by its originator,
   the subsequent copies MAY be forwarded to a different neighbour than
   the initial one.

Is MAY the appropriate level of strength?  Trying the same neighbor
would be effective if the original was unsuccessful due to packet loss,
but is it possible for a routing pathology to occur that directs the
request in the "wrong direction" with respect to a link or node failure?

Section 3.8.2.4

Is it worth giving some informal guidance about not sending multicast
wildcard requests if a node observes others doing the same around the
same time (or similar) to avoid the "serious congestion" issues?

Section 4.2

   A Babel packet consists of a 4-octet header, followed by a sequence
   of TLVs (the packet body), optionally followed by a second sequence
   of TLVs (the packet trailer).

Without mention of the 'body length' field here, a reader might be
confused at what distinguishes the body TLVs from the trailer TLVs.

   The packet body and trailer are both sequences of TLVs.  The packet
   Ibody is the normal place to store TLVs; the packet trailer only
   contains specialised TLVs that do not need to be protected by
   cryptographic security mechanisms.

I think we need a more explicit statement that the body structure is
subject to change when security mechanisms are in use, to allow for
potential confidentiality-protecting cryptographic mechanisms.

Section 4.3

Length is still in octets, right?

Section 4.4

   Every TLV carries an explicit length in its header; however, most
   TLVs are self-terminating, in the sense that it is possible to
   determine the length of the body without reference to the explicit
   Length field.  If a TLV has a self-terminating format, then it MAY
   allow a sequence of sub-TLVs to follow the body.

This seems like a statement of fact, for which a lowercase "may" is
perfectly adequate.

   Sub-TLVs have the same structure as TLVs.  With the exception of
   PAD1, all TLVs have the following structure:

I was going to complain that it's somewhat unfortunate to use the same
name for a thing that's a TLV and a thing that's a sub-TLV, even if they
have identical encodings.  But then I noticed that in this (sub-TLV)
section we spell it "PAD1" and in the previous (TLV) section we spell it
"Pad1", which are different.  On the gripping hand, Sections 4.6.1 and
4.7.1 both spell it "Pad1", which are the same.  So a little bit of
effort rationalizing things would go a long way.

   The most-significant bit of the sub-TLV, called the mandatory bit,

Just to be clear: this is the MSB of the 'type' octet?

Also, for similar features in other protocols I've suggested the
clarifying language of "comprehension-mandatory" which seems to more
accurately reflect the corresponding behavior.

Section 4.5

   Since the parser state is separate from the bulk of Babel's state,
   and since for correct parsing it must be identical across
   implementations, it is updated before checking for mandatory TLVs:

nit: "mandatory sub-TLVs" (right?)

Section 4.6.2

   MBZ       Set to 0 on transmission.

Is it legal for a receiver to check and abort if any bits are nonzero?

Section 4.6.3

Sixteen bits of nonce does not provide much unguessability (I note that
LISP's rfc6830bis is recommending that their 24-bit nonce echo
functionality not be relied on for return-routability checks over the
public Internet).  However, since these acknowledgment exchanges are
only between direct neighbors, it seems that they are only needed for
correlating responses to requests and not for unguessability.  (In this
case it seems a sequence number would work just as well as a random
number, and we might want to discourage random assignment in the text to
avoid the risk of birthday collisions.)
On the other hand, multicast acknowledgment requests could be
problematic (and especially so when sequential nonces are used), and if
they are intended to be allowed then we may need to consider using a
larger and random nonce.

Section 4.6.6

I'm getting some sever cognitive dissonance between the "Rxcost" field
and the "carrying a link's transmission cost" statement.  Also, in

   Rxcost    The rxcost according to the sending node of the interface
             whose address is specified in the Address field.  The value
             FFFF hexadecimal (infinity) indicates that this interface
             is unreachable.

if I insert commas to get "The rxcost, according to the sending node [of
the TLV], of the interface whose address is specified in the Address
field", does that preserve the intended meaning?
nit/aside: It also feels like there's a bit of a mismatch here, in that
the "rxcost of the interface" probably means the local interface (from
the perspective of the sender), but that interface is being identified
by the *remote* address (again, from the perspective of the sender of
the TLV).  So maybe "whose remote address" could resolve the mismatch
I'm perceiving?  (Or maybe I'm completely misunderstanding, of course.)

   Interval  An upper bound, expressed in centiseconds, on the time
             after which the sending node will send a new IHU; this MUST
             NOT be 0.  [...]

To check my understanding: are the IHUs conceptually a reply to Hellos,
such that if the Hellos stopped arriving then the peer would stop
sending IHUs in response?  I understand that their intervals are set
completely independently, so there is not a direct causal relationship,
but I'm trying to check whether the quoted sentence is a strict
commitment by the sender of the IHU or could be rescinded due to
external events.

Section 4.6.9

   If the Metric field is finite, the router-id of the originating node
   for this announcement is taken from the prefix advertised by this
   Update if the Router-Id flag is set, computed as described above.
   Otherwise, it is taken either from the preceding Router-Id packet, or
   the preceding Update packet with the Router-Id flag set, whichever
   comes last, even if that TLV is otherwise ignored due to an unknown
   mandatory sub-TLV.

Both cases of "packet" here should be "TLV", right?  Otherwise we have
to scope what set of previous packets are applicable to this route
(since we get a lot of packets, for a lot of different routes).

Section 5

"Specification Required" also requires Expert Review.  What guidance can
we provide to the experts for making registration decisions?

Section 6

It's a little disappointing that we provide four different PDUs for
padding but then have no discussion of privacy considerations related to
(potentially encrypted) packet length, and when (else) one might want to
pad, and what padding policy might look like.  I understand that padding
policy remains something of an open research question, but even
acknowledging that can still be useful.

Is an attacker's capability limited to misdirecting traffic?  Can it
cause traffic to be blackholed or cause routing loops by falsifying
protocol data either modified in transit or originating false data?
What are the effects of an attacker completely or selectively dropping
protocol data?  In essence, please flesh out "completely insecure" with
a bit more detail.

   The information that a Babel node announces to the whole routing
   domain is often sufficient to determine a mobile node's physical
   location with reasonable precision.  The privacy issues that this
   causes can be mitigated somewhat by using randomly chosen router-ids
   and randomly chosen IP addresses, and changing them periodically.

"periodically" may not be the best advice; coupling such changes to
mobility events is likely to be more effective at preserving privacy.
(QUIC has discussed related topics quite extensively, though there's
enough traffic in the archives that I can neither point you at a
specific thread or recommend searching for it.)

Section 8.2

I think at least BABEL-HMAC needs to be normative, since it is
RECOMMENDED.

Section A.1

If we're talking about "appending bits" to the history fields, maybe
describing them as fixed-length queues or something makes more sense
than vectors.

If the field is maintained in a 16-bit integer, what is done for the
previously erased bits when we "undo history"?

   Whenever either Hello timer associated to a neighbour expires, the
   local node adds a 0 bit to this neighbour's Hello history, and

We keep two hello histories; we should clarify that the one in question
is the one corresponding to the timer that expired.

Section A.2.2

I don't understand the origin of the '256' in the MIN(1, 256/txcost)
formula (described as a probability estimate).

I think a lot more work is needed to convince me that the two given
formulae for "cost" are equivalent (especially given that 'rxcost' only
appears once in the entire section, in the second formula).

Section A.3.2

Is k "allowed to" (I know this section is just informative) vary on
non-external data, such as the route or link in question?

Appendix C

I could see this content in the main body of the document.