Re: [babel] Benjamin Kaduk's Discuss on draft-ietf-babel-rfc6126bis-12: (with DISCUSS and COMMENT)

Benjamin Kaduk <kaduk@mit.edu> Wed, 14 August 2019 23:05 UTC

Return-Path: <kaduk@mit.edu>
X-Original-To: babel@ietfa.amsl.com
Delivered-To: babel@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D2BAE120901; Wed, 14 Aug 2019 16:05:25 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.199
X-Spam-Level:
X-Spam-Status: No, score=-4.199 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ECchWI0z7Qoy; Wed, 14 Aug 2019 16:05:22 -0700 (PDT)
Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1D8F31208EF; Wed, 14 Aug 2019 16:05:21 -0700 (PDT)
Received: from kduck.mit.edu ([24.16.140.251]) (authenticated bits=56) (User authenticated as kaduk@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id x7EN5F2U022009 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 14 Aug 2019 19:05:18 -0400
Date: Wed, 14 Aug 2019 18:05:14 -0500
From: Benjamin Kaduk <kaduk@mit.edu>
To: Juliusz Chroboczek <jch@irif.fr>
Cc: The IESG <iesg@ietf.org>, draft-ietf-babel-rfc6126bis@ietf.org, Donald Eastlake <d3e3e3@gmail.com>, babel-chairs@ietf.org, babel@ietf.org
Message-ID: <20190814230513.GD88236@kduck.mit.edu>
References: <156521599894.8313.13827924927219698158.idtracker@ietfa.amsl.com> <87v9v57pjm.wl-jch@irif.fr>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <87v9v57pjm.wl-jch@irif.fr>
User-Agent: Mutt/1.10.1 (2018-07-13)
Archived-At: <https://mailarchive.ietf.org/arch/msg/babel/GJBwnL62je6LC3nU1DSl5ZupbVw>
Subject: Re: [babel] Benjamin Kaduk's Discuss on draft-ietf-babel-rfc6126bis-12: (with DISCUSS and COMMENT)
X-BeenThere: babel@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "A list for discussion of the Babel Routing Protocol." <babel.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/babel>, <mailto:babel-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/babel/>
List-Post: <mailto:babel@ietf.org>
List-Help: <mailto:babel-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/babel>, <mailto:babel-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 14 Aug 2019 23:05:26 -0000

On Sat, Aug 10, 2019 at 03:54:53AM +0200, Juliusz Chroboczek wrote:
> Dear Benjamin,
> 
> Thank you very much for your detailed review.
> 
> > ----------------------------------------------------------------------
> > DISCUSS:
> > ----------------------------------------------------------------------
> 
> > I don't think that all of the arithmetic specified in Section 3.2.1 is
> > well defined.  Specifcally, the formulations involving bitwise AND
> > assume that the input to the bitwise AND is nonnegative, which does not
> > seem to be implied by the other stated constraints.  (For example, an
> > "integer n" may well be negative.)
> 
> I've added "nonnegative" (we never add a negative integer to a seqno, not
> even when we undo history in Appendix A.2).
> 
> (The formulation is correct even for negative numbers independently of
> precision if we assume two's complement.)

Agreed, and thanks.

> > It might be simpler to just use the modular arithmetic flavor
> 
> I agree that it would be simpler, but modular arithmetic is a common
> source of bugs, especially in languages where the modulo operation does
> not yield a nonnegative integer (grr).  The bitwise formulations
> constitute useful guidance for the implementer.

Okay.

> > Section 3.5.2 needs to explicitly say that the c and m arguments to M()
> > are the local link cost and the advertised metric,
> 
> Done.
> 
> > Section 3.8.2.1 notes that "[d]ue to duplicate suppression, only a small
> > number of such requests will actually reach the source." (for seqno
> > requests intending to avoid starvation).  But Section 3.8.1.2 only has a
> > SHOULD-level requirement to suppress duplicate seqno requests, so I
> > think there is an internal inconsistency.
> 
> The idea here is that it's a pretty strong SHOULD -- you'd need to be
> really constrained for resources to not implement it.  If that's okay,
> I'll leave it as it stands, if that's okay with you.

It still feels like an internal inconsistency to me.  How would you feel
about s/will actually reach/are expected to actually reach/?

> > I think we may need to have a discussion about the feasibility of
> > multicast acknowledgment requests with only a 16-bit nonce.
> 
> Section 3.3.  An acknowledgment MUST be sent to a unicast destination.

I saw that, which is why I specified "acknowledgment requests".

> > The discussion in Section 4.6.9 of computing the prefix from an Update
> > message (and parser state) seems a little underspecified when the prefix
> > length is not a multiple of 8 bits.
> 
> Agreed, I've added the requirement to clear these bits.

Thanks.

> > (Additionally, "Plen" is not described as measuring bits, explicitly,
> > for any of the PDU descriptions that I remember.)
> 
> Fixed.
> 
> > I appreciate that we have some discussion in Section 4.5 about the need
> > for a stateful parser for the babel packet body; this seems like one of
> > the riskiest areas of the protocol from the implementation perspective.
> 
> I fully agree.  I've always had serious misgivings about this encoding,
> and we did discuss deprecating it in 6126bis.  Here's some background.
> 
> On the one hand, the encoding is inelegant and easy to get wrong, and
> hinders the extensibility of the protocol.  On the other hand, it is
> dramatically effective in some kinds of networks (networks carrying large
> numbers of IPv6 host routes sharing a common prefix), leading to
> a reduction in the amount of data being sent, on the order of 40%.  
> Instead of carrying 40 prefixes in a packet, you carry 60.
> 
> We did consider an alternative, which was to start the packet with a set
> of common prefixes that the individual updates could refer to.  However,
> this turned out to have similar complexity at the parser, while making the
> formatter slightly more complex.
> 
> An on-list poll of the implementers active at the time (from memory, so
> don't hold me accountable):
> 
>   - Markus said "the stateful encoding is not that bad";
>   - Toke declared he's okay with parsing the encoding, but his
>     implementation is not going to send any compressed addresses;
>   - I don't remember if David expressed an opinion, but since he's into
>     wireless networks, I'd expect him to be in favour of keeping it.
> 
> So we kept it.
> 
> > However, I think it would be even more helpful to explicitly call out
> > what pieces of state are needed, what protocol elements affect the
> > state, and what ordering requirements (or non-requirements) there are
> > for the interactions between the different protocol elements that affect
> > parser state.  Can we have a discussion about whether it's appropriate
> > to add some text along these lines?
> 
> Sure, we may have a discussion.
> 
> > ----------------------------------------------------------------------
> > COMMENT:
> > ----------------------------------------------------------------------
> 
> > Should there be a "changes since RFC 6126" section that is retained in
> > the published RFC?  (I assume that Appendix F is going to be dropped.)
> 
> Is that a hard requirement?  We've updated all implementations, so it
> wouldn't be too useful to anyone, and it would be a fair amount of work.

This is the non-blocking Comment section, so no, it's not a hard
requirement.  My personal interest is to be able to get a summary of the
protocol's evolution, what worked and what didn't work, and such, but I
don't know what kind of other demand there is for such a thing.  On the
other hand, putting in the effort to do it know means it's done for
everyone, and we aren't stuck with each interested person having to do the
research themself.

> > The secdir review has some good thoughts (e.g., tracking "link-local"
> > IPv4 addresses, discussion of non-protection from hostile insiders), but
> > I don't see a response to it.
> 
> The review was sent to us off-list, and we replied in kind.  Which points
> exactly did you find useful, and how would you like to see them integrated
> in the document?

Huh, I got a copy on the list
(https://mailarchive.ietf.org/arch/msg/secdir/YiUUmWlDlmIGpMoo8cMHu1k8WaM#)
and it looks like it should have gone to you as well.
What I might want to see integrated in the document depends on the nature
of the discussion triggered by the review.  The points about doing
local-address (subnet?) tracking for IPv4 and the potential for discussion
about the potential harm caused by a hostile insider are most poigniant to
me, and (relatedly) the risk of attack from anywhere in the v4 internet.
The points about integration of the DTLS mechanism with DNS/OCSP/etc. are
interesting, but of course probably a better fit for the DTLS document, and
the general topic of physical location probably merits some discussion (but
IIRC we have covered that fairly well already).

> > We use the phrase "a small multiple of" a few times, but I don't
> > remember seeing any concrete guidance for what factor to use.  Is it
> > intended to be closer to 1.1 or to 4?
> 
> It's 3.5, see Appendix B.  I've added refs at the relevant places.

Ah, thanks.
 
> > In a related vein, there are many places in the document where the
> > precise details of processing are left intentionally underspecified
> > (e.g., computing a link's cost).  I understand that due to the protocol
> > guarantees the needed routing will still be achieved even if nodes use
> > different parameters and algorithms in these cases,
> 
> Right.
> 
> > but do we expect the details to be chosen on a per-implementation basis,
> > or in profile documents, or even left up to operator configuration on
> > a per-node basis?
> 
> I liken the situation to that of BGP, where routing policies are left to
> the implementation or to the network administrator.  In the currently
> extant implementations, we use different algorithms depending on the link
> layer (Ethernet, WiFi or tunnel).
> 
> In this document, we take the following approach:
> 
>   - the normative text stresses the properties that the algorithms used
>     MUST meet;
>   - we give examples of algorithms that work well in Appendix A;
>   - we warn the implementer to be careful.
> 
> From an empirical point of view, this has worked pretty well: independent
> reimplementations interoperate.

Sure.  I'll leave it up to you whether it's worth putting in some text
about the expected granularity of policy specification.

> > Section 1
> 
> > The introduction should mention obsoleting 6126 and 7557, in addition to
> > doing so in the abstract.
> 
> Done.
> 
> > Section 1.1
> 
> >    Finally, Babel is a hybrid routing protocol, in the sense that it can
> >    carry routes for multiple network-layer protocols (IPv4 and IPv6),
> >    whichever protocol the Babel packets are themselves being carried
> >    over.
> 
> > nit: I think "regardless of which" is better than "whichever",
> 
> Agreed, done.
> 
> > Section 1.2
> 
> >    Second, unless the optional algorithm described in Section 3.5.5 is
> >    implemented, Babel does impose a hold time when a prefix is
> 
> > Similarly to my comment on the applicability doc, I'm not sure if
> > there's one or two things in Section 3.5.5 that would match this
> > description.
> 
> I'm not sure what's missing.  3.5.5 clearly states that either you wait
> for a fixed timeout, or you do something that doesn't involve waiting for
> a fixed timeout.

Grammatically, an "optional algorithm" is a single thing.  When I look at
3.5.5, I see two things and am thinking to myself "which one do I pick?".
It sounds like you want me to treat it as if there's a single algorithm,
and the algorithm is "arbitrarily choose one of these two things".  It
would be a lot easier for me to read if we said something like "the
optional algorithm for how long to maintain an infinite metric" or "one of
the two options described".  Maybe I'm the only one confused by this,
though; I can't say.

> > Section 2
> 
> >    Conceptually, Bellman-Ford is executed in parallel for every source
> >    of routing information (destination of data traffic).  In the
> >    following discussion, we fix a source S; the reader will recall that
> >    the same algorithm is executed for all sources.
> 
> > Just to check my understanding: this "source S" is a source of routing
> > information, not a source of data-plane traffic being routed?
> 
> Yes.  Added paranthetical.

Thanks!

> > Section 2.4
> 
> > Is there a reference for AODV?
> 
> Added.
> 
> >    To show that this feasibility condition still guarantees loop-
> >    freedom, recall that at the time when A accepts an update from B, the
> >    metric D(B) announced by B is no smaller than FD(B); since it is
> >    smaller than FD(A), at that point in time FD(B) < FD(A).  Since this
> >    property is preserved when A sends updates, it remains true at all
> >    times, which ensures that the forwarding graph has no loops.
> 
> > I'm trying to walk through this and missing a step or two.  "the metric
> > D(B) announced by B is no smaller than FD(B)" is pretty clear, since
> > FD(B) is just the minimum value of D(B) over time thus far.  But I'm not
> > sure I follow how A can preserve the property FD(B) < FD(A) when A sends
> > updates.  Clearly FD(B(T')) <= FD(B(T0)) for any time T' after T0, but
> > suppose FD(B) remains constant but A is off interacting with some other
> > node C and finds a great path via C, which correspondingly causes D(A)
> > to reduce.  Can I get into a situation where
> > D(A) < FD(B) <= D(A) + C(A,B) (and thus, the subsequent
> > FD(A) < FD(B) <= FD(A) + C(A,B)) if A does not interact with B during
> > that time?
> 
> We want to prove that the following property is preserved:
> 
>     P: NH(A) = B implies FD(B) < FD(A)
> 
> Assume that initially the following are true:
> 
>     NH(A) = B                  (i)
>     FD(B) < FD(A)              (ii)
> 
> A receives an announcement from C.  Then either
> 
>   - A doesn't switch its next hop, in which case D(A) doesn't change,
>     and so neither does FD(A); since FD(B) is nonincreasing, (ii) is still
>     true, and P is true; or
> 
>   - A sets NH(A) := C, so (i) becomes false, and P is true.

Okay.  So it sounds like I'm supposed to read "accepts an update from B"
and interpret that as meaning "that causes NH(A) to be B" as opposed to,
say, "this is valid metric data and I am recomputing my routing  table in
response"?  I can get behind that conclusion, but don't know if that's just
a term of art I missed or some clarification should be added.

> > Section 2.5
> 
> > Using the minusculeu and majuscule forms of the same letter to mean
> > different things (e.g., source S and sequence number s) is something of
> > a readability anti-pattern.
> 
> I agree.  We need Greek letters in RFCs.  (No Gothic, please.)

sadly, RFC 7997 doesn't really seem to support this cause :-/

> > Section 3.2.6
> 
> > It would probably be helpful to readers to note that "neighbor that
> > advertised" and "next-hop" can be different due to being different
> > address families.
> 
> They are completely different data structures.  The neighbour is (a
> reference to) an entry of the neighbour table, the NH is an IP address.

That's true.  Do we want to make it more clear to the reader that is going
through things quickly?

> > Section 3.5.1
> 
> > (side note: I got a bit confused reading this section and had to go
> > double-check several definitions, due to the qualitative difference
> > between the "metric" and "metric'" under comparison.  Namely, the
> > "metric" is for the path from neighbor to S, but the "metric'" is for
> > the path from the current node to S, and so in some sense they are
> > "measuring different things".
> 
> Yeah, it's tricky.  This is written with the implementer in mind, who's
> going to be manipulating, in C notation,
> 
>   update->metric   (metric)
>   source->metric   (metric')
> 
> Since this is the crucial part of the algorithm, it's written in a style
> that attempts to make it as easy as possible to check the implementation
> against the spec -- you can basically transliterate the RFC into your
> favourite programming language.
> 
> > Perhaps using "FD" instead of "metric'" would help disambiguate.
> 
> I think we're fairly consistent at using "metric" for a metric and
> "distance" for a pair (seqno, metric).  Let me know if you find any
> counter-examples.

I don't remember any, nor do I have any suggestions for improving
readability without breaking these desired properties.

> >    router-id.  Feasibility distances are maintained in the source table,
> >    the exact procedure is given in Section 3.7.3.
> 
> > nit: this is a comma splice.
> 
> I've made it into a colon; I don't like semicolons.
> 
> > Section 3.5.2
> 
> >    Note that while strict monotonicity is essential to the integrity of
> >    the network (persistent routing loops may arise if it is not
> >    satisfied), left distributivity is not: if it is not satisfied, Babel
> >    will still converge to a loop-free configuration, but might not reach
> >    a global optimum (in fact, a global optimum may not even exist).
> 
> > I might even go so far as to say that a global optimum "will likely not
> > exist", though this is fairly qualitative/intuitive since we don't
> > define a configuration space or metric over it in which to evaluate the
> > probability.
> 
> I agree with your intuition.  I think it should be possible to give
> a proof for random graphs, but it's not obvious to me whether they are
> representative of real networks.  If you're interested, you should have
> a chat with Sobrinho.
> 
> > Section 3.5.4
> 
> > We don't seem to use the "link cost value equal to cost" anywhere in
> > this section, so maybe it is superfluous.
> 
> Good catch, thanks.  Fixed.
> 
> >    If such an entry exists:
> 
> >    o  if the entry is currently selected, the update is unfeasible, and
> >       the router-id of the update is equal to the router-id of the
> >       entry, then the update MAY be ignored;
> 
> > I guess the idea is that we can keep the old one around until it would
> > time out, since the initial timeout value for it means it should still
> > be workable until our timer expires, but it's only a MAY in case we want
> > to be more proactive about noticing that the advertised metric is now
> > unfeasible?
> 
> In this case, the local node needs to predict the future in order to make
> the optimal decision.  The minor details of predicting the future are left
> to the implementation.  However, the consequences of getting it wrong are
> harmless, so predicting the future is left at MAY level.  This is unlike
> Brexit.
> 
> For this case to trigger, the neighbour needs to have increased its metric
> enough to make us unfeasible (intuitively, the feasibility condition is
> able to buffer up to one hop of metric instability), but without itself
> becoming unfeasible (otherwise it would have sent us a new seqno).  That
> means that there's some instability exactly one hop upstream.  And now
> you need to predict the future:
> 
>   - either this is just a short-term fluctuation, the metric will decrease
>     at the next update, so it's best to stick to the current route;
> 
>   - or this is indicative of our current route getting bad, so it's better
>     to drop it and start hunting for a better one.
> 
> My intuition is that it's best to ignore the MAY unless your current route
> is the only route to the destination, but I don't have any hard data to
> back it.  At any rate, it's a fairly rare edge case, one that's not going
> to happen much in real networks, and both choices are correct.  The MAY is
> intended to communicate that the implementer shouldn't bother with this
> case, unless he knows better.

Thanks for the extra discussion; I agree with your conclusions.

> > Section 3.5.5
> 
> >    o  sending a retraction with an acknowledgment request (Section 3.3)
> >       to every reachable neighbour that has not explicitly retracted
> >       prefix P and waiting for all acknowledgments.
> 
> > nit(?): I'd suggest a comma before "and waiting for all
> > acknowledgments", since that's the final gating factor to achieve the
> > goal.
> 
> Ack.
> 
> >    The former option is simpler and ensures that at that point, any
> >    routes for prefix P pointing at the current node have expired.
> >    However, since the expiry time can be as high as a few minutes, doing
> >    that prevents automatic aggregation by creating spurious black-holes
> >    for aggregated routes.  The latter option is RECOMMENDED as it
> >    dramatically reduces the time for which a prefix is unreachable in
> >    the presence of aggregated routes.
> 
> > nit: I don't think this "prevents automatic aggregation" at a technical
> > level, but rather that it "makes automatic aggregation rather unusable
> > in practice" since if automatic aggregation is used, any route
> > retraction will result in a spurious blackhole for the (minutes) expiry
> > time, which is unacceptable for most environments.
> 
> From a technical point of view, you're right.  I'm leaving the current
> formulation, though, I want to be very clear that it doesn't work in
> practice (or at least I don't know how to make it work).

Okay.

> (It pains me.  I know of at least one (not public) application of Babel
> where automatic aggregation would be useful.  So if anyone has any ideas
> about how to make it work, I'm listening.)
> 
> > Section 3.7
> 
> >    Additionally, in order to ensure that any black-holes are reliably
> >    cleared in a timely manner, a Babel node sends retractions (updates
> >    with an infinite metric) for any recently retracted prefixes.
> 
> > Is the sending of retractions the one described by the SHOULDs in 3.7.2?
> > If so, I'm not sure that "a Babel node sends retractions for any
> > recently retracted prefixes" is quite accurate (since SHOULD is not a
> > mandatory requirement); "can send" or "will generally send" might be
> > better.
> 
> Agreed, tweaked.
> 
> > Section 3.7.1
> 
> >    Every Babel speaker periodically advertises all of its selected
> >    routes on all of its interfaces, including any recently retracted
> >    routes.  Since Babel doesn't suffer from routing loops (there is no
> >    "counting to infinity") and relies heavily on triggered updates
> >    (Section 3.7.2), this full dump only needs to happen infrequently.
> 
> > Part of the need for the full dump stems from the potential for
> > unreliable links, right?
> 
> My intuition was iniially be the same as yours, but unreliable links turn
> out to be the last of our problems.  We'd be using Acks more extensively
> otherwise.
> 
> The main issue is recovery after mobility: the node has moved away, it has
> lost all of its neighbours, you need to rediscover all routes.  If you're
> using link-quality estimation, you cannot easily detect this situation, so
> you cannot simply send a wildcard request.  Until you receive a full
> update, you're not going to switch to your new neighbours.

Ah, good to know.

> > Do we want to mention that relationship here, (and that if there are
> > particularly unreliable links the frequency may need to be more often)?
> 
> I've tried to clarify this in the new version of Appendix B.
> 
> > Section 3.8.1.2
> 
> > We haven't introduced "hop count" yet and just mention it in passing
> > here as "[if the] hop count is 2 or more".
> 
> DOne.
> 
> > Intuitively, it seems like the routr should send an update if the
> > router-ids match and the requested seqno is equal to the route entry's
> > seqno, but I don't see this case covered in the current text.
> 
> >    o  otherwise, if the node has one or more (not necessarily feasible)
> >       routes to the requested prefix with a next hop that is not the
> 
> > nit: I think the parenthetical can just be "not feasible", as any
> > feasible routes in question would have matched the previous bullet
> > point.
> 
> Done.
> 
> >    neighbours.  However, if a seqno request is resent by its originator,
> >    the subsequent copies MAY be forwarded to a different neighbour than
> >    the initial one.
> 
> > Is MAY the appropriate level of strength?  Trying the same neighbor
> > would be effective if the original was unsuccessful due to packet loss,
> > but is it possible for a routing pathology to occur that directs the
> > request in the "wrong direction" with respect to a link or node failure?
> 
> Yes, that's possible if multiple nodes become unfeasible simultaneously.
> (If that happens, then the mechanism in 3.8.2.2 will eventually clear the
> blackhole.)
> 
> From an implementation point of view, you route each seqno request
> independently.  The MAY in this section simply means that you don't need
> to keep track of the neighbour you previously sent the request to.

Hmm.  I might actually suggest a non-normative "may", then, if the idea is
to have the decisions be completely independent.  To me, the "MAY" suggests
a granting of permission to deviate from the expected baseline (the latter
being that it always gets sent to the same neighbor).

> > Section 3.8.2.4
> 
> > Is it worth giving some informal guidance about not sending multicast
> > wildcard requests if a node observes others doing the same around the
> > same time (or similar) to avoid the "serious congestion" issues?
> 
> This section has been removed.
> 
> > Section 4.2
> 
> >    A Babel packet consists of a 4-octet header, followed by a sequence
> >    of TLVs (the packet body), optionally followed by a second sequence
> >    of TLVs (the packet trailer).
> 
> > Without mention of the 'body length' field here, a reader might be
> > confused at what distinguishes the body TLVs from the trailer TLVs.
> 
> I'm leaving it as it stands, I think the text is clear.
> 
> >    The packet body and trailer are both sequences of TLVs.  The packet
> >    Ibody is the normal place to store TLVs; the packet trailer only
> >    contains specialised TLVs that do not need to be protected by
> >    cryptographic security mechanisms.
> 
> > I think we need a more explicit statement that the body structure is
> > subject to change when security mechanisms are in use, to allow for
> > potential confidentiality-protecting cryptographic mechanisms.
> 
> > Section 4.3
> 
> > Length is still in octets, right?
> 
> Fixed.
> 
> > Section 4.4
> 
> >    Every TLV carries an explicit length in its header; however, most
> >    TLVs are self-terminating, in the sense that it is possible to
> >    determine the length of the body without reference to the explicit
> >    Length field.  If a TLV has a self-terminating format, then it MAY
> >    allow a sequence of sub-TLVs to follow the body.
> 
> > This seems like a statement of fact, for which a lowercase "may" is
> > perfectly adequate.
> 
> Fixed.
> 
> >    Sub-TLVs have the same structure as TLVs.  With the exception of
> >    PAD1, all TLVs have the following structure:
> 
> > I was going to complain that it's somewhat unfortunate to use the same
> > name for a thing that's a TLV and a thing that's a sub-TLV, even if they
> > have identical encodings.  But then I noticed that in this (sub-TLV)
> > section we spell it "PAD1" and in the previous (TLV) section we spell it
> > "Pad1", which are different.  On the gripping hand, Sections 4.6.1 and
> > 4.7.1 both spell it "Pad1", which are the same.  So a little bit of
> > effort rationalizing things would go a long way.
> 
> This is meant to be Pad1.  In practice, we have found that there is no
> confusion.
> 
> >    The most-significant bit of the sub-TLV, called the mandatory bit,
> 
> > Just to be clear: this is the MSB of the 'type' octet?
> 
> This has been clarified.
> 
> > Also, for similar features in other protocols I've suggested the
> > clarifying language of "comprehension-mandatory" which seems to more
> > accurately reflect the corresponding behavior.
> 
> I'm afraid it's too long, people won't use it in conversation.
> 
> > Section 4.5
> 
> >    Since the parser state is separate from the bulk of Babel's state,
> >    and since for correct parsing it must be identical across
> >    implementations, it is updated before checking for mandatory TLVs:
> 
> > nit: "mandatory sub-TLVs" (right?)
> 
> Fixed, thanks.
> 
> > Section 4.6.2
> 
> >    MBZ       Set to 0 on transmission.
> 
> > Is it legal for a receiver to check and abort if any bits are nonzero?
> 
> If we want to be consistent with the rest of Babel, it is legal to check
> and to ignore this TLV.  Since this TLV is ignored in any case, the
> distinction is somewhat uninteresting.
> 
> (I guess you're thinking about adding noise for security purposes.  Please
> define a new TLV if that's required, I prefer protocol extensions to be
> explicit about their intent.)

I was originally going for steganographic-like side channels, but that's an
option, too.  I'll wait until I can think up a use for the noise before I
write that draft, though ;)

> > Section 4.6.3
> 
> > Sixteen bits of nonce does not provide much unguessability (I note that
> > LISP's rfc6830bis is recommending that their 24-bit nonce echo
> > functionality not be relied on for return-routability checks over the
> > public Internet).  However, since these acknowledgment exchanges are
> > only between direct neighbors, it seems that they are only needed for
> > correlating responses to requests and not for unguessability.  (In this
> > case it seems a sequence number would work just as well as a random
> > number, and we might want to discourage random assignment in the text to
> > avoid the risk of birthday collisions.)
> > On the other hand, multicast acknowledgment requests could be
> > problematic (and especially so when sequential nonces are used), and if
> > they are intended to be allowed then we may need to consider using a
> > larger and random nonce.
> 
> Section 3.3:
> 
>    An acknowledgment MUST be sent to a unicast destination.

I understand that.  Nonetheless, any time that an attacker C (e.g., on a
shared link) can send a message to A that changes how A interacts with B,
that puts up a flag that we need to think about the interaction more
carefully.  In this case, *probably* B will ignore spurious acks from A,
but is that always true?  Is that the only case we need to consider?

> > Section 4.6.6
> 
> > I'm getting some sever cognitive dissonance between the "Rxcost" field
> > and the "carrying a link's transmission cost" statement.  Also, in
> 
> >    Rxcost    The rxcost according to the sending node of the interface
> >              whose address is specified in the Address field.  The value
> >              FFFF hexadecimal (infinity) indicates that this interface
> >              is unreachable.
> 
> > if I insert commas to get "The rxcost, according to the sending node [of
> > the TLV], of the interface whose address is specified in the Address
> > field", does that preserve the intended meaning?
> > nit/aside: It also feels like there's a bit of a mismatch here, in that
> > the "rxcost of the interface" probably means the local interface (from
> > the perspective of the sender), but that interface is being identified
> > by the *remote* address (again, from the perspective of the sender of
> > the TLV).  So maybe "whose remote address" could resolve the mismatch
> > I'm perceiving?  (Or maybe I'm completely misunderstanding, of course.)
> 
> >    Interval  An upper bound, expressed in centiseconds, on the time
> >              after which the sending node will send a new IHU; this MUST
> >              NOT be 0.  [...]
> 
> > To check my understanding: are the IHUs conceptually a reply to Hellos,
> > such that if the Hellos stopped arriving then the peer would stop
> > sending IHUs in response?  I understand that their intervals are set
> > completely independently, so there is not a direct causal relationship,
> > but I'm trying to check whether the quoted sentence is a strict
> > commitment by the sender of the IHU or could be rescinded due to
> > external events.
> 
> This is used to set the IHU timer, described in 3.4.2:
> 
>    When a neighbour's IHU timer expires, the neighbour's txcost is set to
>    infinity.

Okay.  It sounds like my above description is incorrect, then, and
conceptually the IHU is a pure periodic beacon.  I'll go a bit further and
tell myself that IHU is for reachability confirmation and Hello is for
neighbor detection, though I'm sure that's not exactly right; IHU also has
the benefit of sending the rxcost values (which are themselves determined
by monitoring Hello receipt).

> > Section 4.6.9
> 
> >    If the Metric field is finite, the router-id of the originating node
> >    for this announcement is taken from the prefix advertised by this
> >    Update if the Router-Id flag is set, computed as described above.
> >    Otherwise, it is taken either from the preceding Router-Id packet, or
> >    the preceding Update packet with the Router-Id flag set, whichever
> >    comes last, even if that TLV is otherwise ignored due to an unknown
> >    mandatory sub-TLV.
> 
> > Both cases of "packet" here should be "TLV", right?
> 
> Fixed, thanks.
> 
> > Section 5
> 
> > "Specification Required" also requires Expert Review.  What guidance can
> > we provide to the experts for making registration decisions?
> 
> I don't think there's WG consensus on this subject.  I am in favour of
> being very liberal (since the alternative incurs the risk of people
> squatting our codepoints), but if memory serves at least one WG member
> argued in favour of "RFC required".
> 
> I'm not too worried, though.  Right now, there is only one nonofficial
> protocol extension used in production that I know of, and it's completely
> incompatible with the protocol (it doesn't use sub-TLVs, it savagely
> appends extra data to the Update TLV).  No self-respecting expert would
> approve such an extension.
> 
> I think we can deal with that issue when the problem occurs.

It's unlikely to be a timely response if we do that; I expect an RFC would
be needed in order to change registry policy/guidance, and that would take
at least a few months.

> > Section 6
> 
> This section has been completely rewritten.
> 
> > "periodically" may not be the best advice; coupling such changes to
> > mobility events is likely to be more effective at preserving privacy.
> > (QUIC has discussed related topics quite extensively, though there's
> > enough traffic in the archives that I can neither point you at a
> > specific thread or recommend searching for it.)
> 
> Changed to "often enough".
> 
> > Section 8.2
> 
> > I think at least BABEL-HMAC needs to be normative, since it is
> > RECOMMENDED.
> 
> Done.
> 
> > Section A.1
> 
> > If we're talking about "appending bits" to the history fields, maybe
> > describing them as fixed-length queues or something makes more sense
> > than vectors.
> 
> I think the current formulation is clear.
> 
> > If the field is maintained in a 16-bit integer, what is done for the
> > previously erased bits when we "undo history"?
> 
> It doesn't matter, these are low-order bits, they're not going to
> contribute to any of the computed values.  0 is he obvious value.
> 
> >    Whenever either Hello timer associated to a neighbour expires, the
> >    local node adds a 0 bit to this neighbour's Hello history, and
> 
> > We keep two hello histories; we should clarify that the one in question
> > is the one corresponding to the timer that expired.
> 
> Done.
> 
> > Section A.2.2
> 
> > I don't understand the origin of the '256' in the MIN(1, 256/txcost)
> > formula (described as a probability estimate).
> 
> We scale the values to fit in a 16-bit integer field without excessive
> loss of precision.

Sorry; I'm still confused.  If alpha is a "probability estimate", shoudln't
it be between 0 and 1?  MIN(1, .) then serves to cap the top of the range,
so I want 256/txcost to be between 0 and 1, aka txcost to be larger than
256.  I see that we send the rxcost(/txcost) values in the 16-bit integer
IHU field, and use 0xffff to indicate infinity, but within that 0..2^15-2
range, assignment of costs is left fairly arbitrary.  So a scaling factor
would presumably also be arbitrary, but why is this
partial-scale-and-partial-cap procedure useful versus just scaling over the
full 16-bit range into a probability estimate from 0 to 1?
I also don't see how this operation is scaling "to fit in a 16-bit
integer"; presumably alpha is not stored as a 16-bit integer!

> > I think a lot more work is needed to convince me that the two given
> > formulae for "cost" are equivalent (especially given that 'rxcost' only
> > appears once in the entire section, in the second formula).
> 
> I've added explicit mention of the fact that rxcost = beta * 256, which is
> possibly what you were missing.

Yup, that's the missing link.

>   256/(alpha * beta) = 256 / (MIN(1, 256 / txcost) * beta)
>                      = MAX(256 / 1, 256 / (256 / txcost)) * (rxcost / 256)
>                      = (MAX(txcost, 256) * rxcost) / 256
> 
> > Section A.3.2
> 
> > Is k "allowed to" (I know this section is just informative) vary on
> > non-external data, such as the route or link in question?
> 
> I've removed this paragraph.  Nobody's ever implemented this suggestion,
> the new appendix about filtering is more informative.
> 
> (Comma splice, I know.)
> 
> > Appendix C
> 
> > I could see this content in the main body of the document.
> 
> I've rewritten this appendix, and referenced it in a few more places in
> the body.  I've got no strong opinions either way, and unless there's any
> strong opinions, I'll leave it as it is.

I was deliberately expressing a non-strong opinion :)

Thanks,

Ben