[babel] Benjamin Kaduk's Discuss on draft-ietf-babel-hmac-08: (with DISCUSS and COMMENT)

Benjamin Kaduk via Datatracker <noreply@ietf.org> Wed, 07 August 2019 21:44 UTC

Return-Path: <noreply@ietf.org>
X-Original-To: babel@ietf.org
Delivered-To: babel@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id 5F3CF12007A; Wed, 7 Aug 2019 14:44:51 -0700 (PDT)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
From: Benjamin Kaduk via Datatracker <noreply@ietf.org>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-babel-hmac@ietf.org, Donald Eastlake <d3e3e3@gmail.com>, babel-chairs@ietf.org, d3e3e3@gmail.com, babel@ietf.org
X-Test-IDTracker: no
X-IETF-IDTracker: 6.100.0
Auto-Submitted: auto-generated
Precedence: bulk
Reply-To: Benjamin Kaduk <kaduk@mit.edu>
Message-ID: <156521429138.8333.12124544758210076970.idtracker@ietfa.amsl.com>
Date: Wed, 07 Aug 2019 14:44:51 -0700
Archived-At: <https://mailarchive.ietf.org/arch/msg/babel/jLsWURS4yov_etOXOP7B4aTiPM4>
Subject: [babel] Benjamin Kaduk's Discuss on draft-ietf-babel-hmac-08: (with DISCUSS and COMMENT)
X-BeenThere: babel@ietf.org
X-Mailman-Version: 2.1.29
List-Id: "A list for discussion of the Babel Routing Protocol." <babel.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/babel>, <mailto:babel-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/babel/>
List-Post: <mailto:babel@ietf.org>
List-Help: <mailto:babel-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/babel>, <mailto:babel-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 07 Aug 2019 21:44:52 -0000

Benjamin Kaduk has entered the following ballot position for
draft-ietf-babel-hmac-08: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
for more information about IESG DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-babel-hmac/



----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

Are the HMAC keys required to be the hash function's block size or its
output size?  Section 3.1 says just "the length of each key is exactly
the hash size of the associated HMAC algorithm", and "hash size"
conventionally refers to the output length.  The referenced Section 2 of
RFC 2104 concerns itself with the hash's compression function's block
size B, which is generally different.

Also in Section 3.1, if we are going to claim that a "random string of
sufficient length" suffices to initialize a fresh index, we need to
provide guidance on what constitutes "sufficient length" to achieve the
needed property.

Blake2s is a keyed MAC, but is not an HMAC construction.  If we are to
allow its usage for providing integrity protection of babel packets
directly, we therefore cannot refer to the preotection scheme as "HMAC"
generically.  Fixing this will, unfortunately, be somewhat invasive to
the document, since we mention HMAC all over the place.  I believe that
"Keyed Message Authentication Code (Keyed MAC)" is an appropriate
replacement description.

The suggestion that the large challenge nonce size admits storage of
state in a secure "cookie" in the nonce is true, however, implementing
this properly presents some subtleties, and it seems like something of
an attractive nuisance to suggest that it is possible without giving
adequate guidance at how to do it safely.  Unfortunately, the best
reference I can think of, offhand, is the obsoleted RFC 5077.

Let's also have a discussion about whether 64 bits of randomness is
always sufficient; I left a longer note down in the Comment since I
don't expect this to end up being a blocking point.


----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

Thanks for the clear introduction and applicability statement; they
really help to lay out a clear picture of the scenarios in which we
operate!

This is a symmetric-keyed scenario, so most attacks that involve a
compromised node will be "uninteresting", in that once the key is
exposed all guarantees are lost.  However, it may still be worth noting
that a compromised node can cause disruption on multi-access links
without detection since there is no "end of generation" signal when a
node changes its index.  That is, if node B reboots or otherwise resets
its index/pc, then compromised node C can spoof packets from B with the
previous index and honest node A will accept them, and B will be unable
to detect that it has been spoofed.  On the flip side, we may want to
discuss that B can watch for messages that spoof its source address to
detect compromised nodes.

Is there any need for initial PC value randomization?  Do we want to
recommend starting at 0 or 1 (or prohibit recipients from assuming
that the initial value for an index will be that)?

Section 1

"This document obsoletes RFC 7298" should be in the Introduction as well
as the abstract.

Is the capability for an attacker to modify/spoof Babel packets in
order to cause data to get dropped or cause a routing loop worth
mentioning here?

Section 1.2

   o  that the Hashed Message Authentication Code (HMAC) being used is
      invulnerable to pre-image attacks, i.e., that an attacker is
      unable to generate a packet with a correct HMAC;

I think it's more conventional to include the caveat "without [access
to/knowledge of] the secret key" for this sort of statement about HMAC.

   The first assumption is a property of the HMAC being used.  The
   second assumption can be met either by using a robust random number
   generator [RFC4086] and sufficiently large indices and nonces, by
   using a reliable hardware clock, or by rekeying whenever a collision
   becomes likely.

Does this rekeying option require an external operation/management actor
to trigger it?  It might be worth mentioning with some operational
considerations.

   o  among different nodes, it is only vulnerable to immediate replay:
      if a node A has accepted a packet from C as valid, then a node B
      will only accept a copy of that packet as authentic if B has
      accepted an older packet from C and B has received no later packet
      from C.

nit: I don't think "A has accepted a packet from C" is quite the right
precondition; it seems to be more like "A has received a valid packet
from C", since whether or not A (as an attacker) considers it valid is
irrelevant to whether (honest) B will.

Section 4.1

If we had identifiers for symmetric keys or HMAC algorithms, we could
include those identifiers in the pseudo-header and thereby gain some
protection from downgrade/HMAC-stripping attacks in the presence of a
weak keyed MAC algorithm.  (I think we have to include both what we are
sending and what we think the peer can do in order to get substantial
protection, though, which diminishes the appeal for multicast
scenarios.)

nit: I don't think the past tense is correct for "packet was carried
over IPvN", since we're talking about a pseudo-header used in
computations before the packet is sent.

Section 4.2

It might be worth reiterating that every time a packet goes on the wire,
it gets a fresh PC, regardless of whether it's a "retransmit" after a
timeout or a new message.

   interface MTU (Section 4 of [RFC6126bis]).  For an interface on which
   HMAC protection is configured, the TLV aggregation logic MUST take
   into account the overhead due to PC TLVs (one in each packet) and
   HMAC TLVs (one per configured key).

(per configured key, and also per packet, right?)

Does it matter whether the sender increments the PC before or after
inserting it in the PC TLV?  (I think the only potential impact would be
as it relates to the value sent in response to a challenge nonce, but
the "increment by a positive not-necessarily-one amount" property may
provide all the flexibility we need.)

Section 4.3

Validating the HMACs is the sort of operation that we tend to recommend
be done in constnt-time to avoid side channel attacks.  I don't have a
concrete attack handy here at the moment, though.

      When a PC TLV is encountered, the enclosed PC and Index are saved
      for later processing; if multiple PCs are found (which should not
      happen, see Section 4.2 above), only the first one is processed,
      the remaining ones MUST be silently ignored.  If a Challenge

Any reason to not just drop the whole packet if there are multiple PCs
present?  I see this is not rfc7298bis but don't know what level of
breaking change is reasonable.

   o  The preparse phase above has yielded two pieces of data: the PC
      and Index from the first PC TLV, and a bit indicating whether the
      packet contains a successful Challenge Reply.  If the packet does
      not contain a PC TLV, the packet MUST be dropped and processing
      stops at this point.  If the packet contains a successful
      Challenge Reply, then the PC and Index contained in the PC TLV
      MUST be stored in the Neighbour Table entry corresponding to the
      sender (which already exists in this case), and the packet is
      accepted.

I'd suggest explicitly stating that if there is a challenge reply that
doesn't validate, the packet should be discarded.  Or are there
multicast scenarios where that is not the case? The key point being to
emphasize that just the presence of a challenge reply doesn't mean
anything, it has to be valid in order to have significance.

   o  At this stage, the packet contains no successful challenge reply
      and the Index contained in the PC TLV is equal to the Index in the
      Neighbour Table entry corresponding to the sender.  The receiver
      compares the received PC with the PC contained in the Neighbour
      Table; if the received PC is smaller or equal than the PC
      contained in the Neighbour Table, the packet MUST be dropped and
      processing stops (no challenge is sent in this case, since the
      mismatch might be caused by harmless packet reordering on the
      link).  Otherwise, the PC contained in the Neighbour Table entry
      is set to the received PC, and the packet is accepted.

Does this mean that if packet reordering is encountered, we will just
not process packets that get reordered later?  (AFAIK babel will still
work fine in such conditions, so I'm just checking my understanding.)

   it MAY ignore a challenge request in the case where it it contained

nit: s/it it/it is/

   The same is true of challenge replies.  However, since validating a
   challenge reply is extremely cheap (it's just a bitwise comparison of
   two strings of octets), a similar optimisation for challenge replies
   is not worthwile.

Er, challenge reply validation still requires the HMAC validation step,
right?

Section 4.3.1.1

   When it encounters a mismatched Index during the preparse phase, a
   node picks a nonce that it has never used with any of the keys
   currently configured on the relevant interface, for example by
   drawing a sufficiently large random string of bytes or by consulting

(same comment as above about "sufficiently large")

Section 4.3.1.2

   buffered TLVs in the same packet as the Challenge Reply.  However, it
   MUST arrange for the Challenge Reply to be sent in a timely manner
   (within a few seconds), and SHOULD NOT send any other packets over
   the same interface before sending the Challenge Reply, as those would
   be dropped by the challenger.

I think this "SHOULD NOT" (or rather, "would be dropped by the
challenger") is predicated on the challenge request having not been a
replay, but I do not see anything requiring the recipient to do nonce
uniqueness validation.

Section 4.3.1.3

   neighbour that sent the Challenge Reply.  If no challenge is in
   progress, i.e., if there is no Nonce stored in the Neighbour
   Table entry or the Challenge timer has expired, the Challenge Reply
   MUST be silently ignored and the challenge has failed.

I think "the challenge has failed" is predicated on the challenge reply
being in response to a challenge sent by this node.  The previous
section's "send the Challenge Reply to the unicast address" seems to
imply that there are no multicast scenarios which would make that not
the case, but I just wanted to check my understanding.

Section 5

Do we need to say whether sub-TLVs are allowed in any of these TLVs?
(Presumably they are not, since the length is needed in order to
identify the length of the variable-length fields, but being explicit
can be useful.)

Section 5.1

   This [HMAC] TLV is allowed in the packet trailer (see Section 4.2 of
   [RFC6126bis]), and MUST be ignored if it is found in the packet body.

side note: Using "MUST ignore" vs. "discard the packet" has some
protocol evolution consequences -- it in practice then becomes an
alternative padding technique for use in packet bodies, and if ever used
as such then could lead to a way to fingerprint an implementation or be
used as a hidden channel for sending other data.  But, I see that
ignoring at the TLV level is something of a core babel design choice,
and I don't see any serious consequences that would merit revisiting
that decision.

Section 6

   This mechanism relies on two assumptions, as described in
   Section 1.2.  First, it assumes that the hash being used is

s/hash/MAC/

   enough entropy (64-bit values are believed to be large enough for all
   practical applications), or by using a reliably monotonic hardware

It would require a bit more thought to convince me that 64-bit indices
are sufficient for *all* cases.  Specifically, if we want full 64-bit
strength, then the 64-bit space cannot be controlled or affected by the
attacker to cause collisions.  But I think there will be a reasonable
risk that an attacker can cause a given node to need to regenerate its
index on demand (e.g,. but triggering a bug that crashes it, or power
cycling it), at which point the random selection falls back to the
32-bit birthday bound on uniqueness over time.  Granted, the
consequences of that particular attack would be limited, as the attacker
could only replay the limited number of packets sent using the colliding
index the first time it was used, but this is only intended as an
example of ways in which 64-bit random values can be degraded to 32 bits
of security.

   present at the receiver.  If the attacker is able to cause the
   (Index, PC) pair to persist for arbitrary amounts of time (e.g., by
   repeatedly causing failed challenges), then it is able to delay the
   packet by arbitrary amounts of time, even after the sender has left
   the network.

I'd suggest adding another sentence describing the potential
consequences of selectively delayed input (i.e., messing up the
routing).

   protocol (the data structures described in Section 3.2 of
   [RFC6126bis] are conceptual, any data structure that yields the same
   result may be used).  Implementers might also consider using the fact

nit: that's a comma splice in the parenthetical; a semicolon would be better.