[Hipsec] Benjamin Kaduk's Discuss on draft-ietf-hip-dex-24: (with DISCUSS and COMMENT)

Benjamin Kaduk via Datatracker <noreply@ietf.org> Thu, 25 March 2021 07:19 UTC
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
From: Benjamin Kaduk via Datatracker <noreply@ietf.org>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-hip-dex@ietf.org, hip-chairs@ietf.org, hipsec@ietf.org, Gonzalo Camarillo <gonzalo.camarillo@ericsson.com>, gonzalo.camarillo@ericsson.com
Auto-Submitted: auto-generated
Precedence: bulk
Reply-To: Benjamin Kaduk <kaduk@mit.edu>
Message-ID: <161665675208.21560.1037478413465388633@ietfa.amsl.com>
Date: Thu, 25 Mar 2021 00:19:12 -0700
Archived-At: <https://mailarchive.ietf.org/arch/msg/hipsec/aB2_g0Ms1KpG6Lf3-LuFxqaG1pU>
Subject: [Hipsec] Benjamin Kaduk's Discuss on draft-ietf-hip-dex-24: (with DISCUSS and COMMENT)
Benjamin Kaduk has entered the following ballot position for
draft-ietf-hip-dex-24: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
for more information about IESG DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-hip-dex/



----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

I support Roman's Discuss.

I don't understand how the responder's HOST_ID is supposed to be
authenticated in the handshake.  In HIP-BEX, the HOST_ID is in R1 and
covered by the HIP_SIGNATURE_2, and it is *also* used as input to the
calculation of the HIP_MAC_2 in R2.  In HIP-DEX as currently specified,
the responder's HOST_ID is present in R1 (which has no cryptographic
protection applied) but not present in R2 at all (R2 being the only
message from the responder that is authenticated).  Since we are already
replicating most of R1 in R2 in order to allow the HIP_MAC on R2 to
substitute for the HIP_SIGNATURE_2 in the HIP-BEX version of R1, it
seems like it would be most straightforward to just include a copy of
the responder HOST_ID in R2 as well (thus covered by the main HIP_MAC),
but other options including HIP_MAC_2 are available.

Furthermore, that the lack of authentication for the responder's HOST_ID
could remain in the document for so long, even after multiple rounds of
review, causes me to question whether the cryptographic mechanisms of
this document have really seen an adequate level of review for the
Proposed Standard maturity level.

I also have concerns about the cryptographic analysis of the particular
CKDF construction that is given.  While the previous rounds of review
and response have convinced me that a CMAC-based analogue to HKDF is
safe and well-grounded, the current construction is not fully analogous
to HKDF and also uses a non-injective mapping for convering the
exchanged protocol parameters into CKDF inputs:

- My understanding of the principles behind HKDF is that there is no
  need for the "info" argument to the CKDF-Extract stage, and that using
  that data only in the CKDF-Expand stage is both safe and the expected
  usage.  (The Random #I provided by the responder matches to the HKDF
  salt as a random but non-secret value, and helps to churn the
  extraction of entropy from the IKM.  Adding the I_NONCE along with the
  Diffie-Hellman output allows for an additional source of contributory
  behavior for the initiator, but the Diffie-Hellman exchange itself is
  also supposed to give contributory behavior and the I_NONCE does not
  protect against attacks where the initiator might choose a key share
  that produces a DH output with particular properties, since the
  I_NONCE and initiator key share are produced at the same time.  I
  think we need to be more clear about what the I_NONCE actually does,
  which is to ensure that we get a new key if we have to repeat the
  static-static DH exchange due to (e.g.) state loss, etc.)

- Since we are using Kij | I_NONCE for both IKMm and IKMp, we need to
  ensure that the produced IKM<x> values are distinct by construction.
  The requirement that the encrypted values be at least 64 bits provides
  this property, however, we do not have injectivity because a given
  IKMp could be produced by dividing the "concatenated random values"
  between initiator and responder in different ways.  This introduces a
  risk of attack when the encrypted value of one party is chosen
  maliciously (the attack is easiest when it can be chosen after the
  other party's value is known, but this is not a strict requirement for
  enabling attacks).  So, I think we should either introduce length
  prefixes into the IKMp encoding or require a fixed length (i.e.,
  exactly 64 bits) for the random values.

- The description of the PRK input to CKDF-Expand() includes mention of
  a "case of no extract".  When does this case occur?  I think we need
  to have a clear procedure for when it is (and is not) used, or ideally
  to just always use extract.

- The intermediates T(n) used to generate the CKDF OKM appear to be an
  attempt to use the SP 800-108 "KDF in feedback mode" with optional
  counter, but the NIST version puts the counter directly after the
  previous iteration's output, i.e., before the additional data.  So in
  that sense we are not in a state that "follows the CMAC usage
  guidance" provided by the NIST references.

- The additional information passed to CKDF-Expand() does not provide
  for key separation of the output keys used for the pair-wise key SA
  based on what transport format the keys will be used for.
  (Including the selected transport format in the 'info' should be
  straightforward and resolves this issue.)

I do not see any justification for deviating from the existing RFC 7401
semantics of ECHO_RESPONSE_UNSIGNED in the I2 packet (Appendix B
suggests to use content other than the "unmodified opaque data copied
from the corresponding echo request").  If a two-factor authentication
method is desired, it seems like defining a new TLV pair to convey it is
straightforward and does not confuse the semantics of existing protocol
elements.

There is some text in Section 5.3 that indicates that the "UPDATE,
NOTIFY, CLOSE, and CLOSE_ACK packets are not covered by a signature and
purely rely on the HIP_MAC parameter for packet authentication".
However, the RFC 7401 NOTIFY packet contains only a HIP_SIGNATURE and
not a HIP_MAC.  I think we need to specify a complete NOTIFY message
structure that includes HIP_MAC, rather than attempting to rely on a
delta from RFC 7401 that just removes the HIP_SIGNATURE, most notably so
that we can clearly state what the MAC covers.

Section 6.3 suggests that the CKDF-Expand phase can be skipped for the
Pair-wise Key SA when the needed key is less than or equal to 128 bits,
but I don't see anything in [NIST.SP.800-56C] to suggest that such a
procedure follows the referenced guidance.  In particular, it removes
the opportunity to use the label/context data (known as the "info" in
the RFC 5869 terminology).

We have text in 5.3.2 that I managed to read as saying that the
initiator's (DH_GROUP_LIST) preference takes priority, but there is text
in Section 6.6 that I read as saying that the responder's preference
takes priority.  (See COMMENT for specific locations.)  It can only be
one of those, and we should be clear about which one it is, across the
board.

Section 9.3 discusses the risk of key extraction attack and the need to
validate the peer's public key.  But we say to enforce this in
processing I2 and R2 packets, when the responder's HOST_ID is present in
R1 (and not R2) and is used in the preparation of I2.  If we only
validate the peer's key when processing R2, it is too late and the
damage has already been done.

I think we need greater clarity on whether we are using X25519, or doing
ECDH on Curve25519.  Section 9.3 suggests that we are using X25519, but
only by implicit reference to "the corresponding functions defined in
[RFC7748]"; the rest of the document only discusses Curve25519.
ECDH-on-Curve25519 (or the related curve Wei25519) and X25519 are not
compatible operations; we must pick one.

Section 5.2.2

   The counter for AES-128-CTR MUST have a length of 128 bits.  The
   puzzle value #I and the puzzle solution #J (see Section 4.1.2 in
   [RFC7401]) are used to construct the initialization vector (IV) as
   FOLD(I | J, 112) which are the high-order bits of the CTR counter.  A
   16 bit value as a block counter, which is initialized to zero on
   first use, is appended to the IV in order to guarantee that a non-
   repeating nonce is fed to the AES-CTR encryption algorithm.

   This counter is incremented as it is used for all encrypted HIP
   parameters.  That is a single AES-129-CTR counter associated with the
   Master Key SA.

Is the FOLD output just the initial value of the counter (so that we can
use the full 128-bit space) or do we only get the 16 bits of usable
counter?

Relatedly, I still don't have much clarity on how the counter is
incremented/mnaged for the master key SA.

Section 5.2.3

   HIP DEX HIs are serialized equally to the ECC-based HIs in HIPv2 (see
   Section 5.2.9. of [RFC7401]).  The Group ID of the HIP DEX HI is
   encoded in the "ECC curve" field of the HOST_ID parameter.  The
   supported DH Group IDs are defined in Section 5.2.1.

I don't think RFC 7401 actually specifies the serialization for the ECC
public keys (whether ECDSA or ECDH); that is deferred to the
corresponding references (and, furthermore, RFC 4754 seems to be
covering random groups, not the specific NIST groups).  We need an
actual reference for the serialization of the public key in order for
this to be implementable.  (If we're using X25519, this is very easy and
RFC 7748 does the hard work for us.)

Section 5.3.1

   Regarding the Responder's HIT, the Initiator may receive this HIT
   either from a DNS lookup of the Responder's FQDN (see [RFC8005]),
   from some other repository, or from a local table.  The Responder's
   HIT also MUST be of a HIP DEX type.  If the Initiator does not know
   the Responder's HIT, it may attempt to use opportunistic mode by
   using NULL (all zeros) as the Responder's HIT.  [...]

The "may attempt" seems in conflict with "MUST be of a HIP DEX type".


----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

RFC 7401 has to say about secp160r1 (in 2015) that "Today, these groups
should be used only when the host is not powerful enough (e.g., some
embedded devices) and when security requirements are low (e.g.,
long-term confidentiality is not required)."  We might mirror the
"security requirements are low" portion ourselves as a requirement for
the use of DEX at all.

The use of the random #I from the puzzle as the CMAC key both for
solving the puzzle and for CKDF-Extract() is perhaps a bit
unconventional.  I don't know of a specific attack against it, though
(and the HKDF desing allows an all-zeros key to be used for
HKDF-Extract() when no salt is available).

Section 1

   4.  The forfeiture of the use of digital signatures leaves the R1
       packet open to a MITM attack.  Such an attack is managed in the

We can't use the acronym MITM without expanding it (it's used five times
throughout), and "active on-path attack" is probably more useful a
description anyway.

Section 1.1

   An existing HIP association can be updated with the update mechanism
   defined in [RFC7401].  Likewise, the association can be torn down
   with the defined closing mechanism for HIPv2 if it is no longer
   needed.  Standard HIPv2 uses a HIP_SIGNATURE to authenticate the
   association close operation, but since DEX does not provide for
   signatures, the usual per-message MAC suffices.

Thank you for calling out the divergence from RFC 7401 HIPv2.
However, the conclusion here ("the per-message MAC suffices") is not
supported by the rest of the sentence.

Section 1.2.1

Is it useful to present the overall summary of operations from the
Responder's perspective as well?  I recognize that it is in some sense
similar and may not be worth the partial redundancy.

   Papers like [EfficientECC] show on the ATmega328P [ATmega328P] an
   EdDSA25519 signature generation of 19M cycles and verification of 31M
   cycles.  Thus the SIGMA Public Key operations come at a cost of 81M

It's probably worth noting that the [EfficientECC] implementation has
the additional constraint of targeting side-channel resistance.  That
said, the proposed deployment scenarios for HIP-DEX include those where
the same motivations presented in the paper for wanting
side-channel-resistance apply, so we cannot reasonably remove that
constraint and achieve lighter-weight implementation.

Section 2.3

   HI (Host Identity):  The static ECDH public key that represents the
      identity of the host.  In HIP DEX, a host proves ownership of the
      private key belonging to its HI by creating a HIP_MAC with the
      derived ECDH key (see Section 3) in the appropriate I2 or R2
      packet.

This definition is rather divergent from the RFC 7401 definition of Host
Identity.  Necessarily so, to some extent, since DEX doesn't have
signature keys, but I think we can do better at acknowledging the
divergence.  Perhaps something like "[RFC7401] defined this as the
public key of the signature algorithm that represents the identity of
the host.  Since DEX removes the signature operation, the static ECDH
public key is used to play the role of the identity of the host.  In HIP
DEX, a host [...]"?

My comment from the -13 about the HIP_MAC not directly proving ownership
of the private key also still applies, IMO.

   KEYMAT:  Keying material.  That is, the bit string(s) used as
      cryptographic keys.

This is also pretty divergent from RFC 7401's definition.  Do we want to
say something about "symmetric keys used for encryption and integrity
protection of HIP packets and encrypted user data packets"?

   RHASH (Responder's HIT Hash Algorithm):  In HIP DEX, RHASH is
      redefined as CMAC.  Still, note that CMAC is a message

We might also highlight the "from" part of the redefinition; something
like "Since HIP DEX does not use hash functions, an alternative
mechanism is needed for many of the places where RHASH is used.  To
match up with the HIP DEX design goals, CMAC is repurposed to perform
many of the functions where HIP-BEX uses RHASH.  Still, note that [...]"
might work.

   Security Association (SA):  An SA is a simplex "connection" that

I don't think I understand how an SA is "simplex", and RFC 7401 isn't
really enlightening me.  Help?

Section 3

   *  The HIT suite ID MUST only be a DEX HIT ID (see Section 5.2.4).

I don't think I understand where this restriction applies and what
exactly it's saying.  Section 5.2.4 covers the HIT_SUITE_LIST in R1, but
the reference seems to be made just for the specific ECDH/FOLD HIT Suite
ID (TBD2).  My current guess is that this is just writing down the
(near-)tautology that DEX HITs incorporate the ECDH/FOLD suite ID, in
which case there may not even be a need for a specific normative "MUST".

   Due to the latter property, an attacker may be able to find a
   collision with a HIT that is in use.  Hence, policy decisions such as

I think we should say rather that "it is assumed that an attacker can
find a collision with a HIT that is in use" rather than the current "may
be able to find" formulation.

Section 3.2.1

Is there anything useful to say about what mitigations are available if
an accidental collision occurs?  (Is just the full HOST_ID in the handshake
enough?  Would there be value in re-keying one of the nodes to not
collide?)

   Even without collision-resistance, it is not trivial to create
   duplicate FOLD generated HITs, as FOLD is starting out with a random
   input (the HI).  Although there is a set, {N}, of HIs that will have
   duplicate FOLD HITs, even randomly generating duplicate HITs is
   unlikely.  [...]

I don't think describing a single set of HIs is particularly useful; the
situation might be better described as there being a set of equivalence
classes under FOLD, or many sets where each set has the same FOLDed HIT.
(The note a couple sentences later about "size of set above" would be
adjusted accordingly.

Section 4.1.3.2

                                                        If the data
   transform does not specify its own KDF, the key derivation function
   defined in Section 6.3 is used.  Even though the concatenated input
   is randomly distributed, a KDF Extract phase may be needed to get the
   proper length for the input to the KDF Expand phase.

I'm reluctant to say "the concatenated input is randomly distributed"
since the constrained devices in question may not have particularly good
RNGs.  "Even if" might be safer.

Section 4.1.4

   The User Data Considerations in Section 4.5. of [RFC7401] also apply
   to HIP DEX.  There is only one difference between HIPv2 and HIP DEX.
   Loss of state due to system reboot may be a critical performance
   issue for resource-constrained devices.  Thus, implementors MAY
   choose to use non-volatile, secure storage for HIP states in order
   for them to survive a system reboot as discussed in Section 6.11.
   Using non-volatile storage will limit state loss during reboots to
   only those situations with an SA timeout.

IIUC this includes saving (e.g.) the pair-wise key SA state to
nonvolatile storage, which could affect the safety of user data
exchanged over the negotiated transport format.  That seems important to
note (though it should not be much of a surprise given the discussion
earlier in the document about lack of forward secrecy)!

Section 5.1

I think it would be appropriate to reiterate that indications of
Anonymity in the HIP Controls field are meaningless when DEX is used.

Section 5.2

   HIP DEX reuses the HIP parameters of HIPv2 defined in Section 5.2. of
   [RFC7401] where possible.  Still, HIP DEX further restricts and/or
   extends the following existing parameter types:

As a formal matter, how do we know that DEX is "in use" for a given
exchange and thus that these further restrictions are going to apply?
Is it just that it's the suite ID of the source HIT in the packet?

   *  PUZZLE, SOLUTION, and HIP_MAC parameter processing is altered to
      support CMAC in RHASH and RHASH_len (see Section 6.1 and
      Section 6.2).

I don't really follow how the processing needed to be altered for
RHASH_len.

Section 5.2.1, 5.2.x, etc.

                                                                 With
   HIP DEX, the DH Group IDs are restricted to:

Similarly to the previous comment, at a formal level, how do we know
that DEX is in use and these further restrictions apply?

Section 5.2.4

In RFC 7401 we note that HIT_SUITE_LIST is in the signed part of R1.
I think it would be appropriate to reiterate that for DEX there is no
authenticity protection on R1 (including the HIT_SUITE_LIST), so the
contents of R1 can only be used provisionally until verified by
comparing against the contents of the validated R2.

Section 5.2.5

   The ENCRYPTED_KEY parameter encapsulates a random value that is later

This is a cryptographic random value, right?  We should probably say so
(or that it's from a CSPRNG, etc.).

   used in the session key creation process (see Section 6.3).  This
   random value MUST have a length of at least 64 bits.  The HIP_CIPHER
   is used for the encryption.

The only defined HIP_CIPHER for DEX is AES-128-CTR.  Where does the
counter value get taken from for performing the encryption?

Section 5.3

   In the future, an optional upper-layer payload MAY follow the HIP
   header.  The Next Header field in the header indicates if there is
   additional data following the HIP header.

(This is unchanged from the situation for RFC 7401, right?  Maybe we
should preface it as such, e.g., "As is the case for HIP-BEX, ...")

Section 5.3.1

   first list element.  With HIP DEX, the DH_GROUP_LIST parameter MUST
   only include ECDH groups defined in Section 5.2.1.

As written, this could be interpreted as limiting DEX to the specific
groups enumerated in §5.2.1, as opposed to all ECDH groups (with ECDH
group as defined in §5.2.1).  Limiting to a hardcoded list is bad for
cryptographic algorithm agility, see BCP 201.

Section 5.3.2

I see that the TLVs in R1 are ordered differently than in RFC 7401 (when
they appear in both documents), and interestingly it is the RFC 7401
case that is not in numeric TLV type order!  Is that an erratum against
RFC 7401?

The prose paragraphs cover HIP_CIPHER and DH_GROUP_LIST in the opposite
order than they appear in the figure, though.

   The Initiator's HIT MUST match the one received in the I1 packet if
   the R1 is a response to an I1.  If the Responder has multiple HIs,
   the Responder's HIT MUST match the Initiator's request.  If the
   Initiator used opportunistic mode, the Responder may select among its
   HIs as described below.  See Section 4.1.8 of [RFC7401] for detailed
   information about the "HIP Opportunistic Mode".

The first two sentences don't seem very consistent with opportunistic
mode (but I recognize this is a preexisting situation with the RFC 7401
description as well).

   the current handshake.  Based on the received HIT_SUITE_LIST, the
   Initiator MAY decide to abort the current handshake and initiate a
   new handshake with a different mutually supported HIT suite.  This

Do we want to recommend this version-changing dance before the signal is
authenticated?  What is the harm for waiting for the R2 and only acting
on the authenticated list of versions?

   The HOST_ID parameter depends on the received DH_GROUP_LIST parameter
   and the Responder HIT in the I1 packet.  Specifically, if the I1
   [...]
   the R1 packet accordingly.  If the Responder however does not support
   the DH group required by the Initiator or if the Responder HIT in the
   I1 packet does not match the required DH group, the Responder selects
   [...]

I suggest adding some introductory material that sets the stage here,
noting that because DEX keys are static DH keys and not signature keys,
we have to come up with a procedure (with no analogue in BEX) to find
HIs that are in the same group, so that DH key-exchange is possible at
all.  In order to do this in a manner where tampering/downgrade can be
detected, we make the (essentially arbitrary, since HIP is basically a
symmetric protocol) choice to use initiator preference, and for a given
handshake, deem the first entry in the initiator's DH_GROUP_LIST to be
the "required" group for that handshake.  (Note that we define what the
"required group" is, which the current text does not.)  If the
initiator-selected responder HIT (if present) is useful and is the
required group, we use it, otherwise we provide rules for the responder
behavior that allow the initiator to detect the failed negotiation and
what steps are needed for the next attempt to succeed.  (The responder
HOST_ID includes the correct HIT and group, and the mismatch between
that group and the source HIT group, or the mismatch between HOST_ID and
HIT, indicates the negotiation failure.)

It's a little unfortunate that we have to act on an unauthenticated
signal here, though, but in case of group mismatch there is no way to
achieve authentication without signatures.

   payload protection.  The different format types are DEFAULT, ESP
   (Mandatory to Implement) and ESP-TCP (Experimental, as explained in
   Section 3.1 in [RFC6261]).

I see that RFC 6261 is an experimental document, but not how
specifically section 3.1 thereof explains that ESP-TCP is experimental.

Section 5.3.4

   The Responder repeats the DH_GROUP_LIST, HIP_CIPHER, HIT_SUITE_LIST,
   and TRANSPORT_FORMAT_LIST parameters in the R2 packet.  These
   parameters MUST be the same as included in the R1 packet.  The
   parameter are re-included here because the R2 packet is MACed and
   thus cannot be altered by an attacker.  For verification purposes,
   the Initiator re-evaluates the selected suites and compares the
   results against the chosen ones.  If the re-evaluated suites do not
   match the chosen ones, the Initiator acts based on its local policy.

I strongly suggest saving the TLV payloads from R1 and doing a literal
memcmp() of the R1 and R2 versions.  This is incredibly simple to
implement and hard to mess up; redoing the evaluation/negotiation seems
much more prone to error.

   The ENCRYPTED_KEY parameter contains an Responder generated random
   value that MUST be uniformly distributed.  This random value is
   encrypted with the Master Key SA using the HIP_CIPHER encryption
   algorithm.

(Same comment about cryptographic strength as for the other initiator's
ENCRYPTED_KEY.)

   The I_NONCE parameter contains the nonce, supplied by the Initiator
   for the Master Key generation as shown in Section 6.3.  The Responder
   is echoing the value back to the Initiator to show it used the
   Initiator provided nonce.

This stated justification seems weak; if the Responder had used a
different value for the nonce, the derived key would not agree and the
MAC would fail to validate.  It seems to me, on first look, that the
role of repeating the nonce here is more the typical return-routability
check.  If you think that conveying it in the R2 payload itself plays a
different or additional role, please go into more detail about what and
how.

   The MAC is calculated over the whole HIP envelope, excluding any
   parameters after the HIP_MAC, as described in Section 6.2.  The
   Initiator MUST validate the HIP_MAC parameter.

Should I be reading any particular meaning into the distinction between
"HIP envelope" (as used here) and "HIP packet" (as used in RFC 7401)?

Section 6.2

   5.  Set Checksum and Header Length fields in the HIP header to
       original values.  Note that the Checksum and Length fields
       contain incorrect values after this step.

I recognize that this is just mirroring the RFC 7401 discussion, but I
don't actually understand why these values are incorrect.  The process
of verifying the MAC doesn't remove the MAC from the packet, so AFAICT
the length and checksum could still be valid (provided there are no
parameters after HIP_MAC or they are restored "if they will be needed
later").

Section 6.5

   4.  If the implementation chooses to respond to the I1 packet with an
       R1 packet, it creates a new R1 according to the format described
       in Section 5.3.2.  It chooses the HI based on the destination HIT
       and the DH_GROUP_LIST in the I1 packet.  If the implementation

What is "the HI" that it chooses?

       does not support the DH group required by the Initiator or if the
       destination HIT in the I1 packet does not match the required DH
       group, it selects the mutually preferred and supported DH group

In line with my earlier comments, I suggest being more clear that it is
the initiator's preference that is respected (there is no well-defined
notion of "mutual preference"), assuming that my understanding is
correct.

       based on the DH_GROUP_LIST parameter in the I1 packet.  The
       implementation includes the corresponding ECDH public key in the
       HOST_ID parameter.  If no suitable DH Group ID was contained in
       the DH_GROUP_LIST in the I1 packet, it sends an R1 packet with
       any suitable ECDH public key.

What defines "suitable" here?

   Note that only steps 4 and 5 have been changed with regard to the
   processing rules of HIPv2.  The considerations about R1 management

Pedantically, step 1's directive changed from a "must" to a "MUST",
which may or may not be noteworthy.

Section 6.6

   6.   The system MUST check that the DH Group ID in the HOST_ID
        parameter in the R1 matches the first DH Group ID in the
        Responder's DH_GROUP_LIST in the R1 packet, and also that this
        Group ID corresponds to a value that was included in the
        Initiator's DH_GROUP_LIST in the I1 packet.  If the DH Group ID

This looks like it's describing a system where the responder's
preference takes priority.  The earlier discussion (I thought) indicated
that the initiator's preference took priority.  There can only be one...

Section 6.7

   The processing of I2 packets follows similar rules as HIPv2 (see
   Section 6.9 of [RFC7401]).  The main differences to HIPv2 are that
   HIP DEX introduces a new session key exchange via the ENCRYPTED_KEY
   parameter as well as an I2 reception acknowledgement for
   retransmission purposes.  [...]

So the lack of anonymity support and DH key generation are not "main"
differences? :)

   5.   If the system's state machine is in the R2-SENT state, the
        system MUST check to see if the newly received I2 packet is
        similar to the one that triggered moving to R2-SENT.  If so, it

How is "similar to" determined?

   6.   If the system's state machine is in the I2-SENT state, the
        system MUST make a comparison between its local and sender's
        HITs (similarly as in Section 6.3).  If the local HIT is smaller
        than the sender's HIT, it should drop the I2 packet, use the
        peer Diffie-Hellman key, ENCRYPTED_KEY keying material and nonce
        #I from the R1 packet received earlier, and get the local
        Diffie-Hellman key, ENCRYPTED_KEY keying material, and nonce #J
        from the I2 packet sent to the peer earlier.  Otherwise, the
        system processes the received I2 packet and drops any previously
        derived Diffie-Hellman keying material Kij and ENCRYPTED_KEY
        keying material it might have generated upon sending the I2
        packet previously.  The peer Diffie-Hellman key, ENCRYPTED_KEY,
        and the nonce #J are taken from the just arrived I2 packet.  The
        local Diffie-Hellman key, ENCRYPTED_KEY keying material, and the
        nonce #I are the ones that were sent earlier in the R1 packet.

We list the two ways to get Kij, nonce #I, and nonce #J here ... but we
don't say what to do with them once you get them.

   8.   If the system's state machine is in any state other than
        R2-SENT, the system SHOULD check that the echoed R1 generation
        counter in the I2 packet is within the acceptable range if the
        counter is included.  [...]

If the system is in R2-SENT, do we just re-send the same R2, or do we
have to continue with the rest of the calculations (and the risk of a
loophole that bypasses the R1 generation counter checks)?

   11.  The system MUST derive Diffie-Hellman keying material Kij based
        on the public value and Group ID in the HOST_ID parameter.  This
        keying material is used to derive the keys of the Master Key SA

Do we need to validate that this group is the same group as the HOST_ID
we sent in the R1?

Section 6.8

   4.  The system MUST re-evaluate the DH_GROUP_LIST, HIP_CIPHER,
       HIT_SUITE_LIST, and TRANSPORT_FORMAT_LIST parameters in the R2
       packet and compare the results against the chosen suites.

As mentioned previously, the "remember and memcmp()" option is probably
safer.  (Also, resolving a discuss point might require adding HOST_ID to
this list.)

   Note that step 4 (signature verification) from the original
   processing rules of HIPv2 has been replaced with a negotiation re-
   evaluation in the above processing rules for HIP DEX.  Moreover, step
   6 has been added to the processing rules.

I think that steps 5 and 7 have been added, not step 6.

Section 6.11

   Storing of the R1 generation counter values and ENCRYPTED_KEY counter
   (Section 5.2.5) MUST be configured by explicit HITs.

What is the ENCRYPTED_KEY counter?  The word "counter" does not appear
in Section 5.2.5.

Section 7

   If a Responder is not under high load, K SHOULD be 0.

I believe this SHOULD duplicates normative guidance already given
earlier.

Section 7.1

   ACL processing is applied to all HIP packets.  A HIP peer MAY reject
   any packet where the Receiver's HIT is not in the ACL.  The HI (in

The *Receiver's* HIT?  Not the sender's?

   the R1, I2, and optionally NOTIFY packets) MUST be validated as well,
   when present in the ACL.  This is the defense against collision and
   second-image attacks on the HIT generation.

I think "when present in the ACL" needs to be stricken, since we now
mandate the HIT,HI pairing (or just the HI) to be in the ACL.

Section 8

Why do we give guidance to wait for the retransmission timeout before
acting on I1 but not before acting on R1?

Section 9

   HIP DEX closely resembles HIPv2.  As such, the security
   considerations discussed in Section 8 of [RFC7401] similarly apply to
   HIP DEX.  HIP DEX, however, replaces the SIGMA-based authenticated
   Diffie-Hellman key exchange of HIPv2 with an exchange of random
   keying material that is encrypted with a Diffie-Hellman derived key.

IIUC the ENCRYPTED_KEY material is used only for the pair-wise SA, not
the master key SA.  So some further detail would be helpful here.

   Both the Initiator and Responder contribute to this keying material.
   As a result, the following additional security considerations apply
   to HIP DEX:

We do want to ensure that both parties contribute to the master key SA
as well (which I think they do, with I_NONCE and the puzzle's #i that is
used in CKDF), so we should say that more clearly.

   *  The strength of the keys for both the Master and Pair-wise Key SAs
      is based on the quality of the random keying material generated by
      the Initiator and the Responder.  As either peer may be a sensor
      or an actuator device, there is a natural concern about the
      quality of its random number generator.  Thus at least a CSPRNG
      SHOULD be used.

What is the "at least" intending to indicate here?  What would be
"better" than a CSPRNG?

   *  The R1 packet is unauthenticated and offers an adversary a new
      attack vector against the Initiator.  This is mitigated by only
      processing a received R1 packet when the Initiator has previously
      sent a corresponding I1 packet.  Moreover, the Responder repeats
      the DH_GROUP_LIST, HIP_CIPHER, HIT_SUITE_LIST, and
      TRANSPORT_FORMAT_LIST parameters in the R2 packet in order to
      enable the Initiator to verify that these parameters have not been
      modified by an attacker in the unprotected R1 packet as explained
      in Section 6.8.

[depending on the discuss resolution, HOST_ID might be needed here as
well.]

   *  It is critical to properly manage the ENCRYPTED_KEY counter
      (Section 5.2.5).  If non-volatile store is used to maintain HIP
      state across system resets, then this counter MUST be part of the
      state store.

[the unexplained "ENCRYPTED_KEY counter" again]

Section 9.2

   generate a single keystream.  The integration of AES-CTR into IPsec
   ESP (RFC 3686) used by HIP (and, thus, by HIP-DEX) improves on the

AFAICT this integration is used for the pair-wise SA but the master key
SA messages are using the ENCRYPTED parameter which behaves differently.

   situation by partitioning the 128-bit counter space into a 32-bit
   nonce, 64-bit IV, and 32-bits of counter.  The counter is incremented
   to provide a keystream for protecting a given packet, the IV is
   chosen by the encryptor in a "manner that ensures uniqueness", and
   the nonce persists for the lifetime of a given SA.  In particular, in
   this usage the nonce must be unpredictable, not just single-use.  In
   HIP-DEX, the properties of nonce uniqueness/unpredictability and per-
   packet IV uniqueness are defined in Section 5.2.2.

I don't see such descriptions in Section 5.2.2.

Section 9.3

   With the curves specified here, there is a straightforward key
   extraction attack, which is a very serious problem with the use of
   static keys by HIP-DEX.  Thus it is MANDATORY to validate the peer's
   Public Key.

Please provide more details and/or references so that readers not
already skilled in the art can figure out what is being referenced.

Section 10

   ECC Curve Label  This document specifies a new algorithm-specific
      subregistry named "ECDH Curve Label".  The values for this
      subregistry are defined in Section 5.2.1.  The complete list of

The values analogous to the existing "ECDSA Curve Label" registry seem
to appear in Section 5.2.3, not Section 5.2.1.

Section 13.2

Following [NIST.SP.800-56C] gets a notice that it has been replaced by
rev1.

The way in which we reference RFC 7228 and have MUST-level requirements
on the class of device that uses HIP DEX, could be considered to make
RFC 7228 a normative reference.

If we are actually using X25519, RFC 7748 needs to be normative.
Arguably it does even if we're using ECDH-on-{Curve25519,Wei25519}.

Appendix C

The content of this appendix seems stale (there are no SHOULDs in
Section 6.3 anymore, etc.)

NITS

Abstract

   The HIP DEX protocol is primarily designed for computation or memory-
   constrained sensor/actuator devices.  Like HIPv2, it is expected to
   be used together with a suitable security protocol such as the
   Encapsulated Security Payload (ESP) for the protection of upper layer

per RFC 4303 ESP is the "Encapsulating" Security Payload.

Section 1.1

   HIP DEX does not have the option to encrypt the Host Identity of the
   Initiator in the I2 packet.  The Responder's Host Identity also is
   not protected.  Thus, contrary to HIPv2, HIP DEX does not provide for
   end-point anonymity and any signaling (i.e., HOST_ID parameter
   contained with an ENCRYPTED parameter) that indicates such anonymity
   should be ignored.

I think s/should/must/ -- attempting to rely on such signaling has no
value.

   Finally, HIP DEX is designed as an end-to-end authentication and key
   establishment protocol.  As such, it can be used in combination with
   Encapsulated Security Payload (ESP) [RFC7402] as well as with other

(ESP again)

Section 1.2

   to be a recurring part of the protocol.  Further, for devices
   constrained in this manner, a FS-enabled protocol's cost will likely
   provide little gain.  Since the resulting "FS" key, likely produced
   during device deployment, would typically end up being used for the
   remainder of the device's lifetime.  Since this key (or the
   information needed to regenerate it) persists for the device's
   lifetime, the key step of 'throw away old keys' in achieving forward
   secrecy does not occur, thus the forward secrecy would not be
   obtained in practice.

I think the last two sentences are redundant, and editing remnants where
one (the latter?) is supposed to replace the other.

   try a DEX HIT.  Note that such a downgrade (from BEX to DEX) offer
   approach is open to attack, requiring additional mitigation (e.g.
   ACL controls).

I'd suggest s/open to attack/open to attack by interfering with the
initial BEX offer/

Section 1.2.1

   b.  Key generation

          1 Diffie-Hellman ephemeral keypair generation, and

          1 Diffie-Hellman shared secret generation.

I think I often see DH shared-secret computation classified as a "public
key operation", so perhaps the division between bullets should be
signature scheme vs key agreement.

Section 2.2

   FOLD (X, K)  denotes the partitioning of X into n K-bit segments and
      the iterative folding of these segments via XOR.  I.e., X = x_1,
      x_2, ..., x_n, where x_i is of length K and the last segment x_n
      is padded to length K by appending 0 bits.  FOLD then is computed

I suggest s/0 bits/bits with value 0/

Section 2.3

   CMAC:  The Cipher-based Message Authentication Code with the 128-bit
      Advanced Encryption Standard (AES) defined in [NIST.SP.800-38B].

I suggest
NEW:
   CMAC:  The Cipher-based Message Authentication Code.  In this
      document, CMAC is instantiated using the 128-bit
      Advanced Encryption Standard (AES) defined in [NIST.SP.800-38B].

Section 3

   A compressed encoding of the HI, the Host Identity Tag (HIT), is used

To me a bare "compressed" suggests "reversibly compressed", but the HIT
generation procedure is lossy.  Maybe "reduced encoding"?

   *  The DEX HIT is not generated via a cryptographic hash.  Rather, it
      is a compression of the HI.

(ditto)
Likewise in §3.1, etc.

Section 4.1

   a MAC.  The R2 repeats the lists from R1 for signed validation to
   defend them against a MITM attack.

DEX has no signatures, so maybe "authenticated validation"?


We should probably expand "Trans" to "transport format" in the legend of
Figure 1, since it's not otherwise covered until Section 5.3.3 or so.

Section 4.1.1

               After a successful puzzle verification, the Responder can
   securely create session-specific state and perform CPU-intensive
   operations such as a Diffie-Hellman key generation.  [...]

In DEX, neither party does DH keypair generation in band, since only
static ECDH shares are used.  Maybe talking about DH shared-secret
computation is better?

Section 4.1.2.1

                                 To this end, the Responder MAY notify
   the Initiator about the anticipated delay once the puzzle solution
   was successfully verified that the remaining I2 packet processing
   will incur a high processing delay.  [...]

pick one of "about the anticipated delay" and "that the remaining I2
packet processing will incur a high processing delay".

                                        The NOTIFICATION parameter
   contains the anticipated remaining processing time for the I2 packet
   in milliseconds as two-octet Notification Data.  [...]

Should we say "network byte order"?

Section 4.1.3.1

   The Master Key SA is used to authenticate HIP packets and to encrypt
   selected HIP parameters in the HIP DEX packet exchanges.  Since only
   a small amount of data is protected by this SA, it can be long-lived
   with no need for rekeying.  [...]

I suggest "and in many cases will have no need to rekey", since we go on
to talk about the need for rekeying if the ESP sequence counter would
overflow.

Section 5.2

   *  DH_GROUP_LIST and HOST_ID are restricted to ECC-based suites.

Is "suites" or "algorithms" more appropriate here?

Section 5.2.2

s/AES-129-CTR/AES-128-CTR/

Section 5.3.2

   The DH_GROUP_LIST parameter contains the Responder's order of
   preference based on the Responder's choice the ECDH key contained in
   the HOST_ID parameter (see below).  [...]

The grammar is not right in this sentence, maybe around "choice the ECDH
key".

Section 5.3.3

   If present in the R1 packet, the Initiator MUST include an unmodified
   copy of the R1_COUNTER parameter into the I2 packet.

It seems that RFC 7401 had the (nonsensical) "if present in the I1
packet", which probably merits an errata report.  (I did not submit one
myself so as to preserve credit for whomever actually noticed it first;
I just looked at the diff and it stuck out.)

   The Solution contains the Random #I from the R1 packet and the
   computed #J value.  The low-order #K bits of the RHASH(I | ... | J)
   MUST be zero.

It seems that to be consistent with the rest of the document and RFC
7401 we should capitalize as "SOLUTION".

   The TRANSPORT_FORMAT_LIST parameter contains the single transport
   format type selected by the Initiator.  The chosen type MUST
   correspond to one of the types offered by the Responder in the R1
   packet.  The different format types are DEFAULT, ESP and ESP-TCP as
   explained in Section 3.1 in [RFC6261].

It seems like we could use consistent phrasing for the format types in
§5.3.2 and §5.3.3.

Section 5.4

   of the packet, it MAY respond with an ICMP packet.  Any such reply

I think the "any such replies" formulation in RFC 7401 is correct.

Section 6.1

The RFC 7401 procedure is not particularly clear that the responder
rejects if the received #I is not a saved one (which is needed in order
for the puzzle mechanism to revert to a "cookie-based DoS protection
mechanism" as claimed in §4.1.1).  We might consider rectifying that
here.

Section 6.2

   4.  Compute the CMAC using either HIP-gl or HIP-lg integrity key as
       defined in Section 6.3 and verify it against the received CMAC.

Just writing "either" reads as if it's an open choice; we might rather
say "the appropriate choice of" to indicate that the choice is
constrained.

Section 6.6

   3.   If the HIP association state is I1-SENT or I2-SENT, the received
        Initiator's HIT MUST correspond to the HIT used in the original
        I1 packet.  Also, the Responder's HIT MUST correspond to the one
        used in the I1 packet, unless this packet contained a NULL HIT.

I think s/unless this packet/unless that packet/ makes more sense, as
"this packet" would be the R1 packet we're responding to.

   10.  The system attempts to solve the puzzle in the R1 packet.  The
        system MUST terminate the search after exceeding the remaining
        lifetime of the puzzle.  If the puzzle is not successfully
        solved, the implementation MAY either resend the I1 packet
        within the retry bounds or abandon the HIP base exchange.

s/base/DEX/

   11.  The system computes standard Diffie-Hellman keying material
        according to the public value and Group ID provided in the
        HOST_ID parameter.  The Diffie-Hellman keying material Kij is
        used for key extraction as specified in Section 6.3.

In my experience "shared secret" is a more common term than "keying
material" in this context.

Section 6.8

   2.  The system MUST verify that the HITs in use correspond to the
       HITs that were received in the R1 packet that caused the
       transition to the I2-SENT state.

It looks like RFC 7401 had "transition to the I1-SENT state", which
seems worthy of an errata report.
[Hipsec] Benjamin Kaduk's Discuss on draft-ietf-h… Benjamin Kaduk via Datatracker