[lisp] Benjamin Kaduk's Discuss on draft-ietf-lisp-rfc6833bis-24: (with DISCUSS and COMMENT)

Benjamin Kaduk <kaduk@mit.edu> Thu, 07 February 2019 13:50 UTC

Return-Path: <kaduk@mit.edu>
X-Original-To: lisp@ietf.org
Delivered-To: lisp@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id A9FD91271FF; Thu, 7 Feb 2019 05:50:39 -0800 (PST)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
From: Benjamin Kaduk <kaduk@mit.edu>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-lisp-rfc6833bis@ietf.org, Luigi Iannone <ggx@gigix.net>, lisp-chairs@ietf.org, ggx@gigix.net, lisp@ietf.org
X-Test-IDTracker: no
X-IETF-IDTracker: 6.91.0
Auto-Submitted: auto-generated
Precedence: bulk
Message-ID: <154954743968.23471.9935733647283605722.idtracker@ietfa.amsl.com>
Date: Thu, 07 Feb 2019 05:50:39 -0800
Archived-At: <https://mailarchive.ietf.org/arch/msg/lisp/RXymAATzjiba-QvAtBe7L8xlRls>
Subject: [lisp] Benjamin Kaduk's Discuss on draft-ietf-lisp-rfc6833bis-24: (with DISCUSS and COMMENT)
X-BeenThere: lisp@ietf.org
X-Mailman-Version: 2.1.29
List-Id: List for the discussion of the Locator/ID Separation Protocol <lisp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/lisp>, <mailto:lisp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/lisp/>
List-Post: <mailto:lisp@ietf.org>
List-Help: <mailto:lisp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/lisp>, <mailto:lisp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Feb 2019 13:50:40 -0000

Benjamin Kaduk has entered the following ballot position for
draft-ietf-lisp-rfc6833bis-24: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
for more information about IESG DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-lisp-rfc6833bis/



----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

This document has normative dependencies on other WG drafts that are not
yet mature (one could perhaps define this as having completed IETF LC).  In
particular, I believe there is a nontrivial chance that either or both of
lisp-sec and 6834bis could require changes to this document in order to be
fit for purpose, and thus that this document cannot safely be approved for
publication until these normative dependencies are closer to publication.
In particular, I have done a fairly full review of lisp-sec and have
DISCUSS-worthy points with it (I have not done much review of 6834bis yet).

This document includes a mechansism to use HMAC keyed by a pre-shared key
to authenticate messages (Map-Register and Map-Notify*); it is directly
using the long-term PSK as the HMAC key.  This is not really consistent
with current IETF best practices (e.g,. BCP 107), which tend to not use the
long-term key directly for keying messages, but rather to incorporate some
form of key derivation step, to protect the long-term key from
cryptanalysis and reduce the need to track long-term per-key data usage
limits.  It is probably not feasible to directly require all LISP
implementations to switch keying strategy, but it seems quite advisable to
define new algorithm ID types that include a key derivation step before the
HMAC, and to begin efforts to convert the ecosystem to the more sustainable
cryptographic usage.  I would like to discuss what actions are reasonable
to take at this time, on this front.

As implied by my previous discuss ballot position, I think Section 5.4
should grow a statement (akin to the one added in Section 5.6) that the
"Record" format is also used in the "Map-Reply Record" field of the
Map-Request message, and that the field definitions are reused wholesale
for the Map-Register message.

In Section 5.6, this text seems internally inconsistent:

      can continue using an incrementing nonce.  If the the ETR cannot
      support saving the nonce, then when it restarts it MUST use a new
      authentication key to register to the mapping system.  A Map-
      Server MUST track and save in persistent storage the last nonce
      received for each ETR xTR-ID that registers to it.  If a Map-
      Register is received with a nonce value that is not greater than
      the saved nonce, it drops the Map-Register message and logs the
      fact a replay attack could have occurred.

In order for a new key to be useful as stated, the Map-Server must do the
nonce tracking per <xTR-ID, key> pair and not just per xTR-ID.

Also, guidance is needed on what scope of uniqueness is needed for the Key
ID to function properly -- unique per Map-Server?  Per <Map-Server,xTR>
pair?  Per LISP domain?

Also in Section 5.6:

                                             Implementations of this
      specification MUST include support for either HMAC-SHA-1-96
      [RFC2404] and HMAC-SHA-256-128 [RFC4868] where the latter is
      RECOMMENDED.

I don't think this sort of "mandatory to choose" is BCP 201-compliant.

I think there needs to be more description of Site-ID usage and scoping in
order to be fully interoperable (more in the COMMENT section).

There are multiple places where we talk about message contents being copied
from a corresponding request (e.g., from Map-Request to Map-Notify); we
need to explicitly state that the authentication data is recomputed to
match, e.g., the new message type.  I've tried to note these occurrences in
the COMMENT section.

The condition for Map-Notify-Ack terminating Map-Notify retransmission
seems incomplete (more in the COMMENT).

In Section 8.2:

                           A Map-Register message includes
   authentication data, so prior to sending a Map-Register message, the
   ETR and Map-Server SHOULD be configured with a shared secret or other
   relevant authentication information.  [...]

We require authentication for Map-Register and do not provide any
alternative mechanism for key distribution, so why is this only a SHOULD?

                                                        As developers
   and operators gain experience with the mapping system, additional,
   stronger security measures may be added to the registration process.

This text does not add confidence to the "proposed standard" label.

In Section 9:

   A complete LISP threat analysis can be found in [RFC7835].  In what

As I have stated previously, the threat analysis in RFC 7835 is not
complete and it should not be referred to as such.

   3.  LISP-SEC [I-D.ietf-lisp-sec] MUST be implemented.  Network
       operartors should carefully weight how the LISP-SEC threat model
       applies to their particular use case or deployment.  If they
       decide to ignore a particular recommendation, they should make
       sure the risk associated with the corresponding threats is well
       understood.

I'm concerned enough about the risk of having a "ITR requests lisp-sec but
ETR didn't use it" case that causes complete breakage, that I want to talk
about this a bit more.  We currently in this document say that lisp-sec is
mandatory to implement (which presumably covers at least ITRs, ETRs,
Map-Resolvers, and Map-Servers).  LISP-SEC itself says that "and ETR that
supports LISP-SEC MUST set the S bit in its Map-Register messages".  Is it
possible that an ETR might "implement" but then not "support" LISP-SEC?  If
so, then we should consider the possibility that we need an authenticated
signal (from the mapping system to the ITR) that downgrading from lisp-sec
is allowed.  There seem to be several possibilities for how one might
construct such a signal; two that came to mind to me would be (1) to define a
new ACT value for "repeat without lisp-sec" that could be returned as a
negative Map-Response directly from the mapping system wherever the mapping
system is able to discern that the ETR in question does not support
lisp-sec (I don't actually know if this could happen at Map-Resolver or
would need to be delayed until the final Map-Server) and (2) to have an
optional Map-Request field that the ETR is required to copy unchanged to
the Map-Reply; this could then include a message HMAC'd in the ITR-OTK that
indicates lisp-sec non-support and binds to the nonce in the request.
Whether these are workable ideas seems to depend on aspects of the mapping
system to which I cannot speak.

                                                     The LISP-SEC
   protocol defines a mechanism for providing origin authentication,
   integrity, anti-replay, protection, and prevention of 'man-in-the-
   middle' and 'prefix overclaiming' attacks on the Map-Request/Map-
   Reply exchange.  [...]

Does LISP-SEC actually provide any additional anti-replay protection not
present in the base protocol?  I do not remember any such additional
protection.

   A complete LISP threat analysis has been published in [RFC7835].
   Please refer to it for more detailed security related details.

(1) you already said that above, (2) it's still not complete.

Section 11 ("Changes since RFC 6833") is inaccurate (see COMMENT).  I did
not check whether it is complete, but someone needs to do so before final
publication.


The following items were present in my original DISCUSS position and still
have not been resolved.  Note that I copy below the previous ballot text
even for some issues that are described above already in different words.

A 64-bit nonce is used, apparently as a request/response correlator, but
the actual (cryptographic?) properties required from the nonce in the
protocol are not clearly covered.  In some cryptographic contexts a 64-bit
nonce may be too short; I do not believe that this is the case here, but
without a clear picture of what the requirements are it's hard to say for
sure.
[ed. there was some previous discussion about 24-bit nonces that has been
removed from the text, but the core question of what properties the nonce
is required to provide remains unaddressed in the document text.  There is
also a field called 'Nonce' that is used as a s equence number, the
requirements for which are partially described in the new text.]

The layout of the document is somewhat confusing, in a way that could
arguably lead to noninteroperable implemnetations.  For example, the
section on the Map-Register message format includes descriptions of the
fields in the records and locators therein, and the section on Map-Notify
reuses that portion of the structure, incorporating the field descriptions
by reference.  But the Map-Register section does not indicate that its
descriptions are to apply in both cases, leading to confusing text that
talks about values being set or cases that are not possible for a
Map-Register (i.e., the section nominally being described).  It would be
most clear to have a dedicated subsection for the portion of the
structure(s) that is being reused, which would allow for the per-field
descriptions to clearly indicate in which scope they are defined.  But the
more minimal change of just indicating that the primary definition will be
"dual use" would probably suffice as well.
The Map-Reply record/locator descriptions are reused similarly; I made a
comment on section 5.4 that lists a specific instance, though I believe the
phenomenon is more general.
[ed. this was partially addressed, but the request to examine all data
structure reuse (note that "for example" was used) was not heeded]

Similarly, there are many instances (some noted in my Comment) where a
bidirectional interaction between two xTRs is described, yet the peers are
identified as "ITR" and "ETR".  This is very confusing when the entity
named as "ITR" is described as performing ETR functionality, or vice versa;
pedagogically, it would be much better to use non-role-based names for the
entities while describing these exchanges.
[ed. there was some improvement here; I still note some potential sites for
confusion in the COMMENT]

While I see that there is an entire document dedicated to Map-Versioning
and thus we do not need to fully cover everything here, I think it is
critically important to be clear that there are consistency requirements
attached to map versions, as relating to the stability of membership of
RLOCs in a given record, etc.  (I cannot be very clear hear since I am not
entirely confident of the details of the consistency requirements yet.)

I think we need greater clarity on the 'E' and 'M' bits in the ECM format;
more in the Comment section.
[ed. the reader will need to consult the original ballot's COMMENT section
and not the current one]

Section 8.1 says:
   o  A Negative Map-Reply, with action code of "Natively-Forward", from
      a Map-Server that is authoritative for an EID-Prefix that matches
      the requested EID but that does not have an actively registered,
      more-specific ID-prefix.
This document provides no mechanism to establish that a Map-Server is
authoritative for a given EID-Prefix, so this entire case is
non-actionable.
[ed. I think there may have been some previous discussion on this (e.g.,
that might render it moot) but couldn't find it quickly]

Section 8.2 says:
   An ETR publishes its EID-Prefixes on a Map-Server by sending LISP
   Map-Register messages.  A Map-Register message includes
   authentication data, so prior to sending a Map-Register message, the
   ETR and Map-Server SHOULD be configured with a shared secret or other
   relevant authentication information.
This cannot be a SHOULD if things are to work properly; it has to be MUST.

Section 8.2 also says:
                                                        As developers
   and operators gain experience with the mapping system, additional,
   stronger security measures may be added to the registration process.
This kind of language for forward-looking guidance indicates that the
current security properties are not well-understood by the authors and is
inconsistent with Proposed Standard status.

I think the MUST and SHOULD requirements for implementing cryptographic
primitives are generally swapped; the more-secure ones (e.g.,
HMAC-SHA-256-128) should be MUST, and the legacy algorithms needed for
compatibility with existing deployments would be SHOULD.

Section 9 currently states:
   [a]s noted in Section 8.2, a Map-Server SHOULD verify that all EID-
   Prefixes registered by an ETR match the configuration stored on the
   Map-Server.
I think we need a MUST-level requirement for verifying authorization for a
given EID-Prefix, with one way of satisfying the requirement being checking
configuration, but allowing for other means as well.


----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

Abstract

   This document describes the Control-Plane and Mapping Service for the
   Locator/ID Separation Protocol (LISP), implemented by two new types
   of LISP-speaking devices -- the LISP Map-Resolver and LISP Map-Server

This is a -bis document; is "new" really appropriate?  (It also appears in
the Introduction, of course.)

Section 1

   LISP is not intended to address problems of connectivity and scaling
   on behalf of arbitrary communicating parties.  Relevant situations
   are described in the scoping section of the introduction to
   [I-D.ietf-lisp-rfc6830bis].

It looks like we inline that text into this document as Section 1.1, below;
perhaps this paragraph is no longer needed, then?

Section 2

I don't think the  "In may IETF documents...near the beginning of their
document" needs to be included, as the genart reviewer noted.

Section 4

   A Map-Server is a device that publishes EID-Prefixes in a LISP
   mapping database on behalf of a set of ETRs.  When it receives a Map
   Request (typically from an ITR), it consults the mapping database to

nit: isn't it typically from a Map-Resolver (or other
mapping-system-internal entity)?  It's originally from an ITR, of course,
but the flow assumed by this document is described as
ITR->Map-Resolver->mapping-system-internals->Map-Server->ETR.

   Note that while it is conceivable that a Map-Resolver could cache
   responses to improve performance, issues surrounding cache management
   will need to be resolved so that doing so will be reliable and

nit: s/will/would/?

   practical.  As initially deployed, Map-Resolvers will operate only in
   a non-caching mode, decapsulating and forwarding Encapsulated Map
   Requests received from ITRs.  Any specification of caching
   functionality is out of scope for this document.

I think it's better to say something like "In this specification," rather
than "As initially deployed".

Also, I've confused myself a couple times from this -- it's only the
Map-Resolver that doesn't cache; the ITR is free to cache.  It might be
helpful to call out that distinction here.

Section 5

I still think it's needlessly confusing to duplicate the IP/UDP header
layout figure here, at least without a prefacing comment noting that this
is the standard IP+UDP header with the IP addresses replaced by RLOCs.
But this is a non-blocking comment and the authors have already replied, so
feel free to ignore.

   Implementations MUST be prepared to accept packets when either the
   source port or destination UDP port is set to 4342 due to NATs
   changing port number values.

It's not entirely clear to me what this requirement is saying.

Section 5.2

   A: This is an authoritative bit, which is set to 0 for UDP-based Map-
      Requests sent by an ITR.  It is set to 1 when an ITR wants the
      destination site to return the Map-Reply rather than the mapping
      database system returning a Map-Reply.

Given that we've already disclaimed caching in the mapping system, aren't
all responses supposed to come from the destination site rather than the
mapping system?  Searching the rest of the document for the string
"authoritative" suggests that this is perhaps intended to avoid proxying
behavior from terminating Map-Servers (but when an ETR requests proxying is
it still guaranteed to be able to generate its own Map-Replys?), in which
case this could probably be phrased better.

   P: This is the probe-bit, which indicates that a Map-Request SHOULD
      be treated as a Locator reachability probe.  The receiver SHOULD
      respond with a Map-Reply with the probe-bit set, indicating that
      the Map-Reply is a Locator reachability probe reply, with the
      nonce copied from the Map-Request. [...]

Why are these only SHOULD?

I still think it is needlessly confusing to have bit labels that differ
only by letter case.  While this may not be confusing for the authors,
there are plenty of other people who could potentially be confused by it.
(Also, why are there two bits 'R' next to a 'Rsvd' field that all have the
same "reserved" semantics?)

   L: This is the local-xtr bit.  It is used by an xTR in a LISP site to
      tell other xTRs in the same site that it is part of the RLOC-set
      for the LISP site.  The L-bit is set to 1 when the RLOC is the
      sender's IP address.

At this point in the document, we haven't seen anything to suggest that an
xTR is going to be sending Map-Requests to other xTRs in the same site; a
forward reference is probably in order.

   D: This is the dont-map-reply bit.  It is used in the SMR procedure
      described in Section 6.1.  When an xTR sends an SMR Map-Request
      message, it doesn't need a Map-Reply returned.  When this bit is
      set, the receiver of the Map-Request does not return a Map-Reply.

nit: I'd suggest consolidating the behavior description and leaving the
explanations all at the end, so "This is the dont-map-reply bit.  When this
bit is set, the receiver of the Map-Request does not return a Map-Reply.
It is used in the SMR procedure described in Section 6.1; when an xTR
sends an SMR Map-Request message, it doesn't need a Map-Reply returned."

      Support for processing multiple EIDs in a single Map-Request
      message will be specified in a future version of the protocol.

I would suggest not using the assertive future tense here, as we cannot
really bind our future actions to guarantee its truth.  Wordings like "may
be specified" or "is left for a future version" are alternative options.

   EID mask-len:  This is the mask length for the EID-Prefix.

In bits, right? ...

   EID-Prefix:  This prefix address length is 4 octets for an IPv4
      address family and 16 octets for an IPv6 address family when the

... then maybe we shouldn't switch to bytes for just one sentence (and go
back to bits later in the paragraph)?

      entry, the EID-Prefix is set to the destination IP address of the
      data packet, and the 'EID mask-len' is set to 32 or 128 for IPv4
      or IPv6, respectively.  When an xTR wants to query a site about

Is this really an xTR-specific action, or does it apply to any ITR
functionality?

            This allows the ETR that will receive this Map-Request to
      cache the data if it chooses to do so.

OTOH, this one does seem to require an xTR.

Section 5.3

                           For the initial case, the destination IP
   address used for the Map-Request is the data packet's destination
   address (i.e., the destination EID) that had a mapping cache lookup
   failure.  [...]

This seems like a type mismatch between RLOC/EID -- per the headers, the
destination address should be an RLOC, but we are forced to use an EID in
this case.  The disparity should probably be called out and explained,
e.g., clarify that it's okay to use an EID as destination inside ECM
encapsulation (and, apparently, if we believe Section 5.8, that it's
required to do so).

                                        A successful Map-Reply, which is
   one that has a nonce that matches an outstanding Map-Request nonce,
   will update the cached set of RLOCs associated with the EID-Prefix
   range.

nit: A negative Map-Reply will match the nonce.  Will it also update the
cached set?  Is it still considered to be "successful"?

                                          If the ITR erroneously
   provides no ITR-RLOC addresses, the Map-Replier MUST drop the Map-
   Request.

I see we talked about this last time around; did you want to add some text
about how, despite the protocol message not definitionally allowing for
this detection, in practice it is still possible?

   Request.  When an ETR configured to accept and verify such
   "piggybacked" mapping data receives such a Map-Request and it does

So, "(i.e., it is also an ITR)"?

                       If the ETR (when it is an xTR co-located as an
   ITR) has a Map-Cache entry that matches the "piggybacked" EID and the
   RLOC is in the Locator-Set for the entry, then it MAY send the

nit: "cached entry" would help clarify the prerequisites here.

   source.  If the RLOC is not in the Locator-Set, then the ETR MUST
   send the "verifying Map-Request" to the "piggybacked" EID.  [...]

"send ... to the [...] EID" seems like a type mismatch again, since we only
can send Map-Requests to RLOCs.

Section 5.4

      Map-Request.  See RLOC-probing Section 7.1 for more details.  When
      the probe-bit is set to 1 in a Map-Reply message, the A-bit in
      each EID-record included in the message MUST be set to 1.

Do we want to specify any special handling if that NUST is disobeyed?

   S: This is the Security bit.  When set to 1, the following
      authentication information will be appended to the end of the Map-
      Reply.  The details of signing a Map-Reply message can be found in
      [I-D.ietf-lisp-sec].

Please do not use the word "signing" here; it is a term of art that is not
appropriate to the actual operation performed.

   Record TTL:  This is the time in minutes the recipient of the Map-
      Reply will store the mapping.  If the TTL is 0, the entry MUST be

I think "can" is more appropriate than "will"; generally a local cache can
safely be invalidated at will.

   Locator Count:  This is the number of Locator entries.  A Locator

Please scope this to "in the given Record".

   EID mask-len:  This is the mask length for the EID-Prefix.

(in bits, right?)

   ACT:  This 3-bit field describes Negative Map-Reply actions.  In any
      other message type, these bits are set to 0 and ignored on
      receipt.  These bits are used only when the 'Locator Count' field
      is set to 0.  The action bits are encoded only in Map-Reply
      messages.  [...]

This is the section on Map-Reply messages; why are we talking about other
message types?  Also, do we want to mention that the possible values are
managed by IANA?

   A: The Authoritative bit, when set to 1, is always set to 1 by an
      ETR.  When a Map-Server is proxy Map-Replying for a LISP site, the
      Authoritative bit is set to 0.  This indicates to requesting ITRs
      that the Map-Reply was not originated by a LISP node managed at
      the site that owns the EID-Prefix.

nit: This text is needlessly confusing.  How about "The authoritative bit
can only be set to 1 by an ETR (and not a Map-Server generating Map-Reply
messages as a proxy).  If this bit is set to 0, that indicates ..."?

Section 5.5

Please provide a link/reference for Data-Probe on first usage.

   For each Map-Reply record, the list of Locators in a Locator-Set MUST
   appear in the same order for each ETR that originates a Map-Reply
   message.  The Locator-Set MUST be sorted in order of ascending IP
   address where an IPv4 locator address is considered numerically 'less
   than' an IPv6 locator address.

IIUC, there is no need for "MUST appear in the same order" if you also
mandate a specific sorting function.

Section 5.6

   P: This is the proxy Map-Reply bit.  When set to 1, an ETR sends a
      Map-Register message requesting the Map-Server to proxy a Map-
      Reply.  [...]

nit: "just one?"

Do you want to give a mnemonic for the 'I' bit?

The "Nonce" field is acting as a sequence number, not just as a number used
once.  I strongly suggest changing the name accordingly.

   Authentication Data Length:  This is the length in octets of the
      'Authentication Data' field that follows this field.  The length
      of the 'Authentication Data' field is dependent on the MAC
      algorithm used.  The length field allows a device that doesn't
      know the MAC algorithm to correctly parse the packet.

Why does a device that won't be able to validate the authentication data
need to be able to parse the packet?  I thought all Map-Registers needed to
be authenticated.

   xTR-ID:  xTR-ID is a 128 bit field at the end of the Map-Register
      message, starting after the final Record in the message.  The xTR-
      ID is used to uniquely identify a xTR.  The same xTR-ID value MUST
      NOT be used in two different xTRs.

Globally, over all time?  Within a single LISP domain, over all time?
Please be specific.

   Site-ID:  Site-ID is a 64 bit field at the end of the Map- Register
      message, following the xTR-ID.  Site-ID is used to uniquely
      identify to which site the xTR that sent the message belongs.

Where is a (LISP) "site" formally defined?  Are there weird topologies or
edge cases that we need to consider when assigning numbers, risk of having
two IDs that might validly apply to a single xTR, etc.?

Section 5.7

(If Nonce is renamed above, it should be renamed here as well.)

   The fields of the Map-Notify are copied from the corresponding Map-
   Register to acknowledge its correct processing.  [...]

Is the authentication data recomputed?

                                      The fields of the Map-Notify-Ack
   are copied from the corresponding Map-Notify message to acknowledge
   its correct processing.

(ditto)

   The Map-Notify-Ack message has the same contents as a Map-Notify
   message.  It is used to acknowledge the receipt of a Map-Notify
   (solicited or unsolicited) and for the sender to stop retransmitting

So a normal exchange would include Map-Register, Map-Notify, and
Map-Notify-Ack?

   A Map-Server sends an unsolicited Map-Notify message (one that is not
   used as an acknowledgment to a Map-Register message) that follows the
   Congestion Control And Relability Guideline sections of [RFC8085].  A

This second clause ("that follows") is rather a non sequitur here.  And we
still don't know what purpose the unsolicited Map-Notify serves!

   Map-Notify is retransmitted until a Map-Notify-Ack is received by the
   Map-Server with the same nonce used in the Map-Notify message.  If a

Presumably we care about (e.g.) the key ID matching and the authentication
data validating, as well?

   Map-Notify-Ack is never received by the Map-Server, it issues a log
   message.  An implementation SHOULD retransmit up to 3 times at 3
   second retransmission intervals, after which time the retransmission
   interval is exponentially backed-off for another 3 retransmission

"exponentially" is not well defined unless the base of the exponent is
specified.

   attempts.  After this time, an xTR can only get the RLOC-set change
   by later querying the mapping system or by RLOC-probing one of the
   RLOCs of the existing cached RLOC-set to get the new RLOC-set.

What RLOC-set change?  This text doesn't seem to indicate what
functionality is going on here.

Section 5.8

   An Encapsulated Control Message (ECM) is used to encapsulate control
   packets sent between xTRs and the mapping database system.

Some of the flag bit descriptions appear to describe usages that are or can
be entirely within the mapping system.

   D:    This is the DDT-bit.  When set to 1, the sender is requesting a
         Map-Referral message to be returned.  The details of this
         procedure are described in [RFC8111].

E.g., here, the sender can be (IIUC) within the mapping system.

   E:    This is the to-ETR bit.  When set to 1, the Map-Server's
         intention is to forward the ECM to an authoritative ETR.

I'm not sure that "intention" is quite right, here -- as far as this
document is concerned, a Map-Server will always know whether it is sending
an ECM to an authoritative ETR.  Also, this bit does not seem to be used
for anything within this document, and no external reference is given.

Are the 'M' and 'E' bits mutally exclusive?  (Would we even care?)

I suggest adding more text about which sender/receiver pairs are permitted
(or allowed or expected) to set the D, E, and M bits.

         invoking Map-Request.  Port number 4341 MUST NOT be assigned to
         either port.  The checksum field MUST be non-zero.

This is the only place in this document that we disallow port 4341.  Should
we also be disallowing it from being used as the non-4342 port for other
exchanges?

   LCM:  The format is one of the control message formats described in
         this section.  [...]

nit: "this section" means 5.8; presumably we mean Section 5.

Section 6.1

I agree with Warren that the direct usage of mapping information included
in an SMR presents a substantial attack surface, both for DoS and
potentially for redirecting traffic wholesale (whether for snooping
purposes or use as volumetric DoS to a third-party target).  There is some
discussion of the risks of spoofing with this sort of "gleaming" behavior,
but I strongly suggest mentioning something like "this technique presents a
risk of off-path spoofing; see Section 9 for details" at each such
non-validated scheme for learning mapping information.

   Since ETRs are not required to keep track of remote ITRs that have
   cached their mappings, they do not know which ITRs need to have their
   mappings updated.  As a result, an ETR will solicit Map-Requests
   (called an SMR message) to those sites to which it has been sending
   LISP encapsulated data packets for the last minute.  In particular,
   an ETR will send an SMR to an ITR to which it has recently sent
   encapsulated data.  This can only occur when both ITR and ETR
   functionality reside in the same router.

I still think that this text is needlessly confusing about which action is
taken by which router, and could be improved as, e.g., "this can only occur
when the ETR also provides ITR functionality (that is, it is an xTR)".

   Both the SMR sender and the Map-Request responder MUST rate-limit
   these messages.  Rate-limiting can be implemented as a global rate-
   limiter or one rate-limiter per SMR destination.

What is the goal of this rate-limiting; how is the threshold determined?

   The following procedure shows how an SMR exchange occurs when a site
   is doing Locator-Set compaction for an EID-to-RLOC mapping:

Where is locator-set compaction defined?

Throughout this whole example, "the site with the changed mapping" and "the
site that sent the Map-Request" are kind of clunky phrases; it might be
cleaner writing to give them names (like "site A" and "site B").

   2.  A remote ITR that receives the SMR message will schedule sending
       a Map-Request message to the source locator address of the SMR
       message or to the mapping database system.  [...]

How does the ITR decide which destination to send the Map-Request to?

       copied from the SMR message.  If the source Locator is the only
       Locator in the cached Locator-Set, the remote ITR SHOULD send a

just to double-check: this is the source Locator from the SMR?

       Map-Request to the database mapping system just in case the
       single Locator has changed and may no longer be reachable to
       accept the Map-Request.

Is this the only case that the Map-Request would go to the mapping system?

   3.  The remote ITR MUST rate-limit the Map-Request until it gets a
       Map-Reply while continuing to use the cached mapping.  When

nit: I suggest a comma after "Map-Reply" to avoid the misparse that the
Map-Reply must be received while the cached mapping is in use (and that the
rate limiting would continue indefinitely if the cached mapping expired in
the meantime).

   5.  The ETRs at the site with the changed mapping record the fact
       that the site that sent the Map-Request has received the new
       mapping data in the Map-Cache entry for the remote site so the
       Locator-Status-Bits are reflective of the new mapping for packets
       going to the remote site.  [...]

The Locator-Status-Bits in which direction?  (Probably should also give a
section ref to 6830bis for the definition.)

   For security reasons, an ITR MUST NOT process unsolicited Map-
   Replies.  To avoid Map-Cache entry corruption by a third party, a
   sender of an SMR-based Map-Request MUST be verified.  If an ITR

To be clear, the verification here is essentially return-routability
verification, aka proof that the sender actually owns the claimed address,
right?  I think it is appropriate to have some text noting the specific
behavior, and that this is not any sort of cryptographic or strongly
authenticated verification.

   receives an SMR-based Map-Request and the source is not in the
   Locator-Set for the stored Map-Cache entry, then the responding Map-
   Request MUST be sent with an EID destination to the mapping database
   system.  [...]

What is an "SMR-based Map-Request" (also appears in the next paragraph and
one other place)?  Is it just an SMR?  If it's some actual Map-Request, I'm
confused at why an *I*TR would be receiving it.

Section 7

   3.  An ITR may receive an ICMP Port Unreachable message from a
       destination host.  This occurs if an ITR attempts to use
       interworking [RFC6832] and LISP-encapsulated data is sent to a
       non-LISP-capable site.

Is the ITR supposed to conclude that the RLOC is likely down in this
situation?

   When ITRs receive ICMP Network Unreachable or Host Unreachable
   messages as a method to determine unreachability, they will refrain
   from using Locators that are described in Locator lists of Map-
   Replies.  [...]

Is this really as precise as it can be?  It kind of sounds like it says
that all Map-Replies will be ignored when any ICMP Network/Host Unreachable
message is received.

            If it does not find one and BGP is running in the Default-
   Free Zone (DFZ), it can decide to not use the Locator even though the

Is running in the DFZ consistent with the reduced scope of running in a
single administrative domain?

   Optionally, an ITR can send a Map-Request to a Locator, and if a Map-
   Reply is returned, reachability of the Locator has been determined.

Is this describing the same flow as item (5) above and Section 7.1 below?
If so, it seems totally redundant and could be omitted.

   Obviously, sending such probes increases the number of control
   messages originated by Tunnel Routers for active flows, so Locators
   are assumed to be reachable when they are advertised.

I'm not sure what "advertised" is intended to mean, here.  Is it
"advertised into the mapping system"?  But that is not directly visible to
the ITR, only indirectly through the results of an actual mapping request
(and even then, the Map-Reply from an ETR could be invalid, e.g.,
overclaiming, unless LISP-SEC is used.

                               Both Requests and Replies MUST be rate-
   limited.  [...]

I believe this requirement duplicates requirements already made elsewhere;
the other locations also include more guidance on actual rates.

Section 7.1

                                                   A Map-Request used as
   an RLOC-probe is NOT encapsulated and NOT sent to a Map-Server or to
   the mapping database system as one would when soliciting mapping
   data.  [..]

I strongly suggest using a word other than "soliciting" to avoid confusion
with SMR.

   data.  The EID record encoded in the Map-Request is the EID-Prefix of
   the Map-Cache entry cached by the ITR or PITR.  The ITR MAY include a

Is it worth reminding the reader that the source EID here is zero-length
and source-EID-AFI set to zero?

   mapping data record for its own database mapping information that
   contains the local EID-Prefixes and RLOCs for its site.  [...]

To double-check: this mapping data record is included in the "Map-Reply
Record" field of the Map-Request message?  It would probably help the
reader to be consistent about this terminology.

Section 8.1

                                                  In particular, the ITR
   need not connect to the LISP-ALT infrastructure or implement the BGP
   and GRE protocols that it uses.

Why does LISP-ALT get a callout but not (e.g.) LISP-DDT?

Section 8.3

   In response to a Map-Request (received over the ALT if LISP-ALT is in
   use), the Map-Server first checks to see if the destination EID

I see no reason to mention LISP-ALT here.

                                      If the EID-prefix is registered or
   not registered and there is a authentication failure, then a Drop/

What is the authentication flow that would be failing here?  The
Map-Register for the corresponding prefix?

                                                If either of these
   actions result as a temporary state in policy or authentication then
   a Send-Map-Request action with 1-minute TTL MAY be returned to allow
   the requestor to retry the Map-Request.

How can an SMR have an associated TTL?  The message format is that of a
regular Map-Request, is it not?

Section 8.4

   Upon receipt of an Encapsulated Map-Request, a Map-Resolver
   decapsulates the enclosed message and then searches for the requested
   EID in its local database of mapping entries (statically configured
   or learned from associated ETRs if the Map-Resolver is also a Map-
   Server offering proxy reply service).

This seems to be the first time the document admits the possibility for a
local database of mapping entries on a Map-Resolver; this makes me wonder
if there was an incomplete removal of such functionality from the document,
especially given that local caching of responses on the Map-Resolver is
explicitly disclaimed in Section 4.

Section 8.4.1

   ETRs MAY have anycast RLOC addresses which are registered as part of
   their RLOC-set to the mapping system.  However, registrations MUST
   use their unique RLOC addresses or distinct authentication keys to
   identify security associations with the Map-Servers.

xTR-IDs cannot be used for this purpose?

Section 9

I think we should have some discussion here about how mapping information
gleamed from SMR messages does not necessarily benefit from the on-path
guarantee that the nonce provides for regular mapping-system exchanges.

   The 2-way LISP control-plane header nonce exchange can be used to
   avoid ITR spoofing attacks, but active on-path attackers (e.g 'man-

Do we really need to limit ourselves to "ITR spoofing" as opposed to
generic spoofing, here?

           The Map-Register message is vulnerable to replay attacks by a
   man-in-the-middle.  A compromised ETR can overclaim the prefix it
   owns and successfully register it on its corresponding Map-Server.
   To mitigate this and as noted in Section 8.2, a Map-Server SHOULD
   verify that all EID-Prefixes registered by an ETR match the
   configuration stored on the Map-Server.

The conversion of the Map-Register 'Nonce' field into a sequence number
provides some moderate remediation against the replay attack; that should
be included in this discussion.

   Encrypting control messages via DTLS [RFC6347] or LISP-crypto
   [RFC8061] SHOULD be used to support privacy to prevent eavesdroping
   and packet tampering for messages exchanged between xTRs, xTRs and
   the mapping system, and nodes that make up the mapping system.

nit: "to support privacy to prevent eavesdropping and packet tampering"
doesn't read as grammatical; is the "to support privacy" still needed?

Section 10

Thank you for adding the Privacy Considerations section; it is imporant to
document this property of the system and let the operator make informed
decisions.

   As noted by [RFC6973] privacy is a complex issue that greatly depends
   on the specific protocol use-case and deployment.  As noted in
   section 1.1 of [I-D.ietf-lisp-rfc6830bis] LISP focuses on use-cases

Also Section 1.1 of this document.

Section 11

   o  The "m", "I", "L", and "D" bits are added to the Map-Request
      message.  See Section 5.3 for details.

Isn't this more a Section 5.2 thing than 5.3?  Also, I don't see "m" or "I"
bits described (though I do see "M").

   o  The "S", "I", "E", "T", "a", and "m" bits are added to the Map-
      Register message.  See Section 5.6 for details.

I see an "M" bit but not an "m" one.

Section 12.3

It feels a little weird to lump the ACT fields (which have a registry)
together in a section with the flag fields scattered throughout the
protocol (which do not).  Is it bad to have separate subsections for them
(especially when Section 12.6 already exists and does provide a registry
for other flag bits)?

Section 12.6

                        A sub-registry needs to be created per each
   message and record.  [...]

What is a "record" in this context?  (It does not seem like a mapping
record.)

I mostly expect IANA to ask for a listing of which bits/ranges are/aren't
allocatabale.