[lisp] Benjamin Kaduk's Discuss on draft-ietf-lisp-rfc6833bis-25: (with DISCUSS and COMMENT)

Benjamin Kaduk via Datatracker <noreply@ietf.org> Fri, 30 August 2019 18:54 UTC

Return-Path: <noreply@ietf.org>
X-Original-To: lisp@ietf.org
Delivered-To: lisp@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id 8518B1200E7; Fri, 30 Aug 2019 11:54:35 -0700 (PDT)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
From: Benjamin Kaduk via Datatracker <noreply@ietf.org>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-lisp-rfc6833bis@ietf.org, Luigi Iannone <ggx@gigix.net>, lisp-chairs@ietf.org, ggx@gigix.net, lisp@ietf.org
X-Test-IDTracker: no
X-IETF-IDTracker: 6.100.0
Auto-Submitted: auto-generated
Precedence: bulk
Reply-To: Benjamin Kaduk <kaduk@mit.edu>
Message-ID: <156719127553.25739.936459114738864865.idtracker@ietfa.amsl.com>
Date: Fri, 30 Aug 2019 11:54:35 -0700
Archived-At: <https://mailarchive.ietf.org/arch/msg/lisp/8cIVfrnpdYFa-1juOhjosV59XFM>
Subject: [lisp] Benjamin Kaduk's Discuss on draft-ietf-lisp-rfc6833bis-25: (with DISCUSS and COMMENT)
X-BeenThere: lisp@ietf.org
X-Mailman-Version: 2.1.29
List-Id: List for the discussion of the Locator/ID Separation Protocol <lisp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/lisp>, <mailto:lisp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/lisp/>
List-Post: <mailto:lisp@ietf.org>
List-Help: <mailto:lisp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/lisp>, <mailto:lisp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 30 Aug 2019 18:54:36 -0000

Benjamin Kaduk has entered the following ballot position for
draft-ietf-lisp-rfc6833bis-25: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
for more information about IESG DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-lisp-rfc6833bis/



----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

Updating for the -25 by removing points that are fully addressed but leaving
points that I still want to have further discussion on.  It may be most expedient to
continue discussion on my -24 ballot thread.  There are a couple of new items to the
-25, that I attempt to call out as such (they appear right before the "the following items
were present in my original DISCUSS position" section).

Please also note that the COMMENT section was entirely refreshed for the -24,
and I make some additions for the -25.

This document has normative dependencies on other WG drafts that are not
yet mature (one could perhaps define this as having completed IETF LC).  In
particular, I believe there is a nontrivial chance that either or both of
lisp-sec and 6834bis could require changes to this document in order to be
fit for purpose, and thus that this document cannot safely be approved for
publication until these normative dependencies are closer to publication.
In particular, I have done a fairly full review of lisp-sec and have
DISCUSS-worthy points with it (I have not done much review of 6834bis yet).


Also in Section 5.6:

                                             Implementations of this
      specification MUST include support for either HMAC-SHA-1-96
      [RFC2404] and HMAC-SHA-256-128 [RFC4868] where the latter is
      RECOMMENDED.

I don't think this sort of "mandatory to choose" is BCP 201-compliant,
since we need to have at least one MTI, strong, algorithm, and this text did not
pick one to be MTI.  Now (-25) we're at "SHOULD include support for
HMAC-SHA256-128-HKDF-SHA256", which is also not quite MTI (but is
definitely strong).  Of course, I personally
won't complain if we just go with the new HKDF stuff, but I recognize that
it would be a big change for implementations and deployments, and don't
think we need to make the spec completely disjoint from reality just to
check a box.  So we could make HMAC-SHA-256-128 MTI and leave the new
one as SHOULD, for example.

I think there needs to be more description of Site-ID usage and scoping in
order to be fully interoperable (more in the COMMENT section).
[ed. Even focusing on the scoping while leaving the detailed usage as
deployment-specific would be okay]

There are multiple places where we talk about message contents being copied
from a corresponding request (e.g., from Map-Request to Map-Notify); we
need to explicitly state that the authentication data is recomputed to
match, e.g., the new message type.  I've tried to note these occurrences in
the COMMENT section.
[ed. I think just from Map-Notify to Map-Notify-Ack is all that's left]

The condition for Map-Notify-Ack terminating Map-Notify retransmission
seems incomplete.  Specifically, we should only accept the
Map-Notify-Ack to stop retransmission if the authentication data validates
(and maybe that it uses the same Key-ID as the Map-Notify, though that
might be overkill).  So just "a Map-Notify-Ack is received by the
Map-Server with the same nonce" is not quite enough.

                                                     The LISP-SEC
   protocol defines a mechanism for providing origin authentication,
   integrity, anti-replay, protection, and prevention of 'man-in-the-
   middle' and 'prefix overclaiming' attacks on the Map-Request/Map-
   Reply exchange.  [...]

Does LISP-SEC actually provide any additional anti-replay protection not
present in the base protocol?  I do not remember any such additional
protection.
[ed. specifically, the nonce mechanism already in this document provides
a decent level of replay protection, so I am trying to nail down how
LISP-SEC does incrementally better than what's already here, for the
specific case of an attacker literally recording a Map-Reply and replaying
it, bit-for-bit, at a later time.

Section 11 ("Changes since RFC 6833") is inaccurate (see COMMENT).  I did
not check whether it is complete, but someone needs to do so before final
publication.
[ed. Waiting to do this until all other changes are in is fine.]


New in the -25, there's an internal inconsistency between Section 5.6's
description of the Authentication Data procedure, that says implementations
"SHOULD include support for HMAC-SHA256-128+HKDF-SHA256", and Section 9's
"[a]n implementation MUST support HMAC-SHA256-128+HKDF-SHA256".

Not new in the -25, but IIRC not previously discussed, how does a
Map-Server pick a Nonce value for unsolicited Map-Notify messages?


The following items were present in my original DISCUSS position and still
have not been resolved.  Note that I copy below the previous ballot text
even for some issues that are described above already in different words.

I think we need greater clarity on the 'E' and 'M' bits in the ECM format;
more in the Comment section [of the ballot on -16], quoted here for clarity:
>   E:    This is the to-ETR bit.  When set to 1, the Map-Server's
>         intention is to forward the ECM to an authoritative ETR.
>
> I think this needs to say more about which message flows this bit is
> defined for.  Presumably the ITR will never use it for sending an
> encapsulated Map-Request to a Map-Resolver, but there seem to be plenty of
> places where ECM wrapping is used.
>
>   M:    This is the to-MS bit.  When set to 1, a Map-Request is being
>         sent to a co-located Map-Resolver and Map-Server where the
>         message can be processed directly by the Map-Server versus the
>         Map-Resolver using the LISP-DDT procedures in [RFC8111].
>
> How does the sender know that its configured Map-Resolver is also a
> Map-Server?  It's unclear to me why this needs a bit in the message as
> opposed to just happening based on the attributes of the receiving
> Map-Server.


----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

[moving from DISCUSS to COMMENT based on explanation of this usage
being a common historical usage in the LISP community.  It still seems
needlessly ambiguous and confusing to me, though.]
There are many instances (some noted in my Comment) where a
bidirectional interaction between two xTRs is described, yet the peers are
identified as "ITR" and "ETR".  This is very confusing when the entity
named as "ITR" is described as performing ETR functionality, or vice versa;
pedagogically, it would be much better to use non-role-based names for the
entities while describing these exchanges.


Abstract

   This document describes the Control-Plane and Mapping Service for the
   Locator/ID Separation Protocol (LISP), implemented by two new types
   of LISP-speaking devices -- the LISP Map-Resolver and LISP Map-Server

This is a -bis document; is "new" really appropriate?  (It also appears in
the Introduction, of course.)

Section 1

   LISP is not intended to address problems of connectivity and scaling
   on behalf of arbitrary communicating parties.  Relevant situations
   are described in the scoping section of the introduction to
   [I-D.ietf-lisp-rfc6830bis].

It looks like we inline that text into this document as Section 1.1, below;
perhaps this paragraph is no longer needed, then?

Section 4

   A Map-Server is a device that publishes EID-Prefixes in a LISP
   mapping database on behalf of a set of ETRs.  When it receives a Map
   Request (typically from an ITR), it consults the mapping database to

nit: isn't it typically from a Map-Resolver (or other
mapping-system-internal entity)?  It's originally from an ITR, of course,
but the flow assumed by this document is described as
ITR->Map-Resolver->mapping-system-internals->Map-Server->ETR.

   Note that while it is conceivable that a Map-Resolver could cache
   responses to improve performance, issues surrounding cache management
   will need to be resolved so that doing so will be reliable and

nit: s/will/would/?

   practical.  As initially deployed, Map-Resolvers will operate only in
   a non-caching mode, decapsulating and forwarding Encapsulated Map
   Requests received from ITRs.  Any specification of caching
   functionality is out of scope for this document.

I think it's better to say something like "In this specification," rather
than "As initially deployed".

Also, I've confused myself a couple times from this -- it's only the
Map-Resolver that doesn't cache; the ITR is free to cache.  It might be
helpful to call out that distinction here.  Interestingly, in Section 1 we made
an analogy from DNS *caching* resolvers to Map-Resolvers; would an
analogy from DNS *recursive* resolvers have been more apt?

Section 5

I still think it's needlessly confusing to duplicate the IP/UDP header
layout figure here, at least without a prefacing comment noting that this
is the standard IP+UDP header with the IP addresses replaced by RLOCs.
But this is a non-blocking comment and the authors have already replied, so
feel free to ignore.

   Implementations MUST be prepared to accept packets when either the
   source port or destination UDP port is set to 4342 due to NATs
   changing port number values.

It's not entirely clear to me what this requirement is saying.

Section 5.2

   A: This is an authoritative bit, which is set to 0 for UDP-based Map-
      Requests sent by an ITR.  It is set to 1 when an ITR wants the
      destination site to return the Map-Reply rather than the mapping
      database system returning a Map-Reply.

Given that we've already disclaimed caching in the mapping system, aren't
all responses supposed to come from the destination site rather than the
mapping system?  Searching the rest of the document for the string
"authoritative" suggests that this is perhaps intended to avoid proxying
behavior from terminating Map-Servers (but when an ETR requests proxying is
it still guaranteed to be able to generate its own Map-Replys?), in which
case this could probably be phrased better.

   P: This is the probe-bit, which indicates that a Map-Request SHOULD
      be treated as a Locator reachability probe.  The receiver SHOULD
      respond with a Map-Reply with the probe-bit set, indicating that
      the Map-Reply is a Locator reachability probe reply, with the
      nonce copied from the Map-Request. [...]

Why are these only SHOULD?

I still think it is needlessly confusing to have bit labels that differ
only by letter case.  While this may not be confusing for the authors,
there are plenty of other people who could potentially be confused by it.
(Also, why are there two bits 'R' next to a 'Rsvd' field that all have the
same "reserved" semantics?)

   L: This is the local-xtr bit.  It is used by an xTR in a LISP site to
      tell other xTRs in the same site that it is part of the RLOC-set
      for the LISP site.  The L-bit is set to 1 when the RLOC is the
      sender's IP address.

At this point in the document, we haven't seen anything to suggest that an
xTR is going to be sending Map-Requests to other xTRs in the same site; a
forward reference is probably in order.

   D: This is the dont-map-reply bit.  It is used in the SMR procedure
      described in Section 6.1.  When an xTR sends an SMR Map-Request
      message, it doesn't need a Map-Reply returned.  When this bit is
      set, the receiver of the Map-Request does not return a Map-Reply.

nit: I'd suggest consolidating the behavior description and leaving the
explanations all at the end, so "This is the dont-map-reply bit.  When this
bit is set, the receiver of the Map-Request does not return a Map-Reply.
It is used in the SMR procedure described in Section 6.1; when an xTR
sends an SMR Map-Request message, it doesn't need a Map-Reply returned."

   Nonce:  This is an 8-octet random value created by the sender of the
      Map-Request.  This nonce will be returned in the Map-Reply.  The
      nonce is used as an index to identify the corresponding Map-
      Request when a Map-Reply message is received.  The nonce MUST be
      generated by a properly seeded pseudo-random (or strong random)
      source.  See [RFC4086] for advice on generating security-sensitive
      random data.

Since this value is just serving as a "number used once" (i.e., to match
replies to requests), this is a stronger requirement than we need -- it
merely needs to be unique per ITR.

   EID mask-len:  This is the mask length for the EID-Prefix.

In bits, right? ...

   EID-Prefix:  This prefix address length is 4 octets for an IPv4
      address family and 16 octets for an IPv6 address family when the

... then maybe we shouldn't switch to bytes for just one sentence (and go
back to bits later in the paragraph)?

      entry, the EID-Prefix is set to the destination IP address of the
      data packet, and the 'EID mask-len' is set to 32 or 128 for IPv4
      or IPv6, respectively.  When an xTR wants to query a site about

Is this really an xTR-specific action, or does it apply to any ITR
functionality?

            This allows the ETR that will receive this Map-Request to
      cache the data if it chooses to do so.

OTOH, this one does seem to require an xTR.

Section 5.3

                           For the initial case, the destination IP
   address used for the Map-Request is the data packet's destination
   address (i.e., the destination EID) that had a mapping cache lookup
   failure.  [...]

This seems like a type mismatch between RLOC/EID -- per the headers, the
destination address should be an RLOC, but we are forced to use an EID in
this case.  The disparity should probably be called out and explained,
e.g., clarify that it's okay to use an EID as destination inside ECM
encapsulation (and, apparently, if we believe Section 5.8, that it's
required to do so).

                                        A successful Map-Reply, which is
   one that has a nonce that matches an outstanding Map-Request nonce,
   will update the cached set of RLOCs associated with the EID-Prefix
   range.

nit: A negative Map-Reply will match the nonce.  Will it also update the
cached set?  Is it still considered to be "successful"?

                                          If the ITR erroneously
   provides no ITR-RLOC addresses, the Map-Replier MUST drop the Map-
   Request.

I see we talked about this last time around; did you want to add some text
about how, despite the protocol message not definitionally allowing for
this detection, in practice it is still possible?

   Request.  When an ETR configured to accept and verify such
   "piggybacked" mapping data receives such a Map-Request and it does

So, "(i.e., it is also an ITR)"?

                       If the ETR (when it is an xTR co-located as an
   ITR) has a Map-Cache entry that matches the "piggybacked" EID and the
   RLOC is in the Locator-Set for the entry, then it MAY send the

nit: "cached entry" would help clarify the prerequisites here.

   source.  If the RLOC is not in the Locator-Set, then the ETR MUST
   send the "verifying Map-Request" to the "piggybacked" EID.  [...]

"send ... to the [...] EID" seems like a type mismatch again, since we only
can send Map-Requests to RLOCs.

Section 5.4

      Map-Request.  See RLOC-probing Section 7.1 for more details.  When
      the probe-bit is set to 1 in a Map-Reply message, the A-bit in
      each EID-record included in the message MUST be set to 1.

Do we want to specify any special handling if that NUST is disobeyed?

   S: This is the Security bit.  When set to 1, the following
      authentication information will be appended to the end of the Map-
      Reply.  The details of signing a Map-Reply message can be found in
      [I-D.ietf-lisp-sec].

Please do not use the word "signing" here; it is a term of art that is not
appropriate to the actual operation performed.

   Record TTL:  This is the time in minutes the recipient of the Map-
      Reply will store the mapping.  If the TTL is 0, the entry MUST be

I think "can" is more appropriate than "will"; generally a local cache can
safely be invalidated at will.

   Locator Count:  This is the number of Locator entries.  A Locator

Please scope this to "in the given Record".

   EID mask-len:  This is the mask length for the EID-Prefix.

(in bits, right?)

   ACT:  This 3-bit field describes Negative Map-Reply actions.  In any
      other message type, these bits are set to 0 and ignored on
      receipt.  These bits are used only when the 'Locator Count' field
      is set to 0.  The action bits are encoded only in Map-Reply
      messages.  [...]

This is the section on Map-Reply messages; why are we talking about other
message types?  Also, do we want to mention that the possible values are
managed by IANA?

   A: The Authoritative bit, when set to 1, is always set to 1 by an
      ETR.  When a Map-Server is proxy Map-Replying for a LISP site, the
      Authoritative bit is set to 0.  This indicates to requesting ITRs
      that the Map-Reply was not originated by a LISP node managed at
      the site that owns the EID-Prefix.

nit: This text is needlessly confusing.  How about "The authoritative bit
can only be set to 1 by an ETR (and not a Map-Server generating Map-Reply
messages as a proxy).  If this bit is set to 0, that indicates ..."?

Section 5.5

Please provide a link/reference for Data-Probe on first usage.

   For each Map-Reply record, the list of Locators in a Locator-Set MUST
   appear in the same order for each ETR that originates a Map-Reply
   message.  The Locator-Set MUST be sorted in order of ascending IP
   address where an IPv4 locator address is considered numerically 'less
   than' an IPv6 locator address.

IIUC, there is no need for "MUST appear in the same order" if you also
mandate a specific sorting function.

Section 5.6

   P: This is the proxy Map-Reply bit.  When set to 1, an ETR sends a
      Map-Register message requesting the Map-Server to proxy a Map-
      Reply.  [...]

nit: "just one?"

Do you want to give a mnemonic for the 'I' bit?

The "Nonce" field is acting as a sequence number, not just as a number used
once.  I strongly suggest changing the name accordingly.

   Authentication Data Length:  This is the length in octets of the
      'Authentication Data' field that follows this field.  The length
      of the 'Authentication Data' field is dependent on the MAC
      algorithm used.  The length field allows a device that doesn't
      know the MAC algorithm to correctly parse the packet.

Why does a device that won't be able to validate the authentication data
need to be able to parse the packet?  I thought all Map-Registers needed to
be authenticated.

   xTR-ID:  xTR-ID is a 128 bit field at the end of the Map-Register
      message, starting after the final Record in the message.  The xTR-
      ID is used to uniquely identify a xTR.  The same xTR-ID value MUST
      NOT be used in two different xTRs.

Globally, over all time?  Within a single LISP domain, over all time?
Please be specific.

   Site-ID:  Site-ID is a 64 bit field at the end of the Map- Register
      message, following the xTR-ID.  Site-ID is used to uniquely
      identify to which site the xTR that sent the message belongs.

Where is a (LISP) "site" formally defined?  Are there weird topologies or
edge cases that we need to consider when assigning numbers, risk of having
two IDs that might validly apply to a single xTR, etc.?

Section 5.7

(If Nonce is renamed above, it should be renamed here as well.)

                                      The fields of the Map-Notify-Ack
   are copied from the corresponding Map-Notify message to acknowledge
   its correct processing.

Please note that the authorization data is recomputed, here.

   The Map-Notify-Ack message has the same contents as a Map-Notify
   message.  It is used to acknowledge the receipt of a Map-Notify
   (solicited or unsolicited) and for the sender to stop retransmitting

So a normal exchange would include Map-Register, Map-Notify, and
Map-Notify-Ack?

   A Map-Server sends an unsolicited Map-Notify message (one that is not
   used as an acknowledgment to a Map-Register message) that follows the
   Congestion Control And Relability Guideline sections of [RFC8085].  A

This second clause ("that follows") is rather a non sequitur here.  And we
still don't know what purpose the unsolicited Map-Notify serves!

   Map-Notify is retransmitted until a Map-Notify-Ack is received by the
   Map-Server with the same nonce used in the Map-Notify message.  If a

Presumably we care about (e.g.) the key ID matching and the authentication
data validating, as well?

   Map-Notify-Ack is never received by the Map-Server, it issues a log
   message.  An implementation SHOULD retransmit up to 3 times at 3
   second retransmission intervals, after which time the retransmission
   interval is exponentially backed-off for another 3 retransmission

"exponentially" is not well defined unless the base of the exponent is
specified.

   attempts.  After this time, an xTR can only get the RLOC-set change
   by later querying the mapping system or by RLOC-probing one of the
   RLOCs of the existing cached RLOC-set to get the new RLOC-set.

What RLOC-set change?  This text doesn't seem to indicate what
functionality is going on here.

Section 5.8

   An Encapsulated Control Message (ECM) is used to encapsulate control
   packets sent between xTRs and the mapping database system.

Some of the flag bit descriptions appear to describe usages that are or can
be entirely within the mapping system.

   D:    This is the DDT-bit.  When set to 1, the sender is requesting a
         Map-Referral message to be returned.  The details of this
         procedure are described in [RFC8111].

E.g., here, the sender can be (IIUC) within the mapping system.

   E:    This is the to-ETR bit.  When set to 1, the Map-Server's
         intention is to forward the ECM to an authoritative ETR.

I'm not sure that "intention" is quite right, here -- as far as this
document is concerned, a Map-Server will always know whether it is sending
an ECM to an authoritative ETR.  Also, this bit does not seem to be used
for anything within this document, and no external reference is given.

Are the 'M' and 'E' bits mutally exclusive?  (Would we even care?)

I suggest adding more text about which sender/receiver pairs are permitted
(or allowed or expected) to set the D, E, and M bits.

         invoking Map-Request.  Port number 4341 MUST NOT be assigned to
         either port.  The checksum field MUST be non-zero.

This is the only place in this document that we disallow port 4341.  Should
we also be disallowing it from being used as the non-4342 port for other
exchanges?

   LCM:  The format is one of the control message formats described in
         this section.  [...]

nit: "this section" means 5.8; presumably we mean Section 5.

Section 6.1

I agree with Warren that the direct usage of mapping information included
in an SMR presents a substantial attack surface, both for DoS and
potentially for redirecting traffic wholesale (whether for snooping
purposes or use as volumetric DoS to a third-party target).  There is some
discussion of the risks of spoofing with this sort of "gleaming" behavior,
but I strongly suggest mentioning something like "this technique presents a
risk of off-path spoofing; see Section 9 for details" at each such
non-validated scheme for learning mapping information.

   Since ETRs are not required to keep track of remote ITRs that have
   cached their mappings, they do not know which ITRs need to have their
   mappings updated.  As a result, an ETR will solicit Map-Requests
   (called an SMR message) to those sites to which it has been sending
   LISP encapsulated data packets for the last minute.  In particular,
   an ETR will send an SMR to an ITR to which it has recently sent
   encapsulated data.  This can only occur when both ITR and ETR
   functionality reside in the same router.

I still think that this text is needlessly confusing about which action is
taken by which router, and could be improved as, e.g., "this can only occur
when the ETR also provides ITR functionality (that is, it is an xTR)".

   The following procedure shows how an SMR exchange occurs when a site
   is doing Locator-Set compaction for an EID-to-RLOC mapping:

Where is locator-set compaction defined?

Throughout this whole example, "the site with the changed mapping" and "the
site that sent the Map-Request" are kind of clunky phrases; it might be
cleaner writing to give them names (like "site A" and "site B").

   messages.  It is RECOMMENDED that the SMR sender rate-limits Map-
   Request for the same destination RLOC to no more than one packet per
   3 seconds.  It is RECOMMENDED that the SMR responder rate-limits Map-
   Request for the same EID-Prefix to no more than once per 3 seconds.

Please double-check that "SMR sender"/"SMR responder" and "Map-Request"
usage are consistent/as-intended.

   2.  A remote ITR that receives the SMR message will schedule sending
       a Map-Request message to the source locator address of the SMR
       message or to the mapping database system.  [...]

How does the ITR decide which destination to send the Map-Request to?

       copied from the SMR message.  If the source Locator is the only
       Locator in the cached Locator-Set, the remote ITR SHOULD send a

just to double-check: this is the source Locator from the SMR?

       Map-Request to the database mapping system just in case the
       single Locator has changed and may no longer be reachable to
       accept the Map-Request.

Is this the only case that the Map-Request would go to the mapping system?

   3.  The remote ITR MUST rate-limit the Map-Request until it gets a
       Map-Reply while continuing to use the cached mapping.  When

nit: I suggest a comma after "Map-Reply" to avoid the misparse that the
Map-Reply must be received while the cached mapping is in use (and that the
rate limiting would continue indefinitely if the cached mapping expired in
the meantime).

   5.  The ETRs at the site with the changed mapping record the fact
       that the site that sent the Map-Request has received the new
       mapping data in the Map-Cache entry for the remote site so the
       Locator-Status-Bits are reflective of the new mapping for packets
       going to the remote site.  [...]

The Locator-Status-Bits in which direction?  (Probably should also give a
section ref to 6830bis for the definition.)  It might also be appropriate to
discuss the interaction with the relevant map version.

   For security reasons, an ITR MUST NOT process unsolicited Map-
   Replies.  To avoid Map-Cache entry corruption by a third party, a
   sender of an SMR-based Map-Request MUST be verified.  If an ITR

To be clear, the verification here is essentially return-routability
verification, aka proof that the sender actually owns the claimed address,
right?  I think it is appropriate to have some text noting the specific
behavior, and that this is not any sort of cryptographic or strongly
authenticated verification.

   receives an SMR-based Map-Request and the source is not in the
   Locator-Set for the stored Map-Cache entry, then the responding Map-
   Request MUST be sent with an EID destination to the mapping database
   system.  [...]

What is an "SMR-based Map-Request" (also appears in the next paragraph and
one other place)?  Is it just an SMR?  If it's some actual Map-Request, I'm
confused at why an *I*TR would be receiving it.

Section 7

   3.  An ITR may receive an ICMP Port Unreachable message from a
       destination host.  This occurs if an ITR attempts to use
       interworking [RFC6832] and LISP-encapsulated data is sent to a
       non-LISP-capable site.

Is the ITR supposed to conclude that the RLOC is likely down in this
situation?

   When ITRs receive ICMP Network Unreachable or Host Unreachable
   messages as a method to determine unreachability, they will refrain
   from using Locators that are described in Locator lists of Map-
   Replies.  [...]

Is this really as precise as it can be?  It kind of sounds like it says
that all Map-Replies will be ignored when any ICMP Network/Host Unreachable
message is received.

            If it does not find one and BGP is running in the Default-
   Free Zone (DFZ), it can decide to not use the Locator even though the

Is running in the DFZ consistent with the reduced scope of running in a
single administrative domain?

   Optionally, an ITR can send a Map-Request to a Locator, and if a Map-
   Reply is returned, reachability of the Locator has been determined.

Is this describing the same flow as item (5) above and Section 7.1 below?
If so, it seems totally redundant and could be omitted.

   Obviously, sending such probes increases the number of control
   messages originated by Tunnel Routers for active flows, so Locators
   are assumed to be reachable when they are advertised.

I'm not sure what "advertised" is intended to mean, here.  Is it
"advertised into the mapping system"?  But that is not directly visible to
the ITR, only indirectly through the results of an actual mapping request
(and even then, the Map-Reply from an ETR could be invalid, e.g.,
overclaiming, unless LISP-SEC is used.

                               Both Requests and Replies MUST be rate-
   limited.  [...]

I believe this requirement duplicates requirements already made elsewhere;
the other locations also include more guidance on actual rates.

Section 7.1

                                                   A Map-Request used as
   an RLOC-probe is NOT encapsulated and NOT sent to a Map-Server or to
   the mapping database system as one would when soliciting mapping
   data.  [..]

I strongly suggest using a word other than "soliciting" to avoid confusion
with SMR.

   data.  The EID record encoded in the Map-Request is the EID-Prefix of
   the Map-Cache entry cached by the ITR or PITR.  The ITR MAY include a

Is it worth reminding the reader that the source EID here is zero-length
and source-EID-AFI set to zero?

   mapping data record for its own database mapping information that
   contains the local EID-Prefixes and RLOCs for its site.  [...]

To double-check: this mapping data record is included in the "Map-Reply
Record" field of the Map-Request message?  It would probably help the
reader to be consistent about this terminology.

Section 8.1

                                                  In particular, the ITR
   need not connect to the LISP-ALT infrastructure or implement the BGP
   and GRE protocols that it uses.

Why does LISP-ALT get a callout but not (e.g.) LISP-DDT?

Section 8.3

                                      If the EID-prefix is registered or
   not registered and there is a authentication failure, then a Drop/

What is the authentication flow that would be failing here?  The
Map-Register for the corresponding prefix?

                                                If either of these
   actions result as a temporary state in policy or authentication then
   a Send-Map-Request action with 1-minute TTL MAY be returned to allow
   the requestor to retry the Map-Request.

How can an SMR have an associated TTL?  The message format is that of a
regular Map-Request, is it not?

Section 8.4

   Upon receipt of an Encapsulated Map-Request, a Map-Resolver
   decapsulates the enclosed message and then searches for the requested
   EID in its local database of mapping entries (statically configured
   or learned from associated ETRs if the Map-Resolver is also a Map-
   Server offering proxy reply service).

This seems to be the first time the document admits the possibility for a
local database of mapping entries on a Map-Resolver; this makes me wonder
if there was an incomplete removal of such functionality from the document,
especially given that local caching of responses on the Map-Resolver is
explicitly disclaimed in Section 4.

Section 8.4.1

   ETRs MAY have anycast RLOC addresses which are registered as part of
   their RLOC-set to the mapping system.  However, registrations MUST
   use their unique RLOC addresses or distinct authentication keys to
   identify security associations with the Map-Servers.

xTR-IDs cannot be used for this purpose?

Section 9

I think we should have some discussion here about how mapping information
gleamed from SMR messages does not necessarily benefit from the on-path
guarantee that the nonce provides for regular mapping-system exchanges.

           The Map-Register message is vulnerable to replay attacks by a
   man-in-the-middle.  A compromised ETR can overclaim the prefix it
   owns and successfully register it on its corresponding Map-Server.
   To mitigate this and as noted in Section 8.2, a Map-Server SHOULD
   verify that all EID-Prefixes registered by an ETR match the
   configuration stored on the Map-Server.

The conversion of the Map-Register 'Nonce' field into a sequence number
provides some moderate remediation against the replay attack; that should
be included in this discussion.

   Encrypting control messages via DTLS [RFC6347] or LISP-crypto
   [RFC8061] SHOULD be used to support privacy to prevent eavesdroping
   and packet tampering for messages exchanged between xTRs, xTRs and
   the mapping system, and nodes that make up the mapping system.

nit: "to support privacy to prevent eavesdropping and packet tampering"
doesn't read as grammatical; is the "to support privacy" still needed or maybe
just a missing "and"?
less-nit: this current wording makes it seem like LISP-crypto should be a
normative reference, but I think the intended semantics are more of "some
mechanism SHOULD be used to provide [privacy/confidentiality protection],
and here is an incomplete list of mechanisms that do so"

Section 10

Thank you for adding the Privacy Considerations section; it is imporant to
document this property of the system and let the operator make informed
decisions.

   As noted by [RFC6973] privacy is a complex issue that greatly depends
   on the specific protocol use-case and deployment.  As noted in
   section 1.1 of [I-D.ietf-lisp-rfc6830bis] LISP focuses on use-cases

Also Section 1.1 of this document.

Section 11

   o  The "m", "I", "L", and "D" bits are added to the Map-Request
      message.  See Section 5.3 for details.

Isn't this more a Section 5.2 thing than 5.3?  Also, I don't see "m" or "I"
bits described (though I do see "M").

   o  The "S", "I", "E", "T", "a", and "m" bits are added to the Map-
      Register message.  See Section 5.6 for details.

I see an "M" bit but not an "m" one.

Section 12.3

It feels a little weird to lump the ACT fields (which have a registry)
together in a section with the flag fields scattered throughout the
protocol (which do not).  Is it bad to have separate subsections for them
(especially when Section 12.6 already exists and does provide a registry
for other flag bits)?

Section 12.6

                        A sub-registry needs to be created per each
   message and record.  [...]

What is a "record" in this context?  (It does not seem like a mapping
record.)

I mostly expect IANA to ask for a listing of which bits/ranges are/aren't
allocatabale.