[lisp] Benjamin Kaduk's Discuss on draft-ietf-lisp-rfc6830bis-26: (with DISCUSS and COMMENT)

Benjamin Kaduk <kaduk@mit.edu> Thu, 07 February 2019 14:35 UTC

Return-Path: <kaduk@mit.edu>
X-Original-To: lisp@ietf.org
Delivered-To: lisp@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id 2E4821295EC; Thu, 7 Feb 2019 06:35:50 -0800 (PST)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
From: Benjamin Kaduk <kaduk@mit.edu>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-lisp-rfc6830bis@ietf.org, Luigi Iannone <ggx@gigix.net>, lisp-chairs@ietf.org, ggx@gigix.net, lisp@ietf.org
X-Test-IDTracker: no
X-IETF-IDTracker: 6.91.0
Auto-Submitted: auto-generated
Precedence: bulk
Message-ID: <154955015018.31108.12820143983477326611.idtracker@ietfa.amsl.com>
Date: Thu, 07 Feb 2019 06:35:50 -0800
Archived-At: <https://mailarchive.ietf.org/arch/msg/lisp/HTKLlXPr9hse-qxK3TIXGhQji4g>
Subject: [lisp] Benjamin Kaduk's Discuss on draft-ietf-lisp-rfc6830bis-26: (with DISCUSS and COMMENT)
X-BeenThere: lisp@ietf.org
X-Mailman-Version: 2.1.29
List-Id: List for the discussion of the Locator/ID Separation Protocol <lisp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/lisp>, <mailto:lisp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/lisp/>
List-Post: <mailto:lisp@ietf.org>
List-Help: <mailto:lisp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/lisp>, <mailto:lisp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Feb 2019 14:35:50 -0000

Benjamin Kaduk has entered the following ballot position for
draft-ietf-lisp-rfc6830bis-26: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
for more information about IESG DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-lisp-rfc6830bis/



----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

Section 3 still contains text:

   EID-to-RLOC Database:   The EID-to-RLOC Database is a global
      distributed database that contains all known EID-Prefix-to-RLOC

that indicates that the mapping database is a single, global, distributed
database; we had previously agreed that the target scope was much more
narrow.  I could perhaps charitably assume that this instance was missed as an
editing error because the phrase "global distributed database" spans a line
break, but given that this specific instance was called out in my previous
discuss position, it is fairly hard to do so.

Also in Section 3:
   Endpoint ID (EID):   An EID is a 32-bit (for IPv4) or 128-bit (for
      IPv6) value used in the source and destination address fields of
      the first (most inner) LISP header of a packet.  [...]

6833bis says (section 5.8) that the inner header can use either RLOC or EID
addresses in the header address fields, which contradicts this statement.

The various places where we mention "gleaming" or similar unauthenticated
(un-path-verified?) schemes for learning mapping information should all
mention at their description that they are susceptible to spoofing and link
to the security considerations.

I'm still concerned about the synchronization requirements between
map-version changes and LSB usage; with the currently described technology
it seems almost inevitable for race conditions around RLOC changes to cause
ITRs to make incorrect routing decisions due to misinterpreted status bits.
It's unclear whether it's even worth trying to tackle this problem before
the map-versioning document is more advanced along in the process, though.
(Several comments throughout are relevant, especially those on Section
13.1.)

I agree with Warren that clarity on whether traffic is buffered or dropped
during the lookup process is needed (e.g., in Section 6).

Also in the vein of Warren's comments, in Section 7.1 I was expecting (from
our previous discussions) that some text would be added about the
determination of L being something that is "performed once by the
administrator of the LISP deployment and treated as a constant across the
deployment".

The discussion of Instance IDs remains incomplete, with no discussion of
within what scope their values must be unique (as truncated to 24 bits).

Similarly (also Section 8),
                                                              Multiple
   Data-Planes can use the same 32-bit space as long as the low-order 24
   bits don't overlap among xTRs.

That's a pretty lousy property to have in a PS specification.

Section 13

   When a Locator record is removed from a Locator-Set, ITRs that have
   the mapping cached will not use the removed Locator because the xTRs
   will set the Locator-Status-Bit to 0.  So, even if the Locator is in
   the list, it will not be used.  For new mapping requests, the xTRs
   can set the Locator AFI to 0 (indicating an unspecified address), as
   well as setting the corresponding Locator-Status-Bit to 0.  This

I do not remember there being an ordering (or even consistency) requirement
on the ITR-RLOC entries in the Map-Request, so it's unclear that just
replacing one entry with an AFI-0 entry would convey this information.  I
suppose that using only a single ITR-RLOC entry, with AFI 0, would provide
a usable signal to the ETR, but that does not seem to be what is being
described here.  (Also, on a rhetorical point, please clarify that the "as
well as" is for setting the LSB to 0 in data packets; Map-Requests do not
include any LSBs.)

   If many changes occur to a mapping over a long period of time, one
   will find empty record slots in the middle of the Locator-Set and new
   records appended to the Locator-Set. At some point, it would be
   useful to compact the Locator-Set so the Locator-Status-Bit settings
   can be efficiently packed.

This text, implying that compactification must wait for some unspecified
later event, seems to be assuming some requirement to preserve order of
Locator-Set entries that I cannot find a description of in either 6830bis
or 6833bis.

Do RFCs 6831 and 8378 need to be normative references for how to do
multicast as an optional protocol feature (recalling that
https://www.ietf.org/blog/iesg-statement-normative-and-informative-references/
clarifies that references that are relevant only for optional features are
still classified as normative)?

In Section 16:

   A complete LISP threat analysis can be found in [RFC7835].  In what

RFC 7835 remains an incomplete analysis; please stop referring to it as
such.

   of time.  The goal is to convince the ITR that the ETR's RLOC is
   reachable even when it may not be reachable.  If the attack is

I think Warren is correct that there is also an attack that lies in
convincing an ITR that an ETR is not reachable even when it is reachable.


The following items were present in my original DISCUSS position and still
have not been resolved.  Note that I copy below the previous ballot text
even for some issues that are described above already in different words.

Section 4.1's Step (6) only mentions parsing "to check for format
validity".  I think it is appropriate to mention (and refer to) source
authentication checks as well, since bad Map-Reply data can allow all sorts
of attacks to occur.

There are some fairly subtle ordering requirements between the order of
entries in Map-Reply messages and the Locator-Status-Bits in data-plane
traffic (so that the semantic meaning of the status bits are meaningful),
which is only given a minimal treatment in the control-plane document.  The
need for synchronization in interpreting these bits should be mentioned
more prominently in the data-plane document as well.

The usage of the Instance ID does not seem to be adequately covered; from
what I've been able to pick up so far it seems that both source and
destination participants must agree on the meaning of an Instance ID, and
the source and destination EIDs must be in the same Instance.  This does
not seem like it is compatible with Internet scale, especially if there are
only 24 usable bits of Instance ID.

There seems to be a lot of intra-site synchronization requirements, notably
with respect to Map-Version consistency, the contents and ordering of
locator sets for EIDs in the site, etc.; the actual hard requirements for
synchronization within a site should be clearly called out, ideally in a
single location.

The security considerations attempt to defer substantially to the
threat-analysis in RFC 7835, which does not really seem like a complete
threat analysis and does not provide analysis as to what requirements are
placed on the boundaries between the different components of LISP (data
plane, control plane, mapping system, various extensions, etc.).  The
secdir reviewer had some good thoughts in this space.

I am not convinced that this protocol meets the current IETF requirements
for the security properties of Standards-Track Protocols without at least
LISP-SEC as a mandatory-to-implement component, and possibly additional or
stronger requirements.  (I did not do a full analysis of the system in the
presence of those security mechanisms, since that is not what is being
presented for review.)
[ed. even though LISP-SEC has been promoted to MTI, it remains difficult to
be confident in the results of a full system analysis due to the number of
other outstanding issues with the core documents.  Consider the risk Ekr
noted yesterday in email about tampering with the Map-Request causing
apparently-valid repsonses that convey incorrect results with respect to
the original query.]


----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

I note as a preface that in many places in this (and other) review, I ask
questions.  The ones that indicate I actually don't understand are
generally accompanied by text like "just to confirm" or provide some option
for possible interpretation.  Others are intended as rhetorical devices,
indicating that the text locally at this point in the document is unclear
and the question posed should be addressed in the document, in-line.

Section 1

                                                         LISP
   encapsulation uses a dynamic form of tunneling where no static
   provisioning is required or necessary.

Do I interpret this correctly as meaning that no static provisioning is
necessary on end-hosts?  It seems that ETRs and the mapping system
entities definitely need static configuration.  But do end sites need to
know what lisp site/administrative domain they are part of?

   LISP is an overlay protocol that separates control from Data-Plane,
   this document specifies the Data-Plane, how LISP-capable routers
   (Tunnel Routers) exchange packets by encapsulating them to the
   appropriate location.  [...]

nit: this is a comma splice

Section 3

   Anycast Address:  Anycast Address is a term used in this document to

How does this definition differ from the one in RFC 8499?

   Data-Probe:   A Data-Probe is a LISP-encapsulated data packet where
      the inner-header destination address equals the outer-header
      destination address used to trigger a Map-Reply by a decapsulating
      ETR.  [...]

nit: is this better as two sentences?  ("...is a LISP-encapsulated data
packet where the inner-header destination address equals the outer-header
destination address.  It is used to trigger...")

I would also say something in this paragraph about how this behavior blurs
the distinction between EID and RLOC.

                    When using Data-Probes, by sending Map-Requests on
      the underlying routing system, EID-Prefixes must be advertised.

I don't understand why the one follows from the other (in fact, there seem
to be three not-particularly-related concepts in this sentence).

(I also note that "Data-Probe" is not mentioned anywhere in this document
other than its own definition, which would suggest that it could be moved
to 6833bis, which does mention them.)

   EID-to-RLOC Database:   The EID-to-RLOC Database is a global
      distributed database that contains all known EID-Prefix-to-RLOC

I thought we had agreed to remove this "global distributed database"
language.

      addresses that are routable on the underlay.  Note that there MAY
      be transient conditions when the EID-Prefix for the site and
      Locator-Set for each EID-Prefix may not be the same on all ETRs.

nit: we haven't indicated a definite site for "the site" to be meaningful.

   EID-to-RLOC Map-Cache:   The EID-to-RLOC Map-Cache is generally
      short-lived, on-demand table in an ITR that stores, tracks, and is
      responsible for timing out and otherwise validating EID-to-RLOC
      mappings.  This cache is distinct from the full "database" of EID-
      to-RLOC mappings; it is dynamic, local to the ITR(s), and
      relatively small, while the database is distributed, relatively
      static, and much more global in scope to LISP nodes.

"global" caught my eye again here; perhaps "global in scope to a LISP
deployment"?

   EID-Prefix:   An EID-Prefix is a power-of-two block of EIDs that are
      allocated to a site by an address allocation authority.  EID-
      Prefixes are associated with a set of RLOC addresses.  EID-Prefix
      allocations can be broken up into smaller blocks when an RLOC set
      is to be associated with the larger EID-Prefix block.

nit: I think the causality here is backwards -- the breaking up does not
occur at the moment that you associate an RLOC set to a larger EID-Prefix
block.

   Endpoint ID (EID):   An EID is a 32-bit (for IPv4) or 128-bit (for
      IPv6) value used in the source and destination address fields of
      the first (most inner) LISP header of a packet.  [...]

6833bis says (section 5.8) that the inner header can use either RLOC or EID
addresses in the header address fields, which contradicts this statement.

      Specifically, when a service provider prepends a LISP header for
      Traffic Engineering purposes, the router that does this is also
      regarded as an ITR.  The outer RLOC the ISP ITR uses can be based
      on the outer destination address (the originating ITR's supplied
      RLOC) or the inner destination address (the originating host's

Er, that the ISP ITR uses as source or as destination?

   Negative Mapping Entry:   A negative mapping entry, also known as a
      negative cache entry, is an EID-to-RLOC entry where an EID-Prefix
      is advertised or stored with no RLOCs.  That is, the Locator-Set
      for the EID-to-RLOC entry is empty, one with an encoded Locator
      count of 0.  This type of entry could be used to describe a prefix
      from a non-LISP site, which is explicitly not in the mapping
      database.  There are a set of well-defined actions that are
      encoded in a Negative Map-Reply.

This bit about Negative Map-Replys comes out of the blue; is it really
needed in this paragraph?

   TE-ETR:   A TE-ETR is an ETR that is deployed in a service provider
      network that strips an outer LISP header for Traffic Engineering
      purposes.

nit: is it really the act of stripping the header that is done for TE
purposes?

Section 4.1

   2.  The LISP ITR must be able to map the destination EID to an RLOC
       of one of the ETRs at the destination site.  The specific method
       used to do this is not described in this example.  See
       [I-D.ietf-lisp-rfc6833bis] for further information.

   3.  The ITR sends a LISP Map-Request as specified in
       [I-D.ietf-lisp-rfc6833bis].  Map-Requests SHOULD be rate-limited.

At risk of repeating myself, isn't (3) just doing what (2) says the ITR
must be able to do?  I don't see why both items are needed.

   5.  The ETR looks at the destination EID of the Map-Request and
       matches it against the prefixes in the ETR's configured EID-to-
       RLOC mapping database.  This is the list of EID-Prefixes the ETR
       is supporting for the site it resides in.  If there is no match,
       the Map-Request is dropped.  Otherwise, a LISP Map-Reply is

Isn't this (1) something that 6833bis should be authoritative on, and (2)
a recipe for random hangs if the mapping system ever misdirects a
map-request?

   7.  Subsequent packets from host1.abc.example.com to
       host2.xyz.example.com will have a LISP header prepended by the

The use of "subsequent" implies that he original IP packet does not receive
this treatment.  But we don't say anywhere that it gets dropped, leaving us
guessing as to what is supposed to happen to it.

   9.  In order to defer the need for a mapping lookup in the reverse
       direction, an ETR can OPTIONALLY create a cache entry that maps
       the source EID (inner-header source IP address) to the source
       RLOC (outer-header source IP address) in a received LISP packet.
       Such a cache entry is termed a "glean mapping" and only contains

This gleaming seems susceptible to spoofing.

Section 5.3

   L: The L-bit is the 'Locator-Status-Bits' field enabled bit.  When
      this bit is set to 1, the Locator-Status-Bits in the second
      32 bits of the LISP header are in use.

What happens to those bits when the L-bit is set to zero?

   E: The E-bit is the echo-nonce-request bit.  This bit MUST be ignored
      and has no meaning when the N-bit is set to 0.  When the N-bit is
      set to 1 and this bit is set to 1, an ITR is requesting that the

Do we need to say what to do when N is 1 and E is 0?

   LISP Locator-Status-Bits (LSBs):  When the L-bit is also set, the
      [...]
      the ETRs at the same site.  When a site has multiple EID-Prefixes
      that result in multiple mappings (where each could have a
      different Locator-Set), the Locator-Status-Bits setting in an
      encapsulated packet MUST reflect the mapping for the EID-Prefix
      that the inner-header source EID address matches.  If the LSB for

I don't think there is guaranteed to be a single unique EID-Prefix that
matches the inner source EID; we need to say longest-match here.

   o  The outer-header 'Differentiated Services Code Point' (DSCP) field
      (or the 'Traffic Class' field, in the case of IPv6) SHOULD be
      copied from the outer-header DSCP field ('Traffic Class' field, in
      the case of IPv6) to the inner-header.

IPv6 is a first-class IP protocol; it should not be relegated to a
parenthetical.

   Note that if an ETR/PETR is also an ITR/PITR and chooses to re-
   encapsulate after decapsulating, the net effect of this is that the
   new outer header will carry the same Time to Live as the old outer
   header minus 1.

Where is the decrementing of the TTL documented?  I just see "copy" in the
above text.

Is the last paragraph adding anything that was not already said previously?
It seems pretty redundant on first read.

Section 6

             Finally, the Map-Cache also contains reachability
   information about EIDs and RLOCs, and uses LISP reachability
   information mechanisms to determine the reachability of RLOCs, see
   Section 10 for the specific mechanisms.

nit: The Map-Cache performs reachability discovery?  That seems incongruous with
a cache nature; perhaps it is better to say that it is updated with the
results of such procedures.

Section 7.1

                     Note that reassembly can happen at the ETR if the
   encapsulated packet was fragmented at or after the ITR.

Why would an ETR want to take on this additional state-keeping burden
instead of relegating it to the end host?

   When the 'DF' field of the IP header is set to 1, or the packet is an
   IPv6 packet originated by the source host, the ITR will drop the
   packet when the size is greater than L and send an ICMPv4 ICMP

Which size?

Section 9

   o  Either side (more likely the server-side ETR) decides not to send
      a Map-Request.  For example, if the server-side ETR does not send
      Map-Requests, it gleans RLOCs from the client-side ITR, giving the

[in the next paragraph we talk about sending Map-Requests to verify gleamed
mappings, which doesn't mesh very well with "does not send Map-Requests"
here]

      client-side ITR responsibility for bidirectional RLOC reachability
      and preferability.  Server-side ETR gleaning of the client-side
      ITR RLOC is done by caching the inner-header source EID and the
      outer-header source RLOC of received packets.  [...]

Isn't this susceptible to spoofing?

   Instead of using the Map-Cache or mapping system, RLOC information
   MAY be gleaned from received tunneled packets or Map-Request
   messages.  A "gleaned" Map-Cache entry, one learned from the source

To double-check, this is describing the same behavior described in the iast
bullet point?

   RLOC of a received encapsulated packet, is only stored and used for a
   few seconds, pending verification.  Verification is performed by
   sending a Map-Request to the source EID (the inner-header IP source
   address) of the received encapsulated packet.  A reply to this

As I commented on 6833bis, I'd appreciate mention that this
"verification" is akin to reverse-routability verification and does not
involve any cryptographic assurances.

                    This "gleaning" mechanism is OPTIONAL, refer to
   Section 16 for security issues regarding this mechanism.

nit: comma splice

Section 10

   2.  When an ETR receives an encapsulated packet from an ITR, the
       source RLOC from the outer header of the packet is likely to be
       reachable.

Liktely to, but not guaranteed, since that's a spoofable field and we rely
on the ITR to fill it with something useful.

   When determining Locator up/down reachability by examining the
   Locator-Status-Bits from the LISP-encapsulated data packet, an ETR
   will receive up-to-date status from an encapsulating ITR about
   reachability for all ETRs at the site.  [...]

This ("up-to-date") wording concerns me, relating to the interaction
between map-versioning, addition/removal of locators from a mapping, and
propagation to values in the LSBs.  I do not think there is currently a
strong consistency guarantee in place that would justify the "up-to-date"
wording, and 6834bis has not received IETF (or IESG) review yet.  My
current understanding is that the status information provided via this
mechanism could be characterized as "best-effort" or "accurate in the
steady-state".

   The security considerations of Section 16 related with data-plane
   reachability applies to the data-plane RLOC reachability mechanisms

nit: I think this is "related to", not "related with".

Section 10.1

   When this packet is received by the ETR, the encapsulated packet is
   forwarded as normal.  When the ETR is an xTR (co-located as an ITR),
   it next sends a data packet to the ITR (when it is an xTR co-located

nit: I don't think this is grammatical; maybe "when it next sends"?

   erroneously consider the Locator unreachable.  An ITR SHOULD only set
   the E-bit in an encapsulated data packet when it knows the ETR is
   enabled for echo-noncing.  This is conveyed by the E-bit in the RLOC-
   probe Map-Reply message.

Is this only in RLOC-probe Map-Reply messages (and not generic, including
mapping-driven, Map-Reply messages)?  If so, I think that 6833bis needs
some clarification in Section 5.4.  (If not, I think that "RLOC-probe"
should be removed from here.)

Section 13

   ignores them.  However, this can only happen for locator addresses
   that are lexicographically greater than the locator addresses in the
   existing Locator-Set.

(It might be apropos to explicitly remind the reader that the entries in the
locator-set are sorted by IP address.)

   When a Locator record is removed from a Locator-Set, ITRs that have
   the mapping cached will not use the removed Locator because the xTRs
   will set the Locator-Status-Bit to 0.  So, even if the Locator is in

Why will the xTRs set the Locator-Status-Bit to 0?  Won't they just use
the updated mapping and have a smaller number of LSBs in their data
packets?

   If many changes occur to a mapping over a long period of time, one
   will find empty record slots in the middle of the Locator-Set and new
   records appended to the Locator-Set. At some point, it would be

How can this happen when the Locator-Set is required to be sorted?  Or does
the sorting requirement not apply to the "Map-Reply Record" field of the
Map-Request?

Section 13.1

   A Map-Version Number can be included in Map-Register messages as
   well.  This is a good way for the Map-Server to assure that all ETRs
   for a site registering to it will be synchronized according to Map-
   Version Number.

This seems vastly underspecified not just to the detailed semantics (for
which the 6834bis reference is probably a suitable replacement) but also
the goals of the synchronization that is to be obtained.  It's unclear to
me that it's appropriate to mention Map-Register and versions at all in
this document; instead we might just note that 6834bs provides for version
synchronization for the ETRs within a site.  (If that's what it does; I
haven't yet read it in detail.)

Section 14

   Just like it would if the destination EID was a unicast address.

nit: this is a sentence fragment; I suggest joining to the previous
sentence with a comma.

Section 20.2

Maybe throw us a bone and include the string
"draft-iannone-openlisp-implementation" in the [OPENLISP] entry?