Re: [lisp] Benjamin Kaduk's Discuss on draft-ietf-lisp-rfc6830bis-20: (with DISCUSS and COMMENT)

"Joel M. Halpern" <jmh@joelhalpern.com> Thu, 27 September 2018 03:53 UTC

Return-Path: <jmh@joelhalpern.com>
X-Original-To: lisp@ietfa.amsl.com
Delivered-To: lisp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D3B81130DDE; Wed, 26 Sep 2018 20:53:08 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.7
X-Spam-Level:
X-Spam-Status: No, score=-2.7 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=joelhalpern.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Vu97cWi3eJG3; Wed, 26 Sep 2018 20:53:05 -0700 (PDT)
Received: from mailb2.tigertech.net (mailb2.tigertech.net [208.80.4.154]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id EF3EF124C04; Wed, 26 Sep 2018 20:53:04 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1]) by mailb2.tigertech.net (Postfix) with ESMTP id B438F446FB6; Wed, 26 Sep 2018 20:53:04 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelhalpern.com; s=2.tigertech; t=1538020384; bh=yp5XEkHdjtgMaXB1wZtWIbCND3OjDtszRhSNTBfkark=; h=Subject:To:Cc:References:From:Date:In-Reply-To:From; b=W2blQhFluMu/unBoHHilAgJwceDFxTzGdYI3ZznjpMPvpsW4YNnRzwgj3Ctptf6Nh +aYPW0libhB/WHJAXq8f237lFVBPGnri6cmlPBY1g9Y7+PUDKwNA/SMJzW1xyexzR9 B67sRYcNYhw9BAc/da+itadbbp1vDMXaK+jfqft4=
X-Virus-Scanned: Debian amavisd-new at b2.tigertech.net
Received: from Joels-MacBook-Pro.local (209-255-163-147.ip.mcleodusa.net [209.255.163.147]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mailb2.tigertech.net (Postfix) with ESMTPSA id 9AA1B447095; Wed, 26 Sep 2018 20:53:03 -0700 (PDT)
To: Benjamin Kaduk <kaduk@mit.edu>, The IESG <iesg@ietf.org>
Cc: draft-ietf-lisp-rfc6830bis@ietf.org, Luigi Iannone <ggx@gigix.net>, lisp-chairs@ietf.org, lisp@ietf.org
References: <153801986490.21574.14435994195001767765.idtracker@ietfa.amsl.com>
From: "Joel M. Halpern" <jmh@joelhalpern.com>
Message-ID: <739fae18-85a5-26c2-85a6-7d7c830fcd32@joelhalpern.com>
Date: Wed, 26 Sep 2018 23:53:02 -0400
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:52.0) Gecko/20100101 Thunderbird/52.9.1
MIME-Version: 1.0
In-Reply-To: <153801986490.21574.14435994195001767765.idtracker@ietfa.amsl.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Language: en-US
Content-Transfer-Encoding: 7bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/lisp/7MzdPspq6QTrf5qO391tIe-SSDI>
Subject: Re: [lisp] Benjamin Kaduk's Discuss on draft-ietf-lisp-rfc6830bis-20: (with DISCUSS and COMMENT)
X-BeenThere: lisp@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: List for the discussion of the Locator/ID Separation Protocol <lisp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/lisp>, <mailto:lisp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/lisp/>
List-Post: <mailto:lisp@ietf.org>
List-Help: <mailto:lisp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/lisp>, <mailto:lisp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 27 Sep 2018 03:53:09 -0000

Is there text we can add about the scoping that will change your discuss 
into a series of useful comments?
If so, Some indication of how you would like that phrased would help us 
address these.

If not, we seem to have a larger problem.
Yours,
Joel

On 9/26/18 11:44 PM, Benjamin Kaduk wrote:
> Benjamin Kaduk has entered the following ballot position for
> draft-ietf-lisp-rfc6830bis-20: Discuss
> 
> When responding, please keep the subject line intact and reply to all
> email addresses included in the To and CC lines. (Feel free to cut this
> introductory paragraph, however.)
> 
> 
> Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
> for more information about IESG DISCUSS and COMMENT positions.
> 
> 
> The document, along with other ballot positions, can be found here:
> https://datatracker.ietf.org/doc/draft-ietf-lisp-rfc6830bis/
> 
> 
> 
> ----------------------------------------------------------------------
> DISCUSS:
> ----------------------------------------------------------------------
> 
> I have grave concerns about the suitability of LISP as a whole, in its
> present form, for advancement to the Standards-Track.  While some of my
> concerns are not specific to this document, as the core protocol
> (data-plane) spec, it seems an appropriate place to attach them to.
> 
> I am told, out of band, that the intended deployment model is no longer to
> cover the entire Internet (c.f. the MISSREF-state
> draft-ietf-lisp-introduction's "with LISP, the dge of the Internet and the
> core can be logically separated and interconnected by LISP-capable
> routers", etc.), and that full Internet-scale operation is no longer a
> goal.  However, since that does not seem to be reflected in the current
> batch of documents up for IESG review, I am forced to ballot on them
> "as-is", namely as targetting global Internet deployment.  The requirements
> placed on the mapping system are so stringent so as to be arguably
> unachievable at Internet-scale, though that arguably has more of an
> interaction with the control-plane than the data-plane.  It's still in
> scope here, though, as part of the overall description of the protocol
> flow.
> 
> There are an almost innumerable number of downgrade attacks possible, and
> the control-plane and data-plane security mechanisms are not normative
> dependencies of the current corpus of documents, and as such are not up for
> consideration as mitigating the security concerns with the core documents.
> 
> Section 3 defines the EID-to-RLOC Datbaase:
> 
>     EID-to-RLOC Database:   The EID-to-RLOC Database is a global
>        distributed database that contains all known EID-Prefix-to-RLOC
>        mappings.  Each potential ETR typically contains a small piece of
>        the database: the EID-to-RLOC mappings for the EID-Prefixes
>        "behind" the router.  These map to one of the router's own
>        globally visible IP addresses.  Note that there MAY be transient
>        conditions when the EID-Prefix for the site and Locator-Set for
>        each EID-Prefix may not be the same on all ETRs.  This has no
>        negative implications, since a partial set of Locators can be
>        used.
> 
> No compelling architecture for a trustworthy global distributed database
> has been presented that I've seen so far, and LISP relies heavily on the
> mapping system's database for its functionality.  I am concerned that so
> many requirements are placed on the mapping system so as to be in effect
> unimplementable, in which case it would seem that the architecture as a
> whole (that is, for a global Internet-scale system) is not fit for purpose.
> 
> Section 4.1's Step (6) only mentions parsing "to check for format
> validity".  I think it is appropriate to mention (and refer to) source
> authentication checks as well, since bad Map-Reply data can allow all sorts
> of attacks to occur.
> 
> There are some fairly subtle ordering requirements between the order of
> entries in Map-Reply messages and the Locator-Status-Bits in data-plane
> traffic (so that the semantic meaning of the status bits are meaningful),
> which is only given a minimal treatment in the control-plane document.  The
> need for synchronization in interpreting these bits should be mentioned
> more prominently in the data-plane document as well.
> 
> The usage of the Instance ID does not seem to be adequately covered; from
> what I've been able to pick up so far it seems that both source and
> destination participants must agree on the meaning of an Instance ID, and
> the source and destination EIDs must be in the same Instance.  This does
> not seem like it is compatible with Internet scale, especially if there are
> only 24 usable bits of Instance ID.
> 
> There seems to be a lot of intra-site synchronization requirements, notably
> with respect to Map-Version consistency, the contents and ordering of
> locator sets for EIDs in the site, etc.; the actual hard requirements for
> synchronization within a site should be clearly called out, ideally in a
> single location.
> 
> The security considerations attempt to defer substantially to the
> threat-analysis in RFC 7835, which does not really seem like a complete
> threat analysis and does not provide analysis as to what requirements are
> placed on the boundaries between the different components of LISP (data
> plane, control plane, mapping system, various extensions, etc.).  The
> secdir reviewer had some good thoughts in this space.
> 
> The security considerations throughout the LISP documents place a heavy
> focus on the risk of over-claiming for routing EID-prefixes.  This is a
> real concern, to be clear, but it should not overshadow the risk of an
> attacker who is able to move traffic around at will, strip security
> protections, cause denial of service, alter data-plane payloads, etc.
> Similarly, this document's security considerations call out denial of
> service as a risk from Map-Cache insertion/spoofing, but the risks from an
> attacker being able to read and modify the traffic, perhaps even without
> detection, seems a much greater threat to me.
> 
> I am not convinced that this protocol meets the current IETF requirements
> for the security properties of Standards-Track Protocols without at least
> LISP-SEC as a mandatory-to-implement component, and possibly additional or
> stronger requirements.  (I did not do a full analysis of the system in the
> presence of those security mechanisms, since that is not what is being
> presented for review.)
> 
> Having an EID that is associated to user-correlatable devices has severe
> privacy considerations, but I could not find this mentioned anywhere in all
> of the LISP documents I've read so far.
> 
> 
> ----------------------------------------------------------------------
> COMMENT:
> ----------------------------------------------------------------------
> 
> I apologize for the somewhat scattered nature of these comments; there are
> a lot of them and I was focusing my time more on trying to understand the
> broader system, and the intended security posture, so they did not get as
> much clean-up as I would have liked.  (Most of my review was performed on the
> -18, though I have tried to update to the -20 as relevant.)
> 
> 
> The instance ID provides for organizational correlation, another privacy
> exposure.
> 
> Is there anything different between an "EID-to-RLOC Map-Request" and just a
> "Map-Request"?  (Same question for "Map-Reply", too.)
> 
> There's a lot of stuff that seems to work best if there is symmetric
> bidirectional traffic, with inline signalling of map version and
> reachability changes, though clearly everything is designed to also work
> with asymmetric connectivity or unidirectional traffic.  It would be nice
> to have a high-level summary in or near the introduction about what kinds
> of behavior/performance differences are expected for bidirectional vs.
> unidirectional traffic.
> 
> Section 2
> 
> That's not the 8174 boilerplate; it's more than just adding a cite to the
> 2119 boilerplate.
> 
> Section 3
> 
> nit: "An address family that pertains to the Data-Plane." is a sentence
> fragment.
> 
>     Ingress Tunnel Router (ITR):   An ITR is a router that resides in a
>        [...]
>        mapping lookup in the destination address field.  Note that this
>        destination RLOC MAY be an intermediate, proxy device that has
>        better knowledge of the EID-to-RLOC mapping closer to the
> 
> This doesn't seem like a 2119 MAY is necessary, but rather a statement of
> fact that may not be known to the encapsulating ITR.
> 
>        Specifically, when a service provider prepends a LISP header for
>        Traffic Engineering purposes, the router that does this is also
>        regarded as an ITR.  The outer RLOC the ISP ITR uses can be based
>        on the outer destination address (the originating ITR's supplied
>        RLOC) or the inner destination address (the originating host's
>        supplied EID).
> 
> I'm confused here, perhaps in multiple ways.  Are there now *two* LISP
> headers on the packet?  Is the "outer RLOC the ISP ITR uses" the source
> RLOC or the destination RLOC?
> 
>     Negative Mapping Entry:   A negative mapping entry, also known as a
>        negative cache entry, is an EID-to-RLOC entry where an EID-Prefix
>        is advertised or stored with no RLOCs.  That is, the Locator-Set
>        for the EID-to-RLOC entry is empty or has an encoded Locator count
>        of 0.
> 
> Is "empty" a distinct representation from "locator count of zero"?
> 
> Perhaps something of an aside, but the check described for
> Route-Returnability is a somewhat weak check, and in some cases could still
> be spoofed.  (I don't expect this to surprise anyone, of course, but
> perhaps some more qualifiers could be added to the text.)
> 
> Section 4
> 
>     An additional LISP header MAY be prepended to packets by a TE-ITR
>     when re-routing of the path for a packet is desired.  A potential
>     use-case for this would be an ISP router that needs to perform
>     Traffic Engineering for packets flowing through its network.  In such
>     a situation, termed "Recursive Tunneling", an ISP transit acts as an
>     additional ITR, and the RLOC it uses for the new prepended header
>     would be either a TE-ETR within the ISP (along an intra-ISP traffic
>     engineered path) or a TE-ETR within another ISP (an inter-ISP traffic
>     engineered path, where an agreement to build such a path exists).
> 
> "the RLOC it uses for the new prepnded header", again, this is as the
> destination RLOC (vs. source RLOC)?
> 
> Section 4.1
> 
>     o  Map-Replies are sent on the underlying routing system topology
>        using the [I-D.ietf-lisp-rfc6833bis] Control-Plane protocol.
> 
> Just to check my understanding: is the "underlying routing system topology"
> the same as the "underlay"?
> 
> Is step (3) just describing more of what step (2) says is "not described in
> this example"?
> 
> Section 5.3
> 
> The word "nonce" is normally used for something used exactly once.
> E.g., with some AEAD algorithms, if the same "nonce" input is used for
> different encryptions, the entire security of the system is compromised.
> It would be better to refer to this field with a different term, given
> that "the same nonce can be used for a period of time when encapsulating to
> the same ETR".  "Uniquifier" or "random value" might be reasonable choices.
> 
> Why is there no discussion of the Map-Version or Instance-ID fields
> in this section?
> 
> When doing ETR/PETR decapsulation:
> 
>     o  The inner-header 'Time to Live' field (or 'Hop Limit' field, in
>        the case of IPv6) SHOULD be copied from the outer-header 'Time to
>        Live' field, when the Time to Live value of the outer header is
>        less than the Time to Live value of the inner header.  Failing to
>        perform this check can cause the Time to Live of the inner header
>        to increment across encapsulation/decapsulation cycles.  This
>        check is also performed when doing initial encapsulation, when a
>        packet comes to an ITR or PITR destined for a LISP site.
> 
> Er, what is "this check" that is also performed for initial encapsulation?
> How are there multiple TTL values to compare?
> 
>     o  The inner-header 'Differentiated Services Code Point' (DSCP) field
>        (or the 'Traffic Class' field, in the case of IPv6) SHOULD be
>        copied from the outer-header DSCP field ('Traffic Class' field, in
>        the case of IPv6) to the inner-header.
> 
> nit: the first "inner-header" seems like an editing remnant?
> 
> Section 7.1
> 
> How is this stateless if it invovles knowledge about the routers between
> the ITR and all possible ETRs (i.e., a set that could change over time)?
> 
> Section 8
> 
> This 32-bit vs 24-bit thing is pretty hokey for a standards-track
> specification (yes, I know that LISP-DDT is not standards track at the
> moment).
> 
> Section 9
> 
>     Alternatively, RLOC information MAY be gleaned from received tunneled
> 
> What is this an alternative to?  The list of four options above?
> 
>     packets or EID-to-RLOC Map-Request messages.  A "gleaned" Map-Cache
>     entry, one learned from the source RLOC of a received encapsulated
>     packet, is only stored and used for a few seconds, pending
>     verification.  Verification is performed by sending a Map-Request to
>     the source EID (the inner-header IP source address) of the received
>     encapsulated packet.
> 
> The source EID is some random end system, right?  So this relys on some
> magic in the ETR to detect that there's a Map-Request and reply directly
> instead of passing it on to the EID that won't know what to do with it?
> 
> Talking about the "R-bit" of the Map-Reply" is detail from 6833bis and
> might benefit from an explicit section reference to the other document.
> 
> Section 10
> 
> What is the "CE" of "CE-based ITRs"?  Presumably Customer Edge, but it
> is not marked as well-known at
> https://www.rfc-editor.org/materials/abbrev.expansion.txt so expansion is
> probably in order.
> 
> Again, when we are talking about the internal structure of the Map-Reply, a
> detailed section refernce to 6833bis is useful.
> 
> Modifying LSBs seems like a fine DoS attack vector for an on-path attacker.
> 
>     value of 1.  Locator-Status-Bits are associated with a Locator-Set
>     per EID-Prefix.  Therefore, when a Locator becomes unreachable, the
>     Locator-Status-Bit that corresponds to that Locator's position in the
>     list returned by the last Map-Reply will be set to zero for that
>     particular EID-Prefix
> 
> Doesn't this imply a stateful relationship between the ordering of
> Map-Replys and data-plane traffic?
> 
> Section 10.1
> 
>     Note that "ITR" and "ETR" are relative terms here.  Both devices MUST
>     be implementing both ITR and ETR functionality for the echo nonce
>     mechanism to operate.
> 
> Perhaps they could be given actual names so as to disambiguate which steps
> are performed with ITR vs. ETR role?
> 
>     The echo-nonce algorithm is bilateral.  That is, if one side sets the
>     E-bit and the other side is not enabled for echo-noncing, then the
>     echoing of the nonce does not occur and the requesting side may
>     erroneously consider the Locator unreachable.  An ITR SHOULD only set
>     the E-bit in an encapsulated data packet when it knows the ETR is
>     enabled for echo-noncing.  This is conveyed by the E-bit in the RLOC-
>     probe Map-Reply message.
> 
> Why is this even optional?  If it was mandatory to use, then there would
> not be a question.  But at least clarify that the "this" that is conveyed
> is whether the peer supports the echo-nonce algorithm.  (Also, subject to
> downgrade.)
> 
> Section 13
> 
>     When a Locator record is removed from a Locator-Set, ITRs that have
>     the mapping cached will not use the removed Locator because the xTRs
>     will set the Locator-Status-Bit to 0.  So, even if the Locator is in
>     the list, it will not be used.  For new mapping requests, the xTRs
>     can set the Locator AFI to 0 (indicating an unspecified address), as
>     well as setting the corresponding Locator-Status-Bit to 0.  This
>     forces ITRs with old or new mappings to avoid using the removed
>     Locator.
> 
> The behavior describe here seems like it would be better described as "when
> a Locator is taken out of service" than "removed from a Locator-Set", since
> if it is not in the set at all, it has no index, and no LSB or AFI to set.
> Should actually depopulating it like this be forbidden?
> 
> I guess the Map Versioning is supposed to help with this, but we need to
> nail down the semantics more and/or give a clearer reference to it.
> 
> Section 13.1
> 
>     An ITR, when it encapsulates packets to ETRs, can convey its own Map-
>     Version Number.  This is known as the Source Map-Version Number.
> 
> Replacing "its own Map-Version Number" with something like "the Map-Version
> numer for the LISP site of which it is a part".  Writing this causes me to
> note that the semantics of the Map-Version are unclear, here -- what is it
> scoped to?  An EID-Prefix?  An RLOC?  Oh, you say that in the next
> paragraph (EID-Prefix).
> 
>     A Map-Version Number can be included in Map-Register messages as
>     well.  This is a good way for the Map-Server to assure that all ETRs
>     for a site registering to it will be synchronized according to Map-
>     Version Number.
> 
> Huh?  I must be confused how this works.  (Also, wouldn't this be better in
> the control plane document which covers Map-Register?)
> 
> Section 15
> 
>     o  When a tunnel-encapsulated packet is received by an ETR, the outer
>        destination address may not be the address of the router.  This
>        makes it challenging for the control plane to get packets from the
>        hardware.  This may be mitigated by creating special Forwarding
>        Information Base (FIB) entries for the EID-Prefixes of EIDs served
>        by the ETR (those for which the router provides an RLOC
>        translation).  These FIB entries are marked with a flag indicating
>        that Control-Plane processing SHOULD be performed.
> 
> I assume this is just my lack of background showing, but I'm confused how
> it makes sense to mark these for control-plane processing.  Isn't the
> control plane much slower, and we're not putting all of the LISP data-plane
> traffic onto the slow path?
> 
> Section 18
> 
>     o  Data-Plane gleaning for creating map-cache entries has been made
>        optional.  If any ITR implementations depend or assume the remote
>        ETR is gleaning should not do so.
> 
> nit: this is ungrammatical; "they should not" or "Any ITR implementations
> that depend on or assume that" would fix it.
> 
> Section 19.1
> 
> Presumably IANA also updated the reference column to point to this
> document?
> 
> 
>