Re: [lisp] Warren Kumari's Discuss on draft-ietf-lisp-rfc6830bis-26: (with DISCUSS and COMMENT)

Dino Farinacci <farinacci@gmail.com> Wed, 06 February 2019 17:48 UTC

From: Dino Farinacci <farinacci@gmail.com>
Message-Id: <E8FC2F26-A7F3-454C-ABBB-C3B47536EB58@gmail.com>
Content-Type: multipart/mixed; boundary="Apple-Mail=_7F48DF72-0693-4ABB-B1FB-1E54B770FE77"
Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.1\))
Date: Wed, 06 Feb 2019 09:48:06 -0800
In-Reply-To: <154941971479.32132.7227582520612116720.idtracker@ietfa.amsl.com>
Cc: The IESG <iesg@ietf.org>, draft-ietf-lisp-rfc6830bis@ietf.org, Luigi Iannone <ggx@gigix.net>, lisp-chairs@ietf.org, lisp@ietf.org
To: Warren Kumari <warren@kumari.net>
References: <154941971479.32132.7227582520612116720.idtracker@ietfa.amsl.com>
Archived-At: <https://mailarchive.ietf.org/arch/msg/lisp/X1icB41g-5tX1_4DWR1R_MdC7SI>
Subject: Re: [lisp] Warren Kumari's Discuss on draft-ietf-lisp-rfc6830bis-26: (with DISCUSS and COMMENT)
Precedence: list

Warren, thanks for the review, I have changed text to address your comments and nits below. A new diff file is enclosed at the end.

I want to send one message to the IESG. This has no reflection on Warren or his commentary. But the standards process seems to be at an all-time low for my prespective. And this Jan 2019 I have been coming to IETF for 30 years. So I think I’m not speaking off the cuff here and speak from what I have experienced over that long time frame.

We have been trying to get these documents to move forward and it seems with all the new people that come to the IESG that do the reviews are not expert in the art and hence we have to explain basic LISP. It has been happening for about 6 quarters now. We explain, people understand, a quarter passes, there is silence, new people come into the review process, and we explain again.

We have redone text so many times that likely have undone commentary from people that were experienced in the art who commented years ago. What if they come back in now and say “you change the text again”.

To the authors, this seems non-sense, never ending and not productive. One can see why open-source approaches are out competing the IETF process. I’ll stop there.

I’ll explain once again in the DISCUSS comments below because I know Warren put effort into this when he should have been resting.  ;-)

> On Feb 5, 2019, at 6:21 PM, Warren Kumari <warren@kumari.net> wrote:
> 
> ----------------------------------------------------------------------
> DISCUSS:
> ----------------------------------------------------------------------
> 
> I read much of this on a plane while overtired, so it is entirely possible /
> probable that I've completely misunderstood something(s) obvious. Many of the
> below are probably simple to address, and either I simply need to be educated,
> or just there needs to be a bit more text / detail provided.

I will respond briefly and directly.

> 1: "3.  The ITR sends a LISP Map-Request as specified in
> [I-D.ietf-lisp-rfc6833bis].  Map-Requests SHOULD be rate-limited." What does
> the ITR do with the packet while waiting for the Map-Request to complete? Must

> it buffer the packets or can it discard them? If the former, for how long must
> it buffer? When you say "SHOULD be rate-limited", can you provide guidance on
> rates? 1 request per second? 1 million per second? Is this rate-limit per
> destination or per device? Apologies if this is clearly stated in RFC6833(bis)
> - I only skimmed it, and didn't see an answer there.

We have said drop or queue. We discourage queuing because one never knows which packets are the important ones to queue. Never much the same as the ARP resolution issue.

> 2: "6. ... Note that the Map-Cache is an on-demand cache. An ITR will manage
> its Map-Cache in such a way that optimizes for its resource constraints."
> Presumably I could cause this cache to thrash / overflow by looking at the RLOC
> database, and choosing EIDs to send traffic to which all require different
> cache entries, causing the cache to overflow (or, at least, causing maximum
> cache pressure). This seems like an ideal DoS vector. It seems that there
> should be more guidance provided on how to size the Map-Cache / the expected
> order of the cache size, even if it is ultimately an implementation issue (e.g:
> is a Map-Cache of 100 entries OK for an ITR? or should it be O(1000)? Or
> roughly size(database)/2? Having multiple devices with small caches, and a bot
> which does the above seems like a global risk).

What happens if you send 10,000,000 BGP routes in an update and the receiver can’t store it. It is the same problem. We have lots of research on this problem and there are pointers in the spec to it.

> I'm quite confused by much of the MTU / Fragmentation stuff -- I did read the
> documents on a plane after not getting much sleep, and so it is entirely
> possible / probable that I'm just being stupid, but there are bits which don't
> seem to make sense to me. 3: "2.  Define L to be the size, in octets, of the
> maximum-sized packet an ITR can send to an ETR without the need for the ITR or
> any intermediate routers to fragment the packet." How do I know what L is? The

> document "RECOMMENDS that L be defined as 1500" -- but 1500 isn't universally
> true (if it were, we would never have to do Path MTU). What happens when the
> *actual* MTU on the path is e.g 1476 because there is a tunnel on the path? The
> text also mentions "which is less than the ITR’s estimate of the path MTU
> between the ITR and its correspondent ETR" - this implies that the ITR is
> tracking / estimating the MTU, which a: doesn't align with the rest of the
> text, or b: sounds like the stateful solution below. I have reread this
> multiple times, but it still feels like it is avoiding the issue by defining it
> to not exist.

L is an architectural constant. If there tends to be tunnels between the ITR and ETR, then you choose L to be 1400. Or you run MTU discovery between the ITR and ETR to determine the effective MTU.

> 4: "Note that reassembly can happen at the ETR if the encapsulated packet was
> fragmented at or after the ITR." - I think that there needs to be more text /
> description about resource constraints on routers performing reassembly of
> fragments - in most cases a router doesn't have to / isn't expected to have to
> reassemble transit packets from arbitrary sources on the Internet (things where
> routers may reassemble are aimed at the control plane which can be
> rate-limited, or are from expected source addresses). It seems that spoofing
> lots of initial fragments without the final one will be a tax on the router.

We have chosen 3 methods to deal with MTU issues because ETR reassembly is the worst approach. And I certainly wouldn’t recommend using it.

> 5: "Instead of using the Map-Cache or mapping system, RLOC information MAY be
> gleaned from received tunneled packets or Map-Request messages. A "gleaned"
> Map-Cache entry, one learned from the source RLOC of a received encapsulated
> packet, is only stored and used for a few seconds, pending verification." - it
> seems that this is ripe for abuse (or I'm missing in the cache expiration). I
> want to hijack traffic from Site X to well known Service Y, so I look up
> Service Y and save the TTL from the Map-Reply. I then start spoof packets
> listing myself as the ETR - eventually Site X will glean from my spoofed
> packets, and start sending traffic to me - yes, this will only work for a few
> seconds -- but as soon as I stop getting packets from site X, I know site X has
> verified the entry and discovered it is wrong... and that the TTL is now being
> deprecated. I start a timer, and second or two less than the TTL later I start
> spoofing packets again, knowing that site X will soon expire the cache entry
> and will once again be willing to accept mine again. A: I get some Site X to
> Site Y traffic for a few seconds every TTL seconds, and B: the loss of this
> traffic is a signal that TTL seconds again it will need to be refreshed.

It was a Google employee in the early days that wanted this feature (circa 2007). ;-) It was so return packets from a server side didn’t have to do the mapping system lookup. It is off by default and only used in trusted enviornments where you can control how the ITR and ETR behave.

> 6: "10.1.  Echo Nonce Algorithm" -- If I spoof lots of packets with the N- and
> E-bits set, the receiving ETR will need to keep false state, and presumably I
> can overfill a cache. This will cause the ETR to not be able to include the
> received nonce on legitimate traffic, and so the ITR on the far side will
> think this ETR is down. This seems like a fairly easy DoS. I'm guessing that
> this can be worked around by not setting the E bit in the RLOC-probe Map-Reply
> message, but this feels like a dangerous foot gun, and should at least be
> noted. Note that this is different to the "Note the attacker must guess a
> valid nonce the ITR is requesting to be echoed within a small window of time. 
> The goal is to convince the ITR that the ETR’s RLOC is  reachable even when it
> may not be reachable."  attack listed in the document in that a: it doesn't
> require any guessing, and b: makes an ETR appear down, not up.

You can’t overfill any cache. An xTR just remembers the last nonce that came with the E-bit set and when it returns packets it uses that nonce.

Yes, many implementations default to not setting advertising they are echo-nonce capable in Map-Replies. So RLOC-probing tends to be used for RLOC reachability. Plus we added features into RLOC-probing that makes it more useful now (lisp-crypto key exchange for one).

> The document does mention "... attack can be mitigated by preventing RLOC
> spoofing in the network by deploying uRPF BCP 38 [RFC2827]." - while that may
> be true for many of the above, BCP38 is far from being universally deployed,
> and this feels similar to solving world hunger by saying everyone must have
> enough food. :-)

An ETR can verify mappings if it chooses to. The more overhead you want to put in the system for anti-spoofing, one can do. Tradeoffs.

By the way food is everywhere, you just have to be willing to eat it.  ;-)

> Again, apologies if I've completely misunderstood something, clue-bat gladly
> accepted…

You did a pretty good job. Thanks.

> ----------------------------------------------------------------------
> COMMENT:
> ----------------------------------------------------------------------
> 
> Comments:
> 1: "LISP Locator-Status-Bits (LSBs):  ... The field is 32 bits when the I-bit
> is set to 0 and is 8 bits when the I-bit is set to 1." - I think I'm missing
> something fairly fundamental here (and in Section 10, 13.1, and others) - if
> I'm using the I bit, it sounds like I can only have 8 ETRs? And with this clear
> I can only have 32? I feel like I must have missed something….

Right 8 ETRs per EID-prefix. If you needed more, then you take the EID-prefix and cut it in half and increase the mask-length by 1, then you can use a different set of 8 for each more specific EID-prefix. Otherwise, you use RLOC-probing and the number of RLOCs can bee up to 255 (the width of the RLOC-Count field in an EID-record).

> 2: "When the ’DF’ field of the IP header is set to 1, or the packet is an IPv6
> packet originated by the source host, the ITR will drop the packet when the
> size is greater than L and send an ICMPv4..." I think this needs to say when
> the resulting (or encapsulated) packet is greater than L (otherwise it is
> unclear if you are referring to the original or resulting packet).

Yes, I will clarify. Great point.

> 3: "The server-side sets a Weight of zero for the RLOC subset list. In this
> case, the client-side can choose how the traffic load is  spread across the
> subset list." -- please insert a reference to Section 12 here. I wrote up a
> long comment on what happens of the load sharing delivers packets to different
> ETRs, before finding this section later on in the document.

Done.

> Nits:
> 1: " LISP does not require changes to either host protocol stack or to underlay
> routers. " -- I think either "to either the host protocol stack" or " to either
> host protocol stacks"
> 
> 2: "As an exmple of such attacks an off-path attacker can" -- typo for example.
> 
> 3: "it can protect itself from erroneious reachability attacks" -- typo for
> erroneous.

Fixed.

Thanks again,
Dino

Attachment: rfcdiff-rfc6833bis-27.html

[lisp] Warren Kumari's Discuss on draft-ietf-lisp… Warren Kumari
Re: [lisp] Warren Kumari's Discuss on draft-ietf-… Dino Farinacci
Re: [lisp] Warren Kumari's Discuss on draft-ietf-… Alissa Cooper
Re: [lisp] Warren Kumari's Discuss on draft-ietf-… Warren Kumari
Re: [lisp] Warren Kumari's Discuss on draft-ietf-… Dino Farinacci
Re: [lisp] Warren Kumari's Discuss on draft-ietf-… Dino Farinacci
Re: [lisp] Warren Kumari's Discuss on draft-ietf-… Warren Kumari
Re: [lisp] Warren Kumari's Discuss on draft-ietf-… Dino Farinacci

Re: [lisp] Warren Kumari's Discuss on draft-ietf-lisp-rfc6830bis-26: (with DISCUSS and COMMENT)

Attachment: rfcdiff-rfc6833bis-27.html