Re: [Int-area] I-D Action: draft-ietf-intarea-tunnels-05.txt

Joe Touch <> Wed, 03 May 2017 18:49 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 99BF212957C for <>; Wed, 3 May 2017 11:49:23 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.901
X-Spam-Status: No, score=-1.901 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RP_MATCHES_RCVD=-0.001] autolearn=ham autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id hm4CwpXzctIu for <>; Wed, 3 May 2017 11:49:21 -0700 (PDT)
Received: from ( []) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id D2BB1129A97 for <>; Wed, 3 May 2017 11:47:35 -0700 (PDT)
Received: from [] ([]) (authenticated bits=0) by (8.13.8/8.13.8) with ESMTP id v43IlDPS027295 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT); Wed, 3 May 2017 11:47:13 -0700 (PDT)
To: "Templin, Fred L" <>
Cc: "" <>
References: <> <> <> <> <> <> <>
From: Joe Touch <>
Message-ID: <>
Date: Wed, 3 May 2017 11:47:13 -0700
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.1.0
MIME-Version: 1.0
In-Reply-To: <>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
Content-Language: en-US
X-MailScanner-ID: v43IlDPS027295
X-ISI-4-69-MailScanner: Found to be clean
Archived-At: <>
Subject: Re: [Int-area] I-D Action: draft-ietf-intarea-tunnels-05.txt
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: IETF Internet Area Mailing List <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 03 May 2017 18:49:24 -0000

Hi, Fred,

Your response keeps raising the same issues that I think we agree upon:

    - PMTUD always has the possibility of creating black holes (whether
in the presence of multipath or not)

    - PLPMTUD can generate cases where the MIN isn't actually detected

and some claims with which I disagree:

    - that PMPMTUD probes travel different paths than the data

However, PLPMTUD probes are generated by transport or higher layers,
using the same transport protocol and port pairs as the data (they ARE
the data). Those probes should be used to determine an MTU only for a
given flow; the potential hazards of sharing that information across
flows is already discussed in that RFC. The entire point of PMPMTUD is
that the probes travel the same path as the data - and yes, any
multipath system could give false information, but should eventually
converge on a useful minimum as long as data packets get through.

    - that tunnels behave differently than unicast links in this regard

I don't yet see any explanation as to why this would be true.

So I'm left with the following, which I propose as the way forward:

    - the text will be clear about the potential issues for multipath
potentially taking a long time to converge

    - the text will be clear about the potential issues for black-holing

However, in both cases, I don't see a good reason to flag how a tunnel
behaves as unique.

Does that work?

If not, why not?


On 5/3/2017 10:48 AM, Templin, Fred L wrote:
> Hi Joe,
> Sorry for the extended delay - see below for responses:
> Thanks - Fred
>> -----Original Message-----
>> From: Joe Touch []
>> Sent: Wednesday, March 29, 2017 3:06 PM
>> To: Templin, Fred L <>
>> Cc:
>> Subject: Re: [Int-area] I-D Action: draft-ietf-intarea-tunnels-05.txt
>> Hi, Fred,
>> I think we're agreed except for PMTUD and PLMTUD. See below about the
>> latter (AFAICT, if it black holes, then the PLMTUD would detect a
>> failure to make forward progress). Black-holing in PMTUD is known and
>> not something this (or any other doc) is fixing, but also a known standard.
>> Joe
>> On 3/29/2017 2:18 PM, Templin, Fred L wrote:
>>> ...
>>>>> 10) Section 4.2.1, bulleted list toward bottom of section, tunnel
>>>>>    "atom" is a very strange word to me. Tunnel "cell"?
>>>> The concept of an atomic packet was defined in RFC 6864. This is derived
>>>> from that. Cell would be introducing a new term, one that is overloaded
>>>> with ATM, which we want to avoid.
>>> OK, but then please find a way to call it an "atomic packet".
>> Agreed.
> OK.
>>> ...
>>>>> 13) Section 4.2.2, final sentence is incorrect. RFC4821-style MTU
>>>>>     probing cannot be used by tunnels due to ECMP because the probe
>>>>>     packets may take a different path than the data packets. That
>>>>>     is why AERO no longer uses RFC4821 probing.
>>>> Regarding ECMP issues, I think we need to wrap this issue up. Here's
>>>> what I propose:
>>>> - point out that ECMP causes problems with PMTUD (and can cause problems
>>>> with PLMTUD).
>>>> - an interface has two choices:
>>>>     - keep track of PMTU based on other packet context (flowID or next
>>>> header info)
>>>>     - merge PMTU feedback, taking the MIN of reported values
>>> The problem is that some paths in the multipath may fail to deliver
>>> the ICMPs. So there is no way to know whether the MIN has been
>>> determined.
>> That's no different from the multicast case, though.
> In terms of PMTUD, you are right that in a certain sense unicast destinations
> with ECMP are like multicast. In other words, a single destination with multiple
> paths - some of which may not deliver ICMPs.
> But (unlike unicast) the Internet community seems to be OK with the fact
> that some multicast group members might not receive multicasts due to
> an MTU black hole. For unicast destinations, it can be a real problem  if
> there is an MTU black hole along one or more paths of the multipath. 
>>>  Also the ingress may be handling transit packets sourced
>>> by a very large number of original sources each of which produce
>>> a very large number of distinct flows. So, there is no way for the
>>> ingress to cache all of the flow information it handles.
>> Min requires maintaining the same information as any interface would keep.
> I was referring to PLPMTUD here. For PLPMTUD, the probes sent by the
> tunnel ingress may take a very different path of the multipath than most
> data packets will take. So, there is no way for the ingress to definitely
> determine the Min MTU.
>>>> There's no magic here. It's a lot like multicast - either keep track in
>>>> a way that you *think* correlates to the different PMTU feedback or take
>>>> the MIN.
>>> It only works if all paths on the mutlipath can be counted on to deliver
>>> the ICMPs. If any paths in the multipath fail to deliver the ICMPs, it
>>> black holes. And, this is a known problem.
>> Again, same as multicast, and frankly also the same as unicast when the
>> ICMPs are blocked. That's a known problem with PMTUD.
> As above, in some sense it is like multicast. But, the Internet community
> has deemed it OK for multicast to black hole for some group members.
> For unicast, the behavior has to be deterministic and neither PMTUD
> nor PLPMTUD can guarantee that for tunnels.
>>>> The current doc does need a scrub to make this point clearly and
>>>> consistently.
>>> It doesn't work, regardless of the amount of scrubbing.
>> If your point is that PMTUD doesn't work and should never be used,
>> that's clearly not accurate and unlikely to get WG consensus. You're
>> welcome to try, though. However, 1981bis is on its way to increased
>> standards maturity as we speak.
> Multipath really is problematic for both PMTUD and PLPMTUD *for
> tunnels* - that is aside from any considerations for promoting
> RFC1981bis to standards-track in the more general sense.
>>> ...
>>>>> 17) Section 4.2.3 cites RFC4821, but PLPMTUD cannot be used by
>>>>>     tunnels due to ECMP.
>>>> I disagree; it can, but the system needs to either take the MIN or have
>>>> a way to decouple discovered PMTUs in way that can be trusted to
>>>> reasonably correspond to the ECMP splitting.
>>> It doesn't work in the generalized case. The ECMP might split into a
>>> multitude of distinct paths, and there is no way for the ingress to
>>> known which of the paths have been tested. And, all it takes is
>>> one un-tested path in the multipath and there is potential for a
>>> black hole.
>> If PLPMTUD thinks the protocol is making forward progress, then it is
>> not a black hole.
> "Making forward progress" over some paths of the multipath does not
> guarantee progress over all paths - some paths might black hole.
>>>>> 20) Section 4.3.3, fourth paragraph, "A multipoint tunnel MUST
>>>>>     have support for broadcast and multicast" - I think this
>>>>>     would be better as a "SHOULD". RFC2529 and AERO support
>>>>>     multicast, but RFC5214 does not yet it is widely deployed.
>>>> Multicast or its equivalent. Otherwise, you can't support IPv6
>>>> multicast, which is a required capability of IPv6.
>>> Large NBMA links can connect many nodes - thousands or more.
>>> So, for link-scoped multicast, serialized multicast (i.e., multicast
>>> via iterative unicast) would not scale.
>> Serial multicast is not the only equivalent. LANE pushed broadcast ARPs
>> to a unicast ARP server.
> Yes, and NBMA had the MARS proposal.
>> And yes, that won't scale to millions of links, but then if/when it
>> doesn't scale, then you cease to be able to claim this is a valid IP
>> link. Multicast is not an optional protocol for IPv6 in particular.
> RFC5214 and AERO work fine with IPv6.
>>> That is why some large NBMA links (e.g., RFC5214, AERO) use unicast
>>> NS/NA/RS/RA instead of link-scoped multicast, as permitted by RFC4861.
>>> Link-scoped multicast service discovery (e.g., DHCPv6 discovery) is
>>> supported via multicast mapping to a unicast link-layer address.
>> Essentially like LANE.
> Unlike LANE, AERO links are link-local-only (i.e., there are no on-link
> subnets). AERO links connect only routers and/or hosts that act like
> routers.
>>>>> 23) Section 5.1, first sub-bullet under "Tunnels must obey core IP
>>>>>     requirements", Are you meaning to talk about IPv4 DF=1?
>>>> Yes, and that should be made more explicit. Also honoring the EMTU_R
>>>> limits until told otherwise.
>>> OK.
>>> One other comment. I agree with figures 12 and 13 but (and I think this is
>>> a crucial point) I think they need a supporting sentence or two explaining
>>> why the procedure is "fragment then encapsulate" and not "encapsulate
>>> then fragment".
>> Agreed.
> Good.
>>> This is the difference between tunnel fragmentation
>>> and ordinary outer fragmentation, where your document is correctly
>>> advocating tunnel fragmentation. To the best of my knowledge, this was
>>> first documented in Section 3.1.7 of RFC2764 and should be cited as such.
>>> At least, that is what Bob B. suggested to me about 10yrs ago.
>> I'll check that...
> Please see the final paragraph of Section 3.12 of the AERO spec for example
> text.
> Thanks - Fred
>> ----