Re: [BEHAVE] PMTU Discovery and ICMPv6 filtering
"Templin, Fred L" <Fred.L.Templin@boeing.com> Fri, 05 February 2010 22:16 UTC
Return-Path: <Fred.L.Templin@boeing.com>
X-Original-To: behave@core3.amsl.com
Delivered-To: behave@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 526003A67ED; Fri, 5 Feb 2010 14:16:13 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.523
X-Spam-Level:
X-Spam-Status: No, score=-6.523 tagged_above=-999 required=5 tests=[AWL=0.076, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ai0ArnDhQcxz; Fri, 5 Feb 2010 14:16:11 -0800 (PST)
Received: from slb-smtpout-01.boeing.com (slb-smtpout-01.boeing.com [130.76.64.48]) by core3.amsl.com (Postfix) with ESMTP id 8DE853A6B4A; Fri, 5 Feb 2010 14:16:11 -0800 (PST)
Received: from blv-av-01.boeing.com (blv-av-01.boeing.com [130.247.48.231]) by slb-smtpout-01.ns.cs.boeing.com (8.14.0/8.14.0/8.14.0/SMTPOUT) with ESMTP id o15MGxVZ003366 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL); Fri, 5 Feb 2010 14:17:00 -0800 (PST)
Received: from blv-av-01.boeing.com (localhost [127.0.0.1]) by blv-av-01.boeing.com (8.14.0/8.14.0/DOWNSTREAM_RELAY) with ESMTP id o15MGxGZ019974; Fri, 5 Feb 2010 14:16:59 -0800 (PST)
Received: from XCH-NWHT-09.nw.nos.boeing.com (xch-nwht-09.nw.nos.boeing.com [130.247.25.115]) by blv-av-01.boeing.com (8.14.0/8.14.0/UPSTREAM_RELAY) with ESMTP id o15MGxRl019969 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=OK); Fri, 5 Feb 2010 14:16:59 -0800 (PST)
Received: from XCH-NW-01V.nw.nos.boeing.com ([130.247.64.120]) by XCH-NWHT-09.nw.nos.boeing.com ([130.247.25.115]) with mapi; Fri, 5 Feb 2010 14:16:59 -0800
From: "Templin, Fred L" <Fred.L.Templin@boeing.com>
To: Ed Jankiewicz <edward.jankiewicz@sri.com>, Behave WG <behave@ietf.org>, "softwires@ietf.org" <softwires@ietf.org>
Date: Fri, 05 Feb 2010 14:17:04 -0800
Thread-Topic: [BEHAVE] PMTU Discovery and ICMPv6 filtering
Thread-Index: Acqk9e5k2HVZbYNmRqmDmJiAEKHTMABrvomw
Message-ID: <E1829B60731D1740BB7A0626B4FAF0A64951037BF6@XCH-NW-01V.nw.nos.boeing.com>
References: <4B69B06D.7080606@sri.com>
In-Reply-To: <4B69B06D.7080606@sri.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: en-US
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Subject: Re: [BEHAVE] PMTU Discovery and ICMPv6 filtering
X-BeenThere: behave@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: mailing list of BEHAVE IETF WG <behave.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/behave>, <mailto:behave-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/behave>
List-Post: <mailto:behave@ietf.org>
List-Help: <mailto:behave-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/behave>, <mailto:behave-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 05 Feb 2010 22:16:13 -0000
Hi Ed, > -----Original Message----- > From: behave-bounces@ietf.org [mailto:behave-bounces@ietf.org] On Behalf Of Ed Jankiewicz > Sent: Wednesday, February 03, 2010 9:21 AM > To: Behave WG; softwires@ietf.org > Subject: [BEHAVE] PMTU Discovery and ICMPv6 filtering > > One of my colleagues received a long comment on Path MTU Discovery recommendations his organization > published and is seeking advice. I recall this has been discussed several times at IETF meetings, > not sure which WG, so this may be redundant. I've tried to summarize the salient points below, and > have two broad questions on this: Are these points already covered in RFCs (other than 4459, 4890) > or current Internet-Drafts? If so, I would appreciate pointers. If not already covered by current > publications, is there interest in documenting the problem and comparing the solutions/drawbacks? > > The commenter basically wrote: > > IPv4 and IPv6 treat packets exceeding MTU differently - IPv4 will fragment packets that are "too big" > but IPv6 will drop the packet and respond with ICMPv6 "too-big" error message. [The subject > publication] recommends using the Path MTU Discovery Protocol to discover the end-to-end PMTU, which > relies on ICMPv6 error messages. These may be blocked by various "filters" and IPsec gateways, which > is the case in many operational networks. > > However, even when ICMPv6 is not blocked, IPsec gateways (in tunnel mode) add extra headers, and > there can be more than one tunnel header involved (routers also create tunnels). When a "too-big" > message is sent the router will return put in its ICMPv6 message the value of the MTU on the next > link at layer 2. The host receiving this MTU value in an ICMP message at part of the Path MTU > Discovery Protocol has no way of knowing how many extra tunnel headers are added along the path, and > so if it just takes the reported MTU value without allowing for these extra headers the process will > keep on failing and will not recover. We have seen this behavior in our experiments. > > This can be prevented by ensuring that the maximum packet size sent by the host is smaller than the > layer 2 limit: smaller by an amount estimated to be sufficient to allow room for extra headers to be > added along the path. Several ways of achieving this are possible: > > (1) Set this reduced MTU value on the on the IPSec gateway LAN interface; the host then discovers > this MTU through the PMTUD. > > (2) Statically configure this reduced MTU value into the host and switch off PMTUD. > > (3) Set a reduced MTU at the IPSec gateway WAN interface; The IPSec gateway acts as a host on this > interface and so can do packet fragmentation. > > (4) Provide the capability in the IPSec gateway to discover the MTU on its WAN interface, subtract > the maximum header size that this gateway will add to packets presented on its LAN interface, which > the host can then discover through the PMTUD. > > Method (4) would be the best solution, but is not currently available in the IPSec gateway products. > The next best solution is (1), which has been used [in commenter experiments]. This is not as good as > (4) because it requires manual intervention, and an understanding of how to calculate the appropriate > (reduced) MTU value. > The next best solution is (2), the only disadvantage of this approach is that only one value can be > set for all paths and so the worst case (lowest) value has to be used. In a complex network it may > not always be obvious what the worst case path is, and so a conservative estimate may be necessary. > Even so this could be preferable in some deployment scenarios since the path-MTU discovery protocol > relies on the passage of ICMP messages which are sometimes blocked by firewalls and other security > devices. > Approach (3) is the worst solution since it will cause many IP packets to be fragmented which is > inefficient (both because, unlike IPv4, the IPv6 header has to be extended to include the > fragmentation offset field, and because it will result the second fragment being very small, i.e. the > ratio of user-data to IP header size will be poor). > > It is likely that for immediate use option (1) should be used although (4) would be better if it were > supported in the relevant products. (1) seems like a safe option at face value, but can lead to undesirable inefficiencies. Consider for example the diagram below: L1 L2 L3 | | | W --|--R--|--GW1<====>GW2--|--Z | | (Internet) | X--| L4, L5, Y--| L6, etc. Here, we have a tunnel beteen 'GW1' and 'GW2' over the Internet to connect two networks. 'GW1' sets a reduced MTU 'M' on its LAN interface connected to link 'L2', and also advertises 'M' in the Router Advertisements it sends on 'L2'. Hosts 'X' and 'Y' pick up the reduced MTU from the RA and limit the size of the packets they send to at most 'M' bytes. But, host 'W' connected to link 'L1' does not see the MTU reduction, and hence will routinely send packets larger than 'M' to any hosts beyond router 'R' such as 'X', 'Y' and 'Z'. These packets will be dropped with an ICMPv6 PTB returned, then 'W' will be forced to reduce its packet size and retransmit. The only way to prevent this is to drive the reduced MTU 'M' deeply into the entire network stacked up behind 'GW1', which may contain arbitrarily many additional routers and links. Now, even if the reduced MTU 'M' were propagated deeply throughout the 'GW1' network, if most communications remain localized hosts would only be able to use a packet size of at most 'M' even if their links natively support a much larger MTU. Consider for example that link 'L2' in the diagram has a native MTU of 9kb, but the effective MTU across the 'GW1'<====>'GW2' tunnel is only 1400. 'GW1' will advertise 1400 on link 'L1', and communications betweenhosts 'X' and 'Y' will be restricted to using at most 1400 byte packets when they could have used 9kb. There are a couple of factors to consider in terms of what might be a better solution. First, what are the expected data rates over the 'GW1'<====>'GW2' tunnel, and second what are the performance characteristics of those gateways? If the data rates are such that GW1 and GW2 are already operating at their peak performance even without taking on any additional processing overhead, then the best solution would be to make sure that all links over which the tunnel might travel (e.g., 'L4', 'L5', 'L6', etc.) are large enough to "hide" the tunnel encapsulation artifact. For example, if all links 'L4' 'L5', 'L6', etc. configure a native MTU of no smaller than 1600 and the encapsulation overhead for the tunnel is 100 bytes, then all routers and hosts on the LAN side of 'GW1' would be able to happily use a 1500 MTU. If the MTUs of 'L4', etc. cannot be controlled, however, then there is no recourse but to use option (1) and cope with the inefficiencies. On the other hand, if the data rates across the tunnel are nominal and/or 'GW1' and 'GW2' have more than sufficient processing capability to take on a modest amount of additional overhead, the GWs can use tunnel fragmentation so that 'GW1' can present a solid MTU on its LAN side interface that does not reflect the size of the tunnel encapsulation headers. If the tunnel fragmentation could accommodate an MTU of at least 1500 in this way, then 'GW1' would be able to observe the "de facto Internet cell size" of 1500. If we further assume that the vast majority of hosts in the world today either limit their packet sizes to no more than 1500 bytes or are willing to assume the risk of silent loss of packets larger than 1500 due to MTU restrictions, then there is no need to place artificial restrictions on the size of packets that can be used within the 'GW1' network. This latter class of hosts (those that send packets larger than 1500) would be best served to use their own host-based MTU probing mechanisms for sending packets larger than 1500 in case the network is somehow silently dropping PMTUD messages. RFC4821 was specifically designed for this purpose. End result - whenever it is practically possible, tunnel routers should use tunnel fragmentation and hosts should use RFC4821. Fred fred.l.templin@boeing.com > -- > Ed Jankiewicz - SRI International > Fort Monmouth Branch Office - IPv6 Research > Supporting DISA Standards Engineering Branch > 732-389-1003 or ed.jankiewicz@sri.com > > _______________________________________________ > Behave mailing list > Behave@ietf.org > https://www.ietf.org/mailman/listinfo/behave
- [BEHAVE] PMTU Discovery and ICMPv6 filtering Ed Jankiewicz
- Re: [BEHAVE] PMTU Discovery and ICMPv6 filtering Michael Richardson
- Re: [BEHAVE] PMTU Discovery and ICMPv6 filtering Templin, Fred L
- Re: [BEHAVE] [Softwires] PMTU Discovery and ICMPv… Dave Dolson
- Re: [BEHAVE] [Softwires] PMTU Discovery and ICMPv… Rémi Després
- Re: [BEHAVE] [Softwires] PMTU Discovery and ICMPv… Templin, Fred L
- Re: [BEHAVE] [Softwires] PMTU Discovery and ICMPv… Rémi Després
- Re: [BEHAVE] [Softwires] PMTU Discovery and ICMPv… Templin, Fred L
- Re: [BEHAVE] [Softwires] PMTU Discovery and ICMPv… Dan Wing
- Re: [BEHAVE] [Softwires] PMTU Discovery and ICMPv… Rémi Després
- Re: [BEHAVE] [Softwires] PMTU Discovery and ICMPv… Dan Wing
- Re: [BEHAVE] [Softwires] PMTU Discovery and ICMPv… Dan Wing
- Re: [BEHAVE] [Softwires] PMTU Discovery and ICMPv… Tero Kivinen
- Re: [BEHAVE] [Softwires] PMTU Discovery and ICMPv… Michael Richardson