Re: [tsvwg] [Int-area] Fragmentation and Path MTU text in nvo3 dataplane reqts draft
"Templin, Fred L" <Fred.L.Templin@boeing.com> Fri, 16 May 2014 18:28 UTC
Return-Path: <Fred.L.Templin@boeing.com>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 275BF1A0326; Fri, 16 May 2014 11:28:03 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.852
X-Spam-Level:
X-Spam-Status: No, score=-4.852 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, RP_MATCHES_RCVD=-0.651, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id B2a7oI7THVmu; Fri, 16 May 2014 11:28:01 -0700 (PDT)
Received: from stl-mbsout-02.boeing.com (stl-mbsout-02.boeing.com [130.76.96.170]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id F19661A0322; Fri, 16 May 2014 11:28:00 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1]) by stl-mbsout-02.boeing.com (8.14.4/8.14.4/DOWNSTREAM_MBSOUT) with SMTP id s4GIRqfv009963; Fri, 16 May 2014 13:27:52 -0500
Received: from XCH-PHX-111.sw.nos.boeing.com (xch-phx-111.sw.nos.boeing.com [130.247.25.132]) by stl-mbsout-02.boeing.com (8.14.4/8.14.4/UPSTREAM_MBSOUT) with ESMTP id s4GIRmhQ009726 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=OK); Fri, 16 May 2014 13:27:49 -0500
Received: from XCH-BLV-205.nw.nos.boeing.com (10.57.37.61) by XCH-PHX-111.sw.nos.boeing.com (130.247.25.132) with Microsoft SMTP Server (TLS) id 14.3.181.6; Fri, 16 May 2014 11:27:47 -0700
Received: from XCH-BLV-504.nw.nos.boeing.com ([169.254.4.105]) by XCH-BLV-205.nw.nos.boeing.com ([169.254.5.221]) with mapi id 14.03.0181.006; Fri, 16 May 2014 11:27:45 -0700
From: "Templin, Fred L" <Fred.L.Templin@boeing.com>
To: "Black, David" <david.black@emc.com>, "tsvwg@ietf.org" <tsvwg@ietf.org>, "tsv-area@ietf.org" <tsv-area@ietf.org>
Thread-Topic: [Int-area] Fragmentation and Path MTU text in nvo3 dataplane reqts draft
Thread-Index: AQHPcTSH/Asq9x/h/UWLdY5SldYekg==
Date: Fri, 16 May 2014 18:27:44 +0000
Message-ID: <2134F8430051B64F815C691A62D983181B2AD6AE@XCH-BLV-504.nw.nos.boeing.com>
References: <8D3D17ACE214DC429325B2B98F3AE712076C55B7B1@MX15A.corp.emc.com> <2134F8430051B64F815C691A62D983181B2AC94E@XCH-BLV-504.nw.nos.boeing.com>
In-Reply-To: <2134F8430051B64F815C691A62D983181B2AC94E@XCH-BLV-504.nw.nos.boeing.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [130.247.104.6]
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-TM-AS-MML: disable
Archived-At: http://mailarchive.ietf.org/arch/msg/tsvwg/xM4Y1bQ_uvY1n4oaQC6JLnUy7os
X-Mailman-Approved-At: Sat, 17 May 2014 09:01:04 -0700
Cc: Mark Townsley <townsley@cisco.com>, "int-area@ietf.org" <int-area@ietf.org>
Subject: Re: [tsvwg] [Int-area] Fragmentation and Path MTU text in nvo3 dataplane reqts draft
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 16 May 2014 18:28:03 -0000
> That document should be the place to put generic recommendations for > tunnel MTU handling that apply to all tunnel types. In case you are wondering what I think "generic recommendations for tunnel MTU handling" should look like, here is what I think: The tunnel link Maximum Transmission Unit (MTU) is 64KB minus the encapsulation overhead for IPv4 [RFC0791] and 4GB minus the encapsulation overhead for IPv6 [RFC2675]. This is the most that IPv4 and IPv6 (respectively) can convey within the constraints of protocol constants, but actual sizes available for tunneling will frequently be much smaller. The base tunneling specifications for IPv4 and IPv6 typically set a static MTU on the tunnel ingress to 1500 bytes minus the encapsulation overhead or smaller still if the tunnel is likely to incur additional encapsulations on the path. This can result in path MTU related black holes when packets that are too large to be accommodated over the tunnel are dropped, but the resulting ICMP Packet Too Big (PTB) messages are lost on the return path. As a result, tunnels use the following MTU mitigations to accommodate larger packets. Tunnels set their ingress MTU to the larger of the underlying interface MTU minus the encapsulation overhead, and 1500 bytes. Tunnels optionally cache per-egress MTU values in the underlying IP path MTU discovery cache initialized to the underlying interface MTU. Tunnels admit packets that are no larger than 1280 bytes minus the encapsulation overhead (*) as well as packets that are larger than 1500 bytes into the tunnel without fragmentation, i.e., as long as they are no larger than the tunnel ingress MTU before encapsulation and also no larger than the cached per-egress MTU following encapsulation. For IPv4, the ingress sets the "Don't Fragment" (DF) bit to 0 for packets no larger than 1280 bytes minus the encapsulation overhead (*) and sets the DF bit to 1 for packets larger than 1500 bytes. If a large packet is lost in the path, the ingress may optionally cache the MTU reported in the resulting PTB message or may ignore the message, e.g., if there is a possibility that the message is spurious. For packets admitted into the tunnel that are larger than 1280 bytes minus the encapsulation overhead (*) but no larger than 1500 bytes, the ingress uses IP fragmentation to fragment the encapsulated packet into two pieces (where the first fragment contains 1024 bytes of the fragmented inner packet) then sends the fragments to the egress. If the outer protocol is IPv4, the node sends the fragments with DF set to 0 and subject to rate limiting to avoid reassembly errors [RFC4963][RFC6864]. For both IPv4 and IPv6, the ingress also sends a 1500 byte probe message (**) to the egress, subject to rate limiting. To construct a probe, the ingress prepares an ICMPv6 Neighbor Solicitation (NS) message with trailing padding octets added to a length of 1500 bytes but does not include the length of the padding in the IPv6 Payload Length field. The ingress then encapsulates the NS in the outer encapsulation headers (while including the length of the padding in the outer length fields), sets DF to 1 (for IPv4) and sends the padded NS message to the neighbor. If the egress returns an NA message, the ingress may then send whole packets within this size range and (for IPv4) relax the rate limiting requirement. (Note that for tunnels that do not perform IPv6 neighbor discovery, an ICMP echo request message can be used instead of NS.) The egress MUST be capable of reassembling packets up to 1500 bytes plus the encapsulation overhead length. It is therefore RECOMMENDED that the egress be capable of reassembling at least 2KB. (*) Note that if it is known without probing that the minimum Path MTU to a tunnel egress is MINMTU bytes (where 1280 < MINMTU < 1500) then MINMTU can be used instead of 1280 in the fragmentation threshold considerations listed above. (**) It is RECOMMENDED that no probes smaller than 1500 bytes be used for MTU probing purposes, since smaller probes may be fragmented if there is a nested tunnel somewhere on the path to the egress. Probe sizes larger than 1500 bytes MAY be used, but may be unnecessary since original sources are expected to use [RFC4821] when sending large packets. I think this applies to all IP-in-(foo)-in-IP tunnel types, and could go as a set of generic recommendations to be cited by other documents. Comments? Thanks - Fred fred.l.templin@boeing.com > -----Original Message----- > From: Int-area [mailto:int-area-bounces@ietf.org] On Behalf Of Templin, Fred L > Sent: Thursday, May 15, 2014 3:41 PM > To: Black, David; tsvwg@ietf.org; tsv-area@ietf.org > Cc: Mark Townsley; int-area@ietf.org > Subject: Re: [Int-area] Fragmentation and Path MTU text in nvo3 dataplane reqts draft > > Hi, > > > -----Original Message----- > > From: tsv-area [mailto:tsv-area-bounces@ietf.org] On Behalf Of Black, David > > Sent: Wednesday, May 14, 2014 1:53 PM > > To: tsvwg@ietf.org; tsv-area@ietf.org > > Subject: Fragmentation and Path MTU text in nvo3 dataplane reqts draft > > > > <WG chair hat off> > > > > Over in the nvo3 WG, draft-ietf-nvo3-dataplane-requirements-03 contains > > some text on dealing with the fragmentation and MTU effects of tunnels. > > I thought I'd ask for some early review of this text, given recent IESG > > excitement around fragmentation and Path MTU topics in another draft: > > All tunnels have trouble with path MTU, and in some cases have no choice > but to fragment. However, they should strive to tune out fragmentation > and forward whole packets whenever possible. > > Over in the intarea, there have been sporadic ongoing discussions about > how to recommend generic MTU mitigations for tunnels. Joe Touch and Mark > Townsley have been working for a long time on a document titled > "Tunnels in the Internet Architecture": > > http://tools.ietf.org/id/draft-ietf-intarea-tunnels-00.txt > > That document should be the place to put generic recommendations for > tunnel MTU handling that apply to all tunnel types. > > Tunnel MTU issues keep popping up in all places, and this is just > another example. Is it time to revive Joe and Mark's document? > > Thanks - Fred > fred.l.templin@boeing. > > > http://datatracker.ietf.org/doc/draft-ietf-ipsecme-ikev2-fragmentation/ballot/ > > > > I believe that the nvo3 draft is in better shape in these areas. Nonetheless, > > I've included its current text on fragmentation and path MTU below, and (on > > behalf of the draft authors and nvo3 WG chairs) I'm looking for input on > > what that text should say and why. > > > > In nvo3 terminology, an overlay network is an inner network that is tunneled > > over an outer underlay network. The nvo3 WG also uses "Tenant System" as > > the term for a sender/receiver of network traffic because multi-tenancy is > > an important motivation for the WG's activities in network virtualization. > > > > -------------------------------------- > > > > 3.5. Path MTU > > > > The tunnel overlay header can cause the MTU of the path to the > > egress tunnel endpoint to be exceeded. > > > > IP fragmentation SHOULD be avoided for performance reasons. > > > > The interface MTU as seen by a Tenant System SHOULD be adjusted such > > that no fragmentation is needed. This can be achieved by > > configuration or be discovered dynamically. > > > > Either of the following options MUST be supported: > > > > o Classical ICMP-based MTU Path Discovery [RFC1191] [RFC1981] or > > Extended MTU Path Discovery techniques such as defined in > > [RFC4821] > > > > o Segmentation and reassembly support from the overlay layer > > operations without relying on the Tenant Systems to know about > > the end-to-end MTU > > > > o The underlay network MAY be designed in such a way that the MTU > > can accommodate the extra tunnel overhead. > > > > -------------------------------------- > > > > </WG chair hat off> > > > > Thanks, > > --David > > ---------------------------------------------------- > > David L. Black, Distinguished Engineer > > EMC Corporation, 176 South St., Hopkinton, MA 01748 > > +1 (508) 293-7953 FAX: +1 (508) 293-7786 > > david.black@emc.com Mobile: +1 (978) 394-7754 > > ---------------------------------------------------- > > _______________________________________________ > Int-area mailing list > Int-area@ietf.org > https://www.ietf.org/mailman/listinfo/int-area
- [tsvwg] Fragmentation and Path MTU text in nvo3 d… Black, David
- Re: [tsvwg] Fragmentation and Path MTU text in nv… Eggert, Lars
- Re: [tsvwg] Fragmentation and Path MTU text in nv… Black, David
- Re: [tsvwg] Fragmentation and Path MTU text in nv… Joe Touch
- Re: [tsvwg] [Int-area] Fragmentation and Path MTU… Templin, Fred L
- Re: [tsvwg] [Int-area] Fragmentation and Path MTU… Templin, Fred L
- Re: [tsvwg] [Int-area] Fragmentation and Path MTU… Templin, Fred L
- Re: [tsvwg] [Int-area] Fragmentation and Path MTU… Templin, Fred L
- Re: [tsvwg] Fragmentation and Path MTU text in nv… Linda Dunbar
- Re: [tsvwg] [sfc] Fragmentation and Path MTU text… Xuxiaohu
- Re: [tsvwg] Fragmentation and Path MTU text in nv… Templin, Fred L
- Re: [tsvwg] [Int-area] Fragmentation and Path MTU… Linda Dunbar
- Re: [tsvwg] [Int-area] Fragmentation and Path MTU… Linda Dunbar
- Re: [tsvwg] [Int-area] Fragmentation and Path MTU… Linda Dunbar