Re: [BEHAVE] Fwd: IPv6 hosts sending <1280 byte packets

"Templin, Fred L" <Fred.L.Templin@boeing.com> Tue, 09 February 2010 17:51 UTC

Return-Path: <Fred.L.Templin@boeing.com>
X-Original-To: behave@core3.amsl.com
Delivered-To: behave@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id E273028C25A for <behave@core3.amsl.com>; Tue, 9 Feb 2010 09:51:11 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.296
X-Spam-Level:
X-Spam-Status: No, score=-6.296 tagged_above=-999 required=5 tests=[AWL=-0.297, BAYES_00=-2.599, J_CHICKENPOX_33=0.6, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Mf1TNxSM8kZ8 for <behave@core3.amsl.com>; Tue, 9 Feb 2010 09:51:10 -0800 (PST)
Received: from blv-smtpout-01.boeing.com (blv-smtpout-01.boeing.com [130.76.32.69]) by core3.amsl.com (Postfix) with ESMTP id 4609928C24E for <behave@ietf.org>; Tue, 9 Feb 2010 09:51:10 -0800 (PST)
Received: from slb-av-01.boeing.com (slb-av-01.boeing.com [129.172.13.4]) by blv-smtpout-01.ns.cs.boeing.com (8.14.0/8.14.0/8.14.0/SMTPOUT) with ESMTP id o19HpfsX023020 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 9 Feb 2010 09:51:41 -0800 (PST)
Received: from slb-av-01.boeing.com (localhost [127.0.0.1]) by slb-av-01.boeing.com (8.14.0/8.14.0/DOWNSTREAM_RELAY) with ESMTP id o19HpeM5020696; Tue, 9 Feb 2010 09:51:40 -0800 (PST)
Received: from XCH-NWHT-08.nw.nos.boeing.com (xch-nwht-08.nw.nos.boeing.com [130.247.25.112]) by slb-av-01.boeing.com (8.14.0/8.14.0/UPSTREAM_RELAY) with ESMTP id o19Hpe6r020682 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=OK); Tue, 9 Feb 2010 09:51:40 -0800 (PST)
Received: from XCH-NW-01V.nw.nos.boeing.com ([130.247.64.120]) by XCH-NWHT-08.nw.nos.boeing.com ([130.247.25.112]) with mapi; Tue, 9 Feb 2010 09:51:40 -0800
From: "Templin, Fred L" <Fred.L.Templin@boeing.com>
To: Dan Wing <dwing@cisco.com>, 'Iljitsch van Beijnum' <iljitsch@muada.com>
Date: Tue, 09 Feb 2010 09:51:38 -0800
Thread-Topic: [BEHAVE] Fwd: IPv6 hosts sending <1280 byte packets
Thread-Index: AcqpEs0oJ90THNJHSZaghnxxd0OyXwAASZ1wAAGlmNAAAJb/cAABmZlwAABjT3AAId8FAA==
Message-ID: <E1829B60731D1740BB7A0626B4FAF0A649510381E7@XCH-NW-01V.nw.nos.boeing.com>
References: <4B6F08CC.2070900@wand.net.nz> <063A973F-EBC3-4CD0-B5B6-B0FB42A8 593D@muada.com><00f201caa8da$b78e3e90$c4f0200a@cisco.com><4B704153.2020007@ it.uc3m.es><015801caa8e6$9b72fff0$c4f0200a@cisco.com><75A95C0D-E2CC-4FD6-B1 1A-5C772FCD0F5C@muada.com><02cd01caa90e$dde921c0$c4f0200a@cisco.com><B50C7F 0A-19DB-4C63-9F72-867B5C2D4841@muada.com><02f101caa916$712d0170$c4f0200a@ci sco.com><E1829B60731D1740BB7A0626B4FAF0A64951038053@XCH-NW-01V.nw.nos.boein g.com><032d01caa91e$343c71d0$c4f0200a@cisco.com> <E1829B60731D1740BB7A0626B4FAF0A64951038079@XCH-NW-01V.nw.nos.boeing.com> <035601caa925$800f2240$c4f0200a@cisco.com>
In-Reply-To: <035601caa925$800f2240$c4f0200a@cisco.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: en-US
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Cc: "behave@ietf.org" <behave@ietf.org>
Subject: Re: [BEHAVE] Fwd: IPv6 hosts sending <1280 byte packets
X-BeenThere: behave@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: mailing list of BEHAVE IETF WG <behave.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/behave>, <mailto:behave-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/behave>
List-Post: <mailto:behave@ietf.org>
List-Help: <mailto:behave-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/behave>, <mailto:behave-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 09 Feb 2010 17:51:12 -0000

Dan,

About the placement of the translator, in the IPv6->IPv4
direction the packet incurs a size reduction due to the
substitution of an IPv4 header in place of the IPv6 header.
In the IPv4->IPv6 direction, however, the size of the packet
is inflated. So, e.g., if the translator receives a 1500byte
IPv4 packet and has to translate it into a 1520byte IPv6
packet, there is a chance that it may be dropped due to a
link MTU restriction somewhere in the IPv6 network and an
ICMPv6 PTB sent back toward the translator. Presumably, the
translator would translate that into an ICMPv4 PTB message
that it forwards on to the IPv4 host.

If I have that right, there may be a concern for whether
the ICMPv4 PTB would be dropped in the IPv4 network before
it arrives at the IPv4 host. I ran some tests and found
that path MTU discovery in IPv4 seems to be problematic
for a substantial number of the top 1000 websites:

http://www.ietf.org/mail-archive/web/rrg/current/msg05907.html

This suggests to me that path MTU discovery in the IPv4
network "works" simply because it is rarely invoked. In
other words, the IPv4 network seems to be getting by
based on the good fortune that most links in the Internet
core configure an MTU of 1500 or larger. But, this state
of affairs may be disrupted if more and more translators
are used that cause IPv4 packets to be inflated in size
after they are translated into IPv6 packets.

So, if the translator were located at the site border of
the IPv4 host (i.e., rather than at the site border of
the IPv6 host) the bulk of the path would be over IPv6
instead of IPv4, hence the incidence of ICMPv4 filtering
would be isolated to the end site of the IPv4 host.

Has the translator deployment analysis taken IPv4 path
MTU-related black holing into consideration?

Fred
fred.l.templin@boeing.com  

> -----Original Message-----
> From: Dan Wing [mailto:dwing@cisco.com]
> Sent: Monday, February 08, 2010 5:16 PM
> To: Templin, Fred L; 'Iljitsch van Beijnum'
> Cc: behave@ietf.org
> Subject: RE: [BEHAVE] Fwd: IPv6 hosts sending <1280 byte packets
> 
> 
> 
> > -----Original Message-----
> > From: behave-bounces@ietf.org
> > [mailto:behave-bounces@ietf.org] On Behalf Of Templin, Fred L
> > Sent: Monday, February 08, 2010 5:11 PM
> > To: Dan Wing; 'Iljitsch van Beijnum'
> > Cc: behave@ietf.org
> > Subject: Re: [BEHAVE] Fwd: IPv6 hosts sending <1280 byte packets
> >
> > Dan,
> >
> > > -----Original Message-----
> > > From: behave-bounces@ietf.org
> > [mailto:behave-bounces@ietf.org] On Behalf Of Dan Wing
> > > Sent: Monday, February 08, 2010 4:24 PM
> > > To: Templin, Fred L; 'Iljitsch van Beijnum'
> > > Cc: behave@ietf.org
> > > Subject: Re: [BEHAVE] Fwd: IPv6 hosts sending <1280 byte packets
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: Templin, Fred L [mailto:Fred.L.Templin@boeing.com]
> > > > Sent: Monday, February 08, 2010 4:02 PM
> > > > To: Dan Wing; 'Iljitsch van Beijnum'
> > > > Cc: behave@ietf.org
> > > > Subject: RE: [BEHAVE] Fwd: IPv6 hosts sending <1280 byte packets
> > > >
> > > > The concern with setting DF=0 on the IPv4 side for packets
> > > > that are 1280 or smaller on the IPv6 side is the possibility
> > > > for RFC4963 data corruption if we let it run at line rate.
> > >
> > > At high speed with the same source/destination addresses, and when
> > > fragmentation on the IPv4 network occurs.  That is, sending
> > lots of packets
> > > with DF=0 does not, by itself, cause a problem even if they
> > are fragmented on
> > > the same link -- so long as they have different source or different
> > > destination addresses.  I agree we should make a note of that.
> >
> > Yes, the concern is for same src,dst operating at line
> > rate. IPv6 apps assume that there is no possibility for
> > fragmentation in the network, hence they may not be as
> > shy as IPv4 apps when it comes to sending large packets
> > at high data rates. Having the translator set DF=0 can
> > break that assumption.
> 
> There are three approaches discussed in new text in
> http://tools.ietf.org/html/draft-ietf-behave-v6v4-xlate-08#page-18,
> which only set DF=0 for "small packets" (packets that were
> less than 1280 on the IPv6 side).
> 
> I don't think that eliminates your concern, though.
> 
> > > Is there another approach that should be considered to
> > > avoid that concern?
> >
> > I still believe placement of the translator matters.
> > Placing the translator closer to the IPv4 host
> > reduces the portion of the path that is exposed to
> > IPv4 hops. This is true whether the DF=0 or DF=1
> > approach is used.
> 
> Agreed.
> 
> However, that isn't always possible.  For example,
> someone operating an IPv6-only network will have to
> place their IPv6/IPv4 translator at their network
> border.
> 
> But perhaps some text could be added to
> draft-ietf-behave-v6v4-xlate which highlighted
> this recommendation to place the translator as
> close to the IPv4 host as possible.  Or, similarly,
> to ensure the IPv4 network has an MTU greater than
> 1280 (minus whatever the IPv6/IPv4 header differences
> are).
> 
> -d
> 
> 
> > Fred
> > fred.l.templin@boeing.com
> >
> > > > There are certainly plenty of websites on the IPv4 Internet
> > > > that set DF=0 on every packet they send, and many even use
> > > > packet sizes larger than 1280. But, how many of them send
> > > > large packets at line rate?
> > >
> > > Content delivery networks?
> > >
> > > -d
> > >
> > >
> > > > Fred
> > > > fred.l.templin@boeing.com
> > > >
> > > > > -----Original Message-----
> > > > > From: behave-bounces@ietf.org
> > > > [mailto:behave-bounces@ietf.org] On Behalf Of Dan Wing
> > > > > Sent: Monday, February 08, 2010 3:29 PM
> > > > > To: 'Iljitsch van Beijnum'
> > > > > Cc: behave@ietf.org
> > > > > Subject: Re: [BEHAVE] Fwd: IPv6 hosts sending <1280 byte packets
> > > > >
> > > > >
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Iljitsch van Beijnum [mailto:iljitsch@muada.com]
> > > > > > Sent: Monday, February 08, 2010 3:02 PM
> > > > > > To: Dan Wing
> > > > > > Cc: 'marcelo bagnulo braun'; behave@ietf.org
> > > > > > Subject: Re: [BEHAVE] Fwd: IPv6 hosts sending <1280
> > byte packets
> > > > > >
> > > > > > On 8 feb 2010, at 23:34, Dan Wing wrote:
> > > > > >
> > > > > > >> So essentially this means doing more work (keeping PMTUD
> > > > > > >> state for the IPv4 path) in order to do even more
> > work later
> > > > > > >> (fragment). Not sure how that makes sense...
> > > > > >
> > > > > > > Yes, it means the translator is doing the effort of
> > fragmenting
> > > > > > > rather than the router immediately in front of the small-MTU
> > > > > > > link.  However, by doing it this way, we can
> > maintain end-to-end
> > > > > > > PMTUD for those IPv6 hosts that receive ICMPv6 PTB and send
> > > > > > > packets with the fragmentation header.
> > > > > >
> > > > > > No, the two issues are orthogonal.
> > > > > >
> > > > > > At some point, an IPv6 host sends a packet that is larger
> > > > > > than the IPv4 PMTU. A router sends back a too big message.
> > > > > > Issue A is how to handle this too big message. Issue B is
> > > > > > what happens to packets that the IPv6 host subsequently sends.
> > > > >
> > > > > A stateless translator cannot identify 'subsequent' packets,
> > > > > because it doesn't remember previous state.
> > > > >
> > > > > > I think there is agreement that A should be handled by simply
> > > > > > translating the IPv4 too big too IPv6 and adjusting it as per
> > > > > > the header size differences, but otherwise let it through
> > > > > > transparently. (I once argued for rewriting it to
> > 1280, though.)
> > > > > >
> > > > > > Nothing has changed here.
> > > > > >
> > > > > > But now for part B. A stateless translator can't monitor the
> > > > > > IPv4 PMTU. Not because it's stateful (we're talking per
> > > > > > destination state, not per session state, it would be doable
> > > > > > to track this) but because there may be more than one
> > > > > > translator so any given translator may not see all packets.
> > > > > > So for stateless translation when the translator has a < 1280
> > > > > > packet with no fragment header we don't know whether this is
> > > > > > because PMTUD was succesful and the packet size fits in the
> > > > > > PMTU, or the IPv6 host ignored the < 1280 too big message and
> > > > > > is omitting the fragment header in violation of the relevant
> > > > > > RFCs. In order to be compatible with the latter the only
> > > > > > thing we can do is set DF to 0.
> > > > > >
> > > > > > (I just realize that we don't know what happens when for
> > > > > > instance the PMTU is 1000 and the IPv6 host wants to send a
> > > > > > 1100-byte packet: does it include a fragment header or not?)
> > > > > >
> > > > > > Note that for part B, if there _is_ a fragment header, that
> > > > > > will also trigger the translator to set DF to 0. So in the
> > > > > > case where the IPv6 hosts are well behaved there is no change
> > > > > > to either parts A or B.
> > > > >
> > > > > So, I believe we are in agreement:  IPv6 packets less than
> > > > > 1280 need to be translated to IPv4 and sent with DF=0.
> > > > >
> > > > > > In the stateful case we know all packets flow through the
> > > > > > same translator so we get to track the IPv4 PMTU so when
> > > > > > there is an IPv6 packet that needs to be translated we know
> > > > > > whether it's bigger or smaller than the PMTU so we can send
> > > > > > packets that are bigger with DF=1 and packets that are
> > > > > > smaller with DF=0 and if we are to, also immediately fragment
> > > > > > them. However, there is no advantage to doing this extra work
> > > > > > because even if we get to set DF=1 on some packets that are
> > > > > > smaller than the PMTU that only lets us discover a reduction
> > > > > > in the PMTU, but that discovery doesn't buy us anything
> > > > > > because the IPv6 host isn't going to reduce its packet size.
> > > > >
> > > > > I agree the IPv6 host won't reduce its packet size below
> > > > > 1280.  But the IPv6 host becomes aware that the path MTU
> > > > > is smaller than 1280.  I don't know if the IPv6 host finds
> > > > > that useful/valuable to know the path MTU is less than the
> > > > > IPv6 minimum MTU.
> > > > >
> > > > > > Fragmenting at the translator also doesn't buy the translator
> > > > > > anything and it may even get the packet blocked or create
> > > > > > more work if there's a firewall with a > 1280 MTU before the
> > > > > > path hits the < 1280 MTU which would need to reassemble the
> > > > > > packet to observe the port numbers. And it could be an attack
> > > > > > vector: attackers could send too bigs with tiny MTUs to make
> > > > > > the translator work harder.
> > > > > >
> > > > > > > Certainly there was some
> > > > > > > perceived value in the IPv6 knowing its packets are being
> > > > > > > fragmented.
> > > > > >
> > > > > > I don't think so. Even if the application somehow learns this
> > > > > > information, what is it supposed to do then?
> > > > >
> > > > > Don't know.  It is obviously easier for the 6/4 translator to
> > > > > simply fragment the packet (or translate and send with
> > DF=0) without
> > > > > informing the IPv6 host, but there is long-standing text in both
> > > > > RFC2460 and its predecessor (RFC1883) that says to send back
> > > > > ICMPv6 PTB and the IPv6 host is then supposed to send packets
> > > > > with the fragmentation header.
> > > > >
> > > > > > > I agree that nearly 100% of PMTUD on IPv4 is with TCP.
> > > > > >
> > > > > > > But there isn't anything prohibiting PMTUD for UDP.
> > > > > >
> > > > > > With sufficient thrust pigs fly just fine...
> > > > > >
> > > > > > Generally, it's not easy for UDP applications to limit their
> > > > > > packet sizes. So the protocol would have to support changing
> > > > > > the packet size on the fly and then the application would
> > > > > > have to react to too big messages.
> > > > >
> > > > > draft-petithuguenin-behave-stun-pmtud does not rely on
> > > > > ICMP packet too big messages.
> > > > >
> > > > > > Not impossible, but I've never seen it happen.
> > > > >
> > > > > Me, neither.  I'm just trying to temper the "only TCP does
> > > > > PMTUD", because UDP can do PMTUD.  And of course there is value
> > > > > in sending the largest packets possible.
> > > > >
> > > > > > But if BitTorrent goes to UDP it's
> > > > > > likely they'll put this in in some way.
> > > > >
> > > > > As of version 2.x, the most popular BitTorrent client on
> > > > > Windows (uTorrent) has implemented uTP, for whatever that's
> > > > > worth.  I don't know if they're doing PMTUD, though.  Beta
> > > > > versions of uTorrent are available for OSX, too, and I expect
> > > > > they will soon support uTP (if they aren't already).
> > > > >
> > > > > > > I tend to think, though, we don't want to figure this out
> > > > > > > for stateful translators at this point in time.  If it is
> > > > > > > found useful/necessary, it can be specified later.
> > > > > >
> > > > > > Right.=
> > > > >
> > > > > -d
> > > > >
> > > > > _______________________________________________
> > > > > Behave mailing list
> > > > > Behave@ietf.org
> > > > > https://www.ietf.org/mailman/listinfo/behave
> > >
> > > _______________________________________________
> > > Behave mailing list
> > > Behave@ietf.org
> > > https://www.ietf.org/mailman/listinfo/behave
> > _______________________________________________
> > Behave mailing list
> > Behave@ietf.org
> > https://www.ietf.org/mailman/listinfo/behave