Re: [BEHAVE] Fwd: IPv6 hosts sending <1280 byte packets

Iljitsch van Beijnum <iljitsch@muada.com> Mon, 08 February 2010 23:01 UTC

Return-Path: <iljitsch@muada.com>
X-Original-To: behave@core3.amsl.com
Delivered-To: behave@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 6241828C107 for <behave@core3.amsl.com>; Mon, 8 Feb 2010 15:01:33 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.599
X-Spam-Level:
X-Spam-Status: No, score=-2.599 tagged_above=-999 required=5 tests=[AWL=0.000, BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Zyib4XpDasMn for <behave@core3.amsl.com>; Mon, 8 Feb 2010 15:01:32 -0800 (PST)
Received: from sequoia.muada.com (unknown [IPv6:2001:1af8:2:5::2]) by core3.amsl.com (Postfix) with ESMTP id 24AE428C0E8 for <behave@ietf.org>; Mon, 8 Feb 2010 15:01:31 -0800 (PST)
Received: from [192.168.2.11] (static-167-138-7-89.ipcom.comunitel.net [89.7.138.167] (may be forged)) (authenticated bits=0) by sequoia.muada.com (8.13.3/8.13.3) with ESMTP id o18N1Fbs033591 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Tue, 9 Feb 2010 00:01:16 +0100 (CET) (envelope-from iljitsch@muada.com)
Mime-Version: 1.0 (Apple Message framework v1077)
Content-Type: text/plain; charset="us-ascii"
From: Iljitsch van Beijnum <iljitsch@muada.com>
In-Reply-To: <02cd01caa90e$dde921c0$c4f0200a@cisco.com>
Date: Tue, 09 Feb 2010 00:02:25 +0100
Content-Transfer-Encoding: quoted-printable
Message-Id: <B50C7F0A-19DB-4C63-9F72-867B5C2D4841@muada.com>
References: <4B6F08CC.2070900@wand.net.nz> <063A973F-EBC3-4CD0-B5B6-B0FB42A8593D@muada.com><00f201caa8da$b78e3e90$c4f0200a@cisco.com> <4B704153.2020007@it.uc3m.es> <015801caa8e6$9b72fff0$c4f0200a@cisco.com> <75A95C0D-E2CC-4FD6-B11A-5C772FCD0F5C@muada.com> <02cd01caa90e$dde921c0$c4f0200a@cisco.com>
To: Dan Wing <dwing@cisco.com>
X-Mailer: Apple Mail (2.1077)
Cc: behave@ietf.org
Subject: Re: [BEHAVE] Fwd: IPv6 hosts sending <1280 byte packets
X-BeenThere: behave@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: mailing list of BEHAVE IETF WG <behave.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/behave>, <mailto:behave-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/behave>
List-Post: <mailto:behave@ietf.org>
List-Help: <mailto:behave-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/behave>, <mailto:behave-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Feb 2010 23:01:33 -0000

On 8 feb 2010, at 23:34, Dan Wing wrote:

>> So essentially this means doing more work (keeping PMTUD 
>> state for the IPv4 path) in order to do even more work later 
>> (fragment). Not sure how that makes sense...

> Yes, it means the translator is doing the effort of fragmenting
> rather than the router immediately in front of the small-MTU
> link.  However, by doing it this way, we can maintain end-to-end
> PMTUD for those IPv6 hosts that receive ICMPv6 PTB and send
> packets with the fragmentation header.

No, the two issues are orthogonal.

At some point, an IPv6 host sends a packet that is larger than the IPv4 PMTU. A router sends back a too big message. Issue A is how to handle this too big message. Issue B is what happens to packets that the IPv6 host subsequently sends.

I think there is agreement that A should be handled by simply translating the IPv4 too big too IPv6 and adjusting it as per the header size differences, but otherwise let it through transparently. (I once argued for rewriting it to 1280, though.)

Nothing has changed here.

But now for part B. A stateless translator can't monitor the IPv4 PMTU. Not because it's stateful (we're talking per destination state, not per session state, it would be doable to track this) but because there may be more than one translator so any given translator may not see all packets. So for stateless translation when the translator has a < 1280 packet with no fragment header we don't know whether this is because PMTUD was succesful and the packet size fits in the PMTU, or the IPv6 host ignored the < 1280 too big message and is omitting the fragment header in violation of the relevant RFCs. In order to be compatible with the latter the only thing we can do is set DF to 0.

(I just realize that we don't know what happens when for instance the PMTU is 1000 and the IPv6 host wants to send a 1100-byte packet: does it include a fragment header or not?)

Note that for part B, if there _is_ a fragment header, that will also trigger the translator to set DF to 0. So in the case where the IPv6 hosts are well behaved there is no change to either parts A or B.

In the stateful case we know all packets flow through the same translator so we get to track the IPv4 PMTU so when there is an IPv6 packet that needs to be translated we know whether it's bigger or smaller than the PMTU so we can send packets that are bigger with DF=1 and packets that are smaller with DF=0 and if we are to, also immediately fragment them. However, there is no advantage to doing this extra work because even if we get to set DF=1 on some packets that are smaller than the PMTU that only lets us discover a reduction in the PMTU, but that discovery doesn't buy us anything because the IPv6 host isn't going to reduce its packet size. Fragmenting at the translator also doesn't buy the translator anything and it may even get the packet blocked or create more work if there's a firewall with a > 1280 MTU before the path hits the < 1280 MTU which would need to reassemble the packet to observe the port numbers. And it could be an attack vector: attackers could send too bigs with tiny MTUs to make the translator work harder.

> Certainly there was some
> perceived value in the IPv6 knowing its packets are being
> fragmented.

I don't think so. Even if the application somehow learns this information, what is it supposed to do then?

> I agree that nearly 100% of PMTUD on IPv4 is with TCP.

> But there isn't anything prohibiting PMTUD for UDP.

With sufficient thrust pigs fly just fine...

Generally, it's not easy for UDP applications to limit their packet sizes. So the protocol would have to support changing the packet size on the fly and then the application would have to react to too big messages. Not impossible, but I've never seen it happen. But if BitTorrent goes to UDP it's likely they'll put this in in some way.

> I tend to think, though, we don't want to figure this out
> for stateful translators at this point in time.  If it is
> found useful/necessary, it can be specified later.

Right.