Re: Path MTU Discovery
Dan.McDonald@eng.sun.com (Dan McDonald) Thu, 06 February 1997 20:54 UTC
Received: (from majordom@localhost) by portal.ex.tis.com (8.8.2/8.8.2) id PAA13942 for ipsec-outgoing; Thu, 6 Feb 1997 15:54:42 -0500 (EST)
From: Dan.McDonald@eng.sun.com
Message-Id: <199702062058.MAA22972@kebe.eng.sun.com>
Subject: Re: Path MTU Discovery
To: angelos@aurora.cis.upenn.edu
Date: Thu, 06 Feb 1997 12:58:32 -0800
Cc: ipsec@tis.com
In-Reply-To: <9702060420.AA79343@aurora.cis.upenn.edu> from "Angelos D. Keromytis" at Feb 5, 97 11:20:02 pm
X-Mailer: ELM [version 2.4 PL25]
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"
Content-Transfer-Encoding: 7bit
Sender: owner-ipsec@ex.tis.com
Precedence: bulk
> Path-mtu discovery breaks in the presence of multiple IPsec > encapsulation(*) (it might even break in the presence of ONE > intermediate encapsulating entity). Are you sure it totally breaks? It doesn't work as well, for sure, but I don't see total breakage. An encrypting router often has a "tunnel interface" that'll have properties like: tun0: 10.69.0.0/16 --> 10.9.1.25 Interfaces have MTU associated with them. A combination of PathMTU discovery from the router to the tunnel endpoint, and knowledge of the algorithms used, etc. can give this tunnel interface a reasonable MTU estimate. Reasonable enough that an MTU too large message for datagrams bound for 10.69.0.0/16 can be sent. BTW, by PathMTU discovery to the endpoint, I mean that the router (because it is originating packets now, from its address to the tunnel endpoint) has a cache-entry/host-route/whatever for the tunnel endpoint. That entry can be a repository for intermediate Path MTU information. Incoming IP data Outgoing forward result -- src 10.8.20.69 --> ROUTER (ifaddr) -> src 10.10.20.20 dst 10.69.21.12 <-(10.8.20.2) dst 10.9.1.25 proto=TCP (10.10.20.20)-> proto=ESP (with IP inside) Figure 1: Demonstration of "originating packets". Now let's say there's ANOTHER layer of encryption between the router above and its tunnel endpoint of 10.9.1.25. THAT router may send an ICMP toobig to OUR router (10.10.20.20), saying the path to 10.9.1.25 has a smaller MTU than what I think. Now, because of the multiple nestings of IP, I can't percolate that ICMP toobig all the way back to the originator, but it will eventually percolate back. It will take N dropped messages for N layers of tunelling. So if there's an intermediate node's worth of encapsulation, the first message will be dropped, then when the first subsequent message hits the cranked-down tunnel endpoint, THEN a toobig can be sent back. Using the above figure 1 as an example, if a router says the path MTU to 10.9.1.25 is less, then the router's tunnel interface will ratchet down its mtu. The original packet from the source host 10.8.20.69 will just be dropped, because the router doesn't really want to go digging deep for the originator. The next IP datagram from 10.8.20.69 will generate an ICMP toobig from the router, because it now has the ratcheted-down MTU on its tunneling interface, and so with one dropped datagram, the node 10.8.20.69 knows the whole path MTU. If messages drop occasionally, that's fine. This is IP. Sure it's a performance hit, but security and performance are sometimes (note my choice of words) opposites in a tradeoff. I don't think your solution about keeping SPIs helps a whole lot here. It seems to be unnecessary implementation cruft. If there's any flaws in what I've said, however, I'd certainly like to know. > This still doesn't address the problem of the original TCP mtu (the > mtu of the outgoing interface could be less than that reported on the > kernel structure, depending on whether a packet will be IPsec'ed or > not). But i doubt we can mandate a solution for that. As for original TCP MSS, which needs to be set, IP must be able to send a hint to the particular TCP session indicating that IP security will lower the effective MSS for this TCP connection. I say it must only alter a single TCP session because IP security should use per-endpoint security properties where possible. See Bellovin's USENIX Security '96 conference paper for details on why. See draft-mcdonald-simple-ipsec-api-00.txt for how an application may exploit this. > Also, there's the case of whether we accept as valid ICMPs from anyone > in between (which means anyone) or just two encapsulating entities > (e.g. two tunneling firewalls). The network-correct approach is > anyone; the security correct is next enc entity. Good point, and it applies to ICMP messages of all shapes, sizes, and flavors. It's possible an intermediate router could send an ICMP with AH on it, that way I have reasonable assurance it came from a router with that IP address. > (*) Steve Kent replied that it shouldn't break for an end host; He is right. > however, the 4.4BSD TCP code checks the outgoing interface MTU > directly to determine the size of the packets, if the route entry does not > have an mtu (check tcp_input.c, tcp_mss()). This means that either TCP > is patched, or fragmentation will happen. Stock 4.4 is broken w.r.t. trying to perform Path MTU discovery. FreeBSD has one solution for doing this with IPv4. The NRL IPv6 code has another solution that it implements on the IPv6 side of things. Dan
- Path MTU Discovery Angelos D. Keromytis
- RE: Path MTU Discovery Sanjay Anand
- RE: Path MTU Discovery Stephen Kent
- Re: Path MTU Discovery Dan McDonald
- Re: Path MTU Discovery Dan McDonald
- Re: Path MTU Discovery Angelos D. Keromytis
- Re: Path MTU Discovery Angelos D. Keromytis
- Re: Path MTU Discovery Craig Metz
- Re: Path MTU Discovery Ran Atkinson
- Re: Path MTU Discovery Dan McDonald
- Re: Path MTU Discovery Angelos D. Keromytis
- Re: Path MTU Discovery Ben Rogers
- Re: Path MTU Discovery Oliver Spatscheck
- Re: Path MTU Discovery Ran Atkinson
- Re: Path MTU Discovery Ran Atkinson
- Re: Path MTU Discovery Angelos D. Keromytis
- Re[2]: Path MTU Discovery Whelan, Bill