yet another MTU discovery scheme
Steve Deering <deering@pescadero.stanford.edu> Sun, 25 February 1990 06:58 UTC
Received: from decwrl.dec.com by acetes.pa.dec.com (5.54.5/4.7.34) id AA21472; Sat, 24 Feb 90 22:58:01 PST
Received: by decwrl.dec.com; id AA21625; Sat, 24 Feb 90 22:57:58 -0800
Received: by Pescadero.Stanford.EDU (5.59/25-eef) id AA13610; Sat, 24 Feb 90 22:57:53 PDT
Date: Sat, 24 Feb 1990 20:51:00 -0000
From: Steve Deering <deering@pescadero.stanford.edu>
Subject: yet another MTU discovery scheme
To: mtudwg
Message-Id: <90/02/24
My original RF-bit proposal assumes that it's OK for fragmentation and reassembly to occur once in a while, and uses intentional fragmentation to learn path-MTUs. If it is decided that fragmentation (actually, reassembly) is to be avoided at all costs, in order not to be vulnerable to Identifier wraparound at high speeds, my proposal could be modified to require that fragmented packets be discarded at the destination rather than reassembled (assuming they have the RF bit set and ICMP Fragment Received messages are sent back to the source). Compared to the original proposal, this has the drawback of wasting one round-trip time to learn the MTU of those paths whose MTU is less than MIN(first-hop-MTU, MSS). The time is "wasted" in the sense that the packets that are used to discover the path-MTU must be thrown away by the receiver. Let me try out another idea -- actually, a variation on an old idea. The old idea is to have senders detect fragmentation by setting the Don't Fragment (DF) bit on all packets, reducing their packet size if they receive "ICMP Destination Unreachable: Fragmentation Needed and DF Set" messages (I'll call them "Can't Fragment" messages, for short). The problem with this approach is that the Can't Fragment messages do not tell the sender what the MTU of the next hop network is, so it may take several retransmissions to learn the MTU (perhaps even doing a binary search for the right packet size). The variation I am proposing is to have the gateway that generates the Can't Fragment message include the recommended MTU (that is, the MTU of the next hop) in the 32-bit "unused" field of the Can't Fragment message. (I'll call that field the "Recommended MTU field".) The sender behavior is as follows: - start by assuming that the path-MTU is MIN(first-hop-MTU, MSS), and send all packets with the DF bit set. - if a Can't Fragment message is received with a non-zero Recommended MTU value less than the currently-assumed path-MTU, adopt the recommended MTU as the new assumed MTU, and re-packetize and retransmit the packet identified by the Can't Fragment message (at the transport layer). Continue to set the DF bit in all packets. - if a Can't Fragment message is received with a zero Recommended MTU value, that means the gateway that generated the message has not been upgraded to the new protocol. If the currently-assumed path-MTU is greater than 576, change it to 576. Repacketize and retransmit the rejected packet. *Stop* setting the DF bit in outgoing packets. - at fairly large intervals (10 - 20 minutes?), reset the assumed-MTU to MIN(first-hop-MTU,MSS) and start sending DF bits again, in order to learn if the path-MTU has increased. No changes are required at the receiver, and gateways can be upgraded gradually. No new IP header bits are required. If a path's MTU shrinks at an unmodified gateway, the sender ends up reverting to the conservative 576 rule. As the hosts and gateways are upgraded to use this strategy, fragmentation is eliminated from the Internet (just as everyone is getting fast enough to run into the Identifier-wraparound problem). In many cases, setting the DF bit will *not* trigger any Can't Fragment messages, now that the NSFnet backbone and most regionals support a 1500 byte MTU. If FDDI ever gets off the ground, it's 4K packets will trigger Can't Fragment messages from the near-side gateway to any smaller-MTU subnet, allowing the sender to learn the correct MTU (or 576) in less than one RTT. In those rare cases where the MTU shrinks at more than one point in a path, it will require multiple retransmissions to learn the path-MTU, but that doesn't seem too serious. Many of the implementation issues are the same as for the RF-bit scheme, such as caching of path-MTUs at the IP layer in the sender, participation of the transport layer in the sender, limiting the number of Can't Fragment messages generated in response to pipes-full of packets with DF bit set, etc. Possible problem: gateways that do not send ICMP Can't Fragment messages when they should. Are there any such gateways? Now, I'm not renouncing my original RF-bit proposal (yet). First, I'd like to hear some opinions on whether or not the Identifier-wraparound problem is something we should worry about. Steve
- yet another MTU discovery scheme Steve Deering
- Re: yet another MTU discovery scheme Philippe Prindeville
- Re: yet another MTU discovery scheme Steve Deering
- Re: yet another MTU discovery scheme Philippe Prindeville
- Re: yet another MTU discovery scheme Philippe Prindeville
- Re: yet another MTU discovery scheme Steve Deering
- Re: yet another MTU discovery scheme Steve Deering