some further musings
smb@research.att.com Tue, 05 December 1989 16:21 UTC
Received: from decwrl.dec.com by acetes.pa.dec.com (5.54.5/4.7.34) id AA20534; Tue, 5 Dec 89 08:21:33 PST
Received: by decwrl.dec.com; id AA18151; Tue, 5 Dec 89 08:21:29 -0800
From: smb@research.att.com
Message-Id: <8912051621.AA06557@hector.homer.nj.att.com>
Received: by hector.homer.nj.att.com id AA06557; Tue, 5 Dec 89 11:21:07 EST
To: mtudwg
Subject: some further musings
Date: Tue, 05 Dec 1989 11:21:05 -0500
>From: hector!smb
A few more thoughts occurred to me this morning: maybe we should re-examine our assumptions. I'm not sure these all make sense, and they're a bit contradictory; think of them as monkey wrenches cast upon the waters... First: is it really necessary to reprobe for MTU changes, once the initial negotiation is complete? Route changes are comparatively infrequent, and many -- most? -- routing changes will not affect the MTU. For example, if an internal router fails, any new route to my external gateway will likely be via the same medium, i.e., Ethernet. For that matter, for the next few years the vast majority of hosts will be restricted by Ethernet's MTU; most other choices are rather uncommon or too expensive for workstations (i.e., FDDI). As long as our long-haul links have an MTU greater than 1500, few changes in route will affect the path MTU. This has several implications. First, it favors any of the types of report-fragmentation. Minimum-MTU requires an active discovery process; if my hypothesis is correct, we want something that responds to decreases in MTU, but ignores increases. Second A second assumption we have been operating under is that the sender should initiate the MTU discovery process. That's true if the 1063 option is used; it's not true with report-fragmentation. Suppose that receivers always generated a fragmentation ICMP message, though only for the first few offenses per connection. If the sender understands the ICMP message, it will adjust its behavior accordingly; if not, the message is mostly harmless except for the minor amount of extra traffic. I would define ``first few'' as some small integer (on the order of 10), or twice the round-trip time -- the receiver should have a decent idea of it, if only from the initial SYN-ACK handshake if TCP is used. If we make the report an IP option instead of a separate message -- tagged onto an unsolicited ICMP ECHO REPLY or something else equally harmless -- we could also resend the report any time we retransmitted an ACK. (That violates layering a bit; the concept may need clearer definition to make it work for NFS and the like.) Finally, if Jeff Mogul's recent proposal is considered useful despite my nitpicking objections, we can apply his observation to report- fragmentation: the last hop router knows the final MTU, so any router that fragments a packet should send back the ICMP message to the host. That gives us the advantage of putting the mechanism into the faster-evolving router population, and getting feedback to the host more quickly (i.e., from closer to the source). The obvious objection here is that the router has no knowledge of connections, and hence wouldn't know when to stop sending these messages to an uncooperative host. I'm not convinced that that holds up. Conforming hosts today shouldn't be sending packets that need fragmentation as long as every hop accepts 576 or larger. This implies that most jumbograms are from hosts that are trying to do MTU discovery, in which case they'd understand the ICMP message. A comparatively small cache could be kept of source-destination pairs that had been sent such messages recently. Since this cache is soft state, it doesn't matter too much if it's lost on a reboot, unless the volume of extra ICMP messages gets to be very large. Does anyone have any current statistics on the frequency of non-local fragmentation? Routers that connect to tinygram networks might need a variant on this strategy, of course; their cache might be too large. We could further embellish this scheme by using the cache when routes change. If the new route to a destination lowers the MTU, flush the cache for that destination -- you want to send new options. If it raises the MTU, flag the cache entries; when any packet flows through for a source-destination pair that's in the cache -- and hence for which the sencder has previously been advised of the proper MTU -- send it a new report to raise the MTU. This last may be overkill, of course... --Steve Bellovin
- some further musings smb
- re: some further musings Craig Partridge
- Re: some further musings Philippe Prindeville
- re: some further musings Keith Mc Cloghrie
- re: some further musings Philippe Prindeville
- re: some further musings art
- re: some further musings Keith Mc Cloghrie