Re: Another proposal to think about

mogul (Jeffrey Mogul) Wed, 29 November 1989 18:33 UTC

Received: by acetes.pa.dec.com (5.54.5/4.7.34) id AA00675; Wed, 29 Nov 89 10:33:13 PST
From: mogul (Jeffrey Mogul)
Message-Id: <8911291833.AA00675@acetes.pa.dec.com>
Date: 29 Nov 1989 1033-PST (Wednesday)
To: Philippe Prindeville <philipp@gipsi.gipsi.fr>
Cc: MTU Discovery <mtudwg>
Subject: Re: Another proposal to think about
In-Reply-To: Philippe Prindeville <philipp@gipsi.gipsi.fr> / Wed, 29 Nov 89 11:48:27 -0100. <8911291048.AA05369@gipsi.gipsi.fr>

    > Your proposal to handle MTU discovery at the TCP level sounds
    > reasonable, except that it only works for TCP.  That means we would
    > have to make similar changes to any other "packetization" protocols,
    > such as TP4 or VMTP or NFS, in order for them also to take advantage
    > of MTU discovery.  
    
    Excuse me, but something is obviously wrong in someone's thinking
    here (possibly mine):  Why would one *want* to change the packet
    size of an VMTP or UDP packet?  They are record-oriented protocols
    and must preserve such boundaries.

Certainly one cannot change the size of a UDP packet once the next
layer up has sent it.  Just as certainly, however, a higher layer
cannot expect to be allowed to send an arbitrarily large UDP packet.
The purpose of MTU discovery is to allow the higher layers (such
as RPC) to know how big a packet may be sent without fear of fragmentation.

NFS (+ Sun RPC) provide a textbook example of both the possibility of
doing this right, and the dangers of doing this wrong.  Sun RPC loves
to send 8kb UDP packets over an Ethernet (with a 1.5kb MTU).  This is
often a disaster when a gateway (or slow receiver interface) is 
involved.  However, NFS is perfectly capable of breaking up a file
write into chunks (it would have to no matter what size it preferred to
use) and so it can be told to use smaller chunks.  Nowadays, this
is done as a hand-configured mount-time option, but the same
information could be provided by MTU discovery (albeit with some
revision of the relevant NFS code) without changing the NFS spec.

(An aside on protocol design: although UDP is required to preserve
datagram boundaries, the designer of a higher-level record-oriented
protocol should not necessarily expect to fit an entire record
into a single UDP datagram.  This couples the parameters of the
higher-level protocol too tightly to the parameters of arbitrary
links in the Internet, if one wants reasonable performance.  If I
understand correctly [someone will no doubt correct me] in the OSI
model, the "presentation layer" is where the higher-level records
should be assembled.  It's nice if the transport layer provides
a way of marking the boundaries, but it can't/shouldn't be prohibited
from imposing additional boundaries.)

-Jeff