My comments

Keith Mc Cloghrie <sytek!kzm@HPLABS.HP.COM> Tue, 05 December 1989 13:48 UTC

Received: from decwrl.dec.com by acetes.pa.dec.com (5.54.5/4.7.34) id AA19726; Tue, 5 Dec 89 05:48:02 PST
Received: by decwrl.dec.com; id AA29941; Tue, 5 Dec 89 05:47:56 -0800
Received: from hplabs.hpl.hp.com by RELAY.CS.NET id aa13702; 5 Dec 89 7:48 EST
Received: by hplabs.HP.COM ; Mon, 4 Dec 89 21:11:37 PST
Received: by sytek.hls.hac.com (5.51/5.17) id AA21692; Sun, 3 Dec 89 01:14:28 PST
From: Keith Mc Cloghrie <sytek!kzm@HPLABS.HP.COM>
Message-Id: <8912030914.AA21692@sytek.hls.hac.com>
Subject: My comments
To: mtudwg%decwrl.dec.com@RELAY.CS.NET
Date: Sun, 3 Dec 89 1:14:25 PDT
X-Mailer: ELM [version 2.2 PL0]

After reading all the messages, I have a number of points to make.

1. Steve D. asked:

> If a host is not sending datagrams large enough to be fragmented,
> why would it *care* what the path MTU is?  

One reason to care is to provide an accurate answer to the 
GET_MAXSIZES procedure call of the HR RFC's Internet/Transport 
Layer Interface.  

2. I contend that it is architecturally wrong to use TCP's MSS 
to "discover the MTU".  MSS is the receiver's buffer size; this can 
be larger or smaller than the MTU.  The MTU is a property of a route;
it should be discovered at the IP layer, not independently on each 
TCP connection, and should also be available for use by UDP applications.

3. Both the 1063 and the report-fragmentation scheme require that a
path be tested periodically to determine if the MTU has increased.  
The possibility of the loss of the reply to this periodic test *is* 
a factor in the comparison.  In 1063, the loss of the reply merely
causes the test to be repeated with no harm except a delay in discovery 
if the MTU has increased.  In report-fragmentation, the loss of an 
ICMP fragmentation-occurred message causes the sender's estimated 
MTU to remain set at too high a value for longer, which (potentially) 
causes more over-sized messages to be sent and suffer fragmentation.

4. With the spare bit in the IP header not available for our use
and assuming use of MSS is not architecturally sound, then the 
remaining candidate schemes would seem to be 1063 and Steve B's 
variant of the report-fragmentation.  It also seems that because 
it uses an IP option, Steve B's variant of report-fragmentation 
is much more like 1063 than was Steve D's original proposal.  In 
particular, I presume the please-report-if-fragmented IP option 
would be sent periodically rather than on every datagram, in which 
case the choice of period would probably be the same as for 1063.

5. I've tried to summarize a comparison between the two schemes 
in the following table:

                              1063                     RF option

   asking mechanism:    periodic IP option        periodic IP option
                         on next datagram          on larger datagram           

   report mechanism:     reply IP option          extra ICMP msg 
                           on next msg           - doesn't require msg
                        in reverse direction      in reverse direction

   sender overhead:    allow space in msg.s        allow space in msg.s
                      for additional IP option    for additional IP option

   receiver overhead:   allow space in msg.s      fragmentation on round-
                        for reply IP option       trip's worth of msg.s &
                                                  sending extra ICMP msg

   gateway overhead:   processing msg with        processing msg with
                       IP option & updating it         IP option

   introduction cost:    cooperating hosts &      cooperating hosts only
                       every intermed. gateway

Assuming I've got the table right, there's only three issues where
there's a significant difference; these being the ones I called 
"report mechanism", "receiver overhead" and "introduction cost".  

The "report mechanism" difference concerns the sending of responses
back to the sender if there is no traffic in the reverse direction.
I suggest that the 1063 specification could easily be extended to have
IP generate some null-function ICMP message via which the IP option
can be sent, if the reply were to sit for a "long" time awaiting
a datagram with the correct destination.

Thus, if I had to draw a conclusion at this stage, it would be 
the following:

  For MTU-Discovery, we have a choice of waiting for most of the 
  gateways to be updated to support the RFC-1063 IP option, or 
  of using report-fragmentation in which each test for an increase 
  in MTU causes a burst of fragmented messages.

Keith.