Re: Transport requirements for DNS-like protocols

Dave Crocker <> Sun, 30 June 2002 04:25 UTC

Return-Path: <>
Received: from by (PMDF V6.0-025 #44856) id <> (original mail from; Sun, 30 Jun 2002 00:25:33 -0400 (EDT)
Received: from by (PMDF V6.0-025 #44856) id <> for (ORCPT; Sun, 30 Jun 2002 00:25:32 -0400 (EDT)
Received: from by (PMDF V6.0-025 #44856) id <> for (ORCPT; Sun, 30 Jun 2002 00:25:32 -0400 (EDT)
Received: from ( []) by (PMDF V6.0-025 #44856) with ESMTP id <> for; Sun, 30 Jun 2002 00:25:32 -0400 (EDT)
Received: from ( [] (may be forged)) by (8.9.3/8.9.3) with ESMTP id VAA06788; Sat, 29 Jun 2002 21:30:48 -0700
Date: Sat, 29 Jun 2002 21:25:05 -0700
From: Dave Crocker <>
Subject: Re: Transport requirements for DNS-like protocols
In-reply-to: <>
To: Michael Mealling <>
Message-id: <>
MIME-version: 1.0
X-Mailer: QUALCOMM Windows Eudora Version (Beta)
Content-type: text/plain; format="flowed"; charset="us-ascii"
References: <> <> <> <>
List-Owner: <>
List-Post: <>
List-Subscribe: <>, <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Help: <>, <>
List-Id: <>

At 11:04 PM 6/29/2002 -0400, Michael Mealling wrote:
>The proposal I've been toying with was to limit the number of packets
>in the train to some number equating to *small*. Anything above that
>is responded to with a "I'm sorry but the response is on the order of
>a file stransfer instead of a simple response so please requery via TCP".

interesting idea.

>But sans some hard network testing across variously connected parts of
>the network I have no idea how to come up with that number.

worse, today's statistics don't necessarily predict tomorrow's.

>Yes. The fact that fragmentation is the exception rather than the rule
>means that the context switching required during the probable situation
>that the router is 'stressed' suggests that the large UDP packet solution
>is relying on the network feature most likely to fail instead of most
>likely to succeed.

Also it is bad in the receiving host.  With IP fragmentation, the fragment 
is held down in the IP layer, consuming a network buffer.

>  We have good evidence that UDP packets smaller than
>512 bytes succeed at a high rate. IMHO, we should optimize to what will
>succeed rather than what is an error correction.

exactly.  for all of the processing efficiencies of larger packets, quick 
DNS-like transactions might be the place to bias towards smaller chunks.

> > >That is, even in the
> > >absence of PMTU discovery, there is still a chance that a larger than
> > >minimum IP packet might still make it through the net unfragmented.
> >
> > But the fact that it might not is the problem.
>Correct. Its the recovery from the dropped fragments that causes additional
>network impact and increased delay due to timeouts for the user.

With a dropped fragment, the entire IP datagram must be resent.  An 
application-level, selective-retransmission scheme does not incur that.

Still, there is the minor issue of retaining state in the server application.

> > Perhaps the way to avoid this mistake is to make DNS use a transport
> > protocol that does not rely on the size of the underlying packets.  The
> > easiest way to do this is a thin layer ON TOP of UDP, that strings them
> > together.
>Beyond a simple sequence number and possible checksum in the first octet,
>would anything else be needed. The only thing I can think of is
>some additional octet in the first packet indicating the total number
>of packets sent.

Sounds about right.  The issue is not designing such a mechanism.  It's the 
impact on the servers that I would guess to be the major issue.

> > However to add selective retransmission requires that the server 
> 'assemble'
> > the DNS query and acknowledge its parts selectively.  Hence the server
> > becomes statement.
>And increases server demands to the point that a cost of a truly reliable
>connect becomes the optimal solution, when in truth it ends up being

Not if we are faced with a true requirement for larger application data 
units, as it seems we are (and, frankly, have been for a number of years.)

The debate is how to provide those larger transaction units, not whether.


Dave Crocker <>
TribalWise, Inc. <>
tel +1.408.246.8253; fax +1.408.850.1850