Re: Transport requirements for DNS-like protocols

Dave Crocker <dhc2@dcrocker.net> Sun, 30 June 2002 04:25 UTC

Return-Path: <ietf-irnss-errors@lists.elistx.com>
Received: from ELIST-DAEMON.eListX.com by eListX.com (PMDF V6.0-025 #44856) id <0GYI005044ALLK@eListX.com> (original mail from dhc2@dcrocker.net); Sun, 30 Jun 2002 00:25:33 -0400 (EDT)
Received: from CONVERSION-DAEMON.eListX.com by eListX.com (PMDF V6.0-025 #44856) id <0GYI005014AKLI@eListX.com> for ietf-irnss@elist.lists.elistx.com (ORCPT ietf-irnss@lists.elistx.com); Sun, 30 Jun 2002 00:25:32 -0400 (EDT)
Received: from DIRECTORY-DAEMON.eListX.com by eListX.com (PMDF V6.0-025 #44856) id <0GYI005014AKLH@eListX.com> for ietf-irnss@elist.lists.elistx.com (ORCPT ietf-irnss@lists.elistx.com); Sun, 30 Jun 2002 00:25:32 -0400 (EDT)
Received: from joy.songbird.com (songbird.com [208.184.79.7]) by eListX.com (PMDF V6.0-025 #44856) with ESMTP id <0GYI003AP4AJRH@eListX.com> for ietf-irnss@lists.elistx.com; Sun, 30 Jun 2002 00:25:32 -0400 (EDT)
Received: from bbprime.dcrocker.net (208.184.79.252.songbird.com [208.184.79.252] (may be forged)) by joy.songbird.com (8.9.3/8.9.3) with ESMTP id VAA06788; Sat, 29 Jun 2002 21:30:48 -0700
Date: Sat, 29 Jun 2002 21:25:05 -0700
From: Dave Crocker <dhc2@dcrocker.net>
Subject: Re: Transport requirements for DNS-like protocols
In-reply-to: <20020629230426.J24592@bailey.dscga.com>
X-Sender: dhc2@jay.songbird.com
To: Michael Mealling <michael@neonym.net>
Cc: ietf-irnss@lists.elistx.com
Message-id: <5.1.1.2.2.20020629211753.020075f8@jay.songbird.com>
MIME-version: 1.0
X-Mailer: QUALCOMM Windows Eudora Version 5.1.1.3 (Beta)
Content-type: text/plain; format=flowed; charset=us-ascii
References: <5.1.1.2.2.20020629080646.02aa6730@jay.songbird.com> <199812050411.UAA00462@daffy.ee.lbl.gov> <vern@ee.lbl.gov> <5.1.1.2.2.20020629080646.02aa6730@jay.songbird.com>
List-Owner: <mailto:ietf-irnss-help@lists.elistx.com>
List-Post: <mailto:ietf-irnss@lists.elistx.com>
List-Subscribe: <http://lists.elistx.com/ob/adm.pl>, <mailto:ietf-irnss-request@lists.elistx.com?body=subscribe>
List-Unsubscribe: <http://lists.elistx.com/ob/adm.pl>, <mailto:ietf-irnss-request@lists.elistx.com?body=unsubscribe>
List-Archive: <http://lists.elistx.com/archives/ietf-irnss/>
List-Help: <http://lists.elistx.com/elists/admin.shtml>, <mailto:ietf-irnss-request@lists.elistx.com?body=help>
List-Id: <ietf-irnss.lists.elistx.com>

At 11:04 PM 6/29/2002 -0400, Michael Mealling wrote:
>The proposal I've been toying with was to limit the number of packets
>in the train to some number equating to *small*. Anything above that
>is responded to with a "I'm sorry but the response is on the order of
>a file stransfer instead of a simple response so please requery via TCP".

interesting idea.

>But sans some hard network testing across variously connected parts of
>the network I have no idea how to come up with that number.

worse, today's statistics don't necessarily predict tomorrow's.


>Yes. The fact that fragmentation is the exception rather than the rule
>means that the context switching required during the probable situation
>that the router is 'stressed' suggests that the large UDP packet solution
>is relying on the network feature most likely to fail instead of most
>likely to succeed.

Also it is bad in the receiving host.  With IP fragmentation, the fragment 
is held down in the IP layer, consuming a network buffer.


>  We have good evidence that UDP packets smaller than
>512 bytes succeed at a high rate. IMHO, we should optimize to what will
>succeed rather than what is an error correction.

exactly.  for all of the processing efficiencies of larger packets, quick 
DNS-like transactions might be the place to bias towards smaller chunks.


> > >That is, even in the
> > >absence of PMTU discovery, there is still a chance that a larger than
> > >minimum IP packet might still make it through the net unfragmented.
> >
> > But the fact that it might not is the problem.
>
>Correct. Its the recovery from the dropped fragments that causes additional
>network impact and increased delay due to timeouts for the user.

With a dropped fragment, the entire IP datagram must be resent.  An 
application-level, selective-retransmission scheme does not incur that.

Still, there is the minor issue of retaining state in the server application.


> > Perhaps the way to avoid this mistake is to make DNS use a transport
> > protocol that does not rely on the size of the underlying packets.  The
> > easiest way to do this is a thin layer ON TOP of UDP, that strings them
> > together.
>
>Beyond a simple sequence number and possible checksum in the first octet,
>would anything else be needed. The only thing I can think of is
>some additional octet in the first packet indicating the total number
>of packets sent.

Sounds about right.  The issue is not designing such a mechanism.  It's the 
impact on the servers that I would guess to be the major issue.


> > However to add selective retransmission requires that the server 
> 'assemble'
> > the DNS query and acknowledge its parts selectively.  Hence the server
> > becomes statement.
>
>And increases server demands to the point that a cost of a truly reliable
>connect becomes the optimal solution, when in truth it ends up being
>overkill.

Not if we are faced with a true requirement for larger application data 
units, as it seems we are (and, frankly, have been for a number of years.)

The debate is how to provide those larger transaction units, not whether.

d/


----------
Dave Crocker <mailto:dave@tribalwise.com>
TribalWise, Inc. <http://www.tribalwise.com>
tel +1.408.246.8253; fax +1.408.850.1850