Re: Transport requirements for DNS-like protocols

Michael Mealling <michael@neonym.net> Fri, 28 June 2002 13:48 UTC

Return-Path: <ietf-irnss-errors@lists.elistx.com>
Received: from ELIST-DAEMON.eListX.com by eListX.com (PMDF V6.0-025 #44856) id <0GYF00C0450Q04@eListX.com> (original mail from michael@bailey.dscga.com) ; Fri, 28 Jun 2002 09:48:26 -0400 (EDT)
Received: from CONVERSION-DAEMON.eListX.com by eListX.com (PMDF V6.0-025 #44856) id <0GYF00C0150Q02@eListX.com> for ietf-irnss@elist.lists.elistx.com (ORCPT ietf-irnss@lists.elistx.com); Fri, 28 Jun 2002 09:48:26 -0400 (EDT)
Received: from DIRECTORY-DAEMON.eListX.com by eListX.com (PMDF V6.0-025 #44856) id <0GYF00C0150P01@eListX.com> for ietf-irnss@elist.lists.elistx.com (ORCPT ietf-irnss@lists.elistx.com); Fri, 28 Jun 2002 09:48:25 -0400 (EDT)
Received: from bailey.dscga.com (bailey.neonym.net [198.78.11.130]) by eListX.com (PMDF V6.0-025 #44856) with ESMTP id <0GYF00AAV50PL1@eListX.com> for ietf-irnss@lists.elistx.com; Fri, 28 Jun 2002 09:48:25 -0400 (EDT)
Received: from bailey.dscga.com (localhost [127.0.0.1]) by bailey.dscga.com (8.12.1/8.12.1) with ESMTP id g5SDl3uK007805; Fri, 28 Jun 2002 09:47:03 -0400 (EDT)
Received: (from michael@localhost) by bailey.dscga.com (8.12.1/8.12.1/Submit) id g5SDl2hL007804; Fri, 28 Jun 2002 09:47:02 -0400 (EDT)
Date: Fri, 28 Jun 2002 09:47:02 -0400
From: Michael Mealling <michael@neonym.net>
Subject: Re: Transport requirements for DNS-like protocols
In-reply-to: <20020628034133.D50C518AC@thrintun.hactrn.net>
To: Rob Austein <sra@hactrn.net>
Cc: ietf-irnss@lists.elistx.com
Reply-to: Michael Mealling <michael@neonym.net>
Message-id: <20020628094702.W24592@bailey.dscga.com>
MIME-version: 1.0
Content-type: text/plain; charset="us-ascii"
Content-disposition: inline
User-Agent: Mutt/1.3.22.1i
References: <199812050411.UAA00462@daffy.ee.lbl.gov> <vern@ee.lbl.gov> <20020628034133.D50C518AC@thrintun.hactrn.net>
List-Owner: <mailto:ietf-irnss-help@lists.elistx.com>
List-Post: <mailto:ietf-irnss@lists.elistx.com>
List-Subscribe: <http://lists.elistx.com/ob/adm.pl>, <mailto:ietf-irnss-request@lists.elistx.com?body=subscribe>
List-Unsubscribe: <http://lists.elistx.com/ob/adm.pl>, <mailto:ietf-irnss-request@lists.elistx.com?body=unsubscribe>
List-Archive: <http://lists.elistx.com/archives/ietf-irnss/>
List-Help: <http://lists.elistx.com/elists/admin.shtml>, <mailto:ietf-irnss-request@lists.elistx.com?body=help>
List-Id: <ietf-irnss.lists.elistx.com>

On Thu, Jun 27, 2002 at 11:41:33PM -0400, Rob Austein wrote:
> This is the piece on DNS transport requirements to which I referred in
> the previous message, written as input to a BOF in 1998 that I wasn't
> able to attend in person.  It could be construed as off-topic for
> IRNSS per se, but Michael did ask, so for what it's worth....

I wouldn't say its off topic. Especially since you even discuss
something like IRNSS in there. I don't want to spend the entire
time discussing this point but possibly a short discussion might
help me with the document I'm writing...

> DNS has somewhat peculiar transport requirements (not news to you
> folks, obviously).  There are several interwoven factors at work here:
> 
> a) The relatively huge number of DNS clients compared to the relatively
>    small number of DNS servers, particularly as one gets close to the
>    root of the DNS tree (root zone itself, TLDs, big SLDs, etc).
> 
> b) The idempotence of normal DNS queries, and the relatively small
>    amount of work that a DNS server has to do in order to process a
>    normal query.

Which is something any designer of a Layer 2 service should keep in mind.

> c) The relatively low probability that any particular DNS response
>    message will be dropped by the network.

Is this due to its size? I.e. can that be re-written as:

 c) The relatively low probability that any particular 512 byte UDP packet
    will be dropped by the network.


> Taken together, these factors suggest that DNS as we currently know it
> is a classic example of what should be a "stateless" protocol.  The
> server is a critical resource and expects to receive *lots* of
> queries, a vanishingly small number of which will be from any single
> client.  It's not a big deal for the server to perform the entire
> query operation again.  The result is that the aggregate cost to the
> network of using an unreliable transport and recomputing responses
> when retransmission is necessary is less than the cost would be for
> the server to maintain *any* kind of state on behalf of its clients,
> including the kind of state required for even the most lightweight of
> reliable transport protocols.  This has a number of implications,
> perhaps the most troubling of which is the relative uselessness of
> attempting to do conventional path-MTU discovery.

This was one of my conclusions as well. RESCAP also fits into this
same model. The cost of re-querying to the server is lower than just 
about any other technique...

> Therefore, horrifying as it may be to all right-thinking engineers,
> one can make a very strong case that the correct transport protocol
> for normal DNS queries is exactly what we have now, even though
> implies that the correct way of handling bigger DNS response packets
> is via IP fragmentation.  This topic comes up regularly in the DNS
> working groups, but so far the consensus has been that while UDP is a
> terrible transport protocol for DNS, the known alternatives are worse.

Ok, this and the rest of the paragraphs assume IP fragmentation. The
question I have is this: since there is no congestion control at the
IP layer for fragments, and IP fragmentation seems to be acceptable, 
why not do it at the application layer? Limit the packet size to the
minimum MTU for that network and don't do retransmission of packets.
I'm not a transport guru but it seems that all of the congestion
control issues come about because of packet retransmission. If you
just don't _do_ packet level retransmission then congestion control
becomes a non-issue.

In other words, instead of negotiating packet size in the first packet,
have it hardwired to 512 bytes but send multiple packets with
simple sequence numbers so the application can just piece 'em back
together. If it fails you can retry the query via TCP (or UDP
if you feel it might work the second time). But you never ask for individual 
packets to be retransmitted, and you never ACK. 

Its the same network profile as IP fragmentation but with a much
better chance of success at greater than 1500 byte message sizes.

> In the long term, if we ever get a real white pages protocol and
> people stop caring about having cute DNS names, we're still going to
> need an underlying system for associating long-term identifiers with
> IP addresses.  At a technical level, such a system will probably look
> an awful lot like the DNS, but perhaps it'll be sufficently
> decentralized that a reliable transport protocol wouldn't be such a
> burden on the servers.

Awfully prescient!

-MM

-- 
--------------------------------------------------------------------------------
Michael Mealling	|      Vote Libertarian!       | urn:pin:1
michael@neonym.net      |                              | http://www.neonym.net