Re: [DNSOP] Application level DNS message fragmentation
Paul Vixie <paul@redbarn.org> Tue, 09 December 2014 10:58 UTC
Return-Path: <paul@redbarn.org>
X-Original-To: dnsop@ietfa.amsl.com
Delivered-To: dnsop@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id F2BC71A1AE8 for <dnsop@ietfa.amsl.com>; Tue, 9 Dec 2014 02:58:46 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.31
X-Spam-Level:
X-Spam-Status: No, score=-1.31 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, J_CHICKENPOX_36=0.6, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id v5yPRsH8anYA for <dnsop@ietfa.amsl.com>; Tue, 9 Dec 2014 02:58:45 -0800 (PST)
Received: from family.redbarn.org (family.redbarn.org [24.104.150.213]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 197931A1A93 for <dnsop@ietf.org>; Tue, 9 Dec 2014 02:58:45 -0800 (PST)
Received: from [10.113.130.171] (unknown [118.143.13.4]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by family.redbarn.org (Postfix) with ESMTPSA id 0165A18138; Tue, 9 Dec 2014 10:58:42 +0000 (UTC)
Message-ID: <5486D5DC.1050902@redbarn.org>
Date: Tue, 09 Dec 2014 02:58:36 -0800
From: Paul Vixie <paul@redbarn.org>
User-Agent: Postbox 3.0.11 (Windows/20140602)
MIME-Version: 1.0
To: Mukund Sivaraman <muks@isc.org>
References: <20141208083212.GA13206@totoro.home.mukund.org>
In-Reply-To: <20141208083212.GA13206@totoro.home.mukund.org>
X-Enigmail-Version: 1.2.3
Content-Type: multipart/alternative; boundary="------------030801040902090600070003"
Archived-At: http://mailarchive.ietf.org/arch/msg/dnsop/y-W87_7gJoNKX-jyBJqVVCEl56o
Cc: dnsop@ietf.org
Subject: Re: [DNSOP] Application level DNS message fragmentation
X-BeenThere: dnsop@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: IETF DNSOP WG mailing list <dnsop.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dnsop>, <mailto:dnsop-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/dnsop/>
List-Post: <mailto:dnsop@ietf.org>
List-Help: <mailto:dnsop-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dnsop>, <mailto:dnsop-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 09 Dec 2014 10:58:47 -0000
the idea is in my opinion relevant, and worthy of pursuit. i have questions of clarification. > Mukund Sivaraman <mailto:muks@isc.org> > Monday, December 08, 2014 12:32 AM > ... > > When a server determines that the response doesn't fit into a single > datagram (512 or the client's message size), the server splits the reply > into multiple fragment datagrams (512 or some discovered PMTU that > works) such that: there is no reason to support this in non-EDNS. if someone won't upgrade to EDNS, then (1) we have no responsibility toward improving their DNS experience, and (2) they probably will not upgrade to this multi-message proposal either. (arguments of the form, "we want this to work when both endpoints can do EDNS but the middlebox forbids EDNS", are answered by noting that middleboxes probably would not permit multi-message DNS, either.) i think there is no PMTU that works. marka has fought this battle for a long time, and he's currently suggesting 1280-(headersize) for IPv6 and 1500-(headersize) for IPv4, period. my hope is that any recommendation for application-level fragmentation for DNS on UDP/53 would say "MAX(reliably determined PMTU, MIN(client's offered buffer size, 1500 or 1280 depending on transport protocol))." you might also allow the initiator to offer an ideal local-interface maximum fragment size and MIN() against that also.. > > 1. Each datagram is a DNS reply message with identical header field > values (except for section counts) and TC=1 in each of them. The ID > field has the same value among all reply fragments. > > 2. Each datagram contains part of the RRs that form the complete reply, > split on RR boundaries. The DNS header contains the appropriate section > counts for that datagram. The datagrams need not be equal in size. splitting an RR-set across messages makes my skin itch. i know it's the right thing to do and i'm not objecting. just letting you know, somebody will some day not recognize the OPT code that describes this as a multi-message transaction, and cache a partial RR-set, and we'll google the message i am now typing to show them the error of their ways. > > 3. An additional RR (plain DNS) or pseudo RR (inside OPT) called > FRAGMENT is present in every datagram with 2 16-bit fields containing > the count of fragments, and current fragment. (Though a DNS message is > limited to 1<<16 octets and a DNS datagram can be at least 512 octets > long, 16-bit fields are better for fragment count as the datagrams can > be of different sizes.) i think the absence of ACK-based timing means that packet trains longer than 256 packets are too dangerous to contemplate. even with some kind of application-layer inter-record-gap that's a lot of packets to inject without needing to hear an OK signal from the remote end. therefore i suggest two 8-bit fields. > > 4. A client that doesn't know about this scheme notices TC=1 and retries > with TCP. Datagrams other than the first one should be ignored as they > are duplicate replies with the same message ID. i think that wastes end-to-end bandwidth, and should be avoided, by having the initiator solicit (for QUERY) or probe (for UPDATE) using an EDNS OPT, rather than letting the responder just spew. > > 5. A client that is aware of this scheme finds TC=1 and the FRAGMENT RR > and does reassembly (similar to IP fragment reassembly such as RFC 815), > DNS messages being limited to 1<<16 octets too. referencing your later message on this thread, i don't think compression pointers can be allowed to point out-of-message. so, each message will form its own string dictionary. if that's what you meant to say then i'm sorry for misunderstanding you. > > This scheme still restricts the size of a single RR to the datagram > size. Reassembly (unlike IP fragments) doesn't require offsets such as > used in RFC 815 as RRs are wholly contained inside one datagram. > > TSIG can also be made to work with such a scheme on fragment by fragment > basis. > > ---- > > This scheme is not for replacing TCP. As mentioned above, if a TXT RR > containing multiple character-strings doesn't fit in a single datagram > for example, and truncation happens, it'll require TCP. It's not for > replacing EDNS's large datagram sizes too. But it is possible for EDNS > replies to overflow path MTU causing loss of replies, and when loss is > noted, on second attempt, truncation could occur as the message no > longer fits in reduced datagram size. > > Some things can still be served by UDP where possible (without involving > all the baggage of TCP.. roundtrips for starting SYN/ACK, for most DNS > requests having the connection remain in slow-start phase, etc.) As an > example, with a fragment datagram max size of 512, replies could > traverse a firewall that blocked large replies. > > This scheme should be backwards compatible with (ignored by) existing > implementations. Client implementations of this scheme can also signal > support with FRAGMENT 0 0. i'd like to see this coupled to the cookie proposal, so that if cookies aren't used, then this option is not available. i'm in moderate support. -- Paul Vixie
- [DNSOP] Application level DNS message fragmentati… Mukund Sivaraman
- Re: [DNSOP] Application level DNS message fragmen… Paul Vixie
- Re: [DNSOP] Application level DNS message fragmen… Mark Andrews
- Re: [DNSOP] Application level DNS message fragmen… Paul Vixie
- Re: [DNSOP] Application level DNS message fragmen… Mukund Sivaraman
- Re: [DNSOP] Application level DNS message fragmen… Mukund Sivaraman
- Re: [DNSOP] Application level DNS message fragmen… Mukund Sivaraman
- Re: [DNSOP] Application level DNS message fragmen… Paul Vixie
- Re: [DNSOP] Application level DNS message fragmen… David Dagon
- Re: [DNSOP] Application level DNS message fragmen… Mukund Sivaraman
- Re: [DNSOP] Application level DNS message fragmen… Mukund Sivaraman