Re: [DNSOP] [Ext] partial glue is not enough, I-D Action: draft-ietf-dnsop-glue-is-not-optional-00.txt

Shane Kerr <shane@time-travellers.org> Fri, 03 July 2020 08:17 UTC

Return-Path: <shane@time-travellers.org>
X-Original-To: dnsop@ietfa.amsl.com
Delivered-To: dnsop@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3EF3A3A08BA for <dnsop@ietfa.amsl.com>; Fri, 3 Jul 2020 01:17:39 -0700 (PDT)
X-Quarantine-ID: <NNJKvWByo5f1>
X-Virus-Scanned: amavisd-new at amsl.com
X-Amavis-Alert: BAD HEADER SECTION, Improper folded header field made up entirely of whitespace (char 20 hex): X-Spam-Report: ...T_ADDRESS@@ for details. Content previ[...]
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id NNJKvWByo5f1 for <dnsop@ietfa.amsl.com>; Fri, 3 Jul 2020 01:17:36 -0700 (PDT)
Received: from saturn.zonnestelsel.tk (saturn.zonnestelsel.tk [IPv6:2001:470:78c8:2::11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 97B803A08B9 for <dnsop@ietf.org>; Fri, 3 Jul 2020 01:17:32 -0700 (PDT)
Received: from earth.zonnestelsel.tk ([2001:470:78c8:2::9]) by saturn.zonnestelsel.tk with esmtpsa (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from <shane@time-travellers.org>) id 1jrGsx-0003nZ-JF for dnsop@ietf.org; Fri, 03 Jul 2020 08:17:30 +0000
To: dnsop@ietf.org
References: <20200702011816.D4B0D1C3CD10@ary.qy> <2843010.V8yvLItfke@linux-9daj> <alpine.OSX.2.22.407.2007020949360.96330@ary.qy> <6402649.6cnN7U4pX5@linux-9daj> <20200702154841.GA83916@jurassic.vpn.mukund.org> <F71B8055-415E-4DF4-8089-04AA1445269D@icann.org> <CAH1iCirMLsLmohChQCvqiS6ra0MYK40eJsDm_B5pMXAgXRnEpg@mail.gmail.com>
From: Shane Kerr <shane@time-travellers.org>
Message-ID: <6a311259-33f8-8c4b-4c99-6d8d4af6cc73@time-travellers.org>
Date: Fri, 3 Jul 2020 10:17:27 +0200
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.8.0
MIME-Version: 1.0
In-Reply-To: <CAH1iCirMLsLmohChQCvqiS6ra0MYK40eJsDm_B5pMXAgXRnEpg@mail.gmail.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 7bit
X-Spam-Score-Int: -28
X-Spam-Bar: --
Archived-At: <https://mailarchive.ietf.org/arch/msg/dnsop/KMoKW4ss9l8mhtajjkpl4027800>
Subject: Re: [DNSOP] [Ext] partial glue is not enough, I-D Action: draft-ietf-dnsop-glue-is-not-optional-00.txt
X-BeenThere: dnsop@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IETF DNSOP WG mailing list <dnsop.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dnsop>, <mailto:dnsop-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dnsop/>
List-Post: <mailto:dnsop@ietf.org>
List-Help: <mailto:dnsop-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dnsop>, <mailto:dnsop-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 03 Jul 2020 08:17:39 -0000

Brian,

Thanks for the interesting idea. Apologies for the rambling response below.

On 02/07/2020 19.16, Brian Dickson wrote:
> 
> 
> On Thu, Jul 2, 2020 at 9:14 AM Paul Hoffman <paul.hoffman@icann.org 
> <mailto:paul.hoffman@icann.org>> wrote:
> 
>     The interpretation of whether a partial RRset is allowed by
>     1035/2181 made by JohnL, PaulV, and MukundS are all plausible and
>     conflicting. RFC 1035 and RFC 2181 are unclear about whether an
>     RRset that is required in a reply can be partial.
> 
>     draft-ietf-dnsop-glue-is-not-optional as it stands is probably not
>     the best place to update the understanding of the standards-level
>     relationship between partial RRsets, the TC bit, and what parts of a
>     response are required. Doing so is adventurous, time-consuming, and
>     will almost certainly cause multiple current implementations to be
>     out of compliance.
> 
>     It is probably still worth doing, albeit carefully. A bad outcome
>     would be finishing the document due to exhaustion instead of consensus.
> 
> 
> I agree with Paul H in this regard.
> 
> Here's how I see it: the update to 1035 is necessary, but not sufficient 
> (probably).
> 
> Note that 1035 itself predates EDNS, so advice on TC alone is good, but 
> for the population of DNS implementations doing EDNS, perhaps we could 
> take advantage of its existence?
> 
> There are a whole bunch of unused bits in the core element of the OPT RR 
> (the place where the DO bit exists). That would be an excellent (IMHO) 
> place to signal the situation here (partial glue truncation).
> 
> Such a bit would hopefully disambiguate the cases where TC=1 is set, 
> allowing for graceful handling (try to use the available glue, prepare 
> for the failure case and retry over TCP if it does fail).
> 
> Thoughts?

I guess a resolver can infer that it has a full set of glue if there are 
A and AAAA records for every NS, otherwise the glue _might_ be 
truncated. This is kind of suckful because many servers (most?) don't 
have IPv6, and some (I run a few, .biz used to have one) don't have IPv4.

Anyway the basic approach of a resolver would be:

1. If I get back a response that has what appears to be a full set of 
glue (A and AAAA for every in-bailiwick NS) then I don't have to worry 
about missing glue.

2. If I get back a response that might be missing glue and there is an 
EDNS option "I support glue TC" that is not set, then I don't have to 
worry about missing glue.

3. If I get back a response that might be missing glue and there is an 
EDNS option "I support glue TC" that is set, then:
3.a. I can optionally try the servers that are in the glue that I did 
get first.
3.b. At some point I can try to get more glue using TCP.

4. If I get back a response that might be missing glue and there is no 
"I support glue TC" EDNS option, then you are basically in the same 
place as #3, except you might be wasting that TCP re-try because it 
might not return additional glue.

Because it is so hard for a server to know if there is actually glue 
missing, I think there _might_ be benefit. It could _potentially_ save a 
lot of wasted queries, mostly looking for non-existent AAAA records.

OTOH since this is about an EDNS option, it seems simpler to just 
realize that you've got at least 1280 bytes in the reply and that it 
will mostly not need truncation. It takes a lot of NS/A/AAAA records to 
fill up even a 1280 byte response (the root servers have 13 NS with A & 
AAAA for each, and is between 811 and 1103 bytes, depending on who you 
ask and which of their servers instances you get at any given time):

a -> 811
b -> 839
c -> 839
d -> 811
e -> 811
f -> 839
g -> 839
h -> 811
i -> 851
j -> 811
k -> 823
l -> 1003
m -> 811

I suppose it is possible that someone uses something like:

foo.example NS really-long-name.foo.example
             NS different-but-still-long.foo.example
             NS filling-up-dat-packet-tho.foo.example
                  .
                  .
                  .

And exceeds the space for glue.

Or maybe dnscurve:

bar.example NS 
uz5bcx1nh80x1r17q653jf3guywz7cmyh5jv0qjz0unm56lq7rpj8l.bar.example
             NS 
kxmvjtvks4rr576myswk7c6hdyedarthudoorz673crgktq5n6nmaq.bar.example
             NS 
tn2krco3pfhogkgwgswvrvp6dkuawmezyplgnu4f7wlajqv2gppp5t.bar.example
                  .
                  .
                  .

Even in those cases you need a lot of NS though.

Anyway, I guess this is basically all to convince myself that:

1. We probably don't need an EDNS option to signal just glue truncation
2. We should probably set TC when we do truncate glue

Cheers,

--
Shane