Re: [DNSOP] partial glue is not enough, I-D Action: draft-ietf-dnsop-glue-is-not-optional-00.txt

Paul Vixie <paul@redbarn.org> Thu, 02 July 2020 07:47 UTC

Return-Path: <paul@redbarn.org>
X-Original-To: dnsop@ietfa.amsl.com
Delivered-To: dnsop@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E40D83A0DF2 for <dnsop@ietfa.amsl.com>; Thu, 2 Jul 2020 00:47:46 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 3qiNCW_DSJ-5 for <dnsop@ietfa.amsl.com>; Thu, 2 Jul 2020 00:47:43 -0700 (PDT)
Received: from family.redbarn.org (family.redbarn.org [24.104.150.213]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id AC87A3A0DF1 for <dnsop@ietf.org>; Thu, 2 Jul 2020 00:47:39 -0700 (PDT)
Received: from linux-9daj.localnet (dhcp-166.access.rits.tisf.net [24.104.150.166]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (1024 bits) server-digest SHA256) (Client did not present a certificate) by family.redbarn.org (Postfix) with ESMTPSA id 5DE51B0588; Thu, 2 Jul 2020 07:47:37 +0000 (UTC)
From: Paul Vixie <paul@redbarn.org>
To: dnsop@ietf.org
Cc: John Levine <johnl@taugh.com>
Date: Thu, 02 Jul 2020 07:47:36 +0000
Message-ID: <2843010.V8yvLItfke@linux-9daj>
Organization: none
In-Reply-To: <20200702011816.D4B0D1C3CD10@ary.qy>
References: <20200702011816.D4B0D1C3CD10@ary.qy>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="UTF-8"
Archived-At: <https://mailarchive.ietf.org/arch/msg/dnsop/7FareWuVRxMzlmcdDTbgEeGlS0A>
Subject: Re: [DNSOP] partial glue is not enough, I-D Action: draft-ietf-dnsop-glue-is-not-optional-00.txt
X-BeenThere: dnsop@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IETF DNSOP WG mailing list <dnsop.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dnsop>, <mailto:dnsop-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dnsop/>
List-Post: <mailto:dnsop@ietf.org>
List-Help: <mailto:dnsop-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dnsop>, <mailto:dnsop-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 02 Jul 2020 07:47:47 -0000

On Thursday, 2 July 2020 01:18:16 UTC John Levine wrote:
> In article <9056955.dJ39pTEj9z@linux-9daj> you write:
> >On Wednesday, 1 July 2020 09:41:49 UTC Jan Včelák wrote:
> >> ...
> >
> >i think if you're using round robin or random selection, a subset is fine.
> >if we had to codify this practice, i'd ask that at least two address
> >records of each available kind be included (so, two AAAA's, two A's) or
> >else set TC=1.

> I really don't like this. If you do that, you're going to have
> failures when there are working servers but none of their addresses
> happen to be in the glue subset in the response, and without TC=1
> there's no hint that there's more glue if you retry.

this is the draft where that issue would be decided, so it's good we're 
talking about it. there are subtleties to the proposal you quoted, such that:

1. it is not possible to fetch the glue directly from the server who sent you 
the silently truncated delegation, all you can get is another referral.

2. you'll get a different subset of the available glue each time you retry, 
due to random ordering or round-robin.

3. even without TC=1 you will know there's under-zonecut glue you didn't 
receive, because you saw the NS RR, and the only path to the address RR is 
through that NS RRset.

any full resolver who doesn't like the conditions detected in #3 above is free 
to either retry as in #2 above, or just fetch with TCP, since the silent 
truncation is unambiguously implied.

> If a response with TC=1 has at least one record in the additional
> section, that tells the client that the missing records are all glue.

true to form, BIND4/8 did that, and no harm came of it, but it's difficult to 
standardize, and is not an optimization, it's a definite signal pattern meant 
to convey meaning. while 1035 does describe response construction order, that 
would have to be reiterated in case some responder has been adding things 
column wise rather than row wise, and has never had a complaint from it until 
the day somebody starts to depend on actual 1035 RRset stuffing order. this 
feels like pushing on a rope, but i know we have to do that sometimes.

> So I think it would be OK in that case for the client to use what it's
> got, but remember that if it can't contact any of the NS with the
> A/AAAA it's got, it can go back and get the rest.

...with TCP, i hope you meant.

> Remember, if it's glue, there's no other way to get it. If it's worth
> returning glue at all, it's worth providing all of it.

until someone invents faster than light travel, round trips and remote state 
will be the second and third most expensive things on the internet. (the most 
expensive thing is complexity.) i think we can usefully discuss whether to set 
TC=1 if the only thing that won't fit is glue, but some glue did fit. but our 
goal should be to allow smart initiators to avoid retrying with TCP out of 
reflex. my recommendation of TC=0 is to suppress reflexive TCP retries.

-- 
Paul