Re: [DNSOP] Working Group Last Call for Negative Caching of DNS Resolution Failures

Joe Abley <jabley@strandkip.nl> Fri, 30 June 2023 21:33 UTC

Return-Path: <jabley@strandkip.nl>
X-Original-To: dnsop@ietfa.amsl.com
Delivered-To: dnsop@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 23FD3C151098; Fri, 30 Jun 2023 14:33:08 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.095
X-Spam-Level:
X-Spam-Status: No, score=-2.095 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_BLOCKED=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=strandkip.nl
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id eaQKMf7qPLej; Fri, 30 Jun 2023 14:33:03 -0700 (PDT)
Received: from mail-4022.proton.ch (mail-4022.proton.ch [185.70.40.22]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 78395C15108E; Fri, 30 Jun 2023 14:33:03 -0700 (PDT)
Date: Fri, 30 Jun 2023 21:32:43 +0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=strandkip.nl; s=protonmail; t=1688160780; x=1688419980; bh=K6EUVR5gwRW5RqyrLZsPu2vJYLzz714hmMB3jVJpo+8=; h=Date:To:From:Cc:Subject:Message-ID:In-Reply-To:References: Feedback-ID:From:To:Cc:Date:Subject:Reply-To:Feedback-ID: Message-ID:BIMI-Selector; b=VrPC8+Lfwr7K+KADoo0nAms5me+AglOGWVdDPulj6k85as/up6snn5YhDY9UN33xK ApQGYe6uv/wqa36XKohCrhCuxB+wGVFReQurpYJZuHikUbGx3iltygiESLOMMDzq9G M9FnKecx4xMyoPTZdCmqj6g1Tabl5JpMObHZDHZF9vLo5PGcmh6spSCAQj1OGYaotM Il6dP0iQEdmbPtBaUdE3mDCuArjFHS2uiPspK4v9xIqpjjhjNJFXcm2UaDbmNatQXi B4goNp3QGqeqTGBQcDCbKJPyHxOOviNulemEwRjuQMSbZbXjGYh76+sVBa8S/zt72s UMYvSMtyu7Dsg==
To: Tim Wicinski <tjw.ietf@gmail.com>
From: Joe Abley <jabley@strandkip.nl>
Cc: dnsop <dnsop@ietf.org>, dnsop-chairs <dnsop-chairs@ietf.org>
Message-ID: <BFF54FE3-4B0F-4940-8069-BE89F2CDBBAD@strandkip.nl>
In-Reply-To: <CADyWQ+GFhD0_pdC--SJfaQuGL-yJ29okOyRKStLpF31PerQ=HA@mail.gmail.com>
References: <CADyWQ+GFhD0_pdC--SJfaQuGL-yJ29okOyRKStLpF31PerQ=HA@mail.gmail.com>
Feedback-ID: 73263797:user:proton
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="b1_NbguMSZbJVM8gaO9Ie5Ijgbxt3AuwfGKDzUM9XvbDq8"
Archived-At: <https://mailarchive.ietf.org/arch/msg/dnsop/xTRg9XUY4eN7yzWqLfZHMH0TAbg>
Subject: Re: [DNSOP] Working Group Last Call for Negative Caching of DNS Resolution Failures
X-BeenThere: dnsop@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: IETF DNSOP WG mailing list <dnsop.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dnsop>, <mailto:dnsop-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dnsop/>
List-Post: <mailto:dnsop@ietf.org>
List-Help: <mailto:dnsop-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dnsop>, <mailto:dnsop-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 30 Jun 2023 21:33:08 -0000

On Jun 21, 2023, at 11:00, Tim Wicinski <tjw.ietf@gmail.com> wrote:

> This starts a Working Group Last Call for draft-ietf-dnsop-caching-resolution-failures
>
> Current versions of the draft is available here:
> https://datatracker.ietf.org/doc/draft-ietf-dnsop-caching-resolution-failures/
>
> The Current Intended Status of this document is: Proposed Standard/Standards Track
>
> Please review the draft and offer relevant comments.
> If this does not seem appropriate please speak out.
> If someone feels the document is *not* ready for publication, please speak out with your reasons.
>
> This starts a two week Working Group Last Call process, and ends on: 5 July 2023

I have read -04. i like it. I think it's useful and sensible and it should be published. Whether this particular rev is ready for publication I would say depends on whether the authors disagree with all the pedantic nonsense that follows. They should feel free not to agree.

There are some nits that the authors might want to address or prepare to be outraged at the suggestion of.

The last paragraph in section 1.3 defines terms that are not used in this document except in that paragraph, I think? Perhaps they are vestigial.

In section 2.1 the phrase "Authoritative servers, and more specifically secondary servers, return server failure responses when they don't have any valid date for a zone." I don't think secondary servers are special from the perspective of a client sending a query; such clients cannot usually distinguish between primary and secondary servers and there are a wealth of well-used authoritative servers deployed that don't use zone transfers anyway. I suggest removing the qualifying "and more specifically secondary servers".

In section 2.1 the phrase "server failure responses" is used in place of SERVFAIL. In section 2.2 return codes are referred to as RCODEs and REFUSED is used in place of "refused return codes" or something. I don't think it matters too much which is used, but it seems like the document should be consistent. I like the all-caps representations, personally, but if you go with that there's an argument that they should be defined, e.g. by means of a reference to whatever the latest dns-terminology draft is in section 1.3.

I wonder whether another subsection of section 2 would be useful to discuss transactions that don't time out, but whose transports return positive indications of failure, e.g. a TCP handshake failure or RST, or a TLS negotiation failure when using DoT or DoH. These are not timeouts, but they also lack RCODEs (since they lack responses). Is it worth suggesting that failures that are transport-specific be cached, e.g. to record that server 192.0.2.1 doesn't respond on TCP so don't bother bombarding it with SYNs? You talk about this in 3.1; perhaps a forward reference from section 2 would be helpful.

Section 2 talks at some length about RCODEs like SERVFAIL and REFUSED but is silent on the existence of the extended error option. This may be ok, e.g. since the EDE spec is quick to specify that it "does not change the processing of RCODEs" and since the purpose of your draft is presumably to deal with retry behaviour regardless of whether EDE is supported, but it feels odd not to mention it at all even if it's just to clarify why it's ok for this specification to deal just with RCODEs.

Section 3.1 uses the phrase "a server's transport". I stared at that for quite a bit and I'm not sure I know for sure what it means. I think you're talking about the number of successive transmission of the same query using the same transport that are sent to the same server address. Perhaps that's obvious to other people (or perhaps my interpretation illustrates that there is some ambiguity). Later in this section when you say "timeout value" I think you're talking about how long to wait before identifying a timeout. Again, perhaps that's obvious.

I think the lack of prescriptive direction in section 3.2 concerning precisely how software should implement the required caching is very sensible.

I like the broadening of RFC 4697's advice in section 3.3. I think that's a mistake in 4697. It's good to correct it.

I think the update to RFC 4035 in section 3.4 is similarly sensible.

The recommended sentence for Section 4 according to RFC 8126 is "This document has no IANA actions."

Section 8 contains the sensible instruction "remove [...] before publication". Section 9's instruction is "remove [...] upon publication". I do not know what that means. Hopefully the RFC editor does.

I had forgotten about draft-muks-dnsop-dns-thundering-herd. It's a shame that has expired. If the authors need a volunteer to help bring it back to life, I could help, but it looks pretty good as-is to me. Perhaps it just needs a a poll for adoption?

Joe