[DNSOP] Quick review of draft-dwmtwc-dnsop-caching-resolution-failures-00
Mukund Sivaraman <muks@mukund.org> Tue, 12 July 2022 15:24 UTC
Return-Path: <muks@mukund.org>
X-Original-To: dnsop@ietfa.amsl.com
Delivered-To: dnsop@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 434A9C14CEFC for <dnsop@ietfa.amsl.com>; Tue, 12 Jul 2022 08:24:20 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.107
X-Spam-Level:
X-Spam-Status: No, score=-2.107 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_BLOCKED=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=mukund.org
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id WOaQGxJ3DeDn for <dnsop@ietfa.amsl.com>; Tue, 12 Jul 2022 08:24:16 -0700 (PDT)
Received: from mx.mukund.org (mx.mukund.org [188.40.188.216]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2AF55C14CF10 for <dnsop@ietf.org>; Tue, 12 Jul 2022 08:24:09 -0700 (PDT)
Date: Tue, 12 Jul 2022 20:54:04 +0530
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=mukund.org; s=mail; t=1657639447; bh=+e6BQb12gwWlNy4jQ/5FvjLE1WmqcJzInC1kUxGVQ5w=; h=Date:From:To:Subject:From; b=BL/XXGnsjkrcihIZ+eRiDBEZKpoOZBi4fAoN/FNLDwxx1+bszdUHmVjGCPBCuw8XT f37qVskLyjmFT0n8g+Hm0HPtH6U66L6Gp6kk2dOIxXccjjAhQJHilSUjLSRjSWP5cX tfHCGzE8ocHND/T2dh2L/m/Vbu0VBHvNtDJRW50o9wFOJM8mY+6wAZ2vtYzGCrZhxK DEkOkbsl5+Xgx9cSZk9acMCQ/RCtkSBpW5IzGFbts5FdYqZNkVgv650oVHQoAnHtW8 rFMh2oVsP8nz5ybuqOaElOfbypZ1JQdIYqLxuxP3PQ0XeR50OJHXDEgUykhgsUo9zI ESPPRLjdPGWFQ==
From: Mukund Sivaraman <muks@mukund.org>
To: dnsop@ietf.org
Message-ID: <Ys2SFN8QJkrRAyAz@d1>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg="pgp-sha512"; protocol="application/pgp-signature"; boundary="OUlSJRr6fTR1D74p"
Content-Disposition: inline
Archived-At: <https://mailarchive.ietf.org/arch/msg/dnsop/rzxIZw7th1zRZ01FLsHGZa_GIVw>
Subject: [DNSOP] Quick review of draft-dwmtwc-dnsop-caching-resolution-failures-00
X-BeenThere: dnsop@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: IETF DNSOP WG mailing list <dnsop.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dnsop>, <mailto:dnsop-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dnsop/>
List-Post: <mailto:dnsop@ietf.org>
List-Help: <mailto:dnsop-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dnsop>, <mailto:dnsop-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 12 Jul 2022 15:24:20 -0000
Some comments quickly browsing this draft, as we're handling a quirky issue around NS timeouts and it looked relevant. Firstly, some resolver implementations do cache upstream NS timeouts in various non-standard ways. The resolver I work on has at least 3-4 different mechanisms within the same codebase. Documentation on how timeouts should be handled seems good, so I support this draft. > Internet Engineering Task Force D. Wessels > Internet-Draft W. Carroll > Intended status: Standards Track M. Thomas > Expires: 17 July 2022 Verisign > 13 January 2022 > Negative Caching of DNS Resolution Failures > draft-dwmtwc-dnsop-caching-resolution-failures-00 [snip] > [RFC4697] is a Best Current Practice that documents observed > resolution misbehaviors. It describes a number of situations that > can lead to excessive queries from recusrive resolvers. including: There's a spelling mistake in "recusrive", and the period after "resolvers." should be removed. [snip] > 3.2. TTLs > Resolvers MUST cache resolution failures for at least 5 seconds. > Resolvers SHOULD employ an exponential backoff algorithm to increase > the amount of time for subsequent resolution failures. For example, > the initial negative cache TTL is set to 5 seconds. The TTL is I am guessing the authors meant to write "timeout cache TTL" here instead of negative cache TTL. In any case, the phrase "negative cache TTL" has a well-understood meaning per RFC 2308, and should not be overloaded/reused to indicate timeout cache TTL. [snip] > 3.3. Scope > Resolution failures MUST be cached against the specific query tuple > <query name, type, class, server IP address>. Have you considered the effect of caching the timeout against just an upstream server's IP address? I'm not saying you should, but wondering if any of the other tuple fields are relevant to have separate more-specific timeout cache entries. In other words, is it necessary for there to be a distinction among timeouts for: (1) example.org., A, IN, 10.0.0.1 (2) example.org., TYPE65, IN, 10.0.0.1 (3) example.com., A, IN, 10.0.0.1 Traditionally, a resolver's upstream RTTs and timeouts are tracked against the nameserver IP address. A failure to respond has been considered as a property of the NS (implementation) or path to that NS. My colleagues are handling an issue where an authoritative nameserver does not respond to TYPE65 queries, but responds to queries for common query types such as address records. In this case, without mitigating with controls, the resolver is a little stumped and keeps attempting to contact the upstream NS because it receives some responses from it. The queries for which there are no responses eventually end up waiting for the maximum timeout limit because the resolver keeps trying to talk to it. On a busy resolver, these queries consume resources. We could consider the upstream NS as "bad" if it appears to respond to some queries but doesn't respond to others with some response. But one-off or transient timeouts can occur sometimes due to network packet loss. In our case, if the resolver were to block this zone's upstream NSs as bad, it wouldn't be able to respond to any queries within that zone (even address records). It appears to be a popular country-level zone, and it's unlikely the upstream operators will fix it to respond to TYPE65 queries in the short-term. In such cases, a heavy-handed approach may not be practical. Mukund
- [DNSOP] Quick review of draft-dwmtwc-dnsop-cachin… Mukund Sivaraman
- Re: [DNSOP] Quick review of draft-dwmtwc-dnsop-ca… Wessels, Duane
- Re: [DNSOP] Quick review of draft-dwmtwc-dnsop-ca… Mukund Sivaraman