Re: [DNSOP] opportunistic semi-authoritative caching (Re: DNSOP Call for Adoption - draft-tale-dnsop-serve-stale)

Joe Abley <jabley@hopcount.ca> Fri, 08 September 2017 03:08 UTC

Return-Path: <jabley@hopcount.ca>
X-Original-To: dnsop@ietfa.amsl.com
Delivered-To: dnsop@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 976C91330AD for <dnsop@ietfa.amsl.com>; Thu, 7 Sep 2017 20:08:48 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Level:
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_NONE=-0.0001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=hopcount.ca
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id pOVhUF06KjIr for <dnsop@ietfa.amsl.com>; Thu, 7 Sep 2017 20:08:47 -0700 (PDT)
Received: from mail-it0-x22b.google.com (mail-it0-x22b.google.com [IPv6:2607:f8b0:4001:c0b::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 0B60D132F14 for <dnsop@ietf.org>; Thu, 7 Sep 2017 20:08:47 -0700 (PDT)
Received: by mail-it0-x22b.google.com with SMTP id o200so4188085itg.0 for <dnsop@ietf.org>; Thu, 07 Sep 2017 20:08:47 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hopcount.ca; s=google; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=7t+kM8gvUPGe3VdNibqX2QtG5GgZSm7oXmN85KH5KTE=; b=MJbfYINKtLUCbxSmb9+3iwhiYGtyTEjZWbJypwPAr+aI+rIPIOHFuwVxiXRdgPaH8F yCUMT8BfMg/1JFAkQqn5kKELcCREhHJW8HqQu53QjLEHKYgN1wvnKRcxFdcMR7PQ2zBU BCeAbhtg5+tNq7/7iaKh55h4j5DOfd2DHart4=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=7t+kM8gvUPGe3VdNibqX2QtG5GgZSm7oXmN85KH5KTE=; b=IjdYebn7vK4ysquKqY5QwvToXhE+5wIiGlvnAPg8EGsLmeSTjHIluZRz64FCox/2c7 tYSt+KmKg87O3gYD7zY4ggHJ7Fgh/BhX0l3ONw9CipzrYWe5kAcwq5ek3iZO3EtVXsWv uWmv3wVD9a1JT642wq4KFUeEeyCXrw2pb8e3rryE/S6ilE7ObjoVjIUdJQRBYx+lOncR lTPxVtLpevjhyJAF+4dsYQbVFoTE7YhsI0zQ8s0mdSAJhz0e97env8extOgsziySg1zM 5mfJOyZrE2p79y7cBqJlXzs1XV6gp5UpfyYEsGFsfZqXpKSsqEZB5IPBxS/YOOcO0cQb 7hsg==
X-Gm-Message-State: AHPjjUiiR7kajAf2EaY4gJCIB7jQcXWOVB+V6CLOgq6fnCpl1cSqpDhn 1sfruMVUSaSJd8bI99vSUg==
X-Google-Smtp-Source: AOwi7QAWR/ZPM1+/vP/CIoOrwZDSRtu0R37jMHJyczidfcVAgMfbIf1FONG+YPdclm7j2nr8zQsC1Q==
X-Received: by 10.36.172.46 with SMTP id s46mr1325116ite.74.1504840125622; Thu, 07 Sep 2017 20:08:45 -0700 (PDT)
Received: from ?IPv6:2607:f2c0:101:3:7998:3c45:d081:29ee? (node-131dv4p2g4zfv5kbu6.ipv6.teksavvy.com. [2607:f2c0:101:3:7998:3c45:d081:29ee]) by smtp.gmail.com with ESMTPSA id l197sm398040ita.2.2017.09.07.20.08.44 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 07 Sep 2017 20:08:44 -0700 (PDT)
Content-Type: text/plain; charset="us-ascii"
Mime-Version: 1.0 (1.0)
From: Joe Abley <jabley@hopcount.ca>
X-Mailer: iPad Mail (14G60)
In-Reply-To: <59B1F467.9010308@redbarn.org>
Date: Thu, 07 Sep 2017 23:08:43 -0400
Cc: dnsop@ietf.org
Content-Transfer-Encoding: quoted-printable
Message-Id: <FAC87A99-5558-4369-ADC0-57E2B7BF0429@hopcount.ca>
References: <59B1F467.9010308@redbarn.org>
To: Paul Vixie <paul@redbarn.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/dnsop/nhPnoQajgbx8R86ShWiP4_4e7sA>
Subject: Re: [DNSOP] opportunistic semi-authoritative caching (Re: DNSOP Call for Adoption - draft-tale-dnsop-serve-stale)
X-BeenThere: dnsop@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: IETF DNSOP WG mailing list <dnsop.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dnsop>, <mailto:dnsop-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dnsop/>
List-Post: <mailto:dnsop@ietf.org>
List-Help: <mailto:dnsop-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dnsop>, <mailto:dnsop-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Sep 2017 03:08:48 -0000

Apologies in advance for iPad MIME-crime. See below for crimes committed by workman rather than tools.

> On Sep 7, 2017, at 21:37, Paul Vixie <paul@redbarn.org> wrote:
> 
> note, there's a proposal contained here.
> 
> Jared Mauch wrote:
>>> On Thu, Sep 07, 2017 at 01:29:47PM -0700, Paul Vixie wrote:
>>> if the draft being considered was clear on two points, i'd support adoption.
>>> 
>>> ...
>> 
>>   Would you see the querying application informing you of intent via
>> option code saying "If I'm unable to talk to you once TTL expires, I may serve
>> your last known good answer"?
> 
> i don't think so. if it was "may i serve your last good answer?" then yes. but with it as "i may" and the ? outside the quotes as shown above, then no.

There's a recursive operator with whom Jared and tale may be familiar that some time ago had a feature called "pinning" whereby particular names that were known to be availability-sensitive (their non-availability caused great disturbance in the helpdesk) could be "pinned" -- that is, in the recursive server, they were configured never to expire from the cache. They could be refreshed, but they would not expire.

When I remember learning about this I remember a certain personal conflict. The goal in the implementation I referred to was edge-centric, in the sense that the goal was to reduce the opportunity for end-users to see breakage due to DNS resolution failure. The recursive operator (and, by proxy, the access provider that was fielding the support calls) could probably find out in short order whether this was a safe thing to do for particular domains, and they chose the domains that should be pinned, but I thought and still think that that was a less desirable workflow than the zone manager signalling what the behaviour should be in the event that authoritative servers couldn't be reached. The implementation was pragmatically good, but I thought it could be better.

The conversation I have seen so far is edge-centric; a recursive operator might serve stale data because it believes that a stale answer is better than no answer in all cases. A more granular approach might allow a recursive server to cache a record with multiple TTLs, indicating (for example) that ideally you would refresh this response after N seconds, but if you can't do that then perhaps M seconds would be fine. A large M would reasonably approximate "quite stale". Telling recursive servers what to do in the event of authority servers being unreachable seems better as an authority server operator than having to deal with a variety of behaviours between you and the end-user. This is different from a mechanism whereby a stub or downstream recursive resolver would signal tolerance for stale data in responses.

This could be implemented maybe as an EDNS option used in the request and corresponding response between a recursive server and an authority server. Queries within zones that were not served by their authority servers configured to supply the option or which were not requested by recursive servers prepared to do the necessary would be handled conventionally.


Joe