Re: [DNSOP] DNSOP Call for Adoption - draft-tale-dnsop-serve-stale

Marek Vavruša <mvavrusa@cloudflare.com> Mon, 11 September 2017 18:52 UTC

Return-Path: <mvavrusa@cloudflare.com>
X-Original-To: dnsop@ietfa.amsl.com
Delivered-To: dnsop@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7BDBD1331A3 for <dnsop@ietfa.amsl.com>; Mon, 11 Sep 2017 11:52:10 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.7
X-Spam-Level:
X-Spam-Status: No, score=-2.7 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cloudflare.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id f2jN7kMcrOSK for <dnsop@ietfa.amsl.com>; Mon, 11 Sep 2017 11:52:08 -0700 (PDT)
Received: from mail-lf0-x234.google.com (mail-lf0-x234.google.com [IPv6:2a00:1450:4010:c07::234]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 19C28132697 for <dnsop@ietf.org>; Mon, 11 Sep 2017 11:52:08 -0700 (PDT)
Received: by mail-lf0-x234.google.com with SMTP id 80so20889509lfy.4 for <dnsop@ietf.org>; Mon, 11 Sep 2017 11:52:08 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudflare.com; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=qh/jLYDjKdOVBKwc1QcXvJb37rpfT+Be45NNHRTZ6zc=; b=vdLBn4AxXsOKdaiK2W4yqFxeV27lkGHb8hGSHxPdS+EZkWQxvgQOdGEgCrG4NhR5v4 rSdyWa8kuF0YvzoN5niCaiI8RqKSR9Z+kQIgUvv3mPJ0w7if0LGDYUADY22XZUzu5Xlo /bdKqZUOYSK2C5uPzDqc0fTHSNzYkWp8al+XM=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=qh/jLYDjKdOVBKwc1QcXvJb37rpfT+Be45NNHRTZ6zc=; b=XnyNi+l/5xs8TyBpPsNwgN8/b+3Csfh5sKHHMo6qWRa1kWLFJVgqM/GmgTMj8eMU3H arHo4tgLGo3xZopzB6TJXT0wS+DKr7qUrkJ6M5INeRFSurvdRqSH2RZQuOyCAAbyQJpc ozE49CW8xBpC0SiT1TkfCh4gWVfhOw4YGl0S7BB/NY0nNCOA1dmKn1wd8baJ9DqG6s1X 5brqiCCFRSaxk/md3qVqfKKtsw9h1P5edfKuPwMq0evoqyUhgqZTo12uhqwYS7vKvrRu 4L88iz+xarIa7u+o6PBV6SMKBGfpkz//tz8Kh1mXLfPT47xPeSU+TFclAY/ArkxxnJ6T aUDg==
X-Gm-Message-State: AHPjjUjUQBtlh5OBWkr3XpJRhzKmXr5Ew7oDJLTg9iykN0KwZEsv4Kht paaSubAFCAoGZQmI5TJgf7X1Ciiul+WsDnU=
X-Google-Smtp-Source: AOwi7QDF4ypwGoCsHP0iVjxTQDHLrLr1dcf11g9PdBUehgGc5GDKLFUueiOSlbWpp7fYQ9/XrnL6loFiMKvekkIzi8A=
X-Received: by 10.25.23.105 with SMTP id n102mr4436131lfi.252.1505155926207; Mon, 11 Sep 2017 11:52:06 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.25.233.20 with HTTP; Mon, 11 Sep 2017 11:52:05 -0700 (PDT)
In-Reply-To: <CA+nkc8CaJ+4_SCwm5Nbvd8r5SaKcTFhRq4jNV8RHp91Mrt5BhA@mail.gmail.com>
References: <CADyWQ+FHDHcmq-mr0BCHS5A8yvaOQmhTjve1_DmZN6vAc=BKyA@mail.gmail.com> <20170907154234.3z2zbju2sciiy7wr@nic.fr> <ybltw0emmvh.fsf@wu.hardakers.net> <8295055.TIQDDEhZcU@localhost.localdomain> <20170907221241.GA1031@puck.nether.net> <20170908020710.0170284A3195@rock.dv.isc.org> <CA+nkc8CaJ+4_SCwm5Nbvd8r5SaKcTFhRq4jNV8RHp91Mrt5BhA@mail.gmail.com>
From: Marek Vavruša <mvavrusa@cloudflare.com>
Date: Mon, 11 Sep 2017 11:52:05 -0700
Message-ID: <CAC=TB12Y8nAPZAFy3WiXskuAzb+3cwR-Kqp_WoG8VPu+6AfuBg@mail.gmail.com>
To: Bob Harold <rharolde@umich.edu>
Cc: Mark Andrews <marka@isc.org>, IETF DNSOP WG <dnsop@ietf.org>
Content-Type: text/plain; charset="UTF-8"
Archived-At: <https://mailarchive.ietf.org/arch/msg/dnsop/gRzPomdqcuYE6m_nlEy7EDeNPQI>
Subject: Re: [DNSOP] DNSOP Call for Adoption - draft-tale-dnsop-serve-stale
X-BeenThere: dnsop@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: IETF DNSOP WG mailing list <dnsop.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dnsop>, <mailto:dnsop-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dnsop/>
List-Post: <mailto:dnsop@ietf.org>
List-Help: <mailto:dnsop-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dnsop>, <mailto:dnsop-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 11 Sep 2017 18:52:10 -0000

I support the adoption of this document. Was there a discussion of any
actual downsides besides "I'd like to know if it's stale" and
monitoring?

On Mon, Sep 11, 2017 at 11:11 AM, Bob Harold <rharolde@umich.edu> wrote:
>
> On Thu, Sep 7, 2017 at 10:07 PM, Mark Andrews <marka@isc.org> wrote:
>>
>>
>> Part of the problem is that we have one TTL value for both freshness
>> and don't use beyond.
>>
>> This is fixable.  It is possible to specify two timer values.  It
>> does require adding signaling between recursive servers and
>> authoritative servers, on zone transfers and update requests.
>>
>> You basically add a additional timer field to every record immediately
>> after the TTL field.  This is only returned if the client has
>> signalled support for the extended field, I suggest using the last
>> DNS header bit for this as you can determine how you will parse the
>> response base on whether the bit is set in the response or not.
>> This field is used to expire records from the cache and its value
>> is set to the TTL field if the server has learnt the record from
>> server that doesn't support the extension.
>>
>> The existing TTL field is used for freshness checking.  When a query
>> comes in after that value has expired a freshness check is performed
>> similar to the existing prefetches that happen today.  A TTL of 1
>> is returned unless the original TTL was 0 in which case 0 is returned.
>>
>> New client - new recursive server - new authservers
>>
>>         example.com. 300 86400 IN A 1.2.3.4
>>
>>                 +300 seconds
>>
>>         example.com. 1 86100 IN A 1.2.3.4
>>          (background query is in process)
>>
>> Old client - new recursive server - new authservers
>>
>>         example.com. 300 IN A 1.2.3.4
>>
>>                 +300 seconds
>>
>>         example.com. 1 IN A 1.2.3.4
>>          (background query is in process)
>>
>> New client - new recusive server - old auth servers
>>
>>         example.com. 300 300 IN A 1.2.3.4
>>
>>                 +300 seconds
>>          (record has expired from cache,
>>           new query is performed)
>>
>>         example.com. 300 300 IN A 1.2.3.4
>>
>> For UPDATE a replacement opcode would be cleanest way to signal the
>> new format is being used.  NOTIMP should be returned by servers
>> that don't support the new opcode.
>>
>> There will be a few broken servers that just echo back the new
>> header bit.
>>
>> This way the authoritative servers still control how long records
>> are stored for.  Dead servers will get a little bit of traffic until
>> the the refresh completes.  If the authorative servers are under
>> attack the clients still see a answer.
>>
>> The alternative is to perform the refresh query and if it fails to
>> complete within X milliseconds return the cached data rather than
>> returning the cached data and doing the refresh in the background.
>>
>> Mark
>>
>> --
>> Mark Andrews, ISC
>> 1 Seymour St., Dundas Valley, NSW 2117, Australia
>> PHONE: +61 2 9871 4742                 INTERNET: marka@isc.org
>
>
> While I like the idea of a  "don't use beyond" timer, I think it will be a
> very long time before it is widely deployed (and actually configured by zone
> owners), and therefore won't solve our immediate need.  It would be great if
> clients could opt-in, but again I don't see that happening anytime soon.  So
> I would start with resolver-operators deciding what seems best for their
> clients (which is hat is happening whether we like it or not).  Adding
> client opt-out/opt-in would be good.   Signalling to say that a response is
> stale would be good.  Adding the second timer (both per-RR and as a zone
> default value, like TTL is handled) would be good.
>
> On a related note - the SOA "expire" timer tells a slave how long to keep
> serving "stale" zone data when the master cannot be reached.  Would that be
> a reasonable default value for how long a resolver should serve "stale" data
> when the authoritative servers cannot be reached?   (Currently I think most
> people set a very high value compared to the TTL.)
>
> --
> Bob Harold
>
>
>
> _______________________________________________
> DNSOP mailing list
> DNSOP@ietf.org
> https://www.ietf.org/mailman/listinfo/dnsop
>