Re: [DNSOP] comments on draft-tale-dnsop-serve-stale-00

Dave Lawrence <tale@dd.org> Tue, 27 June 2017 17:15 UTC

Return-Path: <tale@dd.org>
X-Original-To: dnsop@ietfa.amsl.com
Delivered-To: dnsop@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C03A512EA97 for <dnsop@ietfa.amsl.com>; Tue, 27 Jun 2017 10:15:56 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.901
X-Spam-Level:
X-Spam-Status: No, score=-1.901 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id XpAM1hbVWzbO for <dnsop@ietfa.amsl.com>; Tue, 27 Jun 2017 10:15:54 -0700 (PDT)
Received: from gro.dd.org (gro.dd.org [207.136.192.136]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 37AF712EA7C for <dnsop@ietf.org>; Tue, 27 Jun 2017 10:15:54 -0700 (PDT)
Received: by gro.dd.org (Postfix, from userid 102) id 907033F438; Tue, 27 Jun 2017 13:15:52 -0400 (EDT)
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Message-ID: <22866.37576.448814.454933@gro.dd.org>
Date: Tue, 27 Jun 2017 13:15:52 -0400
From: Dave Lawrence <tale@dd.org>
To: dnsop <dnsop@ietf.org>
In-Reply-To: <CAJE_bqebjKFEvWEQbHM49sr_BgFEf8PtrnFWWPphSttFU+aQ8A@mail.gmail.com>
References: <CAJE_bqebjKFEvWEQbHM49sr_BgFEf8PtrnFWWPphSttFU+aQ8A@mail.gmail.com>
Archived-At: <https://mailarchive.ietf.org/arch/msg/dnsop/HELn35wRU09MuRAjC2bu7er7H9A>
Subject: Re: [DNSOP] comments on draft-tale-dnsop-serve-stale-00
X-BeenThere: dnsop@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: IETF DNSOP WG mailing list <dnsop.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dnsop>, <mailto:dnsop-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dnsop/>
List-Post: <mailto:dnsop@ietf.org>
List-Help: <mailto:dnsop-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dnsop>, <mailto:dnsop-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 27 Jun 2017 17:15:57 -0000

Thank you for the feedback, Jinmei!  Now that Akamai finally posted
its IPR notice we can work on moving it forward.  We plan on asking
for WG adoption now that -01 is up.

https://datatracker.ietf.org/doc/draft-tale-dnsop-serve-stale/

Jinmei wrote:
> - I suspect it should include 'Updates: 1035 (if approved)' in the
> top boilerplate.

Done.  1034 too.

>   The recommended value of the client response timer is 1.8 seconds,
>   so end clients will see this amount of delay for queries for which
>   this technique is needed (most notably while the corresponding
>   authoritative servers are under a DoS attack and unreachable).  I
>   wonder whether this is really acceptable in terms of user
>   experience.  According to the draft this implementation has been
>   actually used in the field (correct?).  If so, were the end users
>   okay with the delay?

It's hard for me to address directly what "okay" is.  I can only say
that they got service when otherwise there would have been done.

We're certainly open to discussing timer values.  In fact, Warren has
previously told me that he doesn't think the values are right either.
We might end up with some different recommendations based on
particular use-cases.  At any rate, they're all "SHOULD" for people to
adjust for whatever their own reasons.

>   Also, it's not clear to me why the TTL is set to 1 second.  Since
>   it's actually expired, a zero TTL seems to be a more sensible choice
>   here (a similar feature of unbound uses a zero TTL).  If there's a
>   specific reason to avoid 0, it would be better to explain it
>   explicitly.

Added to document:

"1 second was chosen because historically 0 second TTLs have been
problematic for some implementations.  It not only sidesteps those
potential problems with no practical negative consequence, it would
also rate limit further queries from any client that is honoring the
TTL, such as a forwarding resolver."

Also, as noted with the other timers, it's still a SHOULD.

It doesn't seem worth it to me to add even more text cataloguing the
problems that some people have encountered with 0 TTL, such as
informally described in
http://mark.lindsey.name/2009/03/never-use-dns-ttl-of-zero-0.html

I am, however, still open to further discussion on whether there is a
compelling reason to use 0 instead of 1.  I realize 1035 did have a
pretty explicit definition of what it though 0 *should* mean, but then
went on to describe sending 0 TTL SOAs, which no one does.

> - Section 4
> 
>> Canonical Name (CNAME) records mingled in the expired cache with
>> other records at the same owner name can cause surprising results.
[...]
>   I suspect this is quite specific to internal implementation details
>   of BIND, [...] it's probably better to clarify it's specific to a
>   particular implementation architecture.

I struggled with how to incorporate this feedback, because I felt like
it was already pretty clear that I was discussing BIND specifically.
In the end the only change I made specifically about this was to say
"The version of BIND" instead of just "BIND", because I'm not even
sure whether it would be an issue in the latest versions.  I also
swapped the sequence of events around to match the real incident that
happened (A was received first, then later CNAME, versus the original
doc saying CNAME then A).

If people think there is additional wording improvement still needed
here, please suggest.