[DNSOP] comments on draft-tale-dnsop-serve-stale-00

神明達哉 <jinmei@wide.ad.jp> Wed, 24 May 2017 17:58 UTC

MIME-Version: 1.0
Sender: jinmei.tatuya@gmail.com
From: 神明達哉 <jinmei@wide.ad.jp>
Date: Wed, 24 May 2017 10:57:55 -0700
Message-ID: <CAJE_bqebjKFEvWEQbHM49sr_BgFEf8PtrnFWWPphSttFU+aQ8A@mail.gmail.com>
To: draft-tale-dnsop-serve-stale@ietf.org
Cc: dnsop <dnsop@ietf.org>
Content-Type: text/plain; charset="UTF-8"
Archived-At: <https://mailarchive.ietf.org/arch/msg/dnsop/jlWplNP-ZOtfcOLwq5z4AdUyCcA>
Subject: [DNSOP] comments on draft-tale-dnsop-serve-stale-00
Precedence: list

I've read draft-tale-dnsop-serve-stale-00.  Overall I think we need
something like this in practice.  Even if, technically, it violates
the current protocol standards, the background motivation is a real
operational issue and I believe we should provide some
standard-compliant mitigation.  Of course, the end result may be very
different from what's currently described in this draft, but I think
this is a good start for the goal.

I have a few minor comments on the current version:

- I suspect it should include 'Updates: 1035 (if approved)' in the top
  boilerplate.

- Section 3

   If the answer has not been completely determined by the time the
   client response timer has elapsed, the resolver SHOULD then check its
   cache to see whether there is expired data that would satisfy the
   request.  If so, it adds that data to the response message and SHOULD
   set the TTL of each expired record in the message to 1 second.

  The recommended value of the client response timer is 1.8 seconds,
  so end clients will see this amount of delay for queries for which
  this technique is needed (most notably while the corresponding
  authoritative servers are under a DoS attack and unreachable).  I
  wonder whether this is really acceptable in terms of user
  experience.  According to the draft this implementation has been
  actually used in the field (correct?).  If so, were the end users
  okay with the delay?

  Also, it's not clear to me why the TTL is set to 1 second.  Since
  it's actually expired, a zero TTL seems to be a more sensible choice
  here (a similar feature of unbound uses a zero TTL).  If there's a
  specific reason to avoid 0, it would be better to explain it
  explicitly.

- Section 4

   Canonical Name (CNAME) records mingled in the expired cache with
   other records at the same owner name can cause surprising results.
   This was observed with an initial implementation in BIND, where a
   hostname changed from having a CNAME record to an IPv4 Address (A)
   record.  BIND does not evict CNAMEs in the cache when other types are
   received, which in normal operations is not an issue.  However, after
   both records expired and the authorities became unavailable, the
   fallback to stale answers returned the older CNAME instead of the
   newer A.

  I suspect this is quite specific to internal implementation details
  of BIND, specifically that RRsets of a name is maintained in a
  single-linked list, newer RRsets are prepended to the list, and on
  lookup the last found one is used if the list contains both CNAME
  and the exact type (A in this example).  Is my guess correct?  If
  so, while this is really an interesting topic and probably worth
  sharing, it's probably better to clarify it's specific to a
  particular implementation architecture.

--
JINMEI, Tatuya

[DNSOP] comments on draft-tale-dnsop-serve-stale-… 神明達哉
Re: [DNSOP] comments on draft-tale-dnsop-serve-st… Dave Lawrence
Re: [DNSOP] comments on draft-tale-dnsop-serve-st… 神明達哉
Re: [DNSOP] comments on draft-tale-dnsop-serve-st… Dave Lawrence