[DNSOP] Unexpected REFUSED from BIND when using example config from RFC7706

Bjørn Mork <bjorn@mork.no> Thu, 06 April 2017 09:03 UTC

Return-Path: <bjorn@mork.no>
X-Original-To: dnsop@ietfa.amsl.com
Delivered-To: dnsop@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1D0C912940B for <dnsop@ietfa.amsl.com>; Thu, 6 Apr 2017 02:03:54 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.3
X-Spam-Level:
X-Spam-Status: No, score=-4.3 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_MED=-2.3, RP_MATCHES_RCVD=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=mork.no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Ft_9L_0LhVYu for <dnsop@ietfa.amsl.com>; Thu, 6 Apr 2017 02:03:47 -0700 (PDT)
Received: from canardo.mork.no (canardo.mork.no [IPv6:2001:4641::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4874A1243F6 for <dnsop@ietf.org>; Thu, 6 Apr 2017 02:03:47 -0700 (PDT)
Received: from miraculix.mork.no ([IPv6:2a02:2121:2c0:3560:b474:86ff:fe5d:4ecf]) (authenticated bits=0) by canardo.mork.no (8.14.4/8.14.4) with ESMTP id v3693hXd003569 (version=TLSv1/SSLv3 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO) for <dnsop@ietf.org>; Thu, 6 Apr 2017 11:03:44 +0200
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mork.no; s=b; t=1491469424; bh=1OCTUNvs5OxMmcDoz4xiaeEhHFyr9XKe/oQAwbuffvo=; h=From:To:Subject:Date:Message-ID:From; b=H4bHNXbsNHL57dYfiYXKexrAxF4OlYCvtmTXogs8sy92PHjsEinjcIL2qNaOO5nzL LFOs1REdnYJyuJYfLAFRHJ1shIzMISDfJ8CMLUB+v2vKN0vmlfFRVi37XBfN7mOnLQ XKnik+z7f1xxZi7OkqaEMk4SXYsHhHPUg8a8VgFQ=
Received: from bjorn by miraculix.mork.no with local (Exim 4.89) (envelope-from <bjorn@mork.no>) id 1cw3KM-0003sg-JZ for dnsop@ietf.org; Thu, 06 Apr 2017 11:03:38 +0200
From: =?utf-8?Q?Bj=C3=B8rn_Mork?= <bjorn@mork.no>
To: dnsop@ietf.org
Organization: m
Date: Thu, 06 Apr 2017 11:03:38 +0200
Message-ID: <87inmhrjpx.fsf@miraculix.mork.no>
User-Agent: Gnus/5.130015 (Ma Gnus v0.15) Emacs/24.5 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Virus-Scanned: clamav-milter 0.99.2 at canardo
X-Virus-Status: Clean
Archived-At: <https://mailarchive.ietf.org/arch/msg/dnsop/vPM2AJpA7loSyTDMfdsYHDyftFc>
Subject: [DNSOP] Unexpected REFUSED from BIND when using example config from RFC7706
X-BeenThere: dnsop@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: IETF DNSOP WG mailing list <dnsop.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dnsop>, <mailto:dnsop-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dnsop/>
List-Post: <mailto:dnsop@ietf.org>
List-Help: <mailto:dnsop-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dnsop>, <mailto:dnsop-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 06 Apr 2017 09:03:54 -0000

Hello,

We are currently trying out the configuration recommended by RFC7706,
serving a copy of the root zone on a loopback.  We are using BIND 9.10
and our configuration is directly copied from the example in appendix
B.1.  Even down to the actual loopback address used, as we needed a
dedicated one for this instance anyway.

Recently I noticed a side effect of this configuration which I consider
unwanted and unexpected: It changes how BIND replies to requests without
the RD bit set. BIND will normally answer such requests with a "best
possible redirection", using any matching NS set it has in its cache.
Which often will be the root NS.  But using the RFC7706 example config,
BIND will REFUSE all requests without RD set.

For the record, my expectations are: Absolutely no visible effects to
any DNS clients compared to a normal caching resolver, using the real
root servers directly.



Using the RFC7706 appedix B.1 example config, this is what I see:


bjorn@miraculix:~$ dig . soa @::1 +norecur

; <<>> DiG 9.10.3-P4-Debian <<>> . soa @::1 +norecur
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: REFUSED, id: 49721
;; flags: qr ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;.                              IN      SOA

;; Query time: 0 msec
;; SERVER: ::1#53(::1)
;; WHEN: Thu Apr 06 10:25:30 CEST 2017
;; MSG SIZE  rcvd: 28


Snooping on the loopback interface, shows that the query is refused
without being referred to the local root instance (which makes sort of
sense, as that would be recursion...):

    1 0.000000000           ::1 → ::1          Standard query 0xc239 SOA <Root> OPT 92 DNS
    2 0.000093346           ::1 → ::1          Standard query response 0xc239 Refused SOA <Root> OPT 92 DNS


Removing the local root zone, letting BIND fall back to its built-in
hint zone:

   view recursive {
       dnssec-validation auto;
       allow-recursion { any; };
       recursion yes;
  /*     zone "." {
           type static-stub;
           server-addresses { 127.12.12.12; };
       };
 */
   };


changes this behaviour.  BIND now replies with NOERROR and an AUTHORITY
section from its hints (or actually from the cache. Note the AD flag):

bjorn@miraculix:~$ dig . soa @::1 +norecur

; <<>> DiG 9.10.3-P4-Debian <<>> . soa @::1 +norecur
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 29060
;; flags: qr ra ad; QUERY: 1, ANSWER: 0, AUTHORITY: 13, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;.                              IN      SOA

;; AUTHORITY SECTION:
.                       518378  IN      NS      a.root-servers.net.
.                       518378  IN      NS      g.root-servers.net.
.                       518378  IN      NS      m.root-servers.net.
.                       518378  IN      NS      j.root-servers.net.
.                       518378  IN      NS      h.root-servers.net.
.                       518378  IN      NS      e.root-servers.net.
.                       518378  IN      NS      f.root-servers.net.
.                       518378  IN      NS      k.root-servers.net.
.                       518378  IN      NS      d.root-servers.net.
.                       518378  IN      NS      l.root-servers.net.
.                       518378  IN      NS      i.root-servers.net.
.                       518378  IN      NS      c.root-servers.net.
.                       518378  IN      NS      b.root-servers.net.

;; Query time: 0 msec
;; SERVER: ::1#53(::1)
;; WHEN: Thu Apr 06 10:18:43 CEST 2017
;; MSG SIZE  rcvd: 239



Snooping on "any" reveals that there is no recursion in this case
either, as expected.  The data source is the cache:

    1 0.000000000           ::1 → ::1          Standard query 0x7184 SOA <Root> OPT 92 DNS
    2 0.000213074           ::1 → ::1          Standard query response 0x7184 SOA <Root> NS a.root-servers.net NS g.root-servers.net NS m.root-servers.net NS j.root-servers.net NS h.root-servers.net NS e.root-servers.net NS f.root-servers.net NS k.root-servers.net NS d.root-servers



The problem is that configuring a "static-stub" root zone changes how
BIND handles non RD requests.  Note that this affects *any* such
request, not only requests for root.  Example with the "static-stub"
root zone enabled:

bjorn@miraculix:~$ dig ns mork.no @::1 +norecur

; <<>> DiG 9.10.3-P4-Debian <<>> ns mork.no @::1 +norecur
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: REFUSED, id: 34742
;; flags: qr ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;mork.no.                       IN      NS

;; Query time: 0 msec
;; SERVER: ::1#53(::1)
;; WHEN: Thu Apr 06 10:32:45 CEST 2017
;; MSG SIZE  rcvd: 36

    1 0.000000000           ::1 → ::1          Standard query 0x87b6 NS mork.no OPT 100 DNS
    2 0.000121759           ::1 → ::1          Standard query response 0x87b6 Refused NS mork.no OPT 100 DNS



BIND will normally provide the root hint from its cache without the
"static-stub" root zone configuration:


bjorn@miraculix:~$ dig ns mork.no @::1 +norecur

; <<>> DiG 9.10.3-P4-Debian <<>> ns mork.no @::1 +norecur
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 54393
;; flags: qr ra ad; QUERY: 1, ANSWER: 0, AUTHORITY: 13, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;mork.no.                       IN      NS

;; AUTHORITY SECTION:
.                       518395  IN      NS      a.root-servers.net.
.                       518395  IN      NS      h.root-servers.net.
.                       518395  IN      NS      g.root-servers.net.
.                       518395  IN      NS      d.root-servers.net.
.                       518395  IN      NS      f.root-servers.net.
.                       518395  IN      NS      e.root-servers.net.
.                       518395  IN      NS      m.root-servers.net.
.                       518395  IN      NS      j.root-servers.net.
.                       518395  IN      NS      k.root-servers.net.
.                       518395  IN      NS      i.root-servers.net.
.                       518395  IN      NS      c.root-servers.net.
.                       518395  IN      NS      b.root-servers.net.
.                       518395  IN      NS      l.root-servers.net.

;; Query time: 1 msec
;; SERVER: ::1#53(::1)
;; WHEN: Thu Apr 06 10:34:27 CEST 2017
;; MSG SIZE  rcvd: 247

    1 0.000000000           ::1 → ::1          Standard query 0xd479 NS mork.no OPT 100 DNS
    2 0.000255371           ::1 → ::1          Standard query response 0xd479 NS mork.no NS a.root-servers.net NS h.root-servers.net NS g.root-servers.net NS d.root-servers.net NS f.root-servers.net NS e.root-servers.net NS m.root-servers.net NS j.root-servers.net NS k.root-servers.net NS i.root-servers.net NS c.root-servers.net NS b.root-servers.net NS l.root-servers.net OPT 311 DNS




These examples are all without setting RD.  I will not bore you with the
results having RD set.  Those all work as expected, giving identical
replies with or without the "static-stub" root zone.


Experimenting a bit more, I found that changing the RFC7706 example
config to use a "forward" zone instead, makes BIND behave as expected
while keeping advantages of the local root zone instance:

   view recursive {
       dnssec-validation auto;
       allow-recursion { any; };
       recursion yes;
       zone "." {
           type forward;
           forwarders { 127.12.12.12; };
       };
   };


With this configuration I get similar results to the examples without a
local root zone, while still having all requests for root served by the
local instance.  So it looks like a good solution to me.

I've tried googling a bit to see if this issue has been discussed
before, or if I could find any background for the selection of a
"static-stub" zone type for the RFC7706 example.  But I have not been
able to find any.  Which is why I decided to ask here.  Is this a known
and expected issue?  Why was "static-stub" used?  Was "forward" ever
considered?

I realize that using "forward" will change the behaviour in case of a
failing local instance, but that can also be seen as a feature.  Or it
can be prevented by simply adding "forward only" to the config, if
complete failure is preferred over falling back to the real root
servers.

Finally, if any of you wonder why I (or anyone else) with send queries
without RD set to a caching resolver:  I discovered the issue because
"dig +trace" started failing.  Quoting from dig(1):

   "Recursion is automatically disabled when the +nssearch or +trace
    query options are used."


This makes a real use case for me.  And I like to think that I'm not
that special, so there are probably other users thinking this
behavioural change is a bug as well.

Any advice is appreciated.  Our choices are basically down to either
 a) disable the local root instance, or
 b) go for the "forward" variant


But I hesitate to select option b) without knowing the background for the
"static-stub" recommendation.  Option a) seems a lot safer at the
moment...




Bjørn