[DNSOP] Benjamin Kaduk's No Objection on draft-ietf-dnsop-no-response-issue-19: (with COMMENT)

Benjamin Kaduk via Datatracker <noreply@ietf.org> Tue, 07 April 2020 16:04 UTC

Return-Path: <noreply@ietf.org>
X-Original-To: dnsop@ietf.org
Delivered-To: dnsop@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id 7B54F3A0DA1; Tue, 7 Apr 2020 09:04:22 -0700 (PDT)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
From: Benjamin Kaduk via Datatracker <noreply@ietf.org>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-dnsop-no-response-issue@ietf.org, dnsop-chairs@ietf.org, dnsop@ietf.org, Tim Wicinski <tjw.ietf@gmail.com>, tjw.ietf@gmail.com
X-Test-IDTracker: no
X-IETF-IDTracker: 6.124.0
Auto-Submitted: auto-generated
Precedence: bulk
Reply-To: Benjamin Kaduk <kaduk@mit.edu>
Message-ID: <158627546208.13317.5016687312680560092@ietfa.amsl.com>
Date: Tue, 07 Apr 2020 09:04:22 -0700
Archived-At: <https://mailarchive.ietf.org/arch/msg/dnsop/QniaiutrJ3S982-mgkogvkr79o0>
Subject: [DNSOP] Benjamin Kaduk's No Objection on draft-ietf-dnsop-no-response-issue-19: (with COMMENT)
X-BeenThere: dnsop@ietf.org
X-Mailman-Version: 2.1.29
List-Id: IETF DNSOP WG mailing list <dnsop.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dnsop>, <mailto:dnsop-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dnsop/>
List-Post: <mailto:dnsop@ietf.org>
List-Help: <mailto:dnsop-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dnsop>, <mailto:dnsop-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 07 Apr 2020 16:04:23 -0000

Benjamin Kaduk has entered the following ballot position for
draft-ietf-dnsop-no-response-issue-19: No Objection

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
for more information about IESG DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-dnsop-no-response-issue/



----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

Someone (maybe the RFC Editor) will end up tweaking a lot of commas.
I didn't try to list them all.

I didn't see a response to the secdir reviewer's question (though I'm
also not sure that there's an easy answer to it).

Section 1

   The existence of servers which fail to respond to queries results in
   developers being hesitant to deploy new standards.  Such servers need

nit: it feels a little like a juxtaposition to have "developers" that
"deploy" new standards (vs. "developers that implement" or "operators
that deploy").

   indication that the server is under attack.  Parent zone operators
   are advised to regularly check that the delegating NS records are
   consistent with those of the delegated zone and to correct them when
   they are not [RFC1034].  Doing this regularly should reduce the
   instances of broken delegations.

I can't tell if this 1034 reference is for the recommendation to
regularly check or the definition of "consistent" or something else; if
the recommendation is new, then would BCP 14 keywords be appropriate?

Section 2

   o  The AD flag bit in a response cannot be trusted to mean anything
      as some servers incorrectly copy the flag bit from the request to
      the response [RFC1035], [RFC4035].

Would it be worth a 6840 ref here as well (to catch setting AD in a
request, even though that's not exactly what's being mentioned)?

Section 3.1.2

(Do we want to remind the reader on the NOERROR vs. NXDOMAIN rules? "No"
is probably acceptable.  I see we do so later, in Section 7, so even a
forward reference might suffice.)

Where's the first reference/mention of Meta-RRs?  I see RFC 2929
(obsoleted, transitively, by 6895) that we cite for the "range reserved
for private use" but not for terminology.  Even RFC 8499 (which we don't
cite) only has "meta-RR" in a parenthetical in the description of OPT.

Section 3.1.5

micro-nit: I guess firewalls don't exactly count as "nameservers", which
seems to be the claimed scope for this document.

Section 3.2.1

This section threw me a bit, at first, as the 3.1.x had led me to expect
"nameservers should behave in this way", but this section is "here is
how to tell if a nameserver is misbehaving".  That's not necessarily a
problem, just a ... comment :)

Section 3.2.6

   Some nameservers fail to copy the DO bit to the response despite
   clearly supporting DNSSEC by returning an RRSIG records to EDNS
   queries with DO=1.

I'm not sure if we also want an explicit "nameservers should copy to the
DO bit to the response when they support DNSSEC".

Section 3.2.7

[similarly an affirmative statement of what nameservers should do might
be appropriate here.]

Section 4

   Firewalls and load balancers can affect the externally visible
   behaviour of a nameserver.  Tests for conformance should to be done
   from outside of any firewall so that the system is tested as a whole.

(These are conformance tests run by the nameserver's own operator, or
externally-driven tests, too?)

   However, there may be times when a nameserver mishandles messages
   with a particular flag, EDNS option, EDNS version field, opcode, type
   or class field or combination thereof to the point where the
   integrity of the nameserver is compromised.  Firewalls should offer
   the ability to selectively reject messages using an appropriately
   constructed response based on all these fields while awaiting a fix
   from the nameserver vendor.

I would suggest reiterating that this is "with a response" vs. "drop the
packet silently".

Section 5

   Ideally, Operators should run these tests against a packet scrubbing
   service to ensure that these tests are not seen as attack vectors.

It feels like maybe the most we can say here is "not seen as attack
vectors during normal operation".  We can't exclude the possibility that
some actor decides to generate a flood of messages that happens to match
the test behavior (whether by accident or design), which seems fairly
likely to lead to blocking of the test-behavior traffic as collateral
damage.

Section 7

   If the server does not support EDNS at all, FORMERR is the expected
   error code.  That said a minimal EDNS server implementation requires
   parsing the OPT records and responding with an empty OPT record in
   the additional section in most cases.  There is no need to interpret
   any EDNS options present in the request as unsupported EDNS options
   are expected to be ignored [RFC6891].  Additionally EDNS flags can be
   ignored.  The only part of the OPT record that needs to be examined
   is the version field to determine if BADVERS needs to be sent or not.

It seems like there's an implied "so providing minimal EDNS support is
pretty trivial and you ought to do so already" in here; do we want to
make such sentiment explicit?

Section 8

   Testing is divided into two sections.  "Basic DNS", which all servers
   should meet, and "Extended DNS", which should be met by all servers
   that support EDNS (a server is deemed to support EDNS if it gives a
   valid EDNS response to any EDNS query).  If a server does not support
   EDNS it should still respond to all the tests.

Is this "respond to all the tests, albeit with [error responses]"?

   The tests below use dig from BIND 9.11.0.

I guess this version could become important if some future version
starts setting a new flag by default (that would need to be suppressed
if that version of dig was used for many of these tests).

Section 8.1.2

   Ask for the TYPE1000 RRset at the configured zone's name.  This query
   is made with no DNS flag bits set and without EDNS.  TYPE1000 has
   been chosen for this purpose as IANA is unlikely to allocate this
   type in the near future and it is not in a range reserved for private
   use [RFC6895].  Any unallocated type code could be chosen for this
   test.

Is there a risk that since we document TYPE1000 like this some server
will implement "respond to TYPE1000" without implementing the actual
desired behavior?

Section 8.1.3.2

   AD use in queries is defined in [RFC6840].

(Knowing this would have been helpful up in the toplevel section 8 where
we talk about one or both AD=1 and DO=1 being a signal to expect AD=1.)

Section 8.2.3, 8.2.6

[Same comment about option code 100 as for TYPE1000 above; the same
response is assumed.]

Section 9

   When notification is not effective at correcting problems with a
   misbehaving name server, parent operators can choose to remove NS
   record sets (and glue records below) that refer to the faulty server
   until the servers are fixed.  This should only be done as a last
   resort and with due consideration, as removal of a delegation can
   have unanticipated side effects.  [...]

I have mixed feelings about recommending "cut you off until you fix your
bugs" as an option, but not strongly enough to override WG consensus.