[DNSOP] Benjamin Kaduk's Discuss on draft-ietf-dnsop-dns-tcp-requirements-13: (with DISCUSS and COMMENT)

Benjamin Kaduk via Datatracker <noreply@ietf.org> Mon, 25 October 2021 22:51 UTC

Return-Path: <noreply@ietf.org>
X-Original-To: dnsop@ietf.org
Delivered-To: dnsop@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id 0A29D3A10E5; Mon, 25 Oct 2021 15:51:06 -0700 (PDT)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
From: Benjamin Kaduk via Datatracker <noreply@ietf.org>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-dnsop-dns-tcp-requirements@ietf.org, dnsop-chairs@ietf.org, dnsop@ietf.org, Suzanne Woolf <suzworldwide@gmail.com>, suzworldwide@gmail.com
X-Test-IDTracker: no
X-IETF-IDTracker: 7.39.0
Auto-Submitted: auto-generated
Precedence: bulk
Reply-To: Benjamin Kaduk <kaduk@mit.edu>
Message-ID: <163520226600.2076.6225006958067294469@ietfa.amsl.com>
Date: Mon, 25 Oct 2021 15:51:06 -0700
Archived-At: <https://mailarchive.ietf.org/arch/msg/dnsop/d5hrKxZAaVrcdj2na2b0tzjSD9c>
Subject: [DNSOP] Benjamin Kaduk's Discuss on draft-ietf-dnsop-dns-tcp-requirements-13: (with DISCUSS and COMMENT)
X-BeenThere: dnsop@ietf.org
X-Mailman-Version: 2.1.29
List-Id: IETF DNSOP WG mailing list <dnsop.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dnsop>, <mailto:dnsop-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dnsop/>
List-Post: <mailto:dnsop@ietf.org>
List-Help: <mailto:dnsop-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dnsop>, <mailto:dnsop-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 25 Oct 2021 22:51:12 -0000

Benjamin Kaduk has entered the following ballot position for
draft-ietf-dnsop-dns-tcp-requirements-13: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/blog/handling-iesg-ballot-positions/
for more information about how to handle DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-dnsop-dns-tcp-requirements/



----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

(1) This should be pretty easy to resolve, but this text from §4.4
does not seem to match up with the referenced document:

   The use of TLS places even stronger operational burdens on DNS
   clients and servers.  Cryptographic functions for authentication and
   encryption requires additional processing.  Unoptimized connection
   setup takes two additional round-trips compared to TCP, but can be
   reduced with TCP Fast Open, TLS session resumption [RFC8446] and TLS
   False Start [RFC7918].

Two additional round trips was true of TLS 1.2 and prior versions, but
as of TLS 1.3 the application data from the client can be sent after
only 1 round trip, accompanying the client Finished (and authentication
messages, if in use).  Given the nature of the rest of the sentence, we
might want to specifically mention TLS 1.3 as an improvement over TLS
1.2, but there are probably a number of ways that we could fix it.  Note
additionally that for TLS 1.3, session resumption is not a reduction in
the number of round trips unless 0-RTT data is used (but AFAIK there is
not a published application profile specifying acceptable DNS content
for TLS 0-RTT data, so use of TLS 0-RTT data for DNS is forbidden), but
is still an efficiency gain due to the reduced number of cryptographic
operations (including certificate validation).

(2) Trivial to address, but the section heading for Appendix A.8
references RFC 3326 (The Reason Header Field for the Session Initiation
Protocol (SIP)), not RFC 3226 (DNSSEC and IPv6 A6 aware server/resolver
message size requirements)


----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

This document targets BCP status, while proposing an Updates:
relationship with an Internet Standard and an Informational document.
As mentioned in
https://mailarchive.ietf.org/arch/msg/last-call/oMhqRigr4nbRSMll5NY4zXpZu20/
there is some motivation for having updates to standards-track documents
occur via other standards-track documents, but given that this document
is mostly about giving operational guidance for DNS-over-TCP usage,
there seems to be some argument that BCP is a more appropriate status
for it.  However, I do wonder if there is some existing BCP that this
document would become part of, or if it would need to be given a new BCP
number.  It's not entirely clear how much scope there is for future
additions that would become part of that same BCP, and whether a BCP
number is truly needed in that scenario.  It would be good to hear
others' thoughts on this topic, especially in the form of references to
previous WG discussion.

I made a pull request on github with some editorial suggestions, most of
which by volume are classifying the "Standards Track" documents in
Appendix A properly as Internet Standard or Proposed Standard (there
don't seen to be any lingering Draft Standards in the list):
https://github.com/jtkristoff/draft-ietf-dnsop-dns-tcp-requirements/pull/10

Section 2.4

   headers.  Unfortunately, it is quite common for both ICMPv6 and IPv6
   extension headers to be blocked by middleboxes.  According to
   [HUSTON] some 35% of IPv6-capable recursive resolvers were unable to
   receive a fragmented IPv6 packet.  [...]

I looked through [HUSTON] and wasn't able to find this "35%" figure.
There is a remark that "37% of endpoints used IPv6-capable DNS resolvers
that were incapable of receiving a fragmented IPv6 response", but that
seems to be a somewhat different statement in addition to a somewhat
different percentage value.  In particular, the sampling domain for the
[HUSTON] statement as written appears to be "all endpoints", but the
statement in this draft uses the sampling domain of "IPv6-capable
recursive resolvers".  However, the corresponding calculation in
[HUSTON] looks to be using "IPv6-capable resolvers" as the sampling
domain, just as this document states, which suggests that the error in
phrasing is in [HUSTON] and not in this document.

Section 3

As the directorate review noted, it's a little surprising to see the
OLD/NEW formulation for one of the updates to §6.1.3.2 of RFC 1123 but
not the others.

In particular, using OLD/NEW would allow us to implicitly reiterate that
the "MUST send a UDP query first" requirement for non-zone-transfers
is no longer present (by virtue of the update made by RFC 7766).

   *  Recursive servers (or forwarders) MUST support and service all TCP
      queries so that they do not prevent large responses from a TCP-
      capable server from reaching its TCP-capable clients.

This might benefit from a bit of unpacking.  I think that "MUST support
and service ... queries" refers to the stub/recursive side of things,
with "responses from a TCP-capable server" would refer to the
recursive/authoritative side.  But the rest of the sentence seems to be
assuming that if TCP is used on one side then it is used on the other
side, so that limitations of TCP use on one side do not carry over to
the other.  However, I didn't think that there was a requirement on
recursives to go forward with TCP for queries received over TCP, so I'm
not entirely sure what the actual guidance here is intended to be.

Section 4.2

   DNS server software SHOULD provide a configurable limit on the total
   number of established TCP connections.  If the limit is reached, the
   application is expected to either close existing (idle) connections
   or refuse new connections.  Operators SHOULD ensure the limit is
   configured appropriately for their particular situation.

I think that one of the directorate reviews touched on this topic, but I
wonder if we can give more guidance on what factors of a particular
situation might be relevant for determining what is appropriate.  In
this case, that might include the number of requests the hardware is
capable of serving and the number of requests expected from legitimate
clients; we do seem to provide a bit more detail in the following
paragraph (not quoted here) regarding "number and diversity of users"

Section 8

There's a lot of good advice interspersed in the main body text already;
thank you for that!

The discussion in §4.1 suggests ("SHOULD") to share a TFO server key
amongst servers in a server farm, but this introduces the usual security
considerations for a group-shared symmetric key.  The highlights are
that any member of the group can impersonate any other member, and
compromise of one machine compromises all members' use of the key.
While there's not a great fully generic treatment of these issues in the
RFC series that I know of (yet, at least), I've seen RFC 4046 cited for
it sometimes, and draft-ietf-core-oscore-groupcomm has a section on
"security of the group mode" that also has some overlap with the
relevant considerations for sharing TFO keys.

In a similar vein, in §6 we again SHOULD-level recommend that
applications capturing network packets do TCP segment reassembly in
order to defeat obfuscation techniques involving TCP segmentation.  I am
happy to see that we go on to caution against resource exhaustion
attacks while doing so, but have two related comments: first, that
caution might merit mention again here, and second, that we should note
(either here or there) that when applying resource limits, there's a
tradeoff between allowing service and allowing some attacks to succeed.
Giving up on segmentation reassembly due to resource usage means that a
potential attack could succeed, but dropping streams where segmentation
recovery uses excess resources might deny legitimate service.

We might also consider reiterating that the core DNS over TCP security
considerations (RFC 1035, ???) continue to apply.

Clients that keep state about whether a given server supports TCP (per
discussion in §4.1) might be susceptible to an attacker that is on-path
in one location disrupting TCP in that location and causing the client
to store state that a given server does not support TCP, when TCP
connections from a different location, where the attacker is not on
path, would succeed.

   short-lived DNS transactions over TCP may pose challenges.  In fact,
   [DAI21] details a class of IP fragmentation attacks on DNS
   transactions if the IP ID field can be predicted and a system is
   coerced to fragment rather than retransmit messages.  [...]

I suggest more detail on the "IP ID field" (including IPv4/v6
differences).

Section 9

   being queried).  DNS over TLS or DTLS is the recommended way to
   achieve DNS privacy.

Is it really the (sole) recommended way?  It certainly suffices, but
what is the status of DoH/DoQ?  Perhaps "DNS over TLS or DTLS serves to
provide DNS privacy" optionally followed by a note about DoH or "other
mechanisms" in general.  (May be superseded by Roman's Discuss.)

Section 11.1

I recommend re-review of the classification of the references.
Just because a reference is on the standards-track does not mean that we
must reference it normatively -- e.g., RFC 1995/1996 are mentioned only
in the listing of "other standards related to DNS transport over TCP",
but they are not required reading in order to understand and implement this
document.  See
https://www.ietf.org/about/groups/iesg/statements/normative-informative-references/
.

Section 11.2

I'm having a really hard time seeing how RFC 1123 is classified as
informative given that we have an Updates: relationship to it -- that
would seem to require some familiarity with its contents in order to
know how to use this document properly.  The same question might be
asked of RFC 1536 as well.

The link for [FRAG_POISON] did not lead me to the desired document, just
a generic homepage.

I couldn't locate anything about [DAI21]; is there additional
information that can be provided to assist in locating it?

It is slightly surprising that RFC 1034 does not need to be normative,
but it seems that the only places we directly reference use 1035 rather
than 1034, so informative would be correct.

The "SHOULD enable TFO" might make RFC 7413 into a normative reference
(see the IESG statement linked above)

Since [TDNS] is available at
https://www.isi.edu/~johnh/PAPERS/Zhu15b.html should we include that
link?

Likewise, [TOYAMA] seems to refer to
https://archive.nanog.org/meetings/nanog32/presentations/toyama.pdf, and
[VERISIGN] to
https://indico.dns-oarc.net/event/20/contributions/270/attachments/249/465/rtt_Oarc_ThomasWessels_v1.pdf

Appendix A.2

   This Informational document [RFC1536] states UDP is the "chosen
   protocol for communication though TCP is used for zone transfers."
   That statement should now be considered in its historical context and
   is no longer a proper reflection of modern expectations.

That statement is explicitly updated by this document, which might be
worth noting.

NITS

Section 3

   of the Internet DNS as ever, if not more so.  Furthermore, there has
   been serious research that argues connection-oriented DNS
   transactions may provide security and privacy advantages over UDP
   transport [TDNS].  In fact, the standard for DNS over TLS [RFC7858]
   is just this sort of specification.  [...]

This might be some editing remnants -- "just this sort of specification"
seems to be trying to refer to some previous discussion of a type of
specification, but there isn't any in the preceding text anymore.
([TDNS] does propose a specification for a DNS-over-TLS type scheme,
from a very quick glance, or maybe there is an intent to connect
"specification" with "[of a] connection-oriented DNS transaction
[protocol]".)