Re: [DNSOP] Benjamin Kaduk's Discuss on draft-ietf-dnsop-dns-tcp-requirements-13: (with DISCUSS and COMMENT)

"Wessels, Duane" <dwessels@verisign.com> Mon, 08 November 2021 22:35 UTC

Return-Path: <dwessels@verisign.com>
X-Original-To: dnsop@ietfa.amsl.com
Delivered-To: dnsop@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7B2173A0D8E; Mon, 8 Nov 2021 14:35:37 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.099
X-Spam-Level:
X-Spam-Status: No, score=-2.099 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=verisign.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id F4Ec304CrA1N; Mon, 8 Nov 2021 14:35:32 -0800 (PST)
Received: from mail6.verisign.com (mail6.verisign.com [69.58.187.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id ACD9D3A0D7F; Mon, 8 Nov 2021 14:35:28 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=verisign.com; l=21976; q=dns/txt; s=VRSN; t=1636410930; h=from:to:cc:date:message-id:references:in-reply-to: content-id:content-transfer-encoding:mime-version:subject; bh=IV7vqRP7jkLenj3rcnzqIkXOm3or5vg8t+dHQLuLEkM=; b=fwDaU4df8JJQzWNu/O+wBASPgf3e338/JZlOPcFQSKMpXhDVsXXaYtR2 C8f8nOJJP+C/ejMAQ3PVsBHOWAFt/C7E6cRnGqM70vNiqFjNdl+WSnKuc OzKSoU8ydvqq1l2nRI01t0UcIrVYeCd6TbIzHrcyv/FzSaKFW7+/e1CRg 6G4EDNkU4d+2sqa303yug5jEJh77ASeKJ/Jaq9RR3rFyWlkkUNEEAsGyP yTZlme+B6t6qlJsG8cV1iY+Ij0bQjgUo5kax7Ye2EW0qgp2HTTSYXdm0B KnNpFduQokLZdo0Q9ZG9KNm48dvFpdWEbeGL/7Rjtd9pLsuCvCQ3oCNts Q==;
IronPort-SDR: azhkGT0aYe1sgRag6Ptbue1NwAQy3Xb13CtpKnCglgs5QT7+tMqEEhxKHJyMBgA8WuY0EIfIo5 oxFO47q9F2E1gS3g5AP7ttnQ1znUl9l8Fbf+vmRNWygigySCYHveMS0Ib9+0Efbu87TkdOgTHx J3lueZD/5vKfbJpaL4XTvAGP4TtqQkyipCG52izgl5zRVu0VrABXQnUKqGNTyTKDPxKbMBoRIR UPaWjy3lUiTnMM2VwexNz/pWE3EU+EKkxTlpwlYyABT/1GGFGlwr8ElQeQ75ZnTBo9RZ/DcpNz xaE=
IronPort-Data: A9a23:ry4LVa0QnGPMe7PmrPbD5c1wkn2cJEfYwER7XKvMYLTBsI5bp2EDy mJMUWuGPPuLM2H0Kop3ao609UhUvpLQxoJiHQRoqSg9HnlHl5HIVI+TRqvS04N+DSFioGZPt Zh2hgzodZhsJpPkS5TE3oHJ9RGQ74nRLlbHILOCan8ZqTNMEn970Es6w75h2+aEvPDia++zk YKqyyHgEAL9s9JEGjp8B3Wr8U4HUFza4Vv0j3RmDRx5lAa2e0o9VfrzEZqMw07QGeG4KAIaq 9Hrl9lV9kuBl/skIo39zuajKiXmSJaKVeSFoiI+t6RPHnGuD8H9u0o2HKN0VKtZt9mGt+B35 t5kjISicCwKHqPct7tMAiFzMBgraMWq+JefSZS+meap6RT5VVbcm68oEkoxJ5Ve8+oxH3tV8 7oTLzVlghKr3rrwme3gDLAx3YJ/faEHP6tG0p1k5T3GAO09TJTYa7vH/95D3Tg2wMtJGJ4yY uJAMWo0ME+ZOnWjPH8qGrZnl+Wmv0XibmwDkwvOrqR0znTcmVkZPL/FdYC9lsaxbcZcklubj mfH4yL0DgxyHN2S0jWt83+wiKnIhyyTcI4IHbOks/9nnFPWymoIDwVTWV2g5OWykgukVstCK lYZ/ycosbMa9UG3QJ/6RRLQiHKCpRkEHtFQGuwg8ymMx7bapQGDCQAsQjhab8QOtcIqS3otz FDht8j0FyNoqrmZVnOR+5+bqDqzPW4eKmpqWMMfZQEf5YD8powj1kiKVch5Vqu0lZj/Hnf62 TbT6jYknLNVhskOv0mmwW36b/uXjsChZmYICs//BwpJMisRiFaZWrGV
IronPort-HdrOrdr: A9a23:Hycp+6rV3hDtR7fw0lcISh4aV5r7eYIsimQD101hICG9Kvbo8/ xHnJwguSMdEF4qKQwdcKO7Sc69qBTnhOJICOgqTM2ftWbd2FdAQLsJ0WKm+UyEJ8SczJ8j6U 4DSdkcNDSYNzET5voSojPIcerIq+PpzEncv4bjJgBWIz2CBZsM0+4zMHf8LqQ/fng+OXKofK DsnvaviQDQAkgqUg==
X-IronPort-AV: E=Sophos;i="5.87,218,1631592000"; d="scan'208";a="10834311"
Received: from BRN1WNEX01.vcorp.ad.vrsn.com (10.173.153.48) by BRN1WNEX02.vcorp.ad.vrsn.com (10.173.153.49) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.15; Mon, 8 Nov 2021 17:35:21 -0500
Received: from BRN1WNEX01.vcorp.ad.vrsn.com ([fe80::a89b:32d6:b967:337d]) by BRN1WNEX01.vcorp.ad.vrsn.com ([fe80::a89b:32d6:b967:337d%4]) with mapi id 15.01.2308.015; Mon, 8 Nov 2021 17:35:21 -0500
From: "Wessels, Duane" <dwessels@verisign.com>
To: Benjamin Kaduk <kaduk@mit.edu>
CC: The IESG <iesg@ietf.org>, "draft-ietf-dnsop-dns-tcp-requirements@ietf.org" <draft-ietf-dnsop-dns-tcp-requirements@ietf.org>, "dnsop-chairs@ietf.org" <dnsop-chairs@ietf.org>, "dnsop@ietf.org" <dnsop@ietf.org>, Suzanne Woolf <suzworldwide@gmail.com>
Thread-Topic: [EXTERNAL] Benjamin Kaduk's Discuss on draft-ietf-dnsop-dns-tcp-requirements-13: (with DISCUSS and COMMENT)
Thread-Index: AQHX1PDqnNRdY3zcC0yGApRMQLC3LA==
Date: Mon, 08 Nov 2021 22:35:21 +0000
Message-ID: <B779A165-3FB3-49F0-B4BD-65AD68E9A933@verisign.com>
References: <163520226600.2076.6225006958067294469@ietfa.amsl.com>
In-Reply-To: <163520226600.2076.6225006958067294469@ietfa.amsl.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-mailer: Apple Mail (2.3608.120.23.2.7)
x-originating-ip: [10.170.148.18]
Content-Type: text/plain; charset="utf-8"
Content-ID: <A42730463431D44FA042DF4BC60E5731@verisign.com>
Content-Transfer-Encoding: base64
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/dnsop/_jxdeYWkvmmWw6fJqVCqxacs_YI>
Subject: Re: [DNSOP] Benjamin Kaduk's Discuss on draft-ietf-dnsop-dns-tcp-requirements-13: (with DISCUSS and COMMENT)
X-BeenThere: dnsop@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IETF DNSOP WG mailing list <dnsop.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dnsop>, <mailto:dnsop-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dnsop/>
List-Post: <mailto:dnsop@ietf.org>
List-Help: <mailto:dnsop-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dnsop>, <mailto:dnsop-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Nov 2021 22:35:38 -0000

Hi Ben, thank you for the detailed review.  It has taken me a while to work through
all of your comments and suggestions, but hopefully this addresses them sufficiently.



> On Oct 25, 2021, at 3:51 PM, Benjamin Kaduk via Datatracker <noreply@ietf.org> wrote:
> 
> 
> ----------------------------------------------------------------------
> DISCUSS:
> ----------------------------------------------------------------------
> 
> (1) This should be pretty easy to resolve, but this text from §4.4
> does not seem to match up with the referenced document:
> 
>   The use of TLS places even stronger operational burdens on DNS
>   clients and servers.  Cryptographic functions for authentication and
>   encryption requires additional processing.  Unoptimized connection
>   setup takes two additional round-trips compared to TCP, but can be
>   reduced with TCP Fast Open, TLS session resumption [RFC8446] and TLS
>   False Start [RFC7918].
> 
> Two additional round trips was true of TLS 1.2 and prior versions, but
> as of TLS 1.3 the application data from the client can be sent after
> only 1 round trip, accompanying the client Finished (and authentication
> messages, if in use).  Given the nature of the rest of the sentence, we
> might want to specifically mention TLS 1.3 as an improvement over TLS
> 1.2, but there are probably a number of ways that we could fix it.  Note
> additionally that for TLS 1.3, session resumption is not a reduction in
> the number of round trips unless 0-RTT data is used (but AFAIK there is
> not a published application profile specifying acceptable DNS content
> for TLS 0-RTT data, so use of TLS 0-RTT data for DNS is forbidden), but
> is still an efficiency gain due to the reduced number of cryptographic
> operations (including certificate validation).

Is this better?

   The use of TLS places even stronger operational burdens on DNS
   clients and servers.  Cryptographic functions for authentication and
   encryption requires additional processing.  Unoptimized connection
   setup with TLS 1.3 [RFC8446] takes one additional round-trip compared
   to TCP.  Connection setup times can be reduced with TCP Fast Open,
   and TLS False Start [RFC7918].  TLS session resumption does not
   reduce round-trip latency becase no application profile for use of
   TLS 0-RTT data with DNS has been published at the time of this
   writing.  However, TLS session resumption can reduce the number of
   cryptographic operations.


> (2) Trivial to address, but the section heading for Appendix A.8
> references RFC 3326 (The Reason Header Field for the Session Initiation
> Protocol (SIP)), not RFC 3226 (DNSSEC and IPv6 A6 aware server/resolver
> message size requirements)

Thanks, this has been corrected.


> 
> 
> ----------------------------------------------------------------------
> COMMENT:
> ----------------------------------------------------------------------
> 
> This document targets BCP status, while proposing an Updates:
> relationship with an Internet Standard and an Informational document.
> As mentioned in
> https://secure-web.cisco.com/16mLP8dG_ibZ9_hePr-XOEAMh0ghoochCF5XPpWWxHVs2rqFvgauZy7CTG1l4e8I_O8UaNtqmDrRRxvsmORmD2R1aCRr-SZvwMWwE5IcT1ky6ggdBCttsM0zPB1RTtvux53rPHrjpxbLUVVLSGGOUyh6nVhhkw1KE2dqgrO18iMEmjsLol05Dkj15p5m2l-O0-ZirQbfx_OkxB8W8tw_WkmVpMMsLm7IQwjG-dVQgbr5pK_hBp8VLx7I4WgryUGMW/https%3A%2F%2Fmailarchive.ietf.org%2Farch%2Fmsg%2Flast-call%2FoMhqRigr4nbRSMll5NY4zXpZu20%2F
> there is some motivation for having updates to standards-track documents
> occur via other standards-track documents, but given that this document
> is mostly about giving operational guidance for DNS-over-TCP usage,
> there seems to be some argument that BCP is a more appropriate status
> for it.  However, I do wonder if there is some existing BCP that this
> document would become part of, or if it would need to be given a new BCP
> number.  It's not entirely clear how much scope there is for future
> additions that would become part of that same BCP, and whether a BCP
> number is truly needed in that scenario.  It would be good to hear
> others' thoughts on this topic, especially in the form of references to
> previous WG discussion.
> 
> I made a pull request on github with some editorial suggestions, most of
> which by volume are classifying the "Standards Track" documents in
> Appendix A properly as Internet Standard or Proposed Standard (there
> don't seen to be any lingering Draft Standards in the list):

This has been merged.

> 
> Section 2.4
> 
>   headers.  Unfortunately, it is quite common for both ICMPv6 and IPv6
>   extension headers to be blocked by middleboxes.  According to
>   [HUSTON] some 35% of IPv6-capable recursive resolvers were unable to
>   receive a fragmented IPv6 packet.  [...]
> 
> I looked through [HUSTON] and wasn't able to find this "35%" figure.
> There is a remark that "37% of endpoints used IPv6-capable DNS resolvers
> that were incapable of receiving a fragmented IPv6 response", but that
> seems to be a somewhat different statement in addition to a somewhat
> different percentage value.  In particular, the sampling domain for the
> [HUSTON] statement as written appears to be "all endpoints", but the
> statement in this draft uses the sampling domain of "IPv6-capable
> recursive resolvers".  However, the corresponding calculation in
> [HUSTON] looks to be using "IPv6-capable resolvers" as the sampling
> domain, just as this document states, which suggests that the error in
> phrasing is in [HUSTON] and not in this document.

The 35% figure is derived from this statement in the reference:

   We saw 10,115 individual IPv6 addresses used by IPv6-capable recursive
   resolvers.  Of this set of resolvers, we saw 3,592 resolvers that
   consistently behaved in a manner that was consistent with being unable
   to receive a fragmented IPv6 packet.

3592/10115 is 35.51% and I took the liberty of rounding down rather than up.


> 
> Section 3
> 
> As the directorate review noted, it's a little surprising to see the
> OLD/NEW formulation for one of the updates to §6.1.3.2 of RFC 1123 but
> not the others.
> 
> In particular, using OLD/NEW would allow us to implicitly reiterate that
> the "MUST send a UDP query first" requirement for non-zone-transfers
> is no longer present (by virtue of the update made by RFC 7766).

This has been changed as noted in my reply to John Scudder.

> 
>   *  Recursive servers (or forwarders) MUST support and service all TCP
>      queries so that they do not prevent large responses from a TCP-
>      capable server from reaching its TCP-capable clients.
> 
> This might benefit from a bit of unpacking.  I think that "MUST support
> and service ... queries" refers to the stub/recursive side of things,
> with "responses from a TCP-capable server" would refer to the
> recursive/authoritative side.  But the rest of the sentence seems to be
> assuming that if TCP is used on one side then it is used on the other
> side, so that limitations of TCP use on one side do not carry over to
> the other.  However, I didn't think that there was a requirement on
> recursives to go forward with TCP for queries received over TCP, so I'm
> not entirely sure what the actual guidance here is intended to be.

Agreed, hopefully this is better:

   o  Authoritative servers MUST support and service TCP for receiving
      queries, so that resolvers can reliably receive responses that are
      larger than what fits in a single UDP packet.

   o  Recursive servers (and forwarders) MUST support and service TCP
      for receiving queries, so their TCP-capable clients can reliably
      receive responses that are larger than what fits in a single UDP
      packet.

   o  Recursive servers (and forwarders) MUST support TCP for sending
      queries, so that they can retry truncated UDP responses as
      necessary.


> 
> Section 4.2
> 
>   DNS server software SHOULD provide a configurable limit on the total
>   number of established TCP connections.  If the limit is reached, the
>   application is expected to either close existing (idle) connections
>   or refuse new connections.  Operators SHOULD ensure the limit is
>   configured appropriately for their particular situation.
> 
> I think that one of the directorate reviews touched on this topic, but I
> wonder if we can give more guidance on what factors of a particular
> situation might be relevant for determining what is appropriate.  In
> this case, that might include the number of requests the hardware is
> capable of serving and the number of requests expected from legitimate
> clients; we do seem to provide a bit more detail in the following
> paragraph (not quoted here) regarding "number and diversity of users"

How about this?

   DNS server software SHOULD provide a configurable limit on the total
   number of established TCP connections.  If the limit is reached, the
   application is expected to either close existing (idle) connections
   or refuse new connections.  Operators SHOULD ensure the limit is
   configured appropriately for their particular situation, which
   includes factors such as the number of users or clients, typical
   traffic levels, and hardware resource constraints.

   DNS server software MAY provide a configurable limit on the number of
   established connections per source IP address or subnet.  This can be
   used to ensure that a single or small set of users cannot consume all
   TCP resources and deny service to other users.  Operators SHOULD
   ensure this limit is configured appropriately, based on their number
   and diversity of users, and whether users connect from unique IP
   addresses or through a shared Network Address Translator.


> 
> Section 8
> 
> There's a lot of good advice interspersed in the main body text already;
> thank you for that!
> 
> The discussion in §4.1 suggests ("SHOULD") to share a TFO server key
> amongst servers in a server farm, but this introduces the usual security
> considerations for a group-shared symmetric key.  The highlights are
> that any member of the group can impersonate any other member, and
> compromise of one machine compromises all members' use of the key.
> While there's not a great fully generic treatment of these issues in the
> RFC series that I know of (yet, at least), I've seen RFC 4046 cited for
> it sometimes, and draft-ietf-core-oscore-groupcomm has a section on
> "security of the group mode" that also has some overlap with the
> relevant considerations for sharing TFO keys.

I feel like this point should’ve been brought up in the TFO RFC (7413),
rather than this document.  Section 6.3.4 talks about server farms but doesn’t
mention security concerns about sharing keys.

Perhaps it would be appropriate in this document to say that server clusters
should either use the same TFO server key (as recommended by 7413 sec 6.3.4),
or just disable TFO?


> 
> In a similar vein, in §6 we again SHOULD-level recommend that
> applications capturing network packets do TCP segment reassembly in
> order to defeat obfuscation techniques involving TCP segmentation.  I am
> happy to see that we go on to caution against resource exhaustion
> attacks while doing so, but have two related comments: first, that
> caution might merit mention again here, and second, that we should note
> (either here or there) that when applying resource limits, there's a
> tradeoff between allowing service and allowing some attacks to succeed.
> Giving up on segmentation reassembly due to resource usage means that a
> potential attack could succeed, but dropping streams where segmentation
> recovery uses excess resources might deny legitimate service.

Is this sort of what you had in mind?

   As mentioned in Section 6, applications that implement TCP stream
   reassembly need to limit the amount of memory allocated to connection
   tracking.  A failure to do so could lead to a total failure of the
   logging or monitoring application.  Imposition of resource limits
   creates a tradeoff between allowing some stream reassembly to
   continue and allowing some evasion attacks to succeed. 


> 
> We might also consider reiterating that the core DNS over TCP security
> considerations (RFC 1035, ???) continue to apply.

1035 doesn’t have a lot to say, but maybe you are thinking about whats
in section 4.2.2?

Even so, this document is meant to be operational requirements and I suspect
you are maybe thinking of protocol/implementation requirements, which are
covered by RFC 7766?


> 
> Clients that keep state about whether a given server supports TCP (per
> discussion in §4.1) might be susceptible to an attacker that is on-path
> in one location disrupting TCP in that location and causing the client
> to store state that a given server does not support TCP, when TCP
> connections from a different location, where the attacker is not on
> path, would succeed.

The opening paragraph of section 4.1 has been updated due to other comments
on this same topic.  It now reads:

   Resolvers and other DNS clients should be aware that some servers
   might not be reachable over TCP.  For this reason, clients MAY track
   and limit the number of TCP connections and connection attempts to a
   single server.  Reachability problems can be caused by network
   elements close to the server, close to the client, or anywhere along
   the path between them.  Mobile clients that cache connection failures
   MAY do so on a per-network basis, or MAY clear such a cache upon
   change of network.

Does that address your concern?


> 
>   short-lived DNS transactions over TCP may pose challenges.  In fact,
>   [DAI21] details a class of IP fragmentation attacks on DNS
>   transactions if the IP ID field can be predicted and a system is
>   coerced to fragment rather than retransmit messages.  [...]
> 
> I suggest more detail on the "IP ID field" (including IPv4/v6
> differences).

Thanks, is this better?

   In fact,
   [DAI21] details a class of IP fragmentation attacks on DNS
   transactions if the IP Identifier field (16 bits in IPv4 and 32 bits
   in IPv6) can be predicted and a system is coerced to fragment rather
   than retransmit messages.


> 
> Section 9
> 
>   being queried).  DNS over TLS or DTLS is the recommended way to
>   achieve DNS privacy.
> 
> Is it really the (sole) recommended way?  It certainly suffices, but
> what is the status of DoH/DoQ?  Perhaps "DNS over TLS or DTLS serves to
> provide DNS privacy" optionally followed by a note about DoH or "other
> mechanisms" in general.  (May be superseded by Roman's Discuss.)

Updated:

   A number of protocols have recently been developed
   to provide DNS privacy, including DNS over TLS [RFC7858], DNS over
   DTLS [RFC8094], DNS over HTTPS [RFC8484], with even more on the way.


> 
> Section 11.1
> 
> I recommend re-review of the classification of the references.
> Just because a reference is on the standards-track does not mean that we
> must reference it normatively -- e.g., RFC 1995/1996 are mentioned only
> in the listing of "other standards related to DNS transport over TCP",
> but they are not required reading in order to understand and implement this
> document.  See
> https://secure-web.cisco.com/1LAKZvPIvRrtY7gk6aYxiUqcEPVzMhQCQvufa-Aml_Nz9I7q1WXzR2996DBooUfTPwaaOtq4ifa_Eu7GYG78WGP0Nu3e6e-dNvPIbRiN18gWSfQ39FVHBgFHUpmFVc_k9EDG76jRRivXT60d1eXptpTxnxMR0C8g-ghgoTjrdffE3D42xSerFsEHxd9A5s7k1j_ZyNWxB7infwBRY-6emNntk28Su5YD9C_IwpQ3WkD9vEkA76s9CXU_14MazOgfy/https%3A%2F%2Fwww.ietf.org%2Fabout%2Fgroups%2Fiesg%2Fstatements%2Fnormative-informative-references%2F

Thanks, I’ve moved the references only mentioned in the appendix to the Informative section.

DW