[dns-privacy] Benjamin Kaduk's No Objection on draft-ietf-dprive-rfc7626-bis-06: (with COMMENT)

Benjamin Kaduk via Datatracker <noreply@ietf.org> Wed, 07 October 2020 04:16 UTC

Return-Path: <noreply@ietf.org>
X-Original-To: dns-privacy@ietf.org
Delivered-To: dns-privacy@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id D63DF3A15C5; Tue, 6 Oct 2020 21:16:34 -0700 (PDT)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
From: Benjamin Kaduk via Datatracker <noreply@ietf.org>
To: "The IESG" <iesg@ietf.org>
Cc: draft-ietf-dprive-rfc7626-bis@ietf.org, dprive-chairs@ietf.org, dns-privacy@ietf.org, Brian Haberman <brian@innovationslab.net>, dns-privacy@ietf.org, brian@innovationslab.net
X-Test-IDTracker: no
X-IETF-IDTracker: 7.19.0
Auto-Submitted: auto-generated
Precedence: bulk
Reply-To: Benjamin Kaduk <kaduk@mit.edu>
Message-ID: <160204419484.9519.7742091087612533203@ietfa.amsl.com>
Date: Tue, 06 Oct 2020 21:16:34 -0700
Archived-At: <https://mailarchive.ietf.org/arch/msg/dns-privacy/dDscXNUvw1g1AMxhxCbQ49KitwY>
Subject: [dns-privacy] Benjamin Kaduk's No Objection on draft-ietf-dprive-rfc7626-bis-06: (with COMMENT)
X-BeenThere: dns-privacy@ietf.org
X-Mailman-Version: 2.1.29
List-Id: <dns-privacy.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dns-privacy>, <mailto:dns-privacy-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dns-privacy/>
List-Post: <mailto:dns-privacy@ietf.org>
List-Help: <mailto:dns-privacy-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dns-privacy>, <mailto:dns-privacy-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 07 Oct 2020 04:16:35 -0000

Benjamin Kaduk has entered the following ballot position for
draft-ietf-dprive-rfc7626-bis-06: No Objection

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
for more information about IESG DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-dprive-rfc7626-bis/



----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

Section 1

   At the time of writing, almost all this DNS traffic is currently sent
   in clear (i.e., unencrypted).  However there is increasing deployment

nit: I think that "in the clear" is the term of art (add "the").

   Today, almost all DNS queries are sent over UDP [thomas-ditl-tcp].

It looks like
(https://mailarchive.ietf.org/arch/msg/dns-privacy/1pZL1FA57hzE1e09mQ2HMg2aWYY/)
Sara was going to follow up with the DITL authors to try and ascertain
whether "almost all queries" is still accurate for the "UDP" aspect,
though the IETF mailarchive search doesn't seem to find any more recent
traffic on that topic.  Do we know if anyone actually heard back about
this (or the "sent in [the] clear" a few lines previously)?
I do not pretend to have the expertise needed to judge how the changes
deployed by major browser affect the statistics for "all DNS traffic"
(which presumably includes both stub-to-resolver and
resolver-to-authoritative).

   This has practical consequences when considering encryption of the
   traffic as a possible privacy technique.  Some encryption solutions
   are only designed for TCP, not UDP and new solutions are still
   emerging [I-D.ietf-quic-transport] [I-D.huitema-quic-dnsoquic].

[It looks like dnsoquic became draft-huitema-dprive-dnsoquic.]

Section 3

   multiple dynamic contexts of each device.  This document does not
   attempt such a complex analysis, instead it presents an overview of
   the various considerations that could form the basis of such an
   analysis.

nit: looks like a comma splice.

Section 4.1

   authentication or authorization of the client (resolver).  Due to the
   lack of search capabilities, only a given QNAME will reveal the
   resource records associated with that name (or that name's non-
   existence).  In other words: one needs to know what to ask for, in

I agree with Warren that this statement ("only [...] will reveal [...]
or that name's non-existence") is overly strong.

Section 4.2

   The DNS request includes many fields, but two of them seem
   particularly relevant for the privacy issues: the QNAME and the
   source IP address. "source IP address" is used in a loose sense of
   "source IP address + maybe source port number", because the port

In other contexts I've seen this combination referred to as the
"transport address".

   The QNAME is the full name sent by the user.  It gives information
   about what the user does ("What are the MX records of example.net?"
   means he probably wants to send email to someone at example.net,
   which may be a domain used by only a few persons and is therefore
   very revealing about communication relationships).  [...]

(editorial) something like not-a-secret-cabal.example might make the
example more visceral than example.net does.

   create more problems for the user.  Also, sometimes, the QNAME embeds
   the software one uses, which could be a privacy issue.  For instance,
   _ldap._tcp.Default-First-Site-Name._sites.gc._msdcs.example.org.

(nit) I trust that this can be made into a complete sentence while
addressing Warren's more-substantive comment.

   There are also some BitTorrent clients that query an SRV record for
   _bittorrent-tracker._tcp.domain.example.

In a similar vein, I'm not sure what domain.example is supposed to
represent here -- the domain of the author of the BitTorrent client?

   Therefore, all the issues and warnings about collection of IP
   addresses apply here.  For the communication between the recursive

I mostly assume that this is intended to be a reference to the generic
concerns about "IP addresses are PII", etc., that one is ambiently
exposed to by reading enough about the Internet.  (There does not seem
to be previous discussion of "collection of IP addresses" in this
document, which would seem to indicate that it is not trying to refer
back to previous text.)  If so, perhaps an extra word or two would help
("all the standard issues and warnings", "all the generic issues and
warnings", etc.) clarify the intent of the reference.

   However, hiding does not always work.  Sometimes EDNS(0) Client
   subnet [RFC7871] is used (see its privacy analysis in
   [denis-edns-client-subnet]).  [...]

(nit) The wording here ("its privacy analysis") suggests that the
referenced document is an authoritative/official IETF position, but it
seems to be a blog post by a single individual.  Using "one" or "a"
rather than "its" would convey a less-authoritative connotation.

                                       In both cases, the IP address
   originating queries to the authoritative server is as sensitive as it
   is for HTTP [sidn-entrada].

I don't see how [sidn-entrada] supports the claim that end-user-adjacent
DNS client IP addresses are equally sensitive as HTTP client IP
Addresses; it mentions "sensitive" only twice (as "privacy-sensitive",
admittedly, applying to such IP addresses, but as an assertion without
justification) and "http" only in URLs (mostly in the references) and as
an example request.  It would feel more natural to use an IETF reference
here, as well -- e.g., RFC 7624 discusses correlating client IP
addresses with end users, RFC 7239 clearly covers privacy considerations
for sending client IP addresses in the "forwarded" header field, and
there are no doubt others -- though I do note the contents of the
paragraph after this one.

                     However, for both IPv4 and IPv6 addresses, it is
   important to note that source addresses are propagated with queries
   and comprise metadata about the host, user, or application that
   originated them.

(This "propagated with queries" is still contingent on EDNS(0) Client
Subnet from the previous paragraph, right?)

Section 4.2.1

   cache poisoning attacks by off-path attackers.  It is noted, however,
   that they are designed to just verify IP addresses (and should change
   once a client's IP address changes), they are not designed to
   actively track users (like HTTP cookies).

nit: comma splice.

Section 5.1

   not be.  When other protocols will become more and more privacy-aware
   and secured against surveillance (e.g., [RFC8446],
   [I-D.ietf-quic-transport]), the use of unencrypted transports for DNS
   may become "the weakest link" in privacy.  It is noted that at the
   time of writing there is on-going work attempting to encrypt the SNI
   in the TLS handshake [I-D.ietf-tls-sni-encryption].

This mention of encrypted "SNI" (now encrypted ClientHello) comes as a
bit of a non sequitur.  I suggest a bit of transition such as an
additional clause at the end of the sentences ", which is one of the
last remaning non-DNS cleartext identifiers of a connection target".
(While the actual work itself has progressed to encrypting the entire
ClientHello, I think it's okay to focus the exposition here on the SNI,
as the relevant attribute.)

                                                         It can be noted
      that if the user selects a single resolver with a small client
      population (even when using an encrypted transport) it can
      actually serve to aid tracking of that user as they move across
      network environment.

I wonder if it is worth adding another clause at the end: ", and that an
attacker in a position to observe the moving user is likely also able to
observe the likely-unencrypted DNS queries from the resolver to the
authoritative servers"
Also, nit: "environments" plural.

Section 5.2

   Traffic analysis of unpadded encrypted traffic is also possible
   [pitfalls-of-dns-encryption] because the sizes and timing of
   encrypted DNS requests and responses can be correlated to unencrypted
   DNS requests upstream of a recursive resolver.

We could (but don't have to) note that effective padding policies remain
an open area of research.

Section 6.1.1.2

   o communicate clearly the change in default to users

I think this is intending to say "when the default application resolver
changes away from the system resolver", but the present text is perhaps
a little unclear about what "the change" is referring to.

Section 6.1.2

                                                                Even if
   encrypted DNS such as DoH or DoT is used, unless the client has been
   configured in a secure way with the server identity, an active
   attacker can impersonate the server.  [...]

More than the server identity is needed -- the credentials or trust
anchor needed to authenticate a peer as that identity are also needed.

Section 6.1.3

   User privacy can also be at risk if there is blocking (by local
   network operators or more general mechanisms) of access to remote
   recursive servers that offer encrypted transports when the local
   resolver does not offer encryption and/or has very poor privacy
   policies.  [...]

I suggest adding "e.g." before "when the local resolver" to avoid giving
the impression that this is an exhaustive list.

   This is a form of Rendezvous-Based Blocking as described in
   Section 4.3 of [RFC7754].  Such blocklists often include servers know
   to be used for malware, bots or other security risks.  In order to
   prevent circumvention of their blocking policies, some networks also
   block access to resolvers with incompatible policies.

Perhaps this is touching too much on the controversial topic, but it
seems to me that the networks in question "attempt to block access";
whether or not they fully and reliably succeed at doing so is not clear.
(See also the near-impossibility of closing covert channels in
protocols.)

   It is also noted that attacks on remote resolver services, e.g., DDoS
   could force users to switch to other services that do not offer
   encrypted transports for DNS.

nit: comma after DDoS.

Section 6.1.4.2

   Some implementations have, in fact, chosen to restrict the use of the
   'User-Agent' header so that resolver operators cannot identify the
   specific application that is originating the DNS queries.

With similar disclaimer as previously, perhaps "trivially identify"?
There are other fingerprinting techniques possible even at, e.g., the TLS
layer (that we discussed previously in this document!), which still
apply to DoH.

Section 6.2

   This "protection", when using a large resolver with many clients, is
   no longer present if ECS [RFC7871] is used because, in this case, the
   authoritative name server sees the original IP address (or prefix,
   depending on the setup).

(side note) this has always been a bit confusing to me -- ECS is "client
subnet", not "client address", and I don't really understand why someone
would set the prefix length to the full 128 (or 32) bits of the address.
Is there really a lot of non-truncated client addresses being sent
around like this?  How did that happen?

                                                                    So,
   requests to a given ccTLD may go to servers managed by organizations
   outside of the ccTLD's country.  End users may not anticipate that,
   when doing a security analysis.

(Is this a "for example"?  It seems plausibly relevant for non-cc TLDs
as well.)

Section 7.1

   The IAB privacy and security program also have a work in progress
   [RFC7624] that considers such inference-based attacks in a more
   general framework.

I do not really think the final RFC constitutes a "work in progress"
anymore.

Section 8

   Passive DNS systems [passive-dns] allow reconstruction of the data of
   sometimes an entire zone.  They are used for many reasons -- some
   good, some bad.  Well-known passive DNS systems keep only the DNS
   responses, and not the source IP address of the client, precisely for
   privacy reasons.  Other passive DNS systems may not be so careful.

Perhaps not so well-intentioned, either...

   The revelations from the Edward Snowden documents, which were leaked
   from the National Security Agency (NSA) provide evidence of the use

nit: comma after "(NSA)".

Section 9

   To our knowledge, there are no specific privacy laws for DNS data, in
   any country.  Interpreting general privacy laws like
   [data-protection-directive] or GDPR [10] applicable in the European
   Union in the context of DNS traffic data is not an easy task, and we
   do not know a court precedent here.  See an interesting analysis in
   [sidn-entrada].

This text is essentially unchanged since RFC 7626; did we do much of a
search for whether the past five years have brought about changes in the
legal landscape?