Re: [Doh] some privacy ponderings wrt HTTPs and plain DNS

Sara Dickinson <sara@sinodun.com> Tue, 19 June 2018 15:04 UTC

Return-Path: <sara@sinodun.com>
X-Original-To: doh@ietfa.amsl.com
Delivered-To: doh@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D2AE4130E9F for <doh@ietfa.amsl.com>; Tue, 19 Jun 2018 08:04:09 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.199
X-Spam-Level:
X-Spam-Status: No, score=-4.199 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-2.3] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vZB1GezTjAl7 for <doh@ietfa.amsl.com>; Tue, 19 Jun 2018 08:04:07 -0700 (PDT)
Received: from balrog.mythic-beasts.com (balrog.mythic-beasts.com [IPv6:2a00:1098:0:82:1000:0:2:1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id C8956130DE5 for <doh@ietf.org>; Tue, 19 Jun 2018 08:04:06 -0700 (PDT)
Received: from [62.232.251.194] (port=3605 helo=[192.168.12.23]) by balrog.mythic-beasts.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <sara@sinodun.com>) id 1fVIAq-00083m-Vx; Tue, 19 Jun 2018 16:04:05 +0100
From: Sara Dickinson <sara@sinodun.com>
Message-Id: <AC7EF4EF-17DA-4181-B123-D2F82BBDF1C9@sinodun.com>
Content-Type: multipart/alternative; boundary="Apple-Mail=_3AC9852F-CC1F-4182-A0A2-5EAA7264A86F"
Mime-Version: 1.0 (Mac OS X Mail 11.4 \(3445.8.2\))
Date: Tue, 19 Jun 2018 16:03:54 +0100
In-Reply-To: <CAOdDvNrKhV83ZmCX=KWHx49PtFVO2eTzY+GOxjEzEVd6Auj4Nw@mail.gmail.com>
Cc: nusenu <nusenu-lists@riseup.net>, DoH WG <doh@ietf.org>, bert hubert <bert.hubert@powerdns.com>
To: Patrick McManus <pmcmanus@mozilla.com>
References: <20180618112116.GB9195@server.ds9a.nl> <d137a136-d456-8de2-b682-512edd86b1f7@riseup.net> <E4082C8A-8D16-4F13-82ED-C9F68F66A2A1@sinodun.com> <CAOdDvNrnfxxQ__G_kKn4Fe4jcwcQUZfOb4aNAE6+bjvSrfLcmA@mail.gmail.com> <0D08F629-1719-440D-B4B4-A474CF90B865@sinodun.com> <CAOdDvNrKhV83ZmCX=KWHx49PtFVO2eTzY+GOxjEzEVd6Auj4Nw@mail.gmail.com>
X-Mailer: Apple Mail (2.3445.8.2)
X-BlackCat-Spam-Score: 4
Archived-At: <https://mailarchive.ietf.org/arch/msg/doh/BFYfWTrnxU0iHeGVqvbIMlGJ6lA>
Subject: Re: [Doh] some privacy ponderings wrt HTTPs and plain DNS
X-BeenThere: doh@ietf.org
X-Mailman-Version: 2.1.26
Precedence: list
List-Id: DNS Over HTTPS <doh.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/doh>, <mailto:doh-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/doh/>
List-Post: <mailto:doh@ietf.org>
List-Help: <mailto:doh-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/doh>, <mailto:doh-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 19 Jun 2018 15:04:10 -0000


> On 18 Jun 2018, at 19:04, Patrick McManus <pmcmanus@mozilla.com> wrote:
>> On 18 Jun 2018, at 15:25, Patrick McManus <pmcmanus@mozilla.com <mailto:pmcmanus@mozilla.com>> wrote:
>> 
>> In the interest of moving forward, given that we've completed wglc a couple of times, I'm going to incorporate this as a ietf last-call comment and include text as part of that cycle.. there is surely something useful that can be written
> 
> Erm, did you mean IESG not IETF LC?
> 
> 
> I think we're talking about the same thing. After the WG shepherd submits the doc to the iesg, they will issue a ietf-wide last call. We can make the additions we're talking about here as part of those inevitable updates.

Ok, but my point was going to be that… doesn’t the decision on when this issue gets dealt lie with the WG and chairs, not the document authors?

<snip>

> So I think what I am hearing here is that the use of DoH effectively comes at the price of accepting that additional overhead and all its potential privacy/tracking issues because in practice it is rather impossible and/or impractical to have ‘bare’ DoH that transmits only as much information about the user as, for example, typical DNS-over-TLS?
> 
> 
> more impractical than impossible - at the extreme, you would be building a mere tunnel rather than really building an HTTP application. But its totally appropriate to highlight that there are various tradeoffs that can be made by implementations. e.g. firefox will not accept server cookies right now on DoH transactions nor will it allow authentication. But, in a different circumstance, a DoH client might want to use cookie-drive auth totally reasonably (imagine a subscription based DoH service) - so its not something the protocol should prohibit but we can mention..

OK - so essentially in this specification of DoH the DoH protocol inherits the privacy qualities of HTTPS, it is not attempting to maintain or impose those of existing DNS over UDP/TCP/TLS wrt user identifiability.  

Given that, here is some text as a proposed starting point for a Privacy Considerations section - it is intended mainly as an ‘analysis’ of the current situation and I fully expect the recommendations at the end to be the subject of debate :-)



Privacy Considerations
----------------------------

When considering how the use of DoH affects user privacy (for example, compared with DNS over UDP, TCP or TLS) it is helpful to follow the analysis in both RFC7626 (DNS Privacy Considerations) and RFC6973 (Privacy Considerations for Internet Protocols). 

With reference to Section 2.4 of RFC7626 “On the wire”:

The privacy expectations of a user of DoH are relatively straightforward:

+ DoH encrypts DNS traffic and requires authentication of the server. This clearly mitigates passive surveillance and active attacks attempting to divert DNS traffic to rogue servers.
+ The use of port 443 by default and the ability to intermingle DNS traffic with HTTP traffic on the same connection can preventing on-path devices from interfering with DNS operations and make DNS traffic analysis more difficult.

With reference to Section 2.5 of RFC7626 “In the Servers”:

HTTP and DNS are very different protocols. There exists a natural tension between 
* the wide practice in HTTP to use various headers to optimise HTTP connections, functionality and behaviour (which can facilitate user identification and tracking)
* and the fact that currently DNS is very tightly encoded and contains no standardized user identifiers (since they are acknowledged as compromising user privacy). 

DNS-over-TLS, for example, would normally contain no client identifiers in the DNS messages and a resolver would see only a stream of DNS queries originating from a client IP address. Whereas if DoH clients commonly include several headers in a DNS message (e.g. user-agent and accept-language) this could lead to the resolver being able to identify the source of individual DNS requests not only to a specific end user device but to a specific application. 

Additionally, depending on the client architecture, isolation of DNS queries from other HTTP traffic may or may not be feasible or desirable.  Depending on the use case, isolation of DNS queries from other HTTP traffic may or may not increase privacy. 

The picture for privacy considerations and user expectations here is complex and will require a more detailed analysis for each particular use case. At the extremes, there may be use cases that attempt to achieve parity with DNS-over-TLS from a privacy perspective at the cost of using no identifiable headers, there might be others that provide feature rich data flows where the low-level origin of the DNS query is easily identifiable. 

As guidance, implementors should consider the following:

1.  RFC6973 discusses data minimisation in detail and says:

“Data minimization mitigates the following threats: surveillance, stored data compromise, correlation, identification, secondary use, and disclosure."

2. Implementors should evaluate what identifiers could be omitted or be made less identifying while still fulfilling the protocol's goals. 

  a. Specifically, implementors SHOULD not use non-essential HTTP headers in DoH messages and SHOULD set the user agent string to ‘DoH client’.

  b. Implementations SHOULD only accept cookies or allow client authentication when it is required to fulfil the protocols goals (e.g. a subscription service with user opt-in).



Sara.