Re: [Doh] some privacy ponderings wrt HTTPs and plain DNS

Sara Dickinson <sara@sinodun.com> Mon, 18 June 2018 16:36 UTC

Return-Path: <sara@sinodun.com>
X-Original-To: doh@ietfa.amsl.com
Delivered-To: doh@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BDBD3130DF6 for <doh@ietfa.amsl.com>; Mon, 18 Jun 2018 09:36:23 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.199
X-Spam-Level:
X-Spam-Status: No, score=-4.199 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-2.3] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id lCB7mEm4Bno5 for <doh@ietfa.amsl.com>; Mon, 18 Jun 2018 09:36:20 -0700 (PDT)
Received: from balrog.mythic-beasts.com (balrog.mythic-beasts.com [IPv6:2a00:1098:0:82:1000:0:2:1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id EBA33130DF9 for <doh@ietf.org>; Mon, 18 Jun 2018 09:36:19 -0700 (PDT)
Received: from [2001:b98:204:102:fffa::409] (port=52504) by balrog.mythic-beasts.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <sara@sinodun.com>) id 1fUx8X-0002Vk-8e; Mon, 18 Jun 2018 17:36:18 +0100
From: Sara Dickinson <sara@sinodun.com>
Message-Id: <0D08F629-1719-440D-B4B4-A474CF90B865@sinodun.com>
Content-Type: multipart/alternative; boundary="Apple-Mail=_9403FD35-809D-4F4B-A55D-44EDE8A7685B"
Mime-Version: 1.0 (Mac OS X Mail 11.4 \(3445.8.2\))
Date: Mon, 18 Jun 2018 17:36:11 +0100
In-Reply-To: <CAOdDvNrnfxxQ__G_kKn4Fe4jcwcQUZfOb4aNAE6+bjvSrfLcmA@mail.gmail.com>
Cc: nusenu <nusenu-lists@riseup.net>, DoH WG <doh@ietf.org>, bert hubert <bert.hubert@powerdns.com>
To: Patrick McManus <pmcmanus@mozilla.com>
References: <20180618112116.GB9195@server.ds9a.nl> <d137a136-d456-8de2-b682-512edd86b1f7@riseup.net> <E4082C8A-8D16-4F13-82ED-C9F68F66A2A1@sinodun.com> <CAOdDvNrnfxxQ__G_kKn4Fe4jcwcQUZfOb4aNAE6+bjvSrfLcmA@mail.gmail.com>
X-Mailer: Apple Mail (2.3445.8.2)
X-BlackCat-Spam-Score: 4
Archived-At: <https://mailarchive.ietf.org/arch/msg/doh/hPxDDOYJ7lJliGFMhnvGnCARXWE>
Subject: Re: [Doh] some privacy ponderings wrt HTTPs and plain DNS
X-BeenThere: doh@ietf.org
X-Mailman-Version: 2.1.26
Precedence: list
List-Id: DNS Over HTTPS <doh.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/doh>, <mailto:doh-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/doh/>
List-Post: <mailto:doh@ietf.org>
List-Help: <mailto:doh-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/doh>, <mailto:doh-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 18 Jun 2018 16:36:24 -0000


> On 18 Jun 2018, at 15:25, Patrick McManus <pmcmanus@mozilla.com> wrote:
> 
> In the interest of moving forward, given that we've completed wglc a couple of times, I'm going to incorporate this as a ietf last-call comment and include text as part of that cycle.. there is surely something useful that can be written

Erm, did you mean IESG not IETF LC?

> 
> in the big picture, this is something that I would like to see handled by BCP56-bis, but doesn't seem to be addressed at all there. That's probably a good comment for that draft.
> 
> some more specific things to consider when writing text that come to my mind:
> * content negotiation is something doh clients will want to do as a normal part of http, and most of this information is related.. e.g. accept-language
> * many headers leak state but are tied to http features that clients might choose to participate in anyhow (e.g. cache revalidation headers)
> * versioning information helps prevent bugs from becoming defacto features forever.
> * authentication is a normal thing to do
> * there are lots of headers that we don't know about - and that's ok. Let's provide advice rather than be a whitelisting firewall.
> * entities other than DoH clients and DoH servers participate in a DoH exchange (proxies, lb, etc) so be aware that there is an asymmetry between send and recv in this spec. (i.e. you can say a DoH client cannot send Foo, but you cannot say that a DoH server receiving Foo is a protocol error because generic HTTP in between might add it.)

Thanks for the list… it certainly provides context to understand that HTTP as a substrate brings a significant overhead (as well as all the benefits of such bells and whistles).

So I think what I am hearing here is that the use of DoH effectively comes at the price of accepting that additional overhead and all its potential privacy/tracking issues because in practice it is rather impossible and/or impractical to have ‘bare’ DoH that transmits only as much information about the user as, for example, typical DNS-over-TLS?

I see the trade-offs here, I just want to make sure I understood correctly….

> 
> wrt Sara's specific comment about the differences between connection contexts - that's not a difference the core semantic layer of HTTP presents. from that pov HTTP is stateless. its often not something the client interface exposes very well, and is certainly not necessarily an end to end property. So any text in that neighborhood probably needs to be along the lines of fyi.
> 
> So I think DoH will need to say something fairly generic along the lines of "think about the meta data tradeoffs and explicitly enable features you need”.

So from a privacy point of view there is no clear way at the protocol level to separate DoH traffic from other HTTPS traffic? In other words any considerations of data minimisation would have to be framed in that context (i.e. DoH traffic can’t necessarily be isolated)?

Sara. 


> 
> 
> On Mon, Jun 18, 2018 at 9:47 AM, Sara Dickinson <sara@sinodun.com <mailto:sara@sinodun.com>> wrote:
> 
> 
>> On 18 Jun 2018, at 13:49, nusenu <nusenu-lists@riseup.net <mailto:nusenu-lists@riseup.net>> wrote:
>> 
>> Signed PGP part
>>> As noted, I don't know if this has a place in the draft, but I'd recommend
>>> DoH clients to:
>>> 
>>> * Set their Agent to 'DoH client', no matter what browser/library
>>> * Do not pass non-essential HTTP headers (like language)
>>> * Do not allow the DoH server to set cookies
>>> * Ponder TLS sessions resumption data settings
>>> * Think about all other ways in which HTTP can be tracked (HSTS?)
>>> 
>>> Thoughts?
>> 
>> Thanks for this, I find it rather important to avoid introducing new data 
>> collection opportunities (at the resolver) that haven't been there on plain DNS 
>> and support your recommendations to minimize data exposure.
>> 
>> Ideally all DoH clients look the same from the DoH server perspective.
>> This this will never be completely the case but we should try to minimize
>> the DoH client fingerprintability.
>> 
>> I hope the minimization of the DoH client fingerprint becomes a mandatory part
>> in the document.
> 
> +1 on this. DNSOP and DPRIVE have both acknowledged that client identifiers in DNS messages directly compromise user privacy and have attempted to minimise or mitigate their use (for example see RFC7626, RFC7871 or draft-tale-dnsop-edns0-clientid). The DoH WG should be doing the same (or at least no less).
> 
>> 
>> This is also in-line with the spirit of RFC6973
>> Privacy Considerations for Internet Protocols
>> 
>> specifically section 6.1 and 7.1
>> https://tools.ietf.org/html/rfc6973#section-6.1 <https://tools.ietf.org/html/rfc6973#section-6..1>
>> https://tools.ietf.org/html/rfc6973#section-7.1 <https://tools.ietf.org/html/rfc6973#section-7.1>
>> 
>> "What identifiers could be omitted or be made less
>> identifying while still fulfilling the protocol's goals?"
>> is always a good question to ask.
> 
> And given that the charter says 
> “The working group will analyze the security and privacy issues that
> could arise from accessing DNS over HTTPS. “
> 
> it suddenly strikes me that this draft doesn’t contain a Privacy Considerations section. I would suggest that one is added to address this issue and offer to help with text on that. 
> 
> On a technical note do we have 2 use cases to deal with?
> - one where dedicated connections are used for DoH (i.e. where only DoH requests are made)
> - one where DoH requests are intermingled on the same connection with existing traffic (which will most likely include headers already identifying the client)
> 
> Sara. 
> 
> 
> _______________________________________________
> Doh mailing list
> Doh@ietf.org <mailto:Doh@ietf.org>
> https://www.ietf.org/mailman/listinfo/doh <https://www.ietf.org/mailman/listinfo/doh>
> 
> 
> _______________________________________________
> Doh mailing list
> Doh@ietf.org
> https://www.ietf.org/mailman/listinfo/doh