[Doh] some privacy ponderings wrt HTTPs and plain DNS

bert hubert <bert.hubert@powerdns.com> Mon, 18 June 2018 11:21 UTC

Return-Path: <bert@hubertnet.nl>
X-Original-To: doh@ietfa.amsl.com
Delivered-To: doh@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id CD6FA130E96 for <doh@ietfa.amsl.com>; Mon, 18 Jun 2018 04:21:21 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.652
X-Spam-Level:
X-Spam-Status: No, score=-1.652 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, SPF_PASS=-0.001] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id GdKCvq1u2Czt for <doh@ietfa.amsl.com>; Mon, 18 Jun 2018 04:21:20 -0700 (PDT)
Received: from xs.powerdns.com (xs.powerdns.com [82.94.213.34]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4F62F124BE5 for <doh@ietf.org>; Mon, 18 Jun 2018 04:21:19 -0700 (PDT)
Received: from server.ds9a.nl (unknown [86.82.68.237]) by xs.powerdns.com (Postfix) with ESMTPS id 3C6E39FB55 for <doh@ietf.org>; Mon, 18 Jun 2018 11:21:16 +0000 (UTC)
Received: by server.ds9a.nl (Postfix, from userid 1000) id 0E9E6AC623F; Mon, 18 Jun 2018 13:21:16 +0200 (CEST)
Date: Mon, 18 Jun 2018 13:21:16 +0200
From: bert hubert <bert.hubert@powerdns.com>
To: doh@ietf.org
Message-ID: <20180618112116.GB9195@server.ds9a.nl>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.5.24 (2015-08-30)
Archived-At: <https://mailarchive.ietf.org/arch/msg/doh/vHjITrOMhWSdrozGFe4-eGNMEJc>
Subject: [Doh] some privacy ponderings wrt HTTPs and plain DNS
X-BeenThere: doh@ietf.org
X-Mailman-Version: 2.1.26
Precedence: list
List-Id: DNS Over HTTPS <doh.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/doh>, <mailto:doh-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/doh/>
List-Post: <mailto:doh@ietf.org>
List-Help: <mailto:doh-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/doh>, <mailto:doh-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 18 Jun 2018 11:21:22 -0000

Hi everyone,

I'm happy with the progress in -11 and hope we can see this in standardized
production soon!

One thing I would like to note though, but I'm not sure if it has a place in
the draft: the differential privacy aspects of DNS versus DNS over HTTPS.

DNS is encoded so tightly that there is almost no variation in queries
coming from stub resolvers.  Your esp8266, fridge, microwave and iPhone X
all send out bit for bit identical queries, and this is a wonderful thing.

HTTPs however can be highly identifiable.  Over the course of a week, over
4000 different agent strings visited https://ds9a.nl/ and
https://powerdns.org/, for example (and this is excluding bots). This allows
DoH servers to differentiate different devices in a household easily, since
most of them will have a unique string.

In addition, HTTP can carry further identifying headers like accepted
languages, HSTS settings (perhaps) or even cookies. 

It has also been observed that TLS session resumption data provides a way to
semi-persistently track individual TLS originators (but I don't know for how
long).

With all this, there is the clear risk that DoH as a privacy feature will at
first succeed in providing DNS operators with a more detailed insight into
per-device browsing habits, something they may not be shy to monetize.

I have discussed this on Twitter with various HTTP users and they opined
that since they were supplying these headers already anyhow the privacy
impact is minimal. After some discussion & dumping of headers, I think they
agreed it is not necessary to send out a user's language preferences with
DNS request, nor the CPU, or the operating system or the exact browser
version.

As noted, I don't know if this has a place in the draft, but I'd recommend
DoH clients to:

* Set their Agent to 'DoH client', no matter what browser/library
* Do not pass non-essential HTTP headers (like language)
* Do not allow the DoH server to set cookies
* Ponder TLS sessions resumption data settings
* Think about all other ways in which HTTP can be tracked (HSTS?)

Thoughts?

	Bert