[Model-t] Web Tracking
Eric Rescorla <ekr@rtfm.com> Mon, 17 February 2020 17:32 UTC
Return-Path: <ekr@rtfm.com>
X-Original-To: model-t@ietfa.amsl.com
Delivered-To: model-t@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1E055120855 for <model-t@ietfa.amsl.com>; Mon, 17 Feb 2020 09:32:27 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Level:
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_NONE=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=rtfm-com.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id uKYy4YUF1Ufj for <model-t@ietfa.amsl.com>; Mon, 17 Feb 2020 09:32:24 -0800 (PST)
Received: from mail-lf1-x12d.google.com (mail-lf1-x12d.google.com [IPv6:2a00:1450:4864:20::12d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 317E212084C for <model-t@iab.org>; Mon, 17 Feb 2020 09:32:24 -0800 (PST)
Received: by mail-lf1-x12d.google.com with SMTP id v201so12415620lfa.11 for <model-t@iab.org>; Mon, 17 Feb 2020 09:32:24 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rtfm-com.20150623.gappssmtp.com; s=20150623; h=mime-version:from:date:message-id:subject:to; bh=elkHk81LxZCGEw3EieKGpfg8riinHToEDeL4vPNzJyY=; b=qOvytPIWZVRfOi57pCIWABXcKn1FpU3ekpezNu8djH7DA3im9PDqhx4Mjs8dcbuBoM I7zGhfH39mMOZ1dSqP9pIDuG5PVPGgMI8B72qJ8sfSTaLFpz4A2A6j346Jb6XmukH73B IvCeBHF7gx5XA5mZirWSsyYSvRDiYhfg8ssIeoB1wltRCNz5naFV6iGxZAcRUSI0s6Qo q3wDviUj6xQ4FgDpuVTA9gEim4VpxdSXMXOA+RFhLLCxYiRZVjXIGQBDdUdU3845bWI3 VLLcVrOlApVDPjQ3q5JfiDwKhIz3unmxpCPCBb7a+jdsW1SlxY2TRpPDb/8ro5pNG176 FNrg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=elkHk81LxZCGEw3EieKGpfg8riinHToEDeL4vPNzJyY=; b=mabxo0wyH1/rwnuKg9txEgdvIMPjFKeKNx8maBHeAhku7g0I2rGw340oBi30R1+WTU vOTbjsJoL5Rcm1F4CpXRmX1U+UGaQw7B14ZqXmiSKzIplScnxk8N+5WghduYzDvza8dv e6L6dqsM2c0CDW/btU+aPkEOwsuZjm6GMPstmq3k75ajiHk17X4XlCPVwwJPRNR0K7Tx 2VrZqLj3GEH27qsoVNjRFRH56GvwsmHzlP0X/jO50FZEeazRj0Bwn0lPaFmD0AYc66DN yMxlGuC4e31FiQCE1NY05v5/fQqPpiiQsrF17cevIoS46K5eEgWNQmo0XRN3S3ww4fIp rF0A==
X-Gm-Message-State: APjAAAX+qFqVXNM6lCE8vnw3imhKichXGllign2XyA4/wt8KAPbsgt7P IY9YgDFl8/pW99g0YMrfaMDI3MbPgcWxzuqnnZ3wrM+g/DM=
X-Google-Smtp-Source: APXvYqzUthd93pwC2mXgI7la2rnoAf4cZ1tKTVRcinfipDd4AXtefWH2svWCK0CKNUIL66YUTyJw8Eiv/xOYt1/cwyc=
X-Received: by 2002:ac2:53b9:: with SMTP id j25mr8332315lfh.140.1581960741614; Mon, 17 Feb 2020 09:32:21 -0800 (PST)
MIME-Version: 1.0
From: Eric Rescorla <ekr@rtfm.com>
Date: Mon, 17 Feb 2020 09:31:45 -0800
Message-ID: <CABcZeBN-HNe-j2japnCT5HR49__mxR7jiFAJ4NO27CdpuvirXw@mail.gmail.com>
To: model-t@iab.org
Content-Type: multipart/alternative; boundary="0000000000000d85c2059ec8f0c8"
Archived-At: <https://mailarchive.ietf.org/arch/msg/model-t/qRU8naXGsWHPZgHutSdfkmwnkK0>
Subject: [Model-t] Web Tracking
X-BeenThere: model-t@iab.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Discussions of changes in Internet deployment patterns and their impact on the Internet threat model <model-t.iab.org>
List-Unsubscribe: <https://www.iab.org/mailman/options/model-t>, <mailto:model-t-request@iab.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/model-t/>
List-Post: <mailto:model-t@iab.org>
List-Help: <mailto:model-t-request@iab.org?subject=help>
List-Subscribe: <https://www.iab.org/mailman/listinfo/model-t>, <mailto:model-t-request@iab.org?subject=subscribe>
X-List-Received-Date: Mon, 17 Feb 2020 17:32:27 -0000
And here is our text on Web Tracking One of the biggest threats to user privacy on the Web is ubiquitous third party tracking. This takes advantage of HTTP Cookies [RFC6265] in what is called a "third party context". The basic idea here is that whenever a resource is loaded from a server, that server can include a cookie which will be sent back to the server on future loads. This includes situations where the resource is loaded as a "subresource" on a page (e.g., an image, a piece of JavaScript, etc.). In addition, those loads include a Referer header which contains the top-level page that the subresource is being loaded from. The combination of these features makes it fairly straightforward to build a system which tracks the user across the Web. The way this works is that the tracker convinces a number of content sites ("first parties") to include a subresource from the tracker site. Sometimes this subresource also performs some other function such as displaying an ad or providing analytics to the first party site, but sometimes it is simply a tracker. Then, whenever the user visits one of those content sites, the tracker receives the pair of (1) the Referer header and (2) the cookie, which is the same for each browser client regardless of which site the tracker is on. Together these allow the tracker to build up a picture of the user's browsing history. This can then be used for various purposes, but is most commonly used for ad targeting. This capability itself constitutes a major threat to user privacy. However, there are a number of practices which increase the threat: * Cookie Syncing: any given tracker may not be on all sites, which gives the tracker incomplete coverage. However, trackers often collude (a practice called "cookie syncing") to bridge different tracking cookies. * Identifier correlation: sometimes trackers will be embedded on a site which collects a user identifier (e.g., an e-mail address), in which case the site can inform the tracker of the address which allows the tracker to tie it to the cookie. * Fingerprinting: Cookies are a form of explicit state, which allows browsers to blook or erase them. However, it is also possible to use characteristics of the browser to track the user. For instance, features such as User-Agent string, plugin and font support, screen resolution, and timezone can yield a fingerprint that is sometimes unique to a single user [0] and which persists beyond cookie deletion. Even in cases where this fingerprint is not unique, the anonymity set may be sufficiently small that, when coupled with yet more data, yields a unique, per-user identifier. Fingerprinting of this type is more prevalent on systems and platforms wherein data set features are flexible, such as desktops, wherein plugins are more commonly in use. Fingerprinting prevention is an active research area; see [1] for more information. A number of browsers have started adding anti-tracking technologies. This is a rapidly moving field and so it is difficult to characterize here, but there are several basic ideas: * Blocking any communication with known trackers * Identifying trackers and suppressing their ability to store and access cookies and other state. * "Double keying" in which each third party load on different first party sites is treated as a different context, thereby isolating cookies and other state, e.g., TLS-layer information. [0] Gómez-Boix, Alejandro, Pierre Laperdrix, and Benoit Baudry. "Hiding in the crowd: an analysis of the effectiveness of browser fingerprinting at large scale." Proceedings of the 2018 world wide web conference. 2018. [1] https://amiunique.org
- Re: [Model-t] Web Tracking Jim Fenton
- [Model-t] Web Tracking Eric Rescorla
- Re: [Model-t] Web Tracking Dirk Kutscher
- Re: [Model-t] Web Tracking Töma Gavrichenkov
- Re: [Model-t] Web Tracking Stephen Farrell
- Re: [Model-t] Web Tracking Eric Rescorla
- Re: [Model-t] Web Tracking Töma Gavrichenkov
- Re: [Model-t] Web Tracking Eric Rescorla
- Re: [Model-t] Web Tracking Töma Gavrichenkov
- Re: [Model-t] Web Tracking Eric Rescorla