Re: [ietf-privacy] Privacy of CLIENTID for IMAP/SMTP

Nick Doty <npdoty@ischool.berkeley.edu> Mon, 19 August 2019 18:57 UTC

Return-Path: <npdoty@berkeley.edu>
X-Original-To: ietf-privacy@ietfa.amsl.com
Delivered-To: ietf-privacy@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5E61A120877 for <ietf-privacy@ietfa.amsl.com>; Mon, 19 Aug 2019 11:57:24 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=ischool-berkeley-edu.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id sVnacOfQP2JP for <ietf-privacy@ietfa.amsl.com>; Mon, 19 Aug 2019 11:57:21 -0700 (PDT)
Received: from mail-qk1-x743.google.com (mail-qk1-x743.google.com [IPv6:2607:f8b0:4864:20::743]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 117D9120871 for <ietf-privacy@ietf.org>; Mon, 19 Aug 2019 11:57:20 -0700 (PDT)
Received: by mail-qk1-x743.google.com with SMTP id d79so2324281qke.11 for <ietf-privacy@ietf.org>; Mon, 19 Aug 2019 11:57:20 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ischool-berkeley-edu.20150623.gappssmtp.com; s=20150623; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=jYZuRxaLNpZnfncxtJ2FkTekwSD0pHaUyaX4djf3ufI=; b=AfKsykv54IsriOBHfg4hn9+7GUfXUX7Dc/a5cKUeXnIA7jiZQxpHYYlkxHAvuWyGIb nLpq3sIVs4s48H+cfnTI1TJPlG5G9lLI2EFzH3oadv/uNfDF5Wcpj20O0ijy8QVN6MKg 8COaec5YJ9dCgbtTsC9uW9KTE0Q/rQGP+UDgQlsKxPXveSdZwqCXbgvV5IXzJLS+xTbW rjXSV6CsYl7BqTYg2XsIVMbZODln18eTxhwWXrWlWvQfcBniK/4D2ngR5SmzwWumJMWI 5k0TO67rid88ipMXFZgqbZRk9ItzLfePFw2jAg0IOkvapiOuKTb01MnCXg9XrKdoIJmc SLPg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=jYZuRxaLNpZnfncxtJ2FkTekwSD0pHaUyaX4djf3ufI=; b=uGhomXmL5e6F4QYUddPMh+bVfEHla+1PS8bNJXF8cnOTrYW1GAe76Kr6QJp+R3addS +tiY3fpFkCFTh1Op7orR+NPB7FN6SjTCIsEa/+NC1UzdB+Tjeh0IoetAEYuCgejOn+m+ rnHce6v43qz5aIuYSHuGgbA3xwR48+ex1LCqmwAypY7IjYx5xGPuwQAmzFO6gHSJhHhm uEIj470aRwicesuKTjF8ZJadqy1wyUbz/j36Hsx2SnBfkhr8O5C6ekJRnJ+P7Xsp0x4b cbZHgO+3scnY/OIFzEMDpMNpXERLMnpuEM/NsU5W4RyM9gaE4k7+P9n5mrT5NLCGVrmn QcDA==
X-Gm-Message-State: APjAAAUTxPw/gnKJJnriqqji37vPwLVFXNzT6GWLus9Tl4PhqLiuBA1O GWvloyUwDlsxZRiluXoZpz3vLaWcWe4=
X-Google-Smtp-Source: APXvYqwU2msYS7pYJ7Vv7J5IpCbALpxIaGcdhU7Zk4tsfcWGY2BUVCZ5sEekOJ3u+tFa41H1sUum2A==
X-Received: by 2002:a05:620a:12ca:: with SMTP id e10mr22176264qkl.125.1566241038951; Mon, 19 Aug 2019 11:57:18 -0700 (PDT)
Received: from [192.168.50.69] (rrcs-96-10-14-74.se.biz.rr.com. [96.10.14.74]) by smtp.gmail.com with ESMTPSA id z18sm5113996qtn.87.2019.08.19.11.57.18 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 19 Aug 2019 11:57:18 -0700 (PDT)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\))
From: Nick Doty <npdoty@ischool.berkeley.edu>
In-Reply-To: <ae89b75c-65c3-db47-8152-19ef3f96dcb1@kuix.de>
Date: Mon, 19 Aug 2019 14:57:16 -0400
Cc: "ietf-privacy@ietf.org" <ietf-privacy@ietf.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <B4A437C0-0220-4649-87A1-B2B212B32CC9@ischool.berkeley.edu>
References: <ae89b75c-65c3-db47-8152-19ef3f96dcb1@kuix.de>
To: Kai Engert <kaie@kuix.de>
X-Mailer: Apple Mail (2.3445.104.11)
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf-privacy/AG6riTg7o6FtbaltTFO564U1V5M>
Subject: Re: [ietf-privacy] Privacy of CLIENTID for IMAP/SMTP
X-BeenThere: ietf-privacy@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Internet Privacy Discussion List <ietf-privacy.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf-privacy>, <mailto:ietf-privacy-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf-privacy/>
List-Post: <mailto:ietf-privacy@ietf.org>
List-Help: <mailto:ietf-privacy-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf-privacy>, <mailto:ietf-privacy-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 19 Aug 2019 18:57:25 -0000

Thanks for sharing the proposal and the implementation ideas.

I can definitely see security advantages to this kind of approach. By sending an additional identifier along with a username and password, a server can more reliably recognize clients it has seen before, and that kind of information is a useful heuristic and mitigates some of the risk of password disclosure. I think app-specific passwords were a similar approach, but required overloading the password field.

Adding any client identifier, even one transmitted only over a secure channel, does potentially disclose information about the user to the server. I agree that there is a disclosure of the kind where the server can learn about the number of the user’s devices or something about their habits. That might be an acceptable trade-off for many users, especially because some of that information is likely already available to the server, which uses heuristics like IP address, or particularities of how different email clients work, to recognize variations.

But I am especially concerned about the suggestions for different types of identifiers in these drafts. Sending a permanent or semi-permanent hardware identifier (like the MAC address or other hardware ID) is especially dangerous:
* it’s necessarily shared between accounts and users of the same device, disclosing correlation between accounts;
* it’s less useful for security purposes because, if it’s disclosed to the attacker, it can’t easily be changed;
* the user can’t easily change it when they want to clear their identity information;
* it allows for collusion between servers to identify two accounts at two different servers as connected to the same device;
* it unnecessarily reveals other information about the user’s device, like the hardware vendor, and sensitive information that may be used as an authentication signal in other protocols or for other purposes.

While these drafts don’t require hardware identifiers, recommending it as a type of clientid is dangerously unhelpful. License keys also seem like a poor suggestion: they aren’t necessarily one-to-one with devices, they reveal extra information about the user’s software, they’re unlikely to change and they’re shared across servers. It’s not clear why different types of identifiers are useful at all; it certainly makes the privacy and security properties harder to analyze. The drafts even suggest disclosing information (like the vendor) in the named type of the identifier! Standardization could improve interoperability as well as user privacy.

These privacy threats of identifiers and potential mitigations are well-described in RFC 6973 [0] and I believe additional I-Ds on numeric identifiers are also under development.

We could spend more time reviewing what the persistence, scope and user control of identifiers is appropriate. Off the top of my head: a pseudorandom (and therefore “unintelligent": revealing no other information about the user) unique number generated by the client for every user/account-server pair; resettable by the user whenever they choose; reset whenever the user rotates other client identifiers; stored by the client on the local device and not synced to other devices. 

Forgive my Web-centricness, but HTTP cookies (and more specifically, newer proposals to replace them [1]) have given some experience in the privacy advantages of limiting the scope to an origin (maybe roughly analogous to the mail server) and limiting persistence and intelligence.

Hope this helps,
Nick

[0] https://tools.ietf.org/html/rfc6973#section-6.1
[1] https://tools.ietf.org/html/draft-west-http-state-tokens-00


> On Aug 19, 2019, at 12:07 PM, Kai Engert <kaie@kuix.de> wrote:
> 
> Hello,
> 
> I would like to ask for feedback on potential privacy concerns related
> to the following drafts, that describe a CLIENTID feature for the IMAP
> and SMTP protocols:
> * https://tools.ietf.org/html/draft-yu-imap-client-id
> * https://tools.ietf.org/html/draft-storey-smtp-client-id
> 
> The Thunderbird project is currently evaluating a request to support the
> CLIENTID feature and enable it by default.
> 
> We'd like to ensure that we appropriately consider all potential privacy
> issues that could arise as a consequence, prior to making decisions
> about inclusion or default behavior.
> 
> Here is my quick summary of the CLIENTID feature:
> * an email client creates a random client side identifier for itself,
>  and stores it locally.
> * at the time a client starts a connection with an IMAP or SMTP server,
>  which the user has configured for accessing or sending mail, the
>  server may ask the client to send a client identifier. If the client
>  supports it, and if the connection is encrypted, then the client
>  sends its client side identifier.
> * the client ID doesn't replace the regular login credentials,
>  but shall be treated as additional information about the client
>  accessing the server.
> * servers want to compare the client ID that is received on a
>  connection, and draw the conclusion that the current client is
>  the same client that has connected to the server in the past.
> * the intention of the authors of the drafts is to make it easier
>  for the server side to detect fraudulent login attempts.
> 
> We have already identified that reusing a single identifier across
> multiple email accounts could allow a server to learn that those email
> accounts are controlled by the same entity, and we don't want to allow
> that. Consequently, Thunderbird would use a different client side
> identifier for each account.
> 
> What are additional privacy concerns?
> 
> If multiple client computers are used to access the same email account,
> this feature allows the server to distinguish the different computers.
> 
> For example, a user might regularly use two different computers to
> access an email account, one computer at location A, and the other at
> location B.
> 
> If both locations use a dynamic Internet connection with changing IP
> addresses, as of today, the server probably cannot distinguish which
> location was used to send an email. However, with the CLIENTID feature,
> the server potentially could.
> 
> The server could use the CLIENTID feature to learn about habits. For
> example, if emails from location A are primarily sent during daytime,
> and emails from location B are primarily sent outside regular office
> hours, the server could learn that the computer at location A is at an
> office, and the computer at location B is in a private apartment.
> 
> Consequently, the server operator could draw conclusions where the user
> was located at the time an email was sent.
> 
> Can you think of additional negative consequences for the user's privacy
> with CLIENTID enabled?
> 
> Thanks in advance
> Kai
> 
> _______________________________________________
> ietf-privacy mailing list
> ietf-privacy@ietf.org
> https://www.ietf.org/mailman/listinfo/ietf-privacy