Re: [ietf-privacy] Privacy of CLIENTID for IMAP/SMTP

Thanks for sharing the proposal and the implementation ideas.

I can definitely see security advantages to this kind of approach. By sending an additional identifier along with a username and password, a server can more reliably recognize clients it has seen before, and that kind of information is a useful heuristic and mitigates some of the risk of password disclosure. I think app-specific passwords were a similar approach, but required overloading the password field.

Adding any client identifier, even one transmitted only over a secure channel, does potentially disclose information about the user to the server. I agree that there is a disclosure of the kind where the server can learn about the number of the user’s devices or something about their habits. That might be an acceptable trade-off for many users, especially because some of that information is likely already available to the server, which uses heuristics like IP address, or particularities of how different email clients work, to recognize variations.

But I am especially concerned about the suggestions for different types of identifiers in these drafts. Sending a permanent or semi-permanent hardware identifier (like the MAC address or other hardware ID) is especially dangerous:
* it’s necessarily shared between accounts and users of the same device, disclosing correlation between accounts;
* it’s less useful for security purposes because, if it’s disclosed to the attacker, it can’t easily be changed;
* the user can’t easily change it when they want to clear their identity information;
* it allows for collusion between servers to identify two accounts at two different servers as connected to the same device;
* it unnecessarily reveals other information about the user’s device, like the hardware vendor, and sensitive information that may be used as an authentication signal in other protocols or for other purposes.

While these drafts don’t require hardware identifiers, recommending it as a type of clientid is dangerously unhelpful. License keys also seem like a poor suggestion: they aren’t necessarily one-to-one with devices, they reveal extra information about the user’s software, they’re unlikely to change and they’re shared across servers. It’s not clear why different types of identifiers are useful at all; it certainly makes the privacy and security properties harder to analyze. The drafts even suggest disclosing information (like the vendor) in the named type of the identifier! Standardization could improve interoperability as well as user privacy.

These privacy threats of identifiers and potential mitigations are well-described in RFC 6973 [0] and I believe additional I-Ds on numeric identifiers are also under development.

We could spend more time reviewing what the persistence, scope and user control of identifiers is appropriate. Off the top of my head: a pseudorandom (and therefore “unintelligent": revealing no other information about the user) unique number generated by the client for every user/account-server pair; resettable by the user whenever they choose; reset whenever the user rotates other client identifiers; stored by the client on the local device and not synced to other devices. 

Forgive my Web-centricness, but HTTP cookies (and more specifically, newer proposals to replace them [1]) have given some experience in the privacy advantages of limiting the scope to an origin (maybe roughly analogous to the mail server) and limiting persistence and intelligence.

Hope this helps,
Nick

[0] https://tools.ietf.org/html/rfc6973#section-6.1
[1] https://tools.ietf.org/html/draft-west-http-state-tokens-00

> On Aug 19, 2019, at 12:07 PM, Kai Engert <kaie@kuix.de> wrote:
> 
> Hello,
> 
> I would like to ask for feedback on potential privacy concerns related
> to the following drafts, that describe a CLIENTID feature for the IMAP
> and SMTP protocols:
> * https://tools.ietf.org/html/draft-yu-imap-client-id
> * https://tools.ietf.org/html/draft-storey-smtp-client-id
> 
> The Thunderbird project is currently evaluating a request to support the
> CLIENTID feature and enable it by default.
> 
> We'd like to ensure that we appropriately consider all potential privacy
> issues that could arise as a consequence, prior to making decisions
> about inclusion or default behavior.
> 
> Here is my quick summary of the CLIENTID feature:
> * an email client creates a random client side identifier for itself,
>  and stores it locally.
> * at the time a client starts a connection with an IMAP or SMTP server,
>  which the user has configured for accessing or sending mail, the
>  server may ask the client to send a client identifier. If the client
>  supports it, and if the connection is encrypted, then the client
>  sends its client side identifier.
> * the client ID doesn't replace the regular login credentials,
>  but shall be treated as additional information about the client
>  accessing the server.
> * servers want to compare the client ID that is received on a
>  connection, and draw the conclusion that the current client is
>  the same client that has connected to the server in the past.
> * the intention of the authors of the drafts is to make it easier
>  for the server side to detect fraudulent login attempts.
> 
> We have already identified that reusing a single identifier across
> multiple email accounts could allow a server to learn that those email
> accounts are controlled by the same entity, and we don't want to allow
> that. Consequently, Thunderbird would use a different client side
> identifier for each account.
> 
> What are additional privacy concerns?
> 
> If multiple client computers are used to access the same email account,
> this feature allows the server to distinguish the different computers.
> 
> For example, a user might regularly use two different computers to
> access an email account, one computer at location A, and the other at
> location B.
> 
> If both locations use a dynamic Internet connection with changing IP
> addresses, as of today, the server probably cannot distinguish which
> location was used to send an email. However, with the CLIENTID feature,
> the server potentially could.
> 
> The server could use the CLIENTID feature to learn about habits. For
> example, if emails from location A are primarily sent during daytime,
> and emails from location B are primarily sent outside regular office
> hours, the server could learn that the computer at location A is at an
> office, and the computer at location B is in a private apartment.
> 
> Consequently, the server operator could draw conclusions where the user
> was located at the time an email was sent.
> 
> Can you think of additional negative consequences for the user's privacy
> with CLIENTID enabled?
> 
> Thanks in advance
> Kai
> 
> _______________________________________________
> ietf-privacy mailing list
> ietf-privacy@ietf.org
> https://www.ietf.org/mailman/listinfo/ietf-privacy