Re: [ietf-privacy] [saag] Fwd: WGLC for draft-ietf-tzdist-service-05
Daniel Kahn Gillmor <dkg@fifthhorseman.net> Fri, 30 January 2015 02:13 UTC
Return-Path: <dkg@fifthhorseman.net>
X-Original-To: ietf-privacy@ietfa.amsl.com
Delivered-To: ietf-privacy@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0DCB31A88D9; Thu, 29 Jan 2015 18:13:47 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.8
X-Spam-Level:
X-Spam-Status: No, score=0.8 tagged_above=-999 required=5 tests=[BAYES_50=0.8] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id KQKMyhyJzmYf; Thu, 29 Jan 2015 18:13:43 -0800 (PST)
Received: from che.mayfirst.org (che.mayfirst.org [209.234.253.108]) by ietfa.amsl.com (Postfix) with ESMTP id 6A3C11A88D8; Thu, 29 Jan 2015 18:13:43 -0800 (PST)
Received: from fifthhorseman.net (unknown [38.109.115.130]) by che.mayfirst.org (Postfix) with ESMTPSA id 44B96F984; Thu, 29 Jan 2015 21:13:41 -0500 (EST)
Received: by fifthhorseman.net (Postfix, from userid 1000) id 1B4A0201D1; Thu, 29 Jan 2015 21:13:40 -0500 (EST)
From: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
To: Daniel Migault <mglt.ietf@gmail.com>, saag@ietf.org, ietf-privacy@ietf.org, Eliot Lear <lear@cisco.com>
In-Reply-To: <CADZyTkkCrvTam_ba7Tq6A-cHAVZn+ktKqwWsr_PNQaz2jyTkUQ@mail.gmail.com>
References: <CADZyTkkLu6qQ9LCqDkTHA9o+-YVvQuaUp33kqkAt=PRaQS-Jew@mail.gmail.com> <CADZyTkkCrvTam_ba7Tq6A-cHAVZn+ktKqwWsr_PNQaz2jyTkUQ@mail.gmail.com>
User-Agent: Notmuch/0.18.2 (http://notmuchmail.org) Emacs/24.4.1 (x86_64-pc-linux-gnu)
Date: Thu, 29 Jan 2015 21:13:36 -0500
Message-ID: <874mr9aucv.fsf@alice.fifthhorseman.net>
MIME-Version: 1.0
Content-Type: multipart/signed; boundary="=-=-="; micalg="pgp-sha512"; protocol="application/pgp-signature"
Archived-At: <http://mailarchive.ietf.org/arch/msg/ietf-privacy/UkOg4bm4_9KHQnTUdbqpiDkCUqA>
Cc: Time Zone Data Distribution Service <tzdist@ietf.org>
Subject: Re: [ietf-privacy] [saag] Fwd: WGLC for draft-ietf-tzdist-service-05
X-BeenThere: ietf-privacy@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Internet Privacy Discussion List <ietf-privacy.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf-privacy>, <mailto:ietf-privacy-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ietf-privacy/>
List-Post: <mailto:ietf-privacy@ietf.org>
List-Help: <mailto:ietf-privacy-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf-privacy>, <mailto:ietf-privacy-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 30 Jan 2015 02:13:47 -0000
Hi Daniel and Elliot-- On Wed 2015-01-28 14:24:28 -0500, Daniel Migault wrote: > Our document describing Time Zone Data Distribution Service > <http://tools.ietf.org/html/draft-ietf-tzdist-service-05> [1] is close to > be finalized and we would like to proceed to cross area review. > > We would greatly appreciate to get review by February 11. [...] > [1] http://tools.ietf.org/html/draft-ietf-tzdist-service-05 Thanks for your work on this. This is the first time i've seen this draft; apologies for not looking at it earlier. I'm only subscribed to saag@ietf.org (and ietf-privacy, which is idle lately, but i've included here because some of my review touches on privacy), so this post might not make it through to tzdist@ietf.org -- feel free to forward it as needed. I did a quick skim here with my security and privacy hats on, and have a few comments: (privacy) Privacy Considerations section is missing =================================================== There is *no* "Privacy Considerations" section in the draft at all. Please read RFC 6973 for guidance in conducting a privacy review of the protocol. The act of querying these servers leaks something about the location of the person doing the query, at least, and may leak information about other locations that they're interested in. It's also possible that regular attempts to query this information will provide a linkable trail of the user, which could then be (mis)used without their knowledge or permission. Here's an attempt at a quick analysis, though i haven't thought through the protocol in detail. I hope you'll do your own analysis, and you're welcome to take any of mine: Implausibly: if the average user is interested in 5 timezones, and there are 774 known zones ("find /usr/share/zoneinfo -type f | wc"), and those interests were evenly distributed across the zones for every users, then the set of requests to update an individual's preferred timezones yields nearly 50 bits of entropy, far more than enough to distinguish every individual human from each other. More plausibly: timezone interest is probably less than 5 for most people, and it isn't evenly distributed: the people who are interested in Americas/New_York are more likely to be interested in Americas/Los_Angeles than in Arctic/Longyearbyen. But anyone with an unusual set of TZs can probably be identified (perhaps uniquely) by any provider they talk to just by what TZs they ask for. Since §4.1.4 says "Clients SHOULD poll for changes, using an appropriate conditional request, at least once a day", a malicious provider intent on surveilling its users and with a mechanism to do so would have a daily checkin. I imagine this as some kind of background system service looking for updates. the daily checkin could be used to track a user's movements around the network, if their device is not stationary. The time of checkin could also be used as a linking mechanism, if the machine polls with rigid regularity. Are there strategies that someone interested in preserving their anonymity from a tzdata provider should take to remain anonymous? If so, what are they? (privacy) HTTP pipelining? ========================== Clients requesting multiple unusual TZs together are more easily identifiable to servers, than clients who request only one. Should clients request all their interested TZs at once, or spread out their polling updates over time? HTTP pipelining is clearly more efficient; but what are the privacy implications if you have a system service that does this? (privacy) HTTP Cookies? ======================= The choice of HTTP transport also allows for servers to set cookies in clients -- should clients accept and re-transmit cookies from the server? What are the privacy implications? (privacy) Tracking via ETag? ============================ Also, conditional requests seem to be encouraged via the use of an ETag header. It looks to me like a provider who wants to track its users individually (even in the absence of cookies) could use a cache of personalized ETags to do so. For example, the first time any client requests TZ X (with no If-None-Match request header), the server mints a new ETag Y, generates a new client ID Z, and records: * Client ID Z * the requested TZ X * the new ETag Y * the time of issuance * the IP address * any other interesting metadata When a request comes in for TZ X with an If-None-Match: Y header, the server can link the two requests and record them both with client ID Z. When the underlying data for the TZ actually changes, the server mints a new ETag (for the new version of TZ X), but associates it with the same client ID Z. (privacy) Logging policy for distribution servers? ================================================== There is also no mention of recommended logging policy for the servers, no attempt to address data minimization or the risks to trackable users based on normal server logs. (privacy) Authenticated clients are trackable ============================================= the Security Considerations section says: Servers MAY require some form of authentication or authorization of clients (including secondary servers) to restrict which clients are allowed to access their service, or provide better identification of errant clients. As such, servers MAY require HTTP- based authentication as per [RFC7235]. Clients who make authenticated connections to servers are eminently trackable by those servers. What are the privacy implications for those clients? (privacy) network observers tracking clients ============================================ Someone passively observing the network could also potentially track the clients of a given server via traffic analysis, even if the server is not cooperating. First, the attacker could get a stash of all the data that the server has, noting the size of each zone under each supported format. When a new request is made for a zone, the attacker can observe the size of the query and the size of the response and guess with high probability which zone was requested. If the clients poll once a day on a schedule (i.e. exactly every 86400 seconds) then the network observer may be able to track updates and determine when a client interested in a particular zone does an update. What mechanisms could a client (and server?) use to frustrate such a network-based attacker to keep a given client's identity anonymous? (security/privacy) HTTP redirection ==================================== What if the server sends an HTTP redirection (e.g. via HTTP response 301 or 302) -- should the client follow it? What if it is to a cleartext HTTP resource? What are the security and privacy consequences of following these redirections off-origin? (security) Consequences of accepting bad TZ updates? ==================================================== I'm glad that the Security Considerations recognizes that reliable TZ data is vital -- but no example is given of what a data compromise might look like. Is it worth providing a couple of examples of bad outcomes? are we talking about missed appointments? or crashing software? or something else? (security) why not require TLS on both sides? ============================================= you've got that the service MUST operate over https, but the clients only SHOULD try https first. Why allow for cleartext access at all? Why not say that both clients and servers MUST support HTTPS? I see https://tools.ietf.org/wg/tzdist/trac/ticket/7 suggests that there is consensus that you don't want "mandatory to use", but i don't know where the discussion is, or why you don't want it. (security) Provider-to-Provider TLS =================================== Connections between "Secondary Providers" and "Root Providers" seem different from the connections between Clients and Providers. If you can't mandate HTTPS for all clients for some reason, what about at least mandating that the caching infrastructure requires TLS for all provider-to-provider connections? The secondary provider will need a TLS stack anyway (as a server), so it should be able to do TLS on the upstream side. (security) DNS compromise leaves only cleartext =============================================== If a network-based attacker can filter network traffic, they can simply drop all outbound _timezones._tcp.example.com DNS queries, and then when the client gives up, they can allow through (or provide their own, if DNSSEC isn't involved) responses to _timezone._tcp.example.com. This immediately puts the network attacker in the position of being able to dictate timezone information to a client willing to fall back to cleartext. (security) no-DNSSEC fallback checks are ambiguous ================================================== The Security Considerations currently say: In the absence of a secure DNS option, clients SHOULD check that the target FQDN returned in the SRV record matches the original service domain that was queried. If the target FQDN is not in the queried domain, clients SHOULD verify with the user that the SRV target FQDN is suitable for use before executing any connections to the host. What does "matches" mean here? the second sentence suggests that it means "shares some sort of a suffix with" -- but which part? If i query for an SRV of _timezones._tcp.tz.example.com, and it replies with an FQDN of bar.example.com, is that OK? what about x.y.z.bar.example.com? If DNSSEC isn't available, the attacker can still point this response to any IP address of their choice, right? What does "verify with the user" mean if this is a TZdata service, which is presumably running automatically on the computer to keep this information up-to-date? most such services have no user interaction at all. If there is a UI, what options would the user be given in such a case? Is this a popup dialog box that says "you asked for timezone data updates from tz.example.com -- is it ok to get it from whatever.example instead?" What users can make sense of this dialog? What information would a fully-technically-cognizant user (a deep wizard) use to answer it sensibly? What would a normal user use? If DNSSEC *is* available, is it OK if the record points outside the zone? what if it points to a non-signed zone? (security) Conflicts between Providers? ======================================== The draft implies that a client might fetch data from multiple providers. What should the client do if two providers provide conflicting information about the same TZ? (security) use examples of certificate validation ================================================== The combination of SRV records and X.509 certificate validation and (maybe) DNSSEC is a tricky subject. you've referenced RFC 6125, but i don't think that's enough. Do you mean to suggest that the certificate should use a SRVName subjectAltName (RFC4985)? or should it use a DNSName subjectAltName with the name sent in the SRV query? or a DNSName subjectAltName with the FQDN returned in the SRV response? Providing an example would make it clearer what you mean. For example: If a client looks up SRV for _timezones._tcp.example.com, and gets a response of tz.example.net, then the certificate should (a) be valid, and (b) have either a subjectAltName DNSName of tz.example.net or a subjectAltName SRVName of _timezones._tcp.example.com (or both). (please adjust to taste, i don't mean to tell you what the right choice is here, it's an ugly problem) (security) Statically-signed data vs. transport security ======================================================== The security of the transmission process seems to rely entirely on transport security. If there is a compromise in transmission between the Root provider and the secondary provider, or a compromise of any provider, the client has no way of knowing that they're getting bad data. tzdata changes infrequently enough that it seems like it could be signed with an offline key, making compromise of running systems much less fruitful. But this only works if the client can verify the offline signature. Have you considered any mechanism that the client could use to verify the tz update based on data itself, without depending solely on transport security? I see this question tangentially raise here: https://www.ietf.org/mail-archive/web/tzdist/current/msg00102.html but it's answered only in the "we still need TLS" way (which i agree with). Is any work done (or planned) on providing signed/verifiable data? (security) TLS best-practices? =============================== I'm glad that you've got TLS as a MUST for servers. Is it worth making a normative reference to the UTA's TLS best-practices document? https://tools.ietf.org/html/draft-ietf-uta-tls-bcp Sorry this got long, and that this is more in the form of questions than patches. I hope i haven't repeated too much of what the tzdist WG has already discussed -- please feel free to point me to relevant discussions that i may have missed. --dkg
- Re: [ietf-privacy] [saag] Fwd: WGLC for draft-iet… Daniel Kahn Gillmor
- Re: [ietf-privacy] [saag] Fwd: WGLC for draft-iet… Eliot Lear
- Re: [ietf-privacy] [saag] Fwd: WGLC for draft-iet… Eliot Lear
- Re: [ietf-privacy] [Tzdist] [saag] Fwd: WGLC for … Stephen Farrell
- Re: [ietf-privacy] [Tzdist] [saag] Fwd: WGLC for … Cyrus Daboo
- Re: [ietf-privacy] [Tzdist] [saag] Fwd: WGLC for … Lester Caine
- Re: [ietf-privacy] [Tzdist] [saag] Fwd: WGLC for … Daniel Kahn Gillmor
- Re: [ietf-privacy] [Tzdist] [saag] Fwd: WGLC for … Daniel Kahn Gillmor
- Re: [ietf-privacy] [Tzdist] [saag] Fwd: WGLC for … Lester Caine
- Re: [ietf-privacy] [Tzdist] [saag] Fwd: WGLC for … Paul Eggert
- Re: [ietf-privacy] [Tzdist] [saag] Fwd: WGLC for … Paul Eggert
- Re: [ietf-privacy] [Tzdist] [saag] Fwd: WGLC for … Lester Caine
- Re: [ietf-privacy] [saag] Fwd: WGLC for draft-iet… Eliot Lear
- Re: [ietf-privacy] [Tzdist] [saag] Fwd: WGLC for … Paul Eggert
- Re: [ietf-privacy] [Tzdist] [saag] Fwd: WGLC for … Cyrus Daboo
- Re: [ietf-privacy] [Tzdist] [saag] Fwd: WGLC for … Lester Caine
- Re: [ietf-privacy] [Tzdist] [saag] Fwd: WGLC for … Paul Eggert
- Re: [ietf-privacy] [Tzdist] [saag] Fwd: WGLC for … Lester Caine
- Re: [ietf-privacy] [saag] Fwd: WGLC for draft-iet… Daniel Kahn Gillmor