[dnsext] Re: Privacy vs EDNS Client IP...

Nicholas Weaver <nweaver@ICSI.Berkeley.EDU> Tue, 02 February 2010 20:52 UTC

Return-Path: <owner-namedroppers@ops.ietf.org>
X-Original-To: ietfarch-dnsext-archive@core3.amsl.com
Delivered-To: ietfarch-dnsext-archive@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 3FE0C3A6B6C; Tue, 2 Feb 2010 12:52:31 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -106.28
X-Spam-Level:
X-Spam-Status: No, score=-106.28 tagged_above=-999 required=5 tests=[AWL=-0.281, BAYES_00=-2.599, J_CHICKENPOX_33=0.6, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id OaI5twjfkLsP; Tue, 2 Feb 2010 12:52:29 -0800 (PST)
Received: from psg.com (psg.com [147.28.0.62]) by core3.amsl.com (Postfix) with ESMTP id B00AF3A6A4E; Tue, 2 Feb 2010 12:52:29 -0800 (PST)
Received: from majordom by psg.com with local (Exim 4.71 (FreeBSD)) (envelope-from <owner-namedroppers@ops.ietf.org>) id 1NcPcp-0008PU-QY for namedroppers-data0@psg.com; Tue, 02 Feb 2010 20:45:31 +0000
Received: from [192.150.186.11] (helo=fruitcake.ICSI.Berkeley.EDU) by psg.com with esmtps (TLSv1:AES256-SHA:256) (Exim 4.71 (FreeBSD)) (envelope-from <nweaver@ICSI.Berkeley.EDU>) id 1NcPcm-0008P0-QB for namedroppers@ops.ietf.org; Tue, 02 Feb 2010 20:45:29 +0000
Received: from [IPv6:::1] (jack.ICSI.Berkeley.EDU [192.150.186.73]) by fruitcake.ICSI.Berkeley.EDU (8.12.11.20060614/8.12.11) with ESMTP id o12KjKQF021832; Tue, 2 Feb 2010 12:45:20 -0800 (PST)
References: <7c31c8cc1001271556w4918093er6e94e07cb92c4dc4@mail.gmail.com> <139D0D6A-5A31-4EE8-88B9-3CACE933187B@icsi.berkeley.edu> <6e04e83a1002010944q7abfabc6h892ce4cbb1bddcbf@mail.gmail.com> <973B1F15-E822-491E-89BF-F09FC7E67509@ICSI.Berkeley.EDU> <6e04e83a1002011109u1cd55c99k8b584648184cdc73@mail.gmail.com> <162E0DB1-EC86-4206-AB36-6FEFA786B24C@ICSI.Berkeley.EDU> <6e04e83a1002011402u395f599g74180d28fdbe5707@mail.gmail.com> <D8848FB8-3523-4580-A93F-764494531788@ICSI.Berkeley.EDU> <6e04e83a1002011640t1b637e30gd7d0150eeb0fae8d@mail.gmail.com> <E9A13A5C-73A7-4F66-9617-482551A9BA84@ICSI.Berkeley.EDU> <6e04e83a1002021155kcb908b1v71d362e03e7c4002@mail.gmail.com>
In-Reply-To: <6e04e83a1002021155kcb908b1v71d362e03e7c4002@mail.gmail.com>
Mime-Version: 1.0 (Apple Message framework v1077)
Content-Type: text/plain; charset="us-ascii"
Message-Id: <AB78D628-8A01-4742-B32A-90FC6806201E@ICSI.Berkeley.EDU>
Content-Transfer-Encoding: quoted-printable
Cc: Nicholas Weaver <nweaver@ICSI.Berkeley.EDU>, John Payne <john@sackheads.org>, Roy Arends <roy@nominet.org.uk>, Wilmer van der Gaast <wilmer@google.com>, namedroppers@ops.ietf.org
From: Nicholas Weaver <nweaver@ICSI.Berkeley.EDU>
Subject: [dnsext] Re: Privacy vs EDNS Client IP...
Date: Tue, 02 Feb 2010 12:45:20 -0800
To: Ted Hardie <ted.ietf@gmail.com>
X-Mailer: Apple Mail (2.1077)
Sender: owner-namedroppers@ops.ietf.org
Precedence: bulk
List-ID: <namedroppers.ops.ietf.org>
List-Unsubscribe: To unsubscribe send a message to namedroppers-request@ops.ietf.org with
List-Unsubscribe: the word 'unsubscribe' in a single line as the message text body.
List-Archive: <http://ops.ietf.org/lists/namedroppers/>

On Feb 2, 2010, at 11:55 AM, Ted Hardie wrote:
> Again, I think you're conflating the terms of service for a particular
> actor with the protocol semantics here.  You may be right that some
> 3rd party agreements allow one of the parties to set the ravish-me bit,
> but that doesn't make it okay to assume that all DNS users have agreed
> to the disclosure of this data.  This wasn't possible before and they
> have not opted-in.  You and I may disagree about whether or not they
> should, but let's at least be cleared that they have not.
> 
> Separately, I agree with Stephane's comments on the method for opting out
> in the draft requiring work and having potential deployment difficulties.

No, I am saying that the cases where this would be used, the users have already opted-in to the usage of a system like this, and extensions like this are already covered in the terms of service.

Do you NOT consider explicit selection of a third-party resolver a significant opt-in action?

Should DNS be the only real protocol with an opt-in privacy ability?  HTTP sure doesn't....

> "Authoritative servers wish to provide localized results based on network
> proximity.  What is the best way for interested recursive resolvers and
> authoritative servers to manage information about these mappings?"
> 
> does not automatically imply this solution nor any solution that requires
> divulging which client subnets originated specific queries.

Actually, it does, here's why:

Network proximity, with any decent quality, be it AS number, hash of AS number, subnet mask, truncated AS path, magic-pixie-number, etc, will leak similar amounts of information.  It has to, because it always has to say "the user is in this part of the network space", its just a matter of how to say it and with what precision: be it subnet, most specific AS number, or pixie-dust: they all say "where you are".  

And, like GeoIP, or Wifi Mac address -> location, or whatever, there will be databases made to unify them all, so we might as well call them roughly equivalent to start with and save us the headache of every alternate form getting the same objections from different people.


Hashing type tricks don't help either, because the search space is way too small.  With a network describable by 10K-100K+ points of locality, there simply isn't enough search space to make hashing tricks work.

QED:  Network proximity is privacy sensitive by your definition, in any form, because they are all roughly equivalent with the only difference being degrees of precision.  

IP subnet just happens to be a very convenient one, because the CDNs are already optimized around that as the metric.  But most specific AS or AS-path fragment would be effectively equivalent in information leakage.



> We seem to be talking at cross purposes here.  Let me try again.  This proposal
> would have the option be present whenever a recursive resolver talks to an
> authoritative server if it is globally set.  That means that this
> information passes
> to every authoritative server, whether the authoritative server is
> localizing information
> or not.  That means the universe of leaked data is not "Users of 3rd party DNS
> requesting info about  localized servers" but potentially "users of 3rd party
> DNS requesting info from anyone".  We don't disagree about whether someone
> running their own recursive resolver or using a local one is already disclosing
> (at least I don't think we do).
> 
> If this is not set globally, then the recursive resolver has to
> maintain a table or
> set of rules that notes when it should be sent and when it should not.  I'm not
> sure how it gets knowledge of which services are localized, so it is
> my expectation
> that anyone who turned it on would leave it on for all queries to authoritative
> servers.  I could be wrong, of course, and some other pattern might easily
> emerge.

And you seem to be missing my point:  For users of 1st party resolvers, this information or something semantically equivalent is already being leaked to ALL authorities.  

Just because only a few authorities are USING CDN-like tricks doesn't mean that all authorities aren't already receiving the same network locality information, they just are not bothering to infer anything about it. 

> 
> SOCKS 4 did not; SOCKS 5 added support for it, but in many installations
> it requires explicitly directing the DNS traffic through the tunnel.  SOCKS
> can leak DNS in cases where the tunnel is set up per application and DNS/UDP is
> not explicitly redirected.  Some other forms of proxy don't handle UDP at all.

That does not mean we should work around bugs in proxies that can't handle UDP.  Heck, tunnel your DNS over TCP then, if your proxy doesn't support UDP, to pick the route you want your DNS queries to take.

> Think of it in VPN terms for a moment.  The VPN set-up directs certain
> IP subnets through the tunnel interface, which can have effects from pointing
> default through the tunnel to setting up a small number of networks which
> are routed through the tunnel but leaving all others to be routed outside the
> tunnel.  The resources available through the tunnel can include
> local file servers, smtp servers, and other services which are not available
> from outside because of firewall restrictions.  The client needs to
> pass at least the
> DNS traffic related to those services through the tunnel because of split
> DNS.  It is easier to pass *all* DNS through that tunnel if the set of resources
> the client is interested is {globally available services, private
> services available
> through the tunnel}.  When it includes {globally available services,
> private services
> available through the tunnel, and private services available locally but not
> globally} I have to actually associate which domains are served where,
> and the default can go either to local or tunnel, depending on configuration.

And if you do so, how does this draft really affect you?  

Unless your VPN'ing institution also uses a third party managed DNS service rather than its own DNS resolver, the information all leaks out anyway to the authorities.

And if your institution does not want this information to leak out, why is it leaking it en masse to the third party provider which explicitly says its allowed to use aggregates of that information?

And no matter what, you're CDN performance will be painful since your data IP and query IP are very different, but thats just the limitation of DNS-based CDNs no matter the information.



>> And the problem is, ANY "network mapping of requestor" will probably violate your privacy scruples:
> 
> No, I'm fine with any mapping that the requester agrees to (a non-starter in
> your option)

You don't seem to consider "the user specified this non-default resolver" as agreement, however.


> *or* that is actually a network mapping that doesn't disclose
> which IP address or subnet generated the request for the mapping.  

See above, that's impossible short of noise injection, and noise injection would be pointless: you're trying to hide things from the authorities of sites you want to talk to!

Else why are you looking up the names at all?

The "This option is not exported to the TLD servers" part means only the authorities YOU are going to talk to receive this information at all.


>> The only OTHER option would be to have the authority's response contain netmask rules and force the CDN processing onto the recursive resolver, which would be a huge shift in burden.
>> 
> 
> Is the real problem here a shift in burden of, presumably, processing power
> (since the network traffic might actually be less) or of CDN secret sauce?
> Because if a zone transfer-like mechanism of mapping info can accomplish
> this there is zero privacy implication to my mind, and I would be interested
> to see which of the recursive server operators would say yes to using it.

Such zone-transfer like mechanisms would undoubtedly be constrained by pairwise agreements between authorities and third-party resolvers, because there is BOTH secret sauce issues and load issues.

So do you really want to ingrain contractual-behavior affecting performance into DNS!?!?  Do you really want contractual barriers to entry for 3rd party DNS services?



>> And if you are really concerned about privacy, I'd look at web analytics, that is far far FAR FAR more evil in spraying information around and there IS no opt-out other than technical countermeasures!  You are worried about a paper cut (subnet of requester in a DNS message using third party DNS infrastructures) when there is arterial bleeding going on.
>> 
> And this is unrelated to the work going on here.

If your objection to something that would greatly improve the ability of people to use DNS service providers other than those of the ISP only comes down to protecting privacy, scale matters.  So it is related.