Re: [Ideas] Addressing the privacy issues exposed by IDEAS

Robert Moskowitz <rgm-ietf@htt-consult.com> Wed, 18 October 2017 16:29 UTC

Return-Path: <rgm-ietf@htt-consult.com>
X-Original-To: ideas@ietfa.amsl.com
Delivered-To: ideas@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7478C133055 for <ideas@ietfa.amsl.com>; Wed, 18 Oct 2017 09:29:07 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.2
X-Spam-Level:
X-Spam-Status: No, score=-4.2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 77Eo9OWWt_pr for <ideas@ietfa.amsl.com>; Wed, 18 Oct 2017 09:29:05 -0700 (PDT)
Received: from z9m9z.htt-consult.com (z9m9z.htt-consult.com [50.253.254.3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id BBE7813304E for <ideas@ietf.org>; Wed, 18 Oct 2017 09:29:05 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1]) by z9m9z.htt-consult.com (Postfix) with ESMTP id 233BD60944; Wed, 18 Oct 2017 12:29:04 -0400 (EDT)
X-Virus-Scanned: amavisd-new at htt-consult.com
Received: from z9m9z.htt-consult.com ([127.0.0.1]) by localhost (z9m9z.htt-consult.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id z8Kr2+xcBZ74; Wed, 18 Oct 2017 12:28:56 -0400 (EDT)
Received: from lx120e.htt-consult.com (unknown [192.168.160.12]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by z9m9z.htt-consult.com (Postfix) with ESMTPSA id 56E6A62201; Wed, 18 Oct 2017 12:28:55 -0400 (EDT)
To: Tom Herbert <tom@herbertland.com>
Cc: "ideas@ietf.org" <ideas@ietf.org>
References: <9155d3fe-cbe2-ae2d-9c59-f3dee85b1409@htt-consult.com> <CALx6S371UYq027pvVYTS2F0UE8kknd7LmTk-0z7KAQwu8=q5=w@mail.gmail.com>
From: Robert Moskowitz <rgm-ietf@htt-consult.com>
Message-ID: <f0600b4d-7b51-07b8-b5e1-9dc20dafdee2@htt-consult.com>
Date: Wed, 18 Oct 2017 12:28:51 -0400
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1
MIME-Version: 1.0
In-Reply-To: <CALx6S371UYq027pvVYTS2F0UE8kknd7LmTk-0z7KAQwu8=q5=w@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 7bit
Content-Language: en-US
Archived-At: <https://mailarchive.ietf.org/arch/msg/ideas/ntwuLgJTuBXPRDpl86ueYkRmxKw>
Subject: Re: [Ideas] Addressing the privacy issues exposed by IDEAS
X-BeenThere: ideas@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Discussions relating to the development, clarification, and implementation of control-plane infrastructures and functionalities in ID enabled networks." <ideas.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ideas>, <mailto:ideas-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ideas/>
List-Post: <mailto:ideas@ietf.org>
List-Help: <mailto:ideas-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ideas>, <mailto:ideas-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 18 Oct 2017 16:29:07 -0000

Tom,

Excellent analysis.  Thanks for this.

Bob

On 10/18/2017 12:18 PM, Tom Herbert wrote:
> On Wed, Oct 18, 2017 at 6:04 AM, Robert Moskowitz
> <rgm-ietf@htt-consult.com> wrote:
>> I chose the subject line carefully as you will see by my analysis of the
>> privacy issue(s).  I have discussed this with Padma before bringing this up
>> to the list.
>>
>> Here is the privacy attack, as I see it:
>>
>> It is fairly well established that web sites collect a lot of personal
>> information and information about the device(s) connected to that personal
>> information.  IP addresses, even the actual address of NATed clients are
>> part of the harvest.  Mal and his cousins are busy stealing this information
>> and putting it all together in their own big data pile.
>>
>> Meanwhile, Eve and her cousins are busy watching the network and seeing
>> which IP addresses are communicating with other IP addresses.  Eve and
>> cohorts put their data together with Mal and cohorts' data and then is able
>> to note that:  "Hey look, Alice is talking directly with Barb."  Oh, look,
>> they both moved to new addresses, but we can see it is still Alice and Barb.
>>
>> What is going on here?
>>
>> ID/Loc technologies, enhanced with IDEAS technology, will make Peer-to-Peer
>> communications without any triangular routing achievable.  As long as these
>> P2P communications use the same IP addresses as used in web Client/Server
>> communications, the linkage is there to the privacy leakage occuring through
>> those websites.
>>
>> Three things have to happen to protect the privacy of P2P communications
>> from the swamp of privacy leakage in C/S communications.
>>
>> Identities need to be masked/hidden by both the ID/Loc technologies and
>> IDEAS.
>>
>> Identifiers of all ilk, both in the control channel and the data channel
>> need to change with each move using some Perfect Forward Secrecy (PFS)
>> technology.
>>
>> Multiple IP addresses MUST be used, at least separating the P2P from C/S
>> communications.  Different addresses for different P2P connections is wise.
>>
> Bob,
>
> It's more than just using multiple addresses. Today carriers are
> assigning multiple addresses giving /64s so that a UE is getting 2^64
> addresses. The problem is that this is done by a prefix assignment for
> each device which means the device is easily tracked by that. What we
> want are multiple addresses with some specific properties for privacy.
>
> Here the properties of addresses that I came up with:
>
>       o They are composed of a global routing prefix and a suffix that
>          is internal to an organization or provider. This is the same
> property for IP
>          addresses [RFC3513].
>
>        o The registry and organization of an address can be determined by
>          the network prefix. This is true for any global address.
>
>        o The organizational bits in the address should have minimal
>          hierarchy to prevent inferences. It might be reasonable to have
>          an internal prefix that divides identifiers based on broad
>          geographic regions, but detailed information such as location,
>          department in an enterprise, or device type should not be
>          encoded in a globally visible address.
>
>        o Given two addresses and no other information, the
>          desired properties of correlating them are:
>
>           o It can be inferred if they belong the same organization and
>             registry. This is true for any two global IP addresses.
>
>           o It may be inferred that they belong to the same broad
>             grouping, such as a geographic region, if the information is
>             encoded in the organizational bits of the address.
>
>           o No other correlation can be established. For example, it
>             cannot be inferred that the IP addresses address the same
>             node, the addressed nodes reside in the same subnet, rack, or
>             department, or that the nodes for the two addresses have any
>             geographic proximity to one another.
>
>> Note that if IDEAS-ID/Loc does everything to hide and confuse
>> Identity/Identifier, it is all for naught if multiple IP addresses are not
>> used.  At this point I should mention that TLS 1.3 may have a similar
>> privacy risk, but that is for a different soapbox.
>>
>> Action plan:
>>
>> The IDEAS charter should say something like:
>>
>> "IDEAS will act as an enabling technology for the various ID/Loc
>> technologies currently specified within the IETF.  As such it will result in
>> a wider deployment of, mobile, Peer to Peer communications.  Care will be
>> taken in the design of the IDEAS technology not to enable the privacy
>> leakage attacks in current Client/Server (predominately web-based) to be
>> linked to these P2P communications."
>>
>> This means that whatever technology we come up for IDEAS will mask/hide
>> PII/Identity/Identifier.  So that Eve is in the dark and we need only defend
>> the IDEAS data store from Mal.
>>
>> Each ID/Loc technolgy (and this means ME with HIP) will need revisions to
>> both their control and data plane (this means ESP for HIP) to change how
>> Indentity and Identifiers are handled to break privacy tracking by Eve.
>> This may require using IDEAS as an enabler of privacy functions (I suspect I
>> will need it in HIP to deal with the HI in the R1 packet).  TLS 1.3 may also
>> need revisions with its zero RT method.
>>
>> The final, and potentially big one that is outside the IETF's control is
>> that OSs and ISPs MUST enable support for multiple addresses per host and
> ISP support requires a protocol to do bulk address assignment. This is
> supported with DHCP, although it would be nice to have a method to
> compress addresses in a response to 64 bits (identifiers) assuming
> they all have a common 64 bit prefix. Of course Android doesn't
> support DHCPv6 so they're going to need to be convinced that /128
> address assignments are a leap forward.
>
> OSes support multiple addresses to be configured on an interface
> (order of 1000s). But the use of addresses needs to change to support
> privacy. The concept of different address per outgoing connection
> needs to be implemented. The semantics of INADDR_ANY need to be
> modified to restrict the addresses allowed for incoming connections
> (this is already be worked on container virtualization). There's also
> a few "philosophical" questions relating to expected uses of any
> assigned address-- like how to deal with ICMP. For instance, should
> all of the addresses assigned to a device respond to ping?
>
>> let technologies within the hosts (like ID/Loc) to get addresses to provide
>> privacy separation.  This ALSO extends to MAC addresses!  Eve could be
>> tapping into those IPFIX flows (now there is a BIG privacy leakage attack
>> that no one is talking about) and getting all the MAC/IP address mappings!
>>
> RFC4941 talks about the problem of embedding IEEE identifiers into
> IPv6 addresses. That practice is no longer considered acceptable. In
> some sense, identifier-locator takes this it's logical extreme where
> the "identifier" used to create addresses changes at the time
> granularity of every new connection.
>
>> One caveat that makes the multiple address not so big of a challenge is that
>> ISPs are already providing some level of multiple address support by
>> allowing hotspot usage on the mobile devices.  The IP address seen on the
>> network MAY be from a given device or a device using it as a gateway.  This
>> will become increasingly more common with automotive hotspots.  But this is
>> NOT something we should count on as a mitigation of this privacy attack.
>>
> I was thinking about this problem. The normal way to implement a hot
> spot is to give a device a prefix and delegate addresses from that
> prefix. But that means the prefix is encoded in addresses which breaks
> the address privacy properties above. I think the alternative is to
> just to assign a host spot a whole bunch of /128 addresses and let
> them do what they please with them. They can delegate addresses to the
> their tethered clients.  So devices in the identifier-locator network
> may each be assigned 1000s of addresses, and device that are hot spots
> for many clients may end up needing 100s of thousands or more. The net
> result is that the mapping system is going to need to scale to very
> large numbers, I am assuming the system will need to track more than
> 1T identifiers at scale. Not going to be easy :-)
>
> Tom
>
> _______________________________________________
> Ideas mailing list
> Ideas@ietf.org
> https://www.ietf.org/mailman/listinfo/ideas