Re: [arch-d] Possible IAB Adoption of draft-kpw-iab-privacy-partitioning

Mirja Kuehlewind <ietf@kuehlewind.net> Tue, 13 December 2022 16:33 UTC

Return-Path: <ietf@kuehlewind.net>
X-Original-To: architecture-discuss@ietfa.amsl.com
Delivered-To: architecture-discuss@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3D499C1522D0; Tue, 13 Dec 2022 08:33:55 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.893
X-Spam-Level:
X-Spam-Status: No, score=-1.893 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Hh84mt15mfpY; Tue, 13 Dec 2022 08:33:51 -0800 (PST)
Received: from wp513.webpack.hosteurope.de (wp513.webpack.hosteurope.de [80.237.130.35]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 0FC0DC14CE47; Tue, 13 Dec 2022 08:32:59 -0800 (PST)
Received: from dslb-002-202-026-091.002.202.pools.vodafone-ip.de ([2.202.26.91] helo=smtpclient.apple); authenticated by wp513.webpack.hosteurope.de running ExIM with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) id 1p58DC-0006sP-3n; Tue, 13 Dec 2022 17:32:58 +0100
Content-Type: multipart/alternative; boundary="Apple-Mail=_B637849A-6599-46CF-8CE3-A4823494E425"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.1\))
From: Mirja Kuehlewind <ietf@kuehlewind.net>
X-Priority: 3
In-Reply-To: <797875861.71155.1668701339665@appsuite-gw1.open-xchange.com>
Date: Tue, 13 Dec 2022 17:32:57 +0100
Cc: architecture-discuss@ietf.org
Message-Id: <E86966FF-9649-4E51-BB33-546E12A15781@kuehlewind.net>
References: <166862348898.27211.16338265887689375983@ietfa.amsl.com> <797875861.71155.1668701339665@appsuite-gw1.open-xchange.com>
To: Vittorio Bertola <vittorio.bertola=40open-xchange.com@dmarc.ietf.org>
X-Mailer: Apple Mail (2.3696.120.41.1.1)
X-bounce-key: webpack.hosteurope.de;ietf@kuehlewind.net;1670949180;abfeae04;
X-HE-SMSGID: 1p58DC-0006sP-3n
Archived-At: <https://mailarchive.ietf.org/arch/msg/architecture-discuss/9dqLyHDDhncnJawrXwanbwcXqYw>
Subject: Re: [arch-d] Possible IAB Adoption of draft-kpw-iab-privacy-partitioning
X-BeenThere: architecture-discuss@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: open discussion forum for long/wide-range architectural issues <architecture-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/architecture-discuss>, <mailto:architecture-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/architecture-discuss/>
List-Post: <mailto:architecture-discuss@ietf.org>
List-Help: <mailto:architecture-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/architecture-discuss>, <mailto:architecture-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 13 Dec 2022 16:33:55 -0000

Hi Vittorio,

I see that we already have an issue to I believe address your main point. I still would like to comment on some point below as I don’t think you interpreted the discussion in the draft entirely right.

> On 17. Nov 2022, at 17:08, Vittorio Bertola <vittorio.bertola=40open-xchange.com@dmarc.ietf.org> wrote:
> 
>> Il 16/11/2022 19:31 CET IAB Executive Administrative Manager <execd@iab.org> ha scritto:
>> 
>> Feedback about this draft can be sent in response to this mail on architecture-discuss@ietf.org, or to the IAB directly at iab@iab.org.
> 
> I will repeat and detail what I said when this document was presented in London.
> 
> The statement that this model is "a means to improve the privacy by separating user identity from user data" only stands under some conditions.
> 
> One is the lack of collusion between intermediaries that should sit in independent contexts. The draft acknowledges this in section 5.1, yet the mitigations suggested seem actually counterproductive.
> 
> Contractual agreements make entities more interdependent, not more independent; if both companies live off user profiling and big data collection, they will always have a vested interest in sharing data and the fact that they have a (confidential and unverifiable) contract swearing that they will never do so is only as good as the trust you are willing to put in these companies. Moreover, it is unclear why any party would support a significant technical cost (how much does it cost to route all that traffic?) for no clear reason and, in the absence of user data monetization, no clear revenue stream, except maybe direct payment for this service, which is generally not a popular business model for consumer Internet services.

The text says "Policy and contractual agreements between entities involved in partitioning”. This means e.g. the client might have a contractual relationship with the operator of the relay (e.g. it could be operated by the access network provider). In this case there is already trust to the entity and there is an incentive for the operator to maintain this trust and provide a good service.

> 
> The other suggestion is to just add more and more intermediaries, and this is really baffling. First, privacy is about reducing the number of entities that have a chance to access your data, not increasing it. Second, I am lost at what is now the vision of the IETF's technical leadership on intermediaries; I've seen continuous bashing of the very idea of in-network intermediation for years, including drafts that propose principles to remove or at least reduce existing intermediaries as far as possible (e.g. https://datatracker.ietf.org/doc/draft-thomson-tmi/ ), and here is one that actually proposes to take traffic that was happily scattered throughout the Internet and make it all go through new intermediaries.

Yes, this draft and the protocols in work in the IETF that are discussed in the draft will add more intermediates. However, one important point is that those intermediates are explicitly involved by the clients action (in contrast to NATs or transparent proxies). Further, this document does not propose to spread all your information to all entities. Instead it assumes a certain set of information that is worth sharing (because the client wants to use a certain service) and proposed to divide/partition that information among multiple entities, so each entities see less than when all information is only shared with one entity. If you share the same information with multiple or all entities, this will decrease your privacy but that not what the draft says. Also of course you can partition your data in a (wrong) way that reduces privacy and that’s definitely the hard part as discussed in the section on limits.

> 
> In the end, I think that collusion will always be a very significant risk, as the economic incentives will generally be aligned in favour of it, and this looks like a fundamental flaw of the idea.
> 
> But then, there is a second condition that could address all of this and make the model viable: that we would realistically have a very big number of intermediate proxy providers spread across the globe, and that users would be free to pick and to change the ones they want to use, or even distribute their traffic across many at the same time.
> 
> In this case, we could assume that users would make sure that the services they pick are actually independent and unlikely to collude, and could look for the ones that do not have user data monetization as a business model. This would also meet the requirements of most privacy legislations, which put the user in general control of who gets to see their personal data.
> 
> However, I really see no discussion of this problem anywhere, let alone the recognition that user control is the best antidote against misuses of this scheme. Also, the early implementations are often the exact opposite of this, as, while being opt-in, they do not seem to give any control to users, not even hidden under several layers of configuration menus.
> 
> So, if designed this way, this architecture seems like another big step forward for centralization, this time by routing the entire traffic of each user through a limited set of giant proxy platforms run by the only few companies that can afford to deploy them.
> 
> Actually, the document often seems to consider this expected consolidation as a given, for example when, in section 6 point 2, dismisses concerns about performance losses by assuming a likely "highly optimized nature of proxy-to-proxy paths", i.e. the proxies being in limited numbers and sitting at the core of the fastest global networks.
> 
> If we want to prevent this, then how to facilitate the easy deployment of a big number of independent proxies, and how to allow users to discover and choose among them, should be a main concern of the document; and also, how could the system be designed so that the economic incentives are against centralization and against collusion, and not in favour.

I would actually say that the problem of centralisation is really a separate problem from what this document is trying to do. For me the value of this document is to spell out a principle that we already have observed to be used in many current protocols in the IETF. By giving it a name I believe we will be better able in the community to apply it correctly and analyse its implications.

However, as explained above, a strong assumption in this document is that the client has control. How this is implemented as an interface to the user is a different question and I agree a general problem that we don’t have a great solution for. Also as you note, of course it depends on he deployment how much choice there actually is. But at the end the goals is to actually expose less sensitive data, as also explained above. So in most of the examples in the draft, it is actually not the case that the relay operator can learn sensitive data about the user which also makes it less sensitive who exactly operates this relay (e.g. on bit player or many small). The more important part is the trust to the relay operator to not collude and I guess if you as a user already have a certain contractual relation with that entity, the trust is higher.

I guess we could add a point on centralisation to the section on impacts, however again, it’s not this principle that we spell out in the draft that drives centralisation but other effects about how our protocols are deployed and used. I agree with you that discovery might be an important point but that is true for all new protocols we develop. 

> 
> Hope this is useful; I'd recommend that the IAB clarifies the vision and the objectives before adopting the document.

I guess we could add a sentence that the scope of this document is limited to the definition of the principle but I really wouldn’t want to expand the scope to discuss intermediates or centralisation. We do have separate documents for that.

Mirja



> 
> -- 
> Vittorio Bertola | Head of Policy & Innovation, Open-Xchange
> vittorio.bertola@open-xchange.com 
> Office @ Via Treviso 12, 10144 Torino, Italy
> 
> _______________________________________________
> Architecture-discuss mailing list
> Architecture-discuss@ietf.org
> https://www.ietf.org/mailman/listinfo/architecture-discuss
>