[saag] Liking Linkability

Henry Story <henry.story@bblfish.net> Mon, 08 October 2012 17:01 UTC

Return-Path: <henry.story@bblfish.net>
X-Original-To: saag@ietfa.amsl.com
Delivered-To: saag@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3DD4B21F87E5 for <saag@ietfa.amsl.com>; Mon, 8 Oct 2012 10:01:31 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.67
X-Spam-Level:
X-Spam-Status: No, score=-2.67 tagged_above=-999 required=5 tests=[AWL=-0.930, BAYES_20=-0.74, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id QE9wzV188qHh for <saag@ietfa.amsl.com>; Mon, 8 Oct 2012 10:01:29 -0700 (PDT)
Received: from mail-wg0-f44.google.com (mail-wg0-f44.google.com [74.125.82.44]) by ietfa.amsl.com (Postfix) with ESMTP id 60DBB21F84D4 for <saag@ietf.org>; Mon, 8 Oct 2012 10:01:29 -0700 (PDT)
Received: by mail-wg0-f44.google.com with SMTP id dr13so2586895wgb.13 for <saag@ietf.org>; Mon, 08 Oct 2012 10:01:28 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=subject:mime-version:content-type:from:resent-from:date:cc :resent-date:message-id:resent-to:to:x-mailer:x-gm-message-state; bh=qjXhrDLSoqObfO26jEJdi+y0Qm/8i9tFgXHjItfon1o=; b=nKjvogEft/cPIaITD10KjPsQi4jKicXg9BBf3AnMBXx7oN/cLlleR8EiapIlarmLFE 65xG4DRmGsZwDC3BG+mia8SpRxp3QaKK9ljmr5DYiJ9gnmBNRj2QDbUfo1D3mz0QOfn6 4iz9zZAeBar+DjpdJROzgBgfU8m8811E+GABmMm2zul8Z69H6X4ZwnFEkv0h82e/vIpo OKkF2rhAYAOaTk5kqwPn88XGH2Wm11qlS0v2X07IPVkKNzJjj6SG7uyKNJ0XslHFONtA cpT3ZUBOSCe/jKqSUPH9clpNYLwh+Ec9YRuHvkAtLFpOQsImY4xcEb/a4fAwPnDoRa0X rzgA==
Received: by 10.180.87.132 with SMTP id ay4mr23331972wib.5.1349715688239; Mon, 08 Oct 2012 10:01:28 -0700 (PDT)
Received: from bblfish.home (AAubervilliers-651-1-329-35.w83-114.abo.wanadoo.fr. [83.114.96.35]) by mx.google.com with ESMTPS id w8sm20508745wif.4.2012.10.08.10.01.18 (version=TLSv1/SSLv3 cipher=OTHER); Mon, 08 Oct 2012 10:01:26 -0700 (PDT)
Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\))
Content-Type: multipart/signed; boundary="Apple-Mail=_B5CDDCC9-4108-4E92-B8B2-FB428062FB48"; protocol="application/pkcs7-signature"; micalg="sha1"
From: Henry Story <henry.story@bblfish.net>
Resent-From: Henry Story <henry.story@bblfish.net>
Date: Sat, 06 Oct 2012 15:49:12 +0200
Resent-Date: Mon, 08 Oct 2012 19:01:16 +0200
Message-Id: <88F98DFD-EF7D-4444-A9C2-FB8E15F5DA89@bblfish.net>
Resent-To: saag@ietf.org
To: "public-webid@w3.org" <public-webid@w3.org>, "public-identity@w3.org" <public-identity@w3.org>, public-privacy@w3.org
X-Mailer: Apple Mail (2.1499)
X-Gm-Message-State: ALoCoQndv38P7Nwmte6dbOQc9rWSjivaeDiqNykr492EJMR6+gGB1MOo7IpihwIRPbqlpO2Qq0aM
Resent-Message-Id: <20121008170129.60DBB21F84D4@ietfa.amsl.com>
Cc: "public-philoweb@w3.org" <public-philoweb@w3.org>
Subject: [saag] Liking Linkability
X-BeenThere: saag@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Security Area Advisory Group <saag.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/saag>, <mailto:saag-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/saag>
List-Post: <mailto:saag@ietf.org>
List-Help: <mailto:saag-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/saag>, <mailto:saag-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Oct 2012 17:01:31 -0000

Notions of unlinkability of identities have recently been deployed 
in ways that I would like to argue, are often much too simplistic, 
and in fact harmful to wider issues of privacy on the web.

I would like to show this in two stages:
1. That linkability of identity is essential to electronic privacy 
   on the web
2. Show an example of an argument by Harry Halpin relating to 
linkability, and by pulling it apart show how careful one has 
to be with taking such arguments at face value

Because privacy is the context in which the linkability or non linkability
of identities is important, I would like to start with a simple working 
definition of what constitutes privacy with the following minimal 
criterion [0] that I think everyone can agree on:

"A communication between two people is private if the only people 
who are party to the conversation are the two people in question. 
One can easily generalise to groups: a conversation between groups 
of people is private (to the group) if the only people who can 
participate/read the information are members of that group"

Note that this does not deal with issues of people who were privy to 
the conversation later leaking information voluntarily. We cannot 
technically legislate good behaviour, though we can make it possible 
for people to express context. [1]


1. On the importance of linkability of identities to privacy 
============================================================

A. Issues of Centralisation
---------------------------

We can put this with the following thought experiment which I put
to Ben Laurie recently [0].

First imagine that we all are on one big social network, where 
all of our home pages are at the same URL. Nobody could link
to our profile page in any meaningful way. The bigger the network
the more different people that one URL could refer to. People 
that were part of the network could log in, and once logged in
communicate with others in their unlinkable channels. 

But this would not necessarily give users of the network privacy: 
simply because the network owner would be party to the conversation 
between any two people or any group of people. Conversations 
that do not wish the network owner to be party to the conversation
cannot work within that framework. 

At the level of our planet it is clear that there will always be a 
huge number of agents that cannot for legal or other reasons allow one 
global network owner to be party to all their conversations. We are 
therefore socio-logically forced into the social web.

B. Linkability and the Social Web
---------------------------------

Secondly imagine that we now all have Freedom Boxes [4], where
each of us has full control over the box, its software, and the
data on it. (We take this extreme individualistic case to emphasise
the contrast, not because we don't acknowledge the importance of
many intermediate cases as useful) Now we want to create a 
distributed social network - the social web - where each of us can 
publish information and through access control rules limit who can 
access each resource. We would like to limit access to groups such
as:

 - friends 
 - friends of friends
 - family
 - business colleagues
 - ... 

Limit access means, that we need to determine when accessing a 
resource who is accessing it. For this we need a global identifier
so that can check with the information available to us, if the 
referent of that identifier is indeed a member of one of those 
groups. We can't have a local identifier, for that would require
that the person we were dealing with had an account on our private
box - which will be extremely unlikely. We therefore need a way 
to identify - pseudonymously if be - agents in a global space.

Take the following example. Imagine you come to the WebID TPAC
meeting [6] and I take a picture of everyone present. I would like
to first restrict access to the picture to only those members who
were present. Clearly if I only used local identifiers, I would have
to get each one of you to first create an account on my machine. But 
how would I then know that the accounts created on the FBox correspond
to the people who were at the party? It is much easier if we could
create a party members group and publish it like this

  http://www.w3.org/2005/Incubator/webid/team.n3

Then I could drag and drop this group on the access control panel
of my FBox admin console to restrict access to only those members.
This shows how through linkability I can restrict access and 
increase privacy by making it possible to link identities in a distributed
web. It would be quite possible furthermore for the above team.n3
resource to be protected by access control.


2. Example of how Unlinkability can be used to spread FUD 
=========================================================


So here I would like to show how fears about linkability can
then bring intelligent people like Harry Halpin to make some seemingly
plausible arguments. Here is an example [2] of Harry arguing against
W3C WebID CG's http://webid.info/spec/ 

[[
Please look up "unlinkability" (which is why I kept referencing the 
aforementioned IETF doc [sic [3] below it is a draft] which I saw 
referenced earlier but whose main point seemed missed). Then explain 
how WebID provides unlinkability. 

Looking at the spec - to me, WebID doesn't as it still requires 
publishing your public key at a URI and then having the relying party go 
to your identity provider (i.e. your personal homepage in most cases, 
i.e. what it is that hosts your key) in order to verify your cert, which 
must provide that URI in the SAN in the cert. Thus,  WebID does not 
provide unlinkability. There's some waving of hands about guards and 
access control, but that would not mediate the above point, as the HTTP 
GET to the URI for the key is enough to provide the "link".

In comparison, BrowserID provides better privacy in terms of 
unlinkability by having the browser in between the identity provider and 
the relying party, so the relying party doesn't have to ping the 
identity provider for identity-related transactions. That definitely 
helps provide unlinkability in terms of the identity provider not 
needing to knowing every time the user goes to a relying party.
]]

If I can rephrase the point seems to be the following: A WebID verification 
requires that the site your are authenticating to ( The Relying Party ) verify
your identity by dereferencing ( let me add: anonymously ) your profile 
page, which might only contain as much as your public key publicly. The yellow 
box in the picture here:

 http://www.w3.org/2005/Incubator/webid/spec/#the-webid-protocol

The leakage of information then would not be towards the Relying Party - the
site you are logging into - because that site is the one you just wilfully 
sent a proof of your identity to. The leakage of information is (drum roll) 
towards your profile page server! That server might discover ( through IP address 
sniffing  presumably ) which sites you might be visiting. 

One reasonable answer to this problem would be for the Relying Party to fetch 
this information via Tor which would remove the ip address sniffing problem.

But let us develop the picture of who we are loosing (potentially) 
information to. There are a number of profile server scenarios: 

A. Profile on My Freedom Box [4]

 The FreedomBox is a personal machine that I control, running
free software that I can inspect. Here the only person who has
access to the Freedom Box is me. So if I discover that I logged
in somewhere that should come as no surprise to me. I might even
be interested in this information as a way of gathering information
about where I logged in - and perhaps also if anything had been 
logging in somewhere AS me. (Sadly it looks like it might be
difficult to get much good information there as things stand 
currently with WebID.)

B. Profile on My Company/University Profile Server

As a member of a company, I am part of a larger agency, namely the 
Company or University who is backing my identity as member of that
institution. A profile on a University web site can mean a lot more
than a profile on some social network, because it is in part backed
by that institution. Of course as a member of that institution we
are part of a larger agent hood. And so it is not clear that the institution
and me are in that context that different. This is also why it is 
often legally required that one not use one's company identity for
private business.

C. A Social Network ( Google+, Facebook, ... )

 It is a bit odd that people who are part of these networks, and who
are "liking" pretty much everything on the web in a way that is clearly
visible and is encouraged by those networks to be visible to the 
network, would have an issue with those sites knowing-perhaps (if the 
RP does not use Tor or a proxy) where they are logging into. It is certainly
not the way the OAuth, OpenID or other protocols that are in extremely 
wide use now have been developed and are used by those sites.

If we look then at BrowserId [7] Now Mozilla Persona, the only difference 
really with WebID ( apart from it not being decentralised until crypto in the
browser really works ) is that the certificate is updated at short notice 
- once a day - and that relying parties verify the signature. Neither of course
can the relying party get much interesting attributes this way, and if it did
then the whole of the unlinkability argument would collapse immediately.


3. Conclusion
=============

Talking about privacy is like talking about security. It is a breeding ground 
for paranoia, which tend to make it difficult to notice important
solutions to the problem we actually have. Linkability or unlinkability as defined in
draft-hansen-privacy-terminology-03 [3] come with complicated definitions,
and are I suppose meant to be applied carefully. But the choice of "unlinkable"
as a word tends to help create rhethorical short cuts that are apt to hide the 
real problems of privacy. By trying too hard to make things unlinkable we are moving 
inevitably towards a centralised world where all data is in big brother's hands. 

I want to argue that we should all *Like* Linkability. We should
do it  aware that we can protect ourselves with access control (and TOR) 
and realise that we don't need to reveal anything more than anyone knew 
before hand in our linkable profiles.

To create a Social Web we need a Linkable ( and likeable ) social web.
We may need other technologies for running Wikileaks type set ups, but
the clearly cannot be the basic for an architecture of privacy - even
if it is an important element in the political landscape.

Henry

[0] this is from a discussion with Ben Laurie
    http://lists.w3.org/Archives/Public/public-webid/2012Oct/att-0022/privacy-def-1.pdf
[1] Oshani's Usage Restriction paper 
   http://dig.csail.mit.edu/2011/Papers/IEEE-Policy-httpa/paper.pdf
[2] http://lists.w3.org/Archives/Public/public-identity/2012Oct/0036.html
[3] https://tools.ietf.org/html/draft-hansen-privacy-terminology-03
[4] http://www.youtube.com/watch?v=SzW25QTVWsE
[6] http://www.w3.org/2012/10/TPAC/
[7] A Comparison between BrowserId and WebId
  http://security.stackexchange.com/questions/5406/what-are-the-main-advantages-and-disadvantages-of-webid-compared-to-browserid


Social Web Architect
http://bblfish.net/