Re: [privacydir] Privacy Terminology: What are useful terms?

Nick Mathewson <nickm@torproject.org> Sat, 16 July 2011 04:14 UTC

MIME-Version: 1.0
Sender: nick.a.mathewson@gmail.com
In-Reply-To: <E3350ABE-A2A1-42BB-B446-54A90A0A64BC@gmx.net>
References: <5821BF1F-0FEF-4C6C-89A5-3A33BDE4F843@gmx.net> <CAKDKvuy80Rg4S8Pju2LqU7ew27oN2MNN_Z+FjWFVDiF=aGV7aA@mail.gmail.com> <E3350ABE-A2A1-42BB-B446-54A90A0A64BC@gmx.net>
Date: Sat, 16 Jul 2011 00:14:54 -0400
Message-ID: <CAKDKvuyGcY50RV=VNn8vR81K6mQWS7gHKXD0bvARNF5nEvmzVw@mail.gmail.com>
From: Nick Mathewson <nickm@torproject.org>
To: Hannes Tschofenig <hannes.tschofenig@gmx.net>
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: quoted-printable
Cc: privacydir@ietf.org
Subject: Re: [privacydir] Privacy Terminology: What are useful terms?
Precedence: list

On Wed, Jul 13, 2011 at 5:39 AM, Hannes Tschofenig
<hannes.tschofenig@gmx.net> wrote:
>> We don't use "undetectability" or "unobservability".
>>
>>
> I have seen the unobservability term being used in security protocols when the traffic characteristics shall be hidden (via padding).
> Aren't you doing something similar in Tor? If you do, how do you call that property?

Well, we don't actually do enough padding right now to make many
interesting characteristics unobservable.  When we talk about padding,
we usually talk not in terms of making the real traffic volume
unobservable so much as we talk about making the padded view of a
traffic stream unlinkable to an unpadded view.

When we're talking in general about a particular unobservable
property, we tend to use the active voice.  Rather than say (for
example) that stream open and close events are unobservable by an
attacker watching the client, we tend to say that the attacker can't
tell when streams open and close.

(This is my sense from skimming and grepping our recent design
documentation; it may be that I'm a bit off.  Also, again, I'm not
suggesting that our practice is better than the proposed
terminology--just that it seems to be what we're doing.)

 [...]
>>   * We talk about one kind of or item being "distinguishable" from
>> another.  (For example, a protocol is "indistinguishable" from HTTPS
>> to the extent that an attacker can't tell instances of that protocol
>> from regular HTTPS connections.)
>
> Could be useful.

 [...]
>>  * We use "profiling" to mean learning information about an anonymous
>> subject's activities without necessarily linking them to any specific
>> transaction.  For example, if an attacker concludes that I play WoW,
>> read reddit.com, and upload videos, then my activities have been
>> profiled, even if the attacker is unable to identity which connections
>> or accounts are mine.
>
> Profiling may be a useful term to add.
> Btw, I searched through the Tor documents and couldn't really find a definition.

I think the closest you'd find might be in a discussion of guard
nodes.  But I don't think we ever came up with a good formal
definition.

>> Some additional terminology that I think might be idiosyncratic:
>>
>>   * We use "linkable session" to refer to a set of actions by a
>> subject that the system makes no effort to render unlinkable from one
>> another.
>
> Could you provide an example?

Sure. Take a session on a website.  The user's browser might request
resources from a number of different hosts, and might do so not only
for one but for several pages.  Nonetheless, it might make sense to
treat all the TCP connections as part of a single "session", since
each is already linked to all the rest semantically.

For a more trivial example, suppose a user is logging in to a service
pseudonymously.  Given this behavior, there's no point to try to make
different logins to the same pseudonymous account unlinkable by the
service: the service can already link them  through their use of the
same pseudonym.

>>   * We refer as a "linking identifier" to any parameter P that an
>> attacker can observe about an IOI and use to link it to similar IOIs
>> that have similar values for P.  For example, the window size header
>> transmitted in a typical HTTP request is a linking identifier.
>>
> I wasn't aware that the window size header has such a characteristic.
> Is there a paper you could recommend to learn more about this aspect?

I'd check out the findings from the EFF panopticlick project for
recent results in identifying web browsers by header.  For older
results, I'll need to look around a bit. Window size alone isn't
enough to split off traffic by users (since people resize windows) and
isn't enough to isolate users (since lots of people can share a single
window size), but it helps plenty with statistical linking.

cheers,
-- 
Nick

[privacydir] Privacy Terminology: What are useful… Hannes Tschofenig
Re: [privacydir] Privacy Terminology: What are us… Nick Mathewson
Re: [privacydir] Privacy Terminology: What are us… Hannes Tschofenig
Re: [privacydir] Privacy Terminology: What are us… Nick Mathewson