Re: [perpass] Traffic analysis

Ned makes a number of excellent points here and the real elephant in the
room is under what terms and conditions is the ongoing collection of
metadata about IP communications in any form actually needed and in fact
absolutely necessary. 

There are perfectly good reasons to collect this stuff.   Though the ongoing
concern of this list is clearly the Snowden revelations some of us actually
want that data to prevent and investigate real and legitimate fraud and
abuse within the communications systems, optimize network transport etc. 

First if you take a little stroll over to the IETF STIR problem statement
you will see that fraudulent voice communications is becoming a huge problem
for National Regulators and Law Enforcement.   In the US the failure of
Rural Calls to certain areas now requires the US carriers to maintain ever
larger CDR records in order to preserve the integrity of the PSTN itself.   

http://www.fcc.gov/document/fcc-acts-combat-call-completion-problems-rural-a
merica

Consumers are totally outraged by the violations of THEIR PRIVACY... the
right to be left alone... by malicious Robo Callers who ignore the various
Laws about Do Not Call lists etc.   E-Mail spam has not gone away by any
account but the need for logs and records to attempt to track criminal
activity is still required.  We want to hunt these people down and shut down
their operations.  

The issue is appropriate safeguards on those records and there is
essentially nothing the IETF can do about that.  

It is useful to talk about strengthening key length and understanding to
underlying archectural reasons no one really wants to deploy secure
communications. 

I totally agree with this statement. " I think there are small technical
changes around the edges that can help, but I really see the solutions for
the metadata problem as more political and social than technical.
Concentrating on making encryption really, really easy to use would go a lot
further at this time than messing with deep changes, because people are not
even using what is already available."

Though I would not have used Tony's precise language on a public mail.  I'm
afraid I agree with the underlying sentiment.   

There are more than one joke running around Washington DC about actually
wanting the NSA to keep the CDR records if they would actually use them to
stop robo calls and call spoofing.  

-----Original Message-----
From: perpass-bounces@ietf.org [mailto:perpass-bounces@ietf.org] On Behalf
Of ned+perpass@mrochek.com
Sent: Monday, October 28, 2013 2:30 PM
To: Joe St Sauver
Cc: perpass@ietf.org; huitema@huitema.net; stephen.farrell@cs.tcd.ie
Subject: Re: [perpass] Traffic analysis

> Hi,

> Stephen Farrell <stephen.farrell@cs.tcd.ie>:

> #Not quite sure, but I think we might get some benefit at the #moment 
> from considering how specific fields in real protocols #undermine 
> privacy (e.g. as Christian's draft does with the #Received header 
> fields in mail messages) even if/when TLS or #other existing security 
> mechanisms are properly used.

> My concern is that many traffic analytic approaches tend to be 
> exceedingly robust to "protocol improvements." Protocol tweaks may 
> accomplish little when it comes to practically improving privacy if 
> the underlying protocol's architecture and operational practice goes 
> unchanged.

> For example, when it comes to email, shouldn't section 6 of 
> http://huitema.net/papers/draft-huitema-perpass-analthreat-00.txt
> basically say, "if you want to avoid traffic analytic approaches in 
> the case of email, deploy and use Mixmaster anonymous remailers"?
> ( 
> https://en.wikipedia.org/wiki/Anonymous_remailers#Untraceable_remailer
> s )

And good luck with that, at least on any kind of scale.

But your underlying point is very well taken: The section on email in this
draft focuses on irrelevancies and fails to take note of the real issues.

I hate to sound like a broken record, but folks really need to have some
familiarity with present-day email as it is actually deployed before making
these sorts of asssessments.

Again, present day email usage is increasingly concentrated to a fairly
small number of large ISPs and MSPs. (Small ISPs and enterprise setups are
shifting to using cloud services, and while the Snowden revelations may have
slowed this trend, they haven't stopped it.)

In regards to traffic analysis, this is in some ways a good thing. If the
connections from user clients to the ISP/MSP servers are secured at the
transport layer - and I have demonstrated that a lot of them are - then we
gain a lot by securing the streams between the large providers at the
transport level.

But the elephant in the corner is logging. Service providers maintain very
extensive logs of email traffic, if for no other reason than as a support
tool. These logs provide every possible detail needed for traffic analysis.

Of course one of the earliest Snowden revelations was that the NSA is
collecting these logs from US providers on a massive scale. And hopefully
everyone is aware of Smith v. Maryland, which essentialls says that metadata
is not constitutionally protected.

But before Eupopeans and others get all smug about this, speaking as someone
who has seen quite a few RFPs for mail systems, the only substantive
difference I see between the US and elsewhere is the US approaches this in a
less organized and systematic way and generally has fewer auditing and data
protection requirements. The data is still being collected, and most likely
shareed.

And as for practical and deployable measures that can be undertaken to
address this, I'm at something of a loss to suggest anything. Shifting back
to a more decentralized model sounds nice, but seems a bit outside the
purview of a standards process to try and make that happen.

And even if it a completely decentralized model was practical, in a
peer-to-peer world the metadata that would accrue from watching the
connections themselves would be a fair substitute.

As for mixed models, look at what happened to Lavabit.

> And if we *are* talking about that sort of approach, then I think 
> inevitably we also need to talk about how we simultaneously manage to 
> allow *wanted* private traffic while simultaneously preventing or 
> managing *unwanted traffic* (e.g., spam).

Yep. It's a daunting problem. And it is far from the only one.

> An awful lot of current anti-spam technology depends upon either 
> reputation (which is obviously not present in the case of 
> anonymous/non-attributable traffic), or content analysis (which is 
> also obviously problematic, at least if we presume use of end-to-end 
> encryption (at least until the content is decrypted on the end-user's 
> device)).

You basically have to push the content checks to the client. This has not
proven to be a terrific solution in practice.

> I also think that if you're serious about email privacy, you really 
> can't keep the discussion just at the level of sanitizing headers. You 
> need to get into the format of the content that's allowed as well. For 
> example, it's well known that non-plain text email content (e.g., 
> HTML-formatted email) is potentially a serious threat to privacy due 
> to potential use of things like tracking gifs included in 
> HTML-formatted email.

I think we can do a lot to make it harder to snoop on email content,
although ironically what we're likely to be able to accomplish under the
"prism-proof"
rubric is unlikely to much of anything about the data collection the actual
Prism program performs.

But traffic analysis... unless the fact that those logs are likely to only
be accessible to state entities offers some consolation, I don't think
there's going to be much happiness here.

				Ned
_______________________________________________
perpass mailing list
perpass@ietf.org
https://www.ietf.org/mailman/listinfo/perpass