[perpass] On the perfect passive adversary, men at the end, and configuration versus mechanism

Brian Trammell <trammell@tik.ee.ethz.ch> Sun, 18 August 2013 18:45 UTC

Return-Path: <trammell@tik.ee.ethz.ch>
X-Original-To: perpass@ietfa.amsl.com
Delivered-To: perpass@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 272C521F92B8 for <perpass@ietfa.amsl.com>; Sun, 18 Aug 2013 11:45:36 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.599
X-Spam-Level:
X-Spam-Status: No, score=-6.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id r8bHZ8ND-nsI for <perpass@ietfa.amsl.com>; Sun, 18 Aug 2013 11:45:31 -0700 (PDT)
Received: from smtp.ee.ethz.ch (smtp.ee.ethz.ch [129.132.2.219]) by ietfa.amsl.com (Postfix) with ESMTP id B70CA21F8617 for <perpass@ietf.org>; Sun, 18 Aug 2013 11:45:30 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1]) by smtp.ee.ethz.ch (Postfix) with ESMTP id 5CD6ED930A for <perpass@ietf.org>; Sun, 18 Aug 2013 20:45:29 +0200 (MEST)
X-Virus-Scanned: by amavisd-new on smtp.ee.ethz.ch
Received: from smtp.ee.ethz.ch ([127.0.0.1]) by localhost (.ee.ethz.ch [127.0.0.1]) (amavisd-new, port 10024) with LMTP id m+A0slc74WBp for <perpass@ietf.org>; Sun, 18 Aug 2013 20:45:29 +0200 (MEST)
Received: from [10.0.27.100] (cust-integra-122-165.antanet.ch [80.75.122.165]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: briant) by smtp.ee.ethz.ch (Postfix) with ESMTPSA id D5805D9309 for <perpass@ietf.org>; Sun, 18 Aug 2013 20:45:28 +0200 (MEST)
From: Brian Trammell <trammell@tik.ee.ethz.ch>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: quoted-printable
Message-Id: <2941892B-9F23-46F9-B993-446D7800A0DF@tik.ee.ethz.ch>
Date: Sun, 18 Aug 2013 20:45:26 +0200
To: perpass@ietf.org
Mime-Version: 1.0 (Mac OS X Mail 6.5 \(1508\))
X-Mailer: Apple Mail (2.1508)
Subject: [perpass] On the perfect passive adversary, men at the end, and configuration versus mechanism
X-BeenThere: perpass@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "The perpass list is for discussion of the privacy properties of IETF protocols and concrete ways in which those could be improved. " <perpass.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/perpass>, <mailto:perpass-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/perpass>
List-Post: <mailto:perpass@ietf.org>
List-Help: <mailto:perpass-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/perpass>, <mailto:perpass-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 18 Aug 2013 18:45:36 -0000

Greetings, all,

On the floor of my home directory, there are notes for a paper we never finished called "The Perfect Passive Adversary"; intended as a survey of the state of the network security and measurement literature and practice to explain what is possible if one assumes an adversary which can (1) observe every packet of every communication at every point within a network but (2) does not have access to the endpoints beyond the network interface (i.e. no keyloggers, and no bulk access to user data at a service provider) and (3) cannot influence packet forwarding. It grew to be rather too wide in scope and too paranoid in tone to have a realistic chance at publication, so we dropped it, which in retrospect seems a silly choice.

I would propose that we consider this as one of the threat models we address; the exercise I propose is to examine the protocols we're familiar with and think about (1) what's in the payload, (2) what can be inferred from identifying metadata (IP addresses and port numbers, application-layer addresses (email addresses, URLs, etc) (3) what can be inferred from non-identifying metadata (data volume and packet counts, interarrival times,  control information such as sequence numbers, and so on).

It looks like this conversation is already well underway on the email side of the house, which is good. :) What follows are some further musings on the limitations of this approach.

In general, increasing use of end-to-end transport encryption[1] mitigates the threat of eavesdropping by a man in the middle; indeed, threat models involving intermediate eavesdropping are at this point extremely well understood, and well defended against, barring problems with the crypto protocols that I'm not a good enough cryptographer to understand.  A perfect passive adversary generally won't even bother trying to break this link: it's too strong, a testament to the attention that's been paid to it over the decades. Such an adversary could look for unprotected side-channels (e.g., SSH keystroke timing attacks, static HTTPS content size/timing fingerprints), or could give up on payload altogether and attempt to make inferences just from the flow keys (this being the point of large-scale metadata collection).

There are defenses against this: we can certainly work to increase the cost of inference, through guidelines to inject randomness into non-identifying metadata (at the cost of latency and throughput) in protocols where the identifying metadata and content are already effectively cryptographically protected. End systems and end users still need to be identified by enough of an address to route packets and messages to them, though, and these addresses can generally be mapped to some set of real-world entities as well as a set of real-world locations, so there are limits to how far we can go within the present architecture. (Randy's suggestion to encrypt BGP to make passive network topology observation more difficult might help here -- though the prospect dismays me as a network researcher with an interest in keeping data as open as possible, it's intriguing from a privacy and security standpoint.)

However, being a perfect passive adversary is terribly expensive and doesn't work nearly as well as just compromising the end systems, whether through the traditional phishing, social engineering, rootkits and keyloggers or through cooperation and court orders, as recent revelations have shown. This illustrates the first limitation:

(1) The most serious threats today reside outside the scope of the network and its protocols. It's not the men in the middle we should be worried about, it's the men at the end. It's not clear that we can do anything about this at all.

The tools of monitoring themselves are also applicable to operations and management, which is where I got started in the IETF, working on the protocols that are perhaps most directly applicable to network monitoring at the scale applicable to pervasive passive surveillance, IPFIX (RFC5101 and draft-ietf-ipfix-protocol-rfc5101bis) and PSAMP (RFC5474-5477). Here, we've defined protocols that by _necessity_ enable large-scale passive data collection. That's what they were designed to do, though of course they were designed with a rather more limited applicability in mind, focusing on billing, capacity planning, troubleshooting, and forensics and security monitoring on enterprise networks.

It was through working on these protocols that I remember that the IETF made a policy statement on passive surveillance in the form of wiretapping (full content delivery to a third party) thirteen years ago. RFC 2804 states, in essence, that requirements for wiretapping are not to be considered in the creation and maintenance of IETF standards, though the reasoning presented behind this is rather more motivated by practical concerns than an inherent desire to protect communications. 

RFC 2804 has been generally cited as "we don't do wiretapping here", to the extent that the security considerations of RFC 5476 (PSAMP) state that "[t]he PSAMP Device SHOULD NOT export the full payload of conversations, as this would mean wiretapping [RFC2804].  The PSAMP Device MUST respect local privacy laws" -- in effect, though this is not the _spirit_ of the statement, we say that a device in a jursidiction with weak privacy laws with a full-payload sampling probability of 0.99 is compliant, while one with a sampling probability of 1.00 is not. This illustrates a second issue, which I think can be generalized (but maybe I only think this because I'm a measurement geek):

(2) Most (perhaps all) technologies that can be applied to passive network measurement for operations and management purposes can be applied to pervasive passive network surveillance: the only difference is one of configuration and deployment.

To me, this indicates we'll probably end up focusing more on configuration and implementation guidance than on changing protocols, except in those cases where protocols need specific support for cryptography which they presently lack. This is probably a good thing.

We spent a good deal of time thinking about these issues a few years ago in the FP7-PRISM project (http://fp7-prism.eu) (no, not that PRISM; yes, I appreciate the irony). PRISM essentially placed all configuration decisions for measurement in the hands of a "privacy-preserving controller", which would issue certificates bound to configuration operations only if these passed a verification of the operation, identity of the requestor, and purpose for which the request was made. It also had the ability to degrade passive measurement configurations (i.e. "you want full flow data, but have stated no purpose for which that is allowed; here are 5-minute bins grouped by AS instead"). The main problem with making something like this work is you need a formal language to define all the possible things you can do with passively-obtained measurement data so that you can do the verification. You also need a trusted privacy-preserving controller which can sanction a requestor for stating a false purpose -- in PRISM, we presumed this would be a national data projection authority. But the architectural principles may be applicable to "privacy by default" management infrastructures.

As far as the impact of management protocols on passive surveillance, we can certainly issue more concrete statements about the IETF's position on the use of such protocols, a la 2804, that we do not endorse configurations supporting surveillance activities, but at the same time it would be hard to redefine these protocols in a way to make their misuse for surveillance more difficult without fundamentally disabling them for their legitimate purposes. People are free to roundly ignore any configuration advice we give, and generally will when there's no interoperability benefit not to, so it's not clear that this would have a great deal of impact.

In general, though, focusing on protocols where necessary, configuration where possible, and advice everywhere else seems like a good approach.

Best regards,

Brian


[1] on one of the networks I'm doing research on, I was happily surprised to see Port 443 flows very slightly outnumber Port 80 flows for the first time in April of this year: HTTPS everywhere seems to be having its intended effect. :)