Re: [openpgp] Reducing the meta-data leak

Daniel Kahn Gillmor <> Tue, 05 January 2016 00:43 UTC

Return-Path: <>
Received: from localhost ( []) by (Postfix) with ESMTP id 9FE771ACDC4 for <>; Mon, 4 Jan 2016 16:43:45 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: 0.8
X-Spam-Status: No, score=0.8 tagged_above=-999 required=5 tests=[BAYES_50=0.8] autolearn=ham
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id gV_QCnSUAo0M for <>; Mon, 4 Jan 2016 16:43:44 -0800 (PST)
Received: from ( []) by (Postfix) with ESMTP id C42071ACDC2 for <>; Mon, 4 Jan 2016 16:43:43 -0800 (PST)
Received: from (unknown []) by (Postfix) with ESMTPSA id 27649F984; Mon, 4 Jan 2016 19:43:36 -0500 (EST)
Received: by (Postfix, from userid 1000) id 4CF6B201EF; Mon, 4 Jan 2016 19:43:35 -0500 (EST)
From: Daniel Kahn Gillmor <>
To: Ben McGinnes <>, "Neal H. Walfield" <>, Derek Atkins <>
In-Reply-To: <>
References: <> <> <> <>
User-Agent: Notmuch/0.21+39~gd2ae295 ( Emacs/24.5.1 (x86_64-pc-linux-gnu)
Date: Mon, 04 Jan 2016 19:43:35 -0500
Message-ID: <>
MIME-Version: 1.0
Content-Type: text/plain
Archived-At: <>
Cc: IETF OpenPGP <>
Subject: Re: [openpgp] Reducing the meta-data leak
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "Ongoing discussion of OpenPGP issues." <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Tue, 05 Jan 2016 00:43:45 -0000

On Fri 2016-01-01 23:06:13 -0500, Ben McGinnes wrote:
> On 4/11/2015 1:37 am, Neal H. Walfield wrote:
>> Bryan Ford proposed getting rid of all unencrypted meta-data.  In
>> particular, he wanted to get rid of the recipients / number of
>> recipients.
>> There are some practical difficulties with this approach,
>> which I mentioned above.
>> My proposal is a blue sky idea to avoid having to try to decrypt a
>> message with every secret key while (hopefully) making it more
>> difficult to get at the list of recipients.
> While I don't doubt the good intentions, I fail to see how this has
> any real value.  Specifically because of the significantly larger
> amounts of meta-data which already leaks from every SMTP exchange
> ever.  That's the real threat and that inevitably leads to this
> question:
> * In what scenario has someone gone to the effort of disguising all
>   their SMTP traffic (remailers, tor, whatever), but not selected an
>   alias on the OpenPGP key they're using?

fwiw, there is effort going into protecting some of the SMTP/RFC822
metadata (see the discussions in, which would make this
kind of work within OpenPGP more valuable than it currently is in the
full-metadata-wrapped OpenPGP e-mail use case.

It's worth thinking about this problem from the point of view of a
single message encrypted to a single recipient.  We already have the
mechanisms to deal with multi-recipient messages: A sender who wants to
craft a single message to N people without indicating the number of
people should probably re-build the message N times, with a single
PK-ESK on each.  If they really cares that an observer can't tell that
each recipient is getting the same message, they should presumably
choose a separate SK for each version, and pad each message to different
sizes as well.  We can do all of the above except padding in OpenPGP as
it stands, and padding can often be done in whatever format is being
wrapped by OpenPGP (e.g. a text/plain MIME part consisting of all
whitespace).  Incidentally, the single-pkesk-per-message approach
addresses the smartcard UI/UX concern, and mitigates the
multiple-secret-key UI/UX concern somewhat (the multiple-secret-key
UI/UX concern can be further mitigated by configuration recipient-side).

Implementation work or guidance on avoiding creation of multi-recipient
messages would be good.

With regards to the bloom filter proposal here, i think Tom's concerns
about its viability are worth heeding.  In a single-recipient message,
the bloom filter is effectively just a smaller hash of the key.  The
current PKESK itself already doesn't provide an unambiguous answer to
who a message is encrypted to, since it provides only 64 bits of the
fingerprint, and we know that there are pairs of keys that share their
lower 64 bits of fingerprint.

We could simply reduce the size of that further to get more ambiguity
(the distribution should already be uniform) but the ambiguity is
already on the edge of being a problem for UI (it's not clear whether a
UI should announce "this message claims to be for key XXXXX, which we
have, but cannot be decrypted with it" or "this message claims to be for
keyid XX, and while we have key XXXXX which can't decrypt it, the sender
might have meant some other key").  Yuck.

Removing the metadata of who a message is for seems likely to require

 a) trial decryption on the recipient side (problematic for smartcard
    and multiple-secret-key setups, as Neal and Werner pointed out), or

 b) some sort of racheted shared state between sender and recipient
    (e.g. a briar- or axolotl-style esk, which might provide other nice
    features, like "deletable" ("forward-secret") messages)

While (b) is out of scope for us here until we get 4880bis sorted, if
someone wanted to experiment with that and report back, i'm sure it
would be interesting to several people on the list.

Or maybe there's a (c) option?