Re: [openpgp] Reducing the meta-data leak

Bryan Ford <brynosaurus@gmail.com> Mon, 09 November 2015 16:16 UTC

Return-Path: <brynosaurus@gmail.com>
X-Original-To: openpgp@ietfa.amsl.com
Delivered-To: openpgp@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 486791B2FAA for <openpgp@ietfa.amsl.com>; Mon, 9 Nov 2015 08:16:48 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Level:
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id d2x3b6DV1Byv for <openpgp@ietfa.amsl.com>; Mon, 9 Nov 2015 08:16:46 -0800 (PST)
Received: from mail-wm0-x231.google.com (mail-wm0-x231.google.com [IPv6:2a00:1450:400c:c09::231]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 87D4B1B2FA9 for <openpgp@ietf.org>; Mon, 9 Nov 2015 08:16:45 -0800 (PST)
Received: by wmec201 with SMTP id c201so77228743wme.1 for <openpgp@ietf.org>; Mon, 09 Nov 2015 08:16:44 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=content-type:mime-version:subject:from:in-reply-to:date:cc :message-id:references:to; bh=SOOvGMKm7x9YnFtNejbMKo1ylTRWRUIOIXl0dnuoVrQ=; b=bDB5y+yfMD0ziuyx5r9I7XYFzb66HV2UKx0SlpqSHT365AfiBUFTSzQnOmxIP/axOH k8pI1oaUyOvINnVKZlxE0Y2hYcTHDlSlu8q5uRi8ZuIyeo2cptDMStwiaH8oPKrdcXNx 3pBOt37nYlsQM3jKLQh5w0dbPGB8g/MnVYVrZAwfc8SErkWXytO6/WabCT/AFCE8mHjc wIU1obMuRbYRAFH7sSd4eFDCLeNWkkoL4YEcdnU6Fnbq8ATyB2pSpJ9CsTXEpzTAl8oP Ua6MvKPwGB4XlEnreY2A8AzY+MOkkoLgAWaUscXxxHLblzTtbclYbP4nm/RxF8AHEsgo i1Aw==
X-Received: by 10.28.139.143 with SMTP id n137mr25582525wmd.8.1447085804010; Mon, 09 Nov 2015 08:16:44 -0800 (PST)
Received: from tsf-476-wpa-3-250.epfl.ch (tsf-476-wpa-3-250.epfl.ch. [128.179.179.250]) by smtp.gmail.com with ESMTPSA id y77sm15043042wme.15.2015.11.09.08.16.41 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 09 Nov 2015 08:16:42 -0800 (PST)
Content-Type: multipart/signed; boundary="Apple-Mail=_710DEDB0-2408-467B-8D03-98DAE4776BA2"; protocol="application/pkcs7-signature"; micalg="sha1"
Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2104\))
From: Bryan Ford <brynosaurus@gmail.com>
In-Reply-To: <563F39F1.5040608@iang.org>
Date: Mon, 09 Nov 2015 17:16:41 +0100
Message-Id: <9F39DEC7-593E-4B26-B84F-3E0AF8ADE658@gmail.com>
References: <mailman.92.1446580813.31211.openpgp@ietf.org> <86CB1513-F594-4A9B-A3B6-17ECB9CA9EB6@isoc.org> <160A8D98-3DF8-4F51-A38C-EF3E0DAE71EE@gmail.com> <563F39F1.5040608@iang.org>
To: ianG <iang@iang.org>
X-Mailer: Apple Mail (2.2104)
Archived-At: <http://mailarchive.ietf.org/arch/msg/openpgp/70V_XsEmvKPgaaqFCL6ORzhSbIU>
Cc: openpgp@ietf.org
Subject: Re: [openpgp] Reducing the meta-data leak
X-BeenThere: openpgp@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "Ongoing discussion of OpenPGP issues." <openpgp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/openpgp>, <mailto:openpgp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/openpgp/>
List-Post: <mailto:openpgp@ietf.org>
List-Help: <mailto:openpgp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/openpgp>, <mailto:openpgp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 09 Nov 2015 16:16:48 -0000

Thanks Ian for the feedback! (and the patience :) )

On Nov 8, 2015, at 1:02 PM, ianG <iang@iang.org> wrote:
> On 7/11/2015 11:33 am, Bryan Ford wrote:
>> […]
>> So we basically have three important blobs of data to place in the file:
>> (a) the salt for the password-hashing scheme, (b) the AEAD-encrypted
>> session key encrypted using the password-hasher’s output, and (c) the
>> AEAD-encrypted file content encrypted using the session key.  Item (a)
>> needs to be encoded in the file in “cleartext” since it needs to be
>> available to the decryptor before it can decrypt anything, but
>> fortunately the salt can just be a uniformly random blob anyway (of a
>> length fixed by this well-known scheme).  So for the moment let’s just
>> put it at the very beginning of the encoded file.  Then place the
>> AEAD-encrypted session key blob (b) immediately afterwards, whose size
>> can also easily be fixed for this scheme.  This fixed-length session-key
>> blob may contain encrypted metadata in addition to the session key, such
>> as the file offset of the AEAD-encrypted file content, the (possibly
>> padded) total size of the AEAD-encrypted blob, and perhaps the size of
>> the “useful payload” within that blob after removing any padding.
> 
> I'd suggest also:
> 
> * the version number of the OpenPGP format, like v4 or whatever it is supposed to be - thus causing a stab at how we handle rollover of this format, at least of the following file content.

Indeed, that should be added (inside the blob) and is easy to add.  And the stuff inside the blob could look as much like a conventional PGP packet-stream as we like.

> * a self-MAC on the blob.  This is the proof that you've found the right password, and can proceed.  This would be calculated over the blob with the self-MAC field set to all zeroes.  Extra points if the calculation also includes the salt (a).

Since both the session-key blobs and the main file/payload blob are AEAD-encrypted, those AEAD-encrypted blobs have MACs attached to them by the AEAD algorithm.  Thus, the decryptor knows he’s got the right password when the AEAD decryption algorithm (applied to the correct session-key blob at the correct range of bytes in the file) successfully checks whatever MAC the AEAD scheme defines and returns “All’s well.”

Of course all this could be defined just as well separating the encryption for the MAC-checking, but I just thought it was easiest to go with the AEAD-based definition.

> * any primary MACs over the file data including the padding, pending those other discussions on integrity checking.

Yes.  Any payload padding gets changed, the AEAD’s MAC fails.

There’s a second-order subtlety, regarding how strongly we would want to protect against a (very) active attacker using selective corruption to “probe” the size and shape of the header region _before_ the padded payload even begins.  If some of that region is just random bits (e.g., unused hash table entries) or symmetric-key blobs for “other” recipients, then in my scheme’s basic formulation, the attacker can corrupt bits in those regions and the decryptor might still accept it, whereas the decryptor will refuse to accept it if “its” particular symmetric-key blob (or anything in the payload) gets corrupted.  Thus, a (very) active attacker who can use a decryptor as a “like/don’t-like” oracle can effectively do "corruption tomography” to learn the shape of the header area, thereby possibly learning back a bit of the metadata that we’re trying to hide.  I know of a way to enable the decryptor to check the whole header for corruption as well, but it’s a bit scarily complex and creates other tradeoffs so I haven’t decided if it’s worth the effort.  (Basically it making the header-generation scheme deterministic such that the decryptor can re-run the header-layout code and scream if anything is other than the way the encryptor “should have” done it, including the values of the [pseudo-]random bit in unused hash table entries and such.)

>> This
>> metadata will of course appear as uniform random bits to a non-recipient
>> as long as the AEAD encryption scheme is doing its job.  Finally, place
>> the AEAD-encrypted file content (c), including any padding, after the
>> encrypted session-key blob as the rest of the file.
> 
> 
> Yup.
> 
> On the remaining SM stuff, I'd like to hear that there is widespread support for this before subjecting myself to the pain.

Understandable, sorry if my text was a bit impenetrable - I realize I need to work on some better examples and diagrams. :)

B