Re: [openpgp] [Cfrg] streamable AEAD construct for stored data?

Andy Lutomirski <> Sat, 07 November 2015 01:54 UTC

Return-Path: <>
Received: from localhost ( []) by (Postfix) with ESMTP id 353DA1A0252 for <>; Fri, 6 Nov 2015 17:54:40 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.279
X-Spam-Status: No, score=-1.279 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FM_FORGED_GMAIL=0.622, SPF_PASS=-0.001] autolearn=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id aeFv-fgBdrYD for <>; Fri, 6 Nov 2015 17:54:37 -0800 (PST)
Received: from ( [IPv6:2607:f8b0:4003:c06::230]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id C17301A0121 for <>; Fri, 6 Nov 2015 17:54:37 -0800 (PST)
Received: by oies6 with SMTP id s6so33060692oie.1 for <>; Fri, 06 Nov 2015 17:54:37 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; bh=6YeLxp+fGzpfp8Bxmq+19zrTAirFktaKgapurTzlBH4=; b=IzVhcP8iTa7i8a4cCtIu7SNW4sRAFpr+FYFP05pVXfqHU9s47GR+/QdIrrNCzuXOXV 4yTgx2B3B4O13QH2jJaYBQ/9a+Tj0c46/iQfY1AnPJ5VkPhKCs05EFX7uFzJM4kqExs1 0FAHGT/9/bRfTcTn2IGRj78QF/2j8C7PR9PJZNeCS7ayshy9dtySs5R4Knf5bxdjrguz F5WxdW6ji0BsSkD5mRszV55A7IYLBuOCpr3UPuc6wmtC+BKFsGhBfe+DkStmTrXUYaSU i/2ZY0w6NPHpENGdyuKiICQZWdDVZ2jyS4pksdsQXqD8F6J+E5WizGJpadpsgZPRZ0VP i1OQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-type:content-transfer-encoding; bh=6YeLxp+fGzpfp8Bxmq+19zrTAirFktaKgapurTzlBH4=; b=DKIVNPlrcEIJyKxwoSxb2U1GyoVaiNpgiG+La8fkV7vYWRhMjMfFZw8KZRL2hakbJA GtfDnqB07yY4qqVhWa/OFnbLNKwY2Xq9wOvysfZccEkXVk91IxKoCgZ7Vd67on8hmt8q JjvsayFIiow0ZplXxLtVMhy56OAgBgyH3bWM9wmzwwgxi7OALMDZpelyHsZVxksrARxC T2dFmo5UuymjnEZ+jaatQ1TKqCqXJEy1hRi65KRAghFOTgxcaavfBEc54YBP7DRS+ZhB t3WWT7aQo9hlTJ7dcDQom8y4U9Puf1qLKRdBWGOWKfdauEUUCsmo7C5w7nDZ5eTIeaO1 7dKA==
X-Gm-Message-State: ALoCoQnpmsXfGE/yOALIKhvcombbdlIouellDiDLN+u4zrX2QmiG6YROcF13RRHwS/HpyhW/ABuV
X-Received: by with SMTP id p133mr5428025oig.25.1446861277105; Fri, 06 Nov 2015 17:54:37 -0800 (PST)
MIME-Version: 1.0
Received: by with HTTP; Fri, 6 Nov 2015 17:54:17 -0800 (PST)
In-Reply-To: <>
References: <> <> <>
From: Andy Lutomirski <>
Date: Fri, 6 Nov 2015 17:54:17 -0800
Message-ID: <>
To: Bryan Ford <>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Archived-At: <>
Cc:, "" <>
Subject: Re: [openpgp] [Cfrg] streamable AEAD construct for stored data?
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "Ongoing discussion of OpenPGP issues." <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Sat, 07 Nov 2015 01:54:40 -0000

On Fri, Nov 6, 2015 at 5:46 PM, Bryan Ford <>; wrote:
> To return to this thread - DKG brought up one important potential
> functionality goal for the next OpenPGP message format (streaming-mode
> integrity protection); then the thread diverged into a different and I think
> orthogonal - though equally interesting - potential functionality goal
> (namely random-access capability via Merkle trees as in Tahoe-LAFS).
> I included a slide on this topic in my OpenPGP WG presentation
> (
> slide 10) and was hoping to solicit discussion but there wasn’t time, so
> perhaps we can continue here?
> To be clear, there are two separate use-cases, each of which make sense
> without the other and require different technical solutions (but could also
> make sense together):
> 1. Streaming-mode integrity protection:  We want to make sure OpenPGP can be
> used Unix filter-style on both encryption and decryption sides, to process
> arbitrarily large files (e.g., huge backup tarballs), while satisfying the
> following joint requirements:
> (a) Ensure that neither the encryptor nor decryptor ever has to buffer the
> entire stream in memory or any other intermediate storage.
> (b) Ensure that the decryptor integrity-checks everything it decrypts BEFORE
> passing it onto the next pipeline stage (e.g., un-tar).
> 2. Random-access: Once a potentially-huge OpenPGP-encrypted file has been
> written to some random-access-capable medium, allow a reader to decrypt and
> integrity-check parts of that encrypted file without (re-)processing the
> whole thing: i.e., support integrity-protected random-access reads.
> Let’s call these goals #1 and #2, respectively.
> Achieving either goal will require dividing encrypted files into chunks of
> some kind, but the exact metadata these chunks need to have will vary
> depending on which goal we want to achieve (or both).
> To achieve goal #1 properly, it appears that what we need is not only a MAC
> per chunk but a signature per chunk.  If the encryptor only signs a single
> aggregate MAC at the end, then the decryptor needs to process its input all
> the way to that signature at the end before it can be certain that any (even
> the first) bytes of the decrypted data are valid.  If the encryptor produces
> a Merkle tree at the end and signs its root as in Tahoe-LAFS (e.g., in
> pursuit of goal #2), the decryptor still needs to read to the end of its
> input before being able to integrity-check anything, and hence still fails
> to achieve goal #1.


> 1. How important is the ability to achieve goal #1 above in the OpenPGP
> format (streaming-mode integrity-checking)?

Are you willing to accept a format that allows streaming decryption
but not streaming encryption?  If so, then you'd only need one
signature if you organize your Merkle tree correctly.  In fact:

> 2. How important is the ability to achieve goal #2 above in the OpenPGP
> format (random-access integrity-checking)?

It's fairly easy to imagine a format that allows both streaming
verification and random-access verification with minimal size
overhead.  You could even create the thing in a semi-streamy manner,
where you'd stream out the bulk portion with blanks where the internal
nodes go and then write the internal nodes after the fact.

The best of all worlds might be to treat the Merkle data and the
signature as a detached file.  I bet that one could streamily encrypt
and sign a big file and produce *two* output streams: the bulk data
and a detached serialization of intermediate nodes, where there's a
single signature at the end.  A reader with access to both files could
random-access it or seek the detached signature a bit and then stream
the bulk file.