Re: [openpgp] AEAD Chunk Size

Benjamin Kaduk <> Sat, 30 March 2019 15:04 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id E94D91201EC for <>; Sat, 30 Mar 2019 08:04:54 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.901
X-Spam-Status: No, score=-1.901 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=unavailable autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id MigbwSl230w4 for <>; Sat, 30 Mar 2019 08:04:53 -0700 (PDT)
Received: from ( []) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 06ED31201EB for <>; Sat, 30 Mar 2019 08:04:52 -0700 (PDT)
Received: from ( []) (authenticated bits=56) (User authenticated as kaduk@ATHENA.MIT.EDU) by (8.14.7/8.12.4) with ESMTP id x2UF4cUT001350 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Sat, 30 Mar 2019 11:04:40 -0400
Date: Sat, 30 Mar 2019 10:04:38 -0500
From: Benjamin Kaduk <>
To: Bart Butler <>
Cc: Jon Callas <>, "" <>, Justus Winter <>, "Neal H. Walfield" <>, Jon Callas <>, Peter Gutmann <>
Message-ID: <>
References: <> <> <> <> <> <>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <>
User-Agent: Mutt/1.10.1 (2018-07-13)
Archived-At: <>
Subject: Re: [openpgp] AEAD Chunk Size
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Ongoing discussion of OpenPGP issues." <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Sat, 30 Mar 2019 15:04:55 -0000

My apologies for being only an occasional participant in  this thread (and
it will likely take me another week before I can reply again), but there
are a few points I would like to make.

On Sat, Mar 30, 2019 at 02:17:55AM +0000, Bart Butler wrote:
> Hi Jon,
> As others have noted, there is a lot of confusion on this thread, some of which you touched in your AEAD Conundrum message, like when we say AEAD should not release unauthenticated plaintext, do we mean the entire message or the chunk?

It's really quite something to have gone through a week's worth all in one
go.  There are many people writing out careful descriptions of how they see
things, and yet we still seem to be talking past each other at times.

I propose that we use "plaintext corresponding  to non-modified ciphertex"
for the non-malleability protection that is provided by an AEAD
authentication tag on a single chunk, and "fully authenticated complete
plaintext" for the output after processing an entire message (i.e., all
chunks) with guarantee of non-truncation.  (Are there other cases in
between that we care about?)

> Another piece of confusion is that Efail isn't a single vulnerability, it was several vulnerabilities related (at best) thematically.
> So to be very specific, for the purpose of the following discussion, the advantage of smaller AEAD chunks is specifically to prevent Efail-style ciphertext malleability/gadget attacks, and the prohibition on releasing unauthenticated plaintext is applied to individual chunks, which is sufficient to foil this kind of attack in email.
> The kind of attack we are talking about is fundamentally about exfiltration of plaintext data to an attacker-controlled endpoint. Borrowing from your AEAD Conundrum message, if the first chunk passes and is released, and the second chunk fails, that is OK, at least for email, because the part that was modified (the second chunk) is never released, so you get a truncated message and an error, but the truncated message without the modifications isn't going to exfiltrate itself.

One concern that I have (and  is only tangentially related to this quoted
part) is that I want to make it easy for implementations to "do the right
thing" when ciphertext is modified, i.e., return an error, and specifically
to return an error without releasing any plaintext that originates from the
modified ciphertext.  The current openpgp ecosystem does not seem to be
very compliant to that desired behavior, and part of that may be due to a
lack of philosophical support/help from the spec.

> Now if releasing ANY authenticated chunk of a message that hasn't been fully authenticated (in an AEAD sense) is a real problem for your application, I'd argue that you're trying to make AEAD do something it's not suited for and you should enforce this in your application if it applies to you, probably by not streaming.
> So to recap, small-chunk AEAD provides specific value in preventing ciphertext malleability/gadget attacks, particularly in HTML email, which is a common use case.
> What value does large-chunk AEAD actually provide? What I'm getting from the AEAD Conundrum message is that it's a way for the message encrypter to leverage the "don't release unauthenticated chunks" prohibition to force the decrypter to decrypt the whole message before releasing anything. Why do we want to give the message creator this kind of power? Why should the message creator be given the choice to force her recipient to either decrypt the entire message before release or be less safe than she would have been with smaller chunks?
> Coming back to Neal's point, it's really hard to see any sort of value in really large AEAD chunks, because the performance overhead is negligible at that point and the only security 'benefit' that I can see is the encrypter trying to use the spec to force the decrypter to not stream, which does not seem like something at all desirable.

I'm still not sure I understand the point of very large chunks, since once
they get really  big an implementation is choosing between streaming
plaintext from potentially modified ciphertext or return an error without
even attempting to process the chunk.  I'm not convinced that the second
will win out in implementations if  we alow very large chunks.

Some other notes, not relating to anything specifically quoted from this
message (but derived from other parts of the thread):

TLS allows for arbitrarily variable-length chunks because it is
a synchronous transport for higher-level application streams and the
application may have arbitrary message sizes.  OpenPGP is used in an
asynchronous model, where a message generator can be modelled to make all
its actions before the receiver processes anything, and there is only
one-directional communication within the OpenPGP format.  So there does not
seem to be much demand for "take all the bytes that you have so far and
send them right now", and AFAICT the message generator can just wait until
end of data arrives or enough data to make a complete chunk arrives.  So
from that point of  view, there is not much argument in favor of varying
the chunk size within a single message, and possibly even across messages
(i.e., this line  of reasoning would be okay with a single chunk size fixed
for everyone as a protocol constant).  There are of course other factors
that may come into play, like constrained systems and  such, but we can
treat those separately.

I also have a use case for authentication of large chunks of data at rest:
they allow me to use a cheap bulk storage service that provides
(best-effort) replication and archiving but has poor physical security.  So
I encrypt my data to myself and put it in storage, but when I get it  back
I need to know that it's valid.  I can imagine at least one case where
knowing exactly which chunk was corrupted would save effort; it may be a
toy example but perhaps it is illustrative of a broader case.  Note that
there are algorithms to compute pi to arbitrary precision, and even to
compute the Nth digit thereof without coputing the previous digits.  If I
need to have random-access inquiries into the value of pi, I could
precompute using softare I trust and do this self-encryption thing, and
when a chunk is bad I can recompute only that chunk and still trust that I
only ever use values generated by my trusted implementation.

And finally, there is no openpgp Working Group; all we have here is a bunch
of folks interested in a topic talking amongst each other on a public
mailing list hosted at the IETF.  There are no WG chairs and no expectation
of Area Director supervision (i.e., I don't feel obligated to read the
messages here).  That said, I'm happy to see that we're staying calm and
civil, and AFAICT everyone is honestly trying to understand everyone else's
position and come to a consensus.  Let's try to keep focusing on the
technical details and what use cases we need to cover.