Re: [openpgp] AEAD Chunk Size

Jon Callas <joncallas@icloud.com> Thu, 28 March 2019 21:27 UTC

Return-Path: <joncallas@icloud.com>
X-Original-To: openpgp@ietfa.amsl.com
Delivered-To: openpgp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D552F1202C1 for <openpgp@ietfa.amsl.com>; Thu, 28 Mar 2019 14:27:35 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.85
X-Spam-Level:
X-Spam-Status: No, score=-1.85 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, KHOP_DYNAMIC=0.85, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=icloud.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CtHgpeqzED3P for <openpgp@ietfa.amsl.com>; Thu, 28 Mar 2019 14:27:34 -0700 (PDT)
Received: from mr85p00im-zteg06012001.me.com (mr85p00im-zteg06012001.me.com [17.58.23.197]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 28E0712003F for <openpgp@ietf.org>; Thu, 28 Mar 2019 14:27:34 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=icloud.com; s=04042017; t=1553808453; bh=JYfh3O61d/f5M3Ngg2rrw+SzVE5awiD4HmTNNS68e8o=; h=Content-Type:Mime-Version:Subject:From:Date:Message-Id:To; b=m1Fz5zTtmuwemxlOSRdacg/tSt8Vo6pybq5FqFjExWGo+y46wJ3HIraQwHj0V49h+ RxAmBCjMgJkBMi2IPMg9kXbD1xcM6WDg/vAIPGY8RGU3XYRLsVOMgjecJgJVIxjvhP 44Hs9DCyfuzqWXYYeBLjbnEGVwvqu2u57UHmlA9SWEIZtqd78ikkhHnyy9lp4r4P+Z XOEV4/0x+giNy386s8allbogR4sn2LM/uVjiZUEGen1bxv9Ll7BkoIPYKwJKJjh7Rc Y4YT1zSxBpQ28wQPkakiw1EP9jO7JQe72POAqhyEVpR11Xa4FfuZOsFWEjo9HgP025 UaxzBpw/Q9RMw==
Received: from [10.125.12.152] (67-207-120-150.static.wiline.com [67.207.120.150]) by mr85p00im-zteg06012001.me.com (Postfix) with ESMTPSA id 1B526A001AC; Thu, 28 Mar 2019 21:27:33 +0000 (UTC)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\))
From: Jon Callas <joncallas@icloud.com>
In-Reply-To: <878swzp4fb.fsf@europa.jade-hamburg.de>
Date: Thu, 28 Mar 2019 14:27:27 -0700
Cc: Jon Callas <joncallas@icloud.com>, openpgp@ietf.org
Content-Transfer-Encoding: quoted-printable
Message-Id: <E65F6E9D-8B0B-466D-936B-E8852F26E1FF@icloud.com>
References: <87mumh33nc.wl-neal@walfield.org> <878swzp4fb.fsf@europa.jade-hamburg.de>
To: Justus Winter <justuswinter@gmail.com>
X-Mailer: Apple Mail (2.3445.102.3)
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2019-03-28_13:, , signatures=0
X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1011 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1812120000 definitions=main-1903280138
Archived-At: <https://mailarchive.ietf.org/arch/msg/openpgp/XFWxMbw_jv-JNU_Ugx-f5_fYOr0>
Subject: Re: [openpgp] AEAD Chunk Size
X-BeenThere: openpgp@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Ongoing discussion of OpenPGP issues." <openpgp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/openpgp>, <mailto:openpgp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/openpgp/>
List-Post: <mailto:openpgp@ietf.org>
List-Help: <mailto:openpgp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/openpgp>, <mailto:openpgp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 28 Mar 2019 21:27:36 -0000


> On Mar 28, 2019, at 5:30 AM, Justus Winter <justuswinter@gmail.com> wrote:
> 

[…]

> In the context of processing OpenPGP data, currently there is no
> relation between the size of the encrypted message and the size of the
> decrypted message.  This is due to compression.  

This isn’t precisely true. Certainly, compression is the biggest factor here, but it is not the only one. There are many factors that make it hard to know the finished size of an OpenPGP message a priori. These include ASCII armor, TEXT mode plaintext, and others. It only gets worse inside the plaintext where there is typically in emails quoted-printable, further base64, both, and other bits of brain damage caused by the accretion of many things that were good ideas at the time.

> 
> For me, using an unbounded amount of memory is not an option for a
> component processing OpenPGP data if we want to build robust systems
> on top.

Okay, prior to this working group, when there was running code without a consensus rough or not, this problem existed. Even with compression, PGP 2 ran on DOS machines with a max of 640K of RAM. There are many similarly constrained systems that run OpenPGP implementations.

> 
> Therefore, we need to process OpenPGP data in bounded space.  Since
> there can be no relation between encrypted and decrypted message size
> due to compression, the only option I see is to provide a streaming
> API, which let's us process data in constant space.

I’m not quite sure what you mean when you say “bounded space” because one interpretation of that is obviously false. OpenPGP has always supported being able to process messages where the encryptor does not know the size ahead of time. That’s why we have indeterminate lengths and chunking.

I presume you mean that the implementation has to have constraints on its resources. This is certainly true; there are no unconstrained systems. It’s also true that there are going to be messages that your implementation can’t process well. For example, RFC 4880 allows a partial body length (a chunk) to be 2^30 octets, and that could be irritating to handle.

One of (perhaps unstated) goals of OpenPGP is that it allow for highly constrained implementations. This was a huge consideration in both 2440 and 4880. There are things that are designs the way they are because the working group felt strongly that things have to be one-pass. There were many debates about the MDC that boil down to it (and this is also the reason why HMAC wasn’t used, but there more history there, including that while HMAC existed when the MDC was designed, we did not yet have a proof of security for it.

> 
> [Now, when I say constant space, implementations could still decide to
> use, say, 30 megabytes of buffer space.  Then, most emails will fit
> into this buffer, and we can detect truncated messages before we hand
> out one byte to the downstream application.  This is what we do in
> Sequoia.  Note, however, that the consumer decides how much data to
> buffer before releasing the first data, and not the producer.  If we
> decide to even allow 128 megabyte chunks, than the producer can
> *force* the consumer to allocate 128 megabytes, or either not process
> the message or do it unsafely.]

Or the consumer could return an error and say it can’t decode it.

> 
> Now, as efail demonstrated, we need to protect against ciphertext
> modifications, and we need to do it in a way that does not bring back
> the problems with requiring unbounded space that we're trying to
> address with streaming in the first place.

Efail is primarily a problem with MIME encoding and layering violations. It works just as much with S/MIME as OpenPGP. Perhaps I’m missing something, but I don’t see how Efail is relevant to resource bounds.

> 
> Therefore, we need to use chunking and authenticate message prefixes.
> We need to use chunks that are reasonably small, and this size should
> preferably not be configurable.  We should consider performance
> constraints and pick one suitable size.  Configurable chunk sizes
> bring complexity and increase the attack surface, as was pointed out
> in this thread.
> 

I’m with you on a lot of this, but I don’t know what you mean by “configurable”? Do you mean that there should be one chunk size only? If so, what size do you propose? 32 Meg or thereabouts (2^25 is in that ball park)? If so, would that mean that all messages smaller than your chunk size would be a single chunk? 

> The only argument for a configurable chunk size that came out of this
> thread is to be able to fit the entire message into one chunk.

That’s not the way I understand the discussion. The way I understand it, there are people who desire to have single-chunk messages of a rather large size. At present, the non-AEAD chunks can be any power of 2 up to 2^30 (but the first one has to be at least 2^9). I don’t see the request for variable (is that the same thing as configurable?) chunk sizes to be anything other than the analogue of the present situation.

> 
> I appreciate the desire to protect against truncation.  But,
> truncation is pretty common when we transmit data, so I'd argue that
> application developers are more likely to expect and gracefully deal
> with truncated data than with ciphertext being manipulated or the PGP
> implementation consuming unbounded amounts of memory.

Does this mean that you think that message truncation is an error that OpenPGP doesn’t need to guard against?

That’s the way I interpret the first line in the paragraph above (“I appreciate … But,…”). If so, that’s counter to the long-standing consensus of the working group. It’s the whole reason we have MDCs and the reason why they were aggressively pushed in the implementations and non-MDC packets browbeaten into doing MDCs. See the non-normative discussion in section 5.13 of 4880.

> 
> Now, you may say that even if the PGP implementation doesn't buffer
> the plaintext, the downstream consumer must buffer it in order to
> detect truncation.  But that is not always true.  As pointed out in
> this thread, you can use some kind of transaction scheme to only
> commit data once it has been confirmed to be not truncated.

I think I understand. Are you noting that because of the one-pass nature of OpenPGP, it’s possible to process arbitrary amounts of data and not know that there’s an error until the end? This is certainly true of MDCs, because of the one-pass desire. If you make an implementation that has AEAD chunks, it’s possible that you could be processing correct chunks for an indefinite amount of time, and then get an AEAD failure that calls into question the integrity of the whole stream that led up to that.

Is that what you’re pointing out?

> 
> 
> I have implemented AEAD in Sequoia, and I have evaluated the
> implementations in GnuPG and RNP.  Every implementation is either
> unsafe, not robust, or does not implement the proposal.

Tell us more. What problems did you find?

> 
> What is proposed in RFC4880-bis06 can not be implemented safely.  If the
> working group produces a standard that cannot be implemented safely, I
> consider that a grave failure of the standardization effort.

Okay, you’ve lost me.

What can’t be implemented safely and why?

In my reading of this, I think I have identified two points you’re making.

(1) It’s possible for a chunk to be larger than reasonable processing resources.
(2) It’s possible for a long stream to have an error in the last chunk that signals an error wayyyyyy in the past.

Handling (1) is reasonably easy. Return an error. This situation exists today. It’s possible to make partial bodies of a gigabyte each, and an implementation may not be able to handle that. Return an error.

Handling (2) is also easy, you return an error. This might be unsatisfying, because the error might be in the past, and lots of stuff already handled. Is this your objection?

	Jon