Re: [openpgp] Message padding in OpenPGP

Jon Callas <joncallas@icloud.com> Wed, 25 September 2019 15:24 UTC

Return-Path: <joncallas@icloud.com>
X-Original-To: openpgp@ietfa.amsl.com
Delivered-To: openpgp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6824512008D for <openpgp@ietfa.amsl.com>; Wed, 25 Sep 2019 08:24:11 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.299
X-Spam-Level:
X-Spam-Status: No, score=-4.299 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=icloud.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vuibGqmKxvNu for <openpgp@ietfa.amsl.com>; Wed, 25 Sep 2019 08:24:08 -0700 (PDT)
Received: from st43p00im-zteg10062001.me.com (st43p00im-zteg10062001.me.com [17.58.63.166]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A9FE7120025 for <openpgp@ietf.org>; Wed, 25 Sep 2019 08:24:08 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=icloud.com; s=1a1hai; t=1569425047; bh=a98P8qPdcvjF0bt3c1CfnaASwBvyghrVFXixBaaUCBY=; h=Content-Type:Subject:From:Date:Message-Id:To; b=CkLG+NKOY9pkM/9Jcdb1XH+ZHG0T2cpXnjfVhX5zsOC/53yh7X7l3NpDlqmWHtKK2 RatRfkL10HnDJzdneKaYpb9LWA16IXBF8DKP+a70n1JPTVCyI7Q31dAdfUUlHGQMbQ lE3AZKUZFmuV+jD6Fclu6CVvs7E6esuft1OpbwfyAEWZ2mx7vSjITwpwPLnEASm+ji dFQwBz5uA2iKXtyiukZF0H7PuvvTfMHSKvI+fbLfrJAceHE+vwEtDvQgX25migHeTB WyygwVdJL8O0N/aO9l+hdOD/3g+tX/4c4KN3RJM3e6AnYe6eD6KegLGo1n6ORnebt6 12dRRQmoWZXRg==
Received: from [10.70.126.127] (unknown [38.109.115.130]) by st43p00im-zteg10062001.me.com (Postfix) with ESMTPSA id 25CCC6C0608; Wed, 25 Sep 2019 15:24:07 +0000 (UTC)
Content-Type: text/plain; charset="us-ascii"
Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\))
From: Jon Callas <joncallas@icloud.com>
In-Reply-To: <CA+t5QVs7aoyBotbApmGQBGO9otLeB9knccAV8w9MacjrcE_51w@mail.gmail.com>
Date: Wed, 25 Sep 2019 11:24:04 -0400
Cc: Jon Callas <joncallas@icloud.com>, openpgp@ietf.org
Content-Transfer-Encoding: quoted-printable
Message-Id: <B59B17A7-2F1C-4635-BBC9-1244D5C868F1@icloud.com>
References: <CA+t5QVsZoWEuDWEzGn+mWNsx+giJsq+9pYptt3TfffASBVoGsw@mail.gmail.com> <8994782B-12D6-4B91-BA7A-1BF6BF4E7951@icloud.com> <CA+t5QVs7aoyBotbApmGQBGO9otLeB9knccAV8w9MacjrcE_51w@mail.gmail.com>
To: Justus Winter <justuswinter@gmail.com>
X-Mailer: Apple Mail (2.3445.104.11)
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2019-09-25_07:, , signatures=0
X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1909250145
Archived-At: <https://mailarchive.ietf.org/arch/msg/openpgp/7lX_Vz_-DXp40nqLJwwoPLNmoH0>
Subject: Re: [openpgp] Message padding in OpenPGP
X-BeenThere: openpgp@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Ongoing discussion of OpenPGP issues." <openpgp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/openpgp>, <mailto:openpgp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/openpgp/>
List-Post: <mailto:openpgp@ietf.org>
List-Help: <mailto:openpgp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/openpgp>, <mailto:openpgp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 25 Sep 2019 15:24:12 -0000


> On Sep 25, 2019, at 5:03 AM, Justus Winter <justuswinter@gmail.com> wrote:
> 
> On Tue, Sep 24, 2019 at 11:00 PM Jon Callas <joncallas@icloud.com> wrote:
>> Am I correct in understanding that you're proposing adding in decoy traffic to pad out compressed data to its uncompressed length?
> 
> No.  I'm proposing not to compress the data at all, and then add some
> padding data according to some policy.  The compression container is
> only a means to add the padding within the constraints of the current
> ecosystem.

Okay. Thanks. I think that clarifies. (Meaning I'm still slightly puzzled, but okay.)

> 
>> If I'm missing something, what problem are you trying to solve with this?
> 
> There is a correlation between the size of the encrypted message and
> the size of the plaintext.  On first sight, compression helps with
> that, but that makes the size dependent on the entropy of the
> plaintext, which also leads to problems as discussed previously.
> Padding alleviates this problem, the tradeoff being an increased
> message size.

Well, if you don't compress, sure there is. The size is going to be <message-size> + <overhead> and the overhead is generally easily computed or guessed.

I understand the vague concern, but to me the proposal is security stone soup. You're throwing some stuff together and it vaguely meets the vague concern.

I think there's a way forward and that might be something like:

* Describe the actual threat. I can imagine threats, but I don't have a handle on what you're trying to do exactly.
* Describe how the solution (padding) helps and give quantification as to how it helps. There are plenty of places where padding doesn't help, it just shifts a bunch of things around. There are also places where padding ends up hurting. I've seen this in constant-traffic networks where the padding makes traffic analysis easier (hand waved explanation: the padding makes it easier to recover a timing side channel and that side channel allows you to statistically remove the padding; you end up knowing the aggregate of padding to a statistical confidence level, and on a data stream, that's good enough.)
* Look at what might be second-order effects and discuss them at the least. Costs in terms of networking and storage vs benefit need to be in here.

A few years ago, I was looking at a very similar problem, and that was removing sized-based traffic analysis from cloud storage systems. We looked at padding things out, and in many cases padding didn't really help. We looked at padding with thresholds -- e.g. round everything up to the nearest chunk size, where a chunk is something like a power of two in the 4K to 1M range. It turns out that a lot of information gets leaked anyway. For example, you can easily guess that something is likely to be a selfie, because they're all in a reasonably narrow band of sizes.

The larger your chunk is, the better you blur, but the obvious downside (that now you're writing a lot of extra data in the vast majority of cases) has such an effect on wasted networking and storage space that A Reasonable Person would likely decline to pad. Padding small enough for the concern to be insignificant gives a correspondingly insignificant benefit. We ended up doing some chunking, but we knew that the benefits were so small that we didn't even really talk about it. It was far too easy for someone to think they were getting a huge benefit that they weren't getting. (A similar situation is the way people overestimate how much private browsing or even Tor help them.)

Summing up, it's interesting, but I think a cost-benefit discussion should follow, along with at least a hand wave of metric-ish things.

	Jon