Re: [openpgp] Deprecating compression support

Jon Callas <joncallas@icloud.com> Mon, 18 March 2019 20:45 UTC

Return-Path: <joncallas@icloud.com>
X-Original-To: openpgp@ietfa.amsl.com
Delivered-To: openpgp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B2CC7130DD3 for <openpgp@ietfa.amsl.com>; Mon, 18 Mar 2019 13:45:57 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.85
X-Spam-Level:
X-Spam-Status: No, score=-1.85 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, KHOP_DYNAMIC=0.85, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=icloud.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Zgzhg6CKqB4C for <openpgp@ietfa.amsl.com>; Mon, 18 Mar 2019 13:45:55 -0700 (PDT)
Received: from mr85p00im-zteg06021901.me.com (mr85p00im-zteg06021901.me.com [17.58.23.194]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id BBCE61289FA for <openpgp@ietf.org>; Mon, 18 Mar 2019 13:45:55 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=icloud.com; s=04042017; t=1552941955; bh=4hu+9h8M/Mlt52FNFZDicELMMWv/giLuU1VXyiNEgAs=; h=Content-Type:Mime-Version:Subject:From:Date:Message-Id:To; b=pRIh19OmLY3axZ/7jQL30QscWz36OXe1tFZZlSHyb8GwYMBsIf+XXAeczLEZOlVzN X5fTKpUU1/3vwfHhxW/NQH0DY3ComL7ZAb698+Xl+8VwlMtPHUbhhTYPC1HsqaaGSv qH2D+576qLCQbl51JaL9faLmzmcvBbM3TgWQxRqKeDAeOOPZcdHE57hO4/gzANOyAl TR9wx249pKqwv1dbTeVM8nYowZTRuNyRWbP+SivtY8nW5BIkmrm/U2Lj4Y0GB5Iz4P 6LtdROxdCBCcmgOdYuEwAE5KZNBrB4UoU5H/L0V9iE0kjuoiOCLsqKjndVSX7CpRgC +XuLleVQh/VtA==
Received: from [192.168.7.69] (thing1.merrymeet.com [173.164.244.99]) by mr85p00im-zteg06021901.me.com (Postfix) with ESMTPSA id EA1CB720135; Mon, 18 Mar 2019 20:45:54 +0000 (UTC)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\))
From: Jon Callas <joncallas@icloud.com>
In-Reply-To: <871s3475dy.fsf@europa.jade-hamburg.de>
Date: Mon, 18 Mar 2019 13:45:54 -0700
Cc: Jon Callas <joncallas@icloud.com>
Content-Transfer-Encoding: quoted-printable
Message-Id: <96055353-B0EB-4E25-95CC-B25D9C5A0BA8@icloud.com>
References: <871s3475dy.fsf@europa.jade-hamburg.de>
To: openpgp@ietf.org
X-Mailer: Apple Mail (2.3445.102.3)
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2019-03-18_13:, , signatures=0
X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=1 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1812120000 definitions=main-1903180146
Archived-At: <https://mailarchive.ietf.org/arch/msg/openpgp/QMR59OZkcTbbnCLB2-wzLinEbH4>
Subject: Re: [openpgp] Deprecating compression support
X-BeenThere: openpgp@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Ongoing discussion of OpenPGP issues." <openpgp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/openpgp>, <mailto:openpgp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/openpgp/>
List-Post: <mailto:openpgp@ietf.org>
List-Help: <mailto:openpgp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/openpgp>, <mailto:openpgp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 18 Mar 2019 20:45:58 -0000

I like the basic proposal. I think that deprecation is better than banning, and consequently we ought to be doing it with SHOULDs and SHOULD NOTs rather than MUSTs, as that’s banning rather than deprecating.

However, I want to add that implementations today can deprecate it on their own and no changes in the standard need not be there. If an implementation creates a key with compression preferences set to no compression then any other implementation is bound by the standard not to compress. I advocate this as well. As OpenPGP sits today, any implementer can unilaterally not only deprecate, but eliminate compression. In fact, I’ll go so far as to advocate that all implementers, even GnuPG ought to just start making keys by default that eschew compression. More on this below.

My rationale for getting rid of compression is different. I think that the security-based reasons in this thread are largely unconvincing and often just not quite correct. I don’t want to discuss that in this thread, because I agree with the destination — let’s move away from compression, and am afraid that if we debate the reasons for it we distract from the result we mutually want. We should deprecate compression, but not because of security, but because of simplicity, and in short a case of “that was then, this is now."

OpenPGP has many things in it that are needlessly complex, and compression is one of them. (Other needless complexities include the way that packet sizes are computed, and others.) Much of this comes from its heritage as a utility, and a utility that was designed for DOS and BBSes. The considerations of networking and communications in 1991 are not at all what we should have for 2021. 

One of these differences is that zip-style compression is just not as useful now. Obviously, the wins for compression are on things that are the largest. Most of the large things we transfer today are already compressed, and thus compressing them again is mostly a waste of cpu cycles. There are some file formats that have zip-style compression built in. For example, GIF and PNG graphics are explicitly compressed. Other formats like JPEG are a different sort of compression but have the same characteristic that it is reasonably mathematically pseudo-random and not readily compressed further. Ditto for audio formats like MP3 and AAC, while FLAC and ALAC are like GIF and PNG in that they’re losslessly compressed. 

Today, the large things we send are media, and therefore there’s no need to compress them. The things we send that are easily compressible tend to be smaller.

In the cases where large files are sent that can be compressed, the best solution is for that system to compress the file itself. Remember, OpenPGP’s history comes from DOS programs on small computers and they didn’t even have a way to pipe a tar or zip command to an encryption program. That’s not the case today.

Historically, compression has been the largest share of the time it takes to process a file with an OpenPGP implementation, and thus we are doing something by default that burns CPU for little gain. We should stop doing it by default. Again, the best way to do this is to create keys that have a compression preference of none. We can do this now.

Over a decade ago, the PGP program had a feature of the “PGP Zip” file. This was literally nothing more than a tar file that was run through gzip, and then encrypted with OpenPGP without compression. PGP itself had some nice viewers that let someone manipulate the container just as other directory compression systems did, but any unix system could handle the file by just decrypting with gnupg and piping that to tar. Later on, as I remember (correct me if I’m wrong, Derek, or someone), we shifted to bz2. The same strategy can be improved as compression tools improve.

The compression inside of OpenPGP is also hard to implement correctly. The default compression, the “DEFLATE” option is a modified implementation of ZIP-style programs from the era of the late 1980s. As those programs coalesced into default (not precisely standard, merely ubiquitous) implementations, they went in one direction and that era’s PGP stayed with the variant. When we did RFC 2440, we added in ZLIB (RFC 1950) encryption so that an implementation wouldn’t have to hack the compression software. This is another reason to move away from compression inside OpenPGP. RFC 4880 added in an option for BZ2 (it was the new hotness at the time), and these days the cool kids are using 7-Zip as it's even better than bzip. The best way to keep up with advances in compression is piping from a compressor to some OpenPGP implementation, rather than continuing to chase advancements.

The underlying reality is even worse. Go look at section 5.6 of 2440, and there are interoperability hints for working with PGP2 because it had further limitations in it for internal table sizes. This is another reason to get rid of it for simplicity. It flat isn’t needed.

Simply not having compression aids implementation, as well. A number of years ago, I was putting together a Javascript implementation and there were no good compression libraries at the time, so we just created keys that said “no compression, please” and ran with it. Every implementation can handle decoding an OpenPGP object that is not compressed, so there are fewer bits of backwards compatibility to transition away from it.

Lastly, I’ll note that everything I’ve said here could be applied to banning encryption in 4880-bis, as well. I am not opposed to a ban, but I think deprecation is better, particularly since every implementation can just stop doing compression by default and everything will work just fine. Repeating myself, I even advocate this. Everyone who makes some OpenPGP program can stop today, and likely should. I’ll even promote that to SHOULD. 

(The biggest reason against an outright ban is that there are a lot of systems that use OpenPGP in the midst of some internal process, like moving around large, sensitive files. There are thus likely a whole lot of shell scripts somewhere that someone’s got to find and change, and I wouldn’t be surprised if something breaks if there’s a sudden shift. If we simply start creating keys with preference of no compression, it gets maximum quick uptake, and can even be improved with some options in gnupg.conf or equivalent.)

So to sum up, I completely support deprecating compression, because it improves and simplifies the standard. Today, compression by default is of limited benefit, and it simplifies implementation and understanding of OpenPGP to phase it out. I am an enthusiastic supporter of implementations just making the changes now, without a document change. I’m also in favor of deprecating. I’m not opposed to a ban, but I think it’s unneeded.

	Jon