Re: [ietf-smtp] quoted-unprintable ?

Ned Freed <ned.freed@mrochek.com> Mon, 22 March 2021 14:07 UTC

Return-Path: <ned.freed@mrochek.com>
X-Original-To: ietf-smtp@ietfa.amsl.com
Delivered-To: ietf-smtp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E46803A15DA for <ietf-smtp@ietfa.amsl.com>; Mon, 22 Mar 2021 07:07:08 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.199
X-Spam-Level:
X-Spam-Status: No, score=-0.199 tagged_above=-999 required=5 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=mrochek.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id naVfpVwPFnAQ for <ietf-smtp@ietfa.amsl.com>; Mon, 22 Mar 2021 07:07:04 -0700 (PDT)
Received: from plum.mrochek.com (plum.mrochek.com [172.95.64.195]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 033CE3A15BC for <ietf-smtp@ietf.org>; Mon, 22 Mar 2021 07:07:03 -0700 (PDT)
Received: from dkim-sign.mauve.mrochek.com by mauve.mrochek.com (PMDF V6.1-1 #35243) id <01RWYKF70OOG007A4X@mauve.mrochek.com> for ietf-smtp@ietf.org; Mon, 22 Mar 2021 07:01:58 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mrochek.com; s=201712; t=1616421723; bh=P5vmwJnI1iwNytNqeOvh0SZlpmokNU4TEwclbOtUpiw=; h=Cc:Date:From:Subject:In-reply-to:References:To:From; b=orAvsfnpgbbsKUW2PPV/SUw7DT9La5g+wkZwfhGd2VGmiGXrjOt6eSVMnOknK99V7 5ggvZP8gE/2x3rbrK/UhaWT6l7FkKFjbi5eRHR3bTZaIO7nZrjxv4fznQ1Mv2TfLiV +gF82gJ2IY9MMMocc8Iv3ZHu0+THfdRlbNaoG0nc=
MIME-version: 1.0
Content-transfer-encoding: 7bit
Content-type: TEXT/PLAIN; CHARSET="us-ascii"
Received: from mauve.mrochek.com by mauve.mrochek.com (PMDF V6.1-1 #35243) id <01RWXH19MUOG0085YQ@mauve.mrochek.com>; Sun, 21 Mar 2021 12:49:37 -0700 (PDT)
Cc: ietf-smtp@ietf.org
Message-id: <01RWXI9W6HXE0085YQ@mauve.mrochek.com>
Date: Sun, 21 Mar 2021 12:18:41 -0700
From: Ned Freed <ned.freed@mrochek.com>
In-reply-to: "Your message dated Sun, 21 Mar 2021 13:41:02 -0400" <20210321174103.BB02470D9908@ary.qy>
References: <20210321174103.BB02470D9908@ary.qy>
To: John Levine <johnl@taugh.com>
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf-smtp/c_3lnwgPgbKAkIVZBdL3YYjkuCY>
Subject: Re: [ietf-smtp] quoted-unprintable ?
X-BeenThere: ietf-smtp@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussion of issues related to Simple Mail Transfer Protocol \(SMTP\) \[RFC 821, RFC 2821, RFC 5321\]" <ietf-smtp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf-smtp>, <mailto:ietf-smtp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf-smtp/>
List-Post: <mailto:ietf-smtp@ietf.org>
List-Help: <mailto:ietf-smtp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf-smtp>, <mailto:ietf-smtp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Mar 2021 14:07:15 -0000

> Over on another list we have been musing about CHUNKING and BINARYMIME.

> CHUNKING is easy to implement (I added it to my MTA in about an hour) but BINARYMIME
> is extremely painful on systems where the native line ending is not \r\n because you
> have to parse MIME bodies to figure out when to change line endings and when not to.

> The advantage of BINARYMIME over base64 is that base64 is 33% bigger since it encodes
> six bits per octet rather than 8.  It occurs to me that since everone these days
> supports 8BITMIME, one could invent a quoted-unprintable encoding that encodes only
> the characters that are special, CR LF NUL.  (To play it safe I'd also encode 0xff).
> This gets you about a 2% size increase and stays compatible with 8BITMIME.

It's actually kind of tricky if you want to avoid pathological cases. Combining
the encoding with compression is the simplest way to avoid that, since it's
unlikely in the extreme that the compression scheme will spit out a high
percentage of any specific character.

> This seems totally obvious.  Has anyone proposed it before?

At least five times that I know of. Probably a bunch more than I don't. One
scheme that's actually used on the News side of thing is yEnc:

  http://www.yenc.org/

I wrote a draft at one point for one that combined the encoding part of yEnc
with deflate:

  https://tools.ietf.org/html/draft-freed-mime-newenc-00

It never went anywhere.

> I realize that there is a severe chicken and egg problem here since you wouldn't use it
> unless you knew your recipient could handle it.  I suppose one could add an EHLO keyword,
> but MTAs don't downgrade on the fly any more, particularly since that means they would have
> to redo DKIM and ARC signatures.

First, there are still MTAs that downgrade, just not for 8bitMIME. It's pretty
much a requirement if you're going to try and deploy smtputf8. However, as long
as the downgrade simply ignores encodings it doesn't recognize, it shouldn't
be a problem.

Second, an EHLO keyword isn't useful, since what matters is that the
*recipient*'s ability to decode. And antivirus scanners, which won't like not
being able to see into all the parts of a message.

And this is pretty much why none of these schemes ever went anywhere.

Anyway, I would be fine with reviving the draft if people either have a way around
the chicken and egg problem, or think it can be ignored.

> PS: I also realize that the sensible way to send a message with a giant attachment is
> to put the attachment on a web server and use message/external-body, but that doesn't seem
> all that well supported, either.

Antivirus scanners don't like message/external-body either, because the body can
be changed after scanning. The few that support it do things like replace the
URL with one that redirects through a server that either does the scan every
time the object is fetched or has a checksum to make sure the object hasn't
changed. But all of this is pretty high overhead, and for reasons I don't fully
understand doesn't work all that well in practice.

				Ned