Re: [ietf-smtp] quoted-unprintable ?

Viktor Dukhovni <ietf-dane@dukhovni.org> Fri, 26 March 2021 07:33 UTC

Return-Path: <ietf-dane@dukhovni.org>
X-Original-To: ietf-smtp@ietfa.amsl.com
Delivered-To: ietf-smtp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id CABCA3A12AB for <ietf-smtp@ietfa.amsl.com>; Fri, 26 Mar 2021 00:33:46 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.2
X-Spam-Level:
X-Spam-Status: No, score=-4.2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id py-FKv9VcFDP for <ietf-smtp@ietfa.amsl.com>; Fri, 26 Mar 2021 00:33:42 -0700 (PDT)
Received: from straasha.imrryr.org (straasha.imrryr.org [100.2.39.101]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 7BEFC3A16EC for <ietf-smtp@ietf.org>; Fri, 26 Mar 2021 00:33:41 -0700 (PDT)
Received: by straasha.imrryr.org (Postfix, from userid 1001) id C097AD669E; Fri, 26 Mar 2021 03:33:38 -0400 (EDT)
Date: Fri, 26 Mar 2021 03:33:38 -0400
From: Viktor Dukhovni <ietf-dane@dukhovni.org>
To: ietf-smtp@ietf.org
Message-ID: <YF2OUtRDGvrBXWtG@straasha.imrryr.org>
Reply-To: ietf-smtp@ietf.org
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <YFzdbRhRBq/96hyk@straasha.imrryr.org> <01RX3BF5QL9Y0085YQ@mauve.mrochek.com>
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf-smtp/6ZlssswhpTUY5OuL0HWqr0E_W-c>
Subject: Re: [ietf-smtp] quoted-unprintable ?
X-BeenThere: ietf-smtp@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussion of issues related to Simple Mail Transfer Protocol \(SMTP\) \[RFC 821, RFC 2821, RFC 5321\]" <ietf-smtp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf-smtp>, <mailto:ietf-smtp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf-smtp/>
List-Post: <mailto:ietf-smtp@ietf.org>
List-Help: <mailto:ietf-smtp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf-smtp>, <mailto:ietf-smtp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 26 Mar 2021 07:33:47 -0000

On Thu, Mar 25, 2021 at 04:32:25PM -0700, Ned Freed wrote:

> Any competent encryption scheme produces something close to uniformly
> distributed output, which is about as good as you're going to get.
> 
> The second proposal actually analyzes the input, shifts characters around
> to avoid the problematic cases, and selects a good quoting character.

Nothing nearly that sophisticaed is needed, if one is willing to keep it
simple and accept ~4.2% total cost (output folded to 78 bytes + CRLF).
In particular encryption really feels much too heavyweight.  And there
is a malicious counter-example in the form of binary data that happens
to be the decryption of a run of NULs. :-)

Instead, the COBS scheme generalises easily.  The encoding can be more
efficient if we e.g. chunk the output as 998 bytes + CRLF, in which case
the worst case expansion is 1.8% for an input stream entirely devoid of
NUL, LF or CR bytes.  This is the absolute worst case, not just for
likely, but all possible inputs, and inputs where all lines are 63 bytes
or less is not expanded at all!

On Thu, Mar 25, 2021 at 02:58:53PM -0400, Viktor Dukhovni wrote:
> 
> There's an easy generalisation to 3 forbidden code points: use 6 bits
> for the chunk length, with 2 high bits for which of the 3 forbidden
> sequences terminates the chunk.
> 
>     01 - NUL |
>     10 - NL  | 6-bit byte count
>     11 - CR  |
> 
> A non-zero value (n) of the 6-bit count indicates a run of (n-1) literal
> bytes none of which are the reserved bytes NUL, LF or CR, followed by the
> indicated reserved byte.  When (n == 0) the 2 high bits signal, respectively,
> 63, 126 or 189 literal bytes without a following reserved byte.

-- 
    Viktor.