Re: [dispatch] Zstandard compression

Sean Leonard <> Wed, 15 November 2017 04:34 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 4FE6312785F for <>; Tue, 14 Nov 2017 20:34:04 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id xRTupLGJhDL7 for <>; Tue, 14 Nov 2017 20:34:02 -0800 (PST)
Received: from ( []) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id B1FFE126E64 for <>; Tue, 14 Nov 2017 20:34:02 -0800 (PST)
Received: from ( []) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPSA id 4ED6827551; Tue, 14 Nov 2017 23:33:59 -0500 (EST)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\))
From: Sean Leonard <>
In-Reply-To: <>
Date: Wed, 15 Nov 2017 12:33:57 +0800
Content-Transfer-Encoding: quoted-printable
Message-Id: <>
References: <> <> <> <> <> <> <>
To: Harald Alvestrand <>
X-Mailer: Apple Mail (2.3273)
Archived-At: <>
Subject: Re: [dispatch] Zstandard compression
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: DISPATCH Working Group Mail List <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 15 Nov 2017 04:34:04 -0000

> On Nov 15, 2017, at 8:25 AM, Harald Alvestrand <> wrote:
> On 11/13/2017 10:25 AM, Martin J. Dürst wrote:
>> On 2017/11/13 15:18, Murray S. Kucherawy wrote:
>>> On Mon, Nov 13, 2017 at 2:01 PM, Sean Leonard <>
>>> wrote:
>>>> Working on this. It strikes me as either…odd…or just not really that
>>>> useful for the purpose. It is a format, but it’s just a format for a
>>>> stream
>>>> of arbitrary bytes. There is no way to label it internally beyond the
>>>> implicit “application/octet-stream” type.
>>> Don't you need a hint as to how to decode an octet stream that's
>>> encoded in
>>> some way?
>> Yes. But I think what Sean is saying is that above and beyond that,
>> you'd (in many if not most use cases) also want to know what the
>> decoded octet stream is about. Otherwise, this information has to be
>> transmitted on a side channel.
> One (horribly ugly) way of doing this is to have something like a
> "contains" parameter:
> mime-type: application/zstandard; contains="text/plain; charset=utf-8"
> That's one form of side channel for passing the data around. Not sure if
> anyone would use it.
> But it's such an ugly way of doing things that it should definitely be
> suggested on the media-types list, not here.

Yep, talked offline at the conference and that was one possibility that was discussed: add a parameter that records the inner content type. It has advantages and disadvantages. The main thing is, is this a media type where encapsulating things is intended, or not? The main purpose of the document appears to be to describe the Zstd compression algorithm, not to create container formats.

Since it’s an algorithm, it is broadly applicable to lots of different containing formats. For example, it could be used with .ZIP (application/zip) by registering a new “compression method”: see Section 4.4.5 of the ZIP APPNOTE <>. Zip already supports BZIP2, LZMA, WavPack, and a handful of others, so it’s just another one to support.

(The same principle applies to CMS CompressedData, which is why I suggested it.)

Another thing to consider is whether it is a candidate for a structured syntax suffix, namely +zstd (which was suggested/discussed offline). Think about it, and ask the media-types@ folks. In this case, I do not find it to be a strong candidate because it’s not a “structured” syntax, in the way that JSON, XML, etc. are “structured” (into arbitrary trees of constituent elements or parts).

Best regards,