Re: [Cbor] Packing CBOR (draft-bormann-cbor-packed-00)

Carsten Bormann <cabo@tzi.org> Wed, 29 July 2020 05:19 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7475D3A0F97 for <cbor@ietfa.amsl.com>; Tue, 28 Jul 2020 22:19:07 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.919
X-Spam-Level:
X-Spam-Status: No, score=-1.919 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_MSPIKE_H4=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id QuZw3eKbqzPw for <cbor@ietfa.amsl.com>; Tue, 28 Jul 2020 22:19:03 -0700 (PDT)
Received: from gabriel-vm-2.zfn.uni-bremen.de (gabriel-vm-2.zfn.uni-bremen.de [134.102.50.17]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 678D13A0F95 for <cbor@ietf.org>; Tue, 28 Jul 2020 22:19:03 -0700 (PDT)
Received: from [192.168.217.116] (p5089ae91.dip0.t-ipconnect.de [80.137.174.145]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by gabriel-vm-2.zfn.uni-bremen.de (Postfix) with ESMTPSA id 4BGhfY6B85zyVN; Wed, 29 Jul 2020 07:19:01 +0200 (CEST)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.1\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <87bljyzuat.fsf@hobgoblin.ariadne.com>
Date: Wed, 29 Jul 2020 07:19:01 +0200
Cc: cbor@ietf.org
X-Mao-Original-Outgoing-Id: 617692741.29136-1ba2835b5bcc714cb29845ddd86c55a1
Content-Transfer-Encoding: quoted-printable
Message-Id: <B45FAA6C-E6A8-45CA-9555-719A6E40C214@tzi.org>
References: <87bljyzuat.fsf@hobgoblin.ariadne.com>
To: "Dale R. Worley" <worley@ariadne.com>
X-Mailer: Apple Mail (2.3608.120.23.2.1)
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/5WvP1NSOnjnx2881fSv31cKx6eo>
Subject: Re: [Cbor] Packing CBOR (draft-bormann-cbor-packed-00)
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 29 Jul 2020 05:19:08 -0000

Hi Dale,

+1 to all your editorial points; this sure was written up quickly.

To the technical point: Yes, there is no reason tag 6 cannot be used deeper within the structure.  The interesting technical question is what happens if there are two or more packed structures: Do they sail like ships in the night, or is there interaction (e.g., for nested compression, do the tables get merged in some way?)?
We don’t have an answer for that (but probably should).

A related question is how to support CBOR sequences: is there a way for one of the data items to use a prefix/sharing table entry in another data item in the sequence?

Grüße, Carsten


> On 2020-07-29, at 08:53, Dale R. Worley <worley@ariadne.com> wrote:
> 
> After a very quick look, these points about the draft stand out.  The
> abstract contains:
> 
>   While traditional data compression techniques such as
>   DEFLATE (RFC 1951) work well for CBOR, their disadvantage is that the
>   receiver needs to unpack the compressed form to make use of data.
> 
>   This specification describes Packed CBOR, a simple transformation of
>   a CBOR data item into another CBOR data item that is almost as easy
>   to consume as the original CBOR data item.
> 
> The first sentence is probably not exactly what you want to say, because
> of course, in order to make use of the content of any "compressed form"
> you first have to "unpack" it, in that you have to reverse whatever the
> compression was.  I *think* what you mean is that in order to make use
> of even a small portion of the data, the entire prefix from the start to
> that portion of the data must be unpacked.
> 
> The second sentence seems also to not state what you care about.  I can
> define "a simple transformation of a CBOR data item into another CBOR
> data item" that is *just as easy* to consume as the original:  the
> identity transformation.  I *think* what you mean is that the packed
> form can (in many cases) be significantly smaller than the original
> form, and yet "random access" of individual items in the packed form is
> only constant-factor-slower than it is in the original form.
> 
> Note that I'm depending on "random access" to mean extracting an item
> when the indexes into the nested arrays and keys of the nestd maps are
> provided.
> 
> Note that this isn't just nit-picking, you're explaining the
> characteristics that differentiate Packed CBOR, that is, the reason for
> this work.
> 
> The other thing that stands out is that the document is very terse.
> It promises to define the unpacking operation, but that definition is
> largely implicit.  E.g., 2.1 says:
> 
>   Shared items are stored in the third to last element of the array
>   used as tag content for tag number 6, numbered starting by 2.
> 
>   The shared data items are referenced by using the data items in
>   Table 1.  When reconstructing the original data item, such a
>   reference is replaced by the referenced data item, which is then
>   recursively unpacked.
> 
> I *think* what this means is, "Unpacking a packed-CBOR object is done by
> starting with the rump structure and replacing references within it
> copies of the referenced structure.  Within the rump structure, tag 6
> (which I think is CDDL #6.6) is used not to designate a packed-CBOR
> structure but rather a reference to one of the shared structures.  The
> tagged value in such a reference is either a simple value or an integer,
> interpreted according to Table 1.  Implicitly, each reference is
> replaced by a copy of the referended shared value.  The shared
> structures may themselves contain uses of tag 6, whose values are
> packed-CBOR structures."
> 
> The difference in the above description is that it is explicit about the
> operations that convert the packed-CBOR structure into its unpacked
> form.
> 
> Also, it's not clear to me whether a packed-CBOR (a value tagged with 6)
> can appear anywhere other than the top level of the CBOR structure.  I
> would expect by the nature of CBOR it can be, but section 2 isn't clear
> about it.
> 
> Dale
> 
> _______________________________________________
> CBOR mailing list
> CBOR@ietf.org
> https://www.ietf.org/mailman/listinfo/cbor