Re: [Cbor] Packing CBOR (draft-bormann-cbor-packed-00)

worley@ariadne.com Wed, 29 July 2020 02:53 UTC

Return-Path: <worley@alum.mit.edu>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5CE443A0F26 for <cbor@ietfa.amsl.com>; Tue, 28 Jul 2020 19:53:34 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.232
X-Spam-Level:
X-Spam-Status: No, score=-1.232 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.001, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=comcastmailservice.net header.b=E4vxufiJ; dkim=pass (2048-bit key) header.d=comcastmailservice.net header.b=XLLm99O6
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 78uYkP4WLZOV for <cbor@ietfa.amsl.com>; Tue, 28 Jul 2020 19:53:32 -0700 (PDT)
Received: from resdmta-ch2-02v.sys.comcast.net (resdmta-ch2-02v.sys.comcast.net [IPv6:2001:558:fe21:29:69:252:207:82]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 803F43A0F25 for <cbor@ietf.org>; Tue, 28 Jul 2020 19:53:32 -0700 (PDT)
Received: from resqmta-ch2-06v.sys.comcast.net ([69.252.207.38]) by resqmta-ch2-12v.sys.comcast.net with ESMTP id 0aKokIEHDVfJG0cDjkf7G1; Wed, 29 Jul 2020 02:53:31 +0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcastmailservice.net; s=20180828_2048; t=1595991211; bh=xvwNdXNMEJfS/k6bWlQxS7wbwRR+/qdZiH0aDEaQptg=; h=Received:Received:Received:Received:Received:From:To:Subject:Date: Message-ID; b=E4vxufiJj8PwuUcVqnUV63RmjkQHxAoZkVT390guxktvhFaF6QrCax100FXYoJUgE 1wJ/3X1o5FZeP76LPJLoDRctKrUq55PGaLJzLD4fwkGWQw7Jw4pZkWJ7UHxkVoiCRl EOhc8lfxG5P+6UJ+mLYbahy1qBK/guTIO9XtH9eSxgjQ0uNkxTnVUqQjLUXn8AvQHh EYcGY6p7zY9a9nKxn2NLJODcQzA9KKbpk614MFm64MFIxNIowH1SPeIvDNZ6/H/F5F CjvJ9AKaYHCY1niCMTNZwjrf+3beroIEOZe4yYBzWfroIqKJlGjBWOdPCv+CIQ1NJv A9JR5w/k2vxwg==
Received: from resomta-ch2-19v.sys.comcast.net ([69.252.207.115]) by resqmta-ch2-06v.sys.comcast.net with ESMTP id 0c52kYPst0Dff0cDjkjUhJ; Wed, 29 Jul 2020 02:53:31 +0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcastmailservice.net; s=20180828_2048; t=1595991211; bh=xvwNdXNMEJfS/k6bWlQxS7wbwRR+/qdZiH0aDEaQptg=; h=Received:Received:Received:Received:From:To:Subject:Date: Message-ID; b=XLLm99O65rBu+yWFvQ9fVQ4/yVKzngWwOvfxZxEAW2WA7AonwMgd/Aj/k4y0unLO4 HLbTv/9lda2c1iiGvo1BdW8HW9+qY8IhuRyNnQXynXcs5B4NRKQWr7u+/RwS1wt9+w 9N1F26MvcIvVz/t0ultlb+T63tTjfDOFG9he0Um3PbU09tOsmVWRZeMyz7WEV36a6r oO/7cn8tnZgwGdI70VVwzr9l/L4YDCLgJhQPSlCezuGu3jI2aYgTWr1x8Xw+gOvbt2 A0OQ+FYnJPS5K2nlzAn/G3a9P2MxbwrekK5X+25W0MAtOOCAldGJ+aQOtnu1TUptzJ QSxPISR7FByEw==
Received: from hobgoblin.ariadne.com ([IPv6:2601:192:4a00:430:222:fbff:fe91:d396]) by resomta-ch2-19v.sys.comcast.net with ESMTPA id 0cDfk8Pj2a3JB0cDgkeSWC; Wed, 29 Jul 2020 02:53:29 +0000
X-Xfinity-VMeta: sc=-100.00;st=legit
Received: from hobgoblin.ariadne.com (hobgoblin.ariadne.com [127.0.0.1]) by hobgoblin.ariadne.com (8.14.7/8.14.7) with ESMTP id 06T6rGoe006913; Wed, 29 Jul 2020 02:53:16 -0400
Received: (from worley@localhost) by hobgoblin.ariadne.com (8.14.7/8.14.7/Submit) id 06T6rEEK006908; Wed, 29 Jul 2020 02:53:14 -0400
X-Authentication-Warning: hobgoblin.ariadne.com: worley set sender to worley@alum.mit.edu using -f
From: worley@ariadne.com
To: Carsten Bormann <cabo@tzi.org>
Cc: cbor@ietf.org
In-Reply-To: <341A1839-3F18-4DEC-9D31-37B8B52EE7F5@tzi.org> (cabo@tzi.org)
Sender: worley@ariadne.com
Date: Wed, 29 Jul 2020 02:53:14 -0400
Message-ID: <87bljyzuat.fsf@hobgoblin.ariadne.com>
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/I4BF5-KN4cSRA_WUQ5XDnl9OWIU>
Subject: Re: [Cbor] Packing CBOR (draft-bormann-cbor-packed-00)
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 29 Jul 2020 02:53:34 -0000

After a very quick look, these points about the draft stand out.  The
abstract contains:

   While traditional data compression techniques such as
   DEFLATE (RFC 1951) work well for CBOR, their disadvantage is that the
   receiver needs to unpack the compressed form to make use of data.

   This specification describes Packed CBOR, a simple transformation of
   a CBOR data item into another CBOR data item that is almost as easy
   to consume as the original CBOR data item.

The first sentence is probably not exactly what you want to say, because
of course, in order to make use of the content of any "compressed form"
you first have to "unpack" it, in that you have to reverse whatever the
compression was.  I *think* what you mean is that in order to make use
of even a small portion of the data, the entire prefix from the start to
that portion of the data must be unpacked.

The second sentence seems also to not state what you care about.  I can
define "a simple transformation of a CBOR data item into another CBOR
data item" that is *just as easy* to consume as the original:  the
identity transformation.  I *think* what you mean is that the packed
form can (in many cases) be significantly smaller than the original
form, and yet "random access" of individual items in the packed form is
only constant-factor-slower than it is in the original form.

Note that I'm depending on "random access" to mean extracting an item
when the indexes into the nested arrays and keys of the nestd maps are
provided.

Note that this isn't just nit-picking, you're explaining the
characteristics that differentiate Packed CBOR, that is, the reason for
this work.

The other thing that stands out is that the document is very terse.
It promises to define the unpacking operation, but that definition is
largely implicit.  E.g., 2.1 says:

   Shared items are stored in the third to last element of the array
   used as tag content for tag number 6, numbered starting by 2.

   The shared data items are referenced by using the data items in
   Table 1.  When reconstructing the original data item, such a
   reference is replaced by the referenced data item, which is then
   recursively unpacked.

I *think* what this means is, "Unpacking a packed-CBOR object is done by
starting with the rump structure and replacing references within it
copies of the referenced structure.  Within the rump structure, tag 6
(which I think is CDDL #6.6) is used not to designate a packed-CBOR
structure but rather a reference to one of the shared structures.  The
tagged value in such a reference is either a simple value or an integer,
interpreted according to Table 1.  Implicitly, each reference is
replaced by a copy of the referended shared value.  The shared
structures may themselves contain uses of tag 6, whose values are
packed-CBOR structures."

The difference in the above description is that it is explicit about the
operations that convert the packed-CBOR structure into its unpacked
form.

Also, it's not clear to me whether a packed-CBOR (a value tagged with 6)
can appear anywhere other than the top level of the CBOR structure.  I
would expect by the nature of CBOR it can be, but section 2 isn't clear
about it.

Dale