Re: [Cbor] Record proposal

Carsten Bormann <cabo@tzi.org> Wed, 05 May 2021 12:50 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C4FA83A16ED for <cbor@ietfa.amsl.com>; Wed, 5 May 2021 05:50:24 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.219
X-Spam-Level:
X-Spam-Status: No, score=-4.219 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H4=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 3EnNRKpSd7xd for <cbor@ietfa.amsl.com>; Wed, 5 May 2021 05:50:20 -0700 (PDT)
Received: from gabriel-vm-2.zfn.uni-bremen.de (gabriel-vm-2.zfn.uni-bremen.de [134.102.50.17]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 8149B3A16EF for <cbor@ietf.org>; Wed, 5 May 2021 05:50:20 -0700 (PDT)
Received: from [192.168.217.118] (p548dcb12.dip0.t-ipconnect.de [84.141.203.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by gabriel-vm-2.zfn.uni-bremen.de (Postfix) with ESMTPSA id 4FZxQ16zR8zyV5; Wed, 5 May 2021 14:50:17 +0200 (CEST)
Content-Type: text/plain; charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.6\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <YJJ+oJZ5YF/c14sv@hephaistos.amsuess.com>
Date: Wed, 5 May 2021 14:50:17 +0200
Cc: cbor@ietf.org, Kris Zyp <kriszyp@gmail.com>
X-Mao-Original-Outgoing-Id: 641911817.486241-cc049bf4d5437aa39228390c4b979aaf
Content-Transfer-Encoding: quoted-printable
Message-Id: <41C02CBE-E7EC-4E61-889B-779EE561C632@tzi.org>
References: <8421F43D-E9ED-444F-A915-415F3AE59FA0@tzi.org> <YJJ+oJZ5YF/c14sv@hephaistos.amsuess.com>
To: =?utf-8?Q?Christian_Ams=C3=BCss?= <christian@amsuess.com>
X-Mailer: Apple Mail (2.3608.120.23.2.6)
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/WmOurJydphRy_u6MjRY9oeb-0yM>
Subject: Re: [Cbor] Record proposal
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 05 May 2021 12:50:25 -0000

On 2021-05-05, at 13:16, Christian Amsüss <christian@amsuess.com> wrote:
> 
> (And as much as I dislike being "the person to whom everything looks
> like a nail", I'll probably ask about whether this fits in the general
> model of packed CBOR, with the first entity setting up a single table
> entry, and then the entries expanding a [] to a {}).

There are two potential aspects to the proposed tag:

* a more compact representation (which is all that cbor-packed is about)

* semantic indication that a specific kind of record is being used

Proposed Tag 105 currently does not have a place for further semantic indications, but one could be added.

By the way, cbor-packed turns the example I gave in the referenced email into

51([["value", "name"], [], [], 
   [{simple(1): "one", simple(0): 1}, 
    {simple(1): "two", simple(0): 2}, {simple(1): "three", simple(0): 3}]])

Encoding-wise, the last array looks like this:

      83                  # array(3)
         a2               # map(2)
            e1            # primitive(1)
            63            # text(3)
               6f6e65     # "one"
            e0            # primitive(0)
            01            # unsigned(1)
         a2               # map(2)
            e1            # primitive(1)
            63            # text(3)
               74776f     # "two"
            e0            # primitive(0)
            02            # unsigned(2)
         a2               # map(2)
            e1            # primitive(1)
            65            # text(5)
               7468726565 # "three"
            e0            # primitive(0)
            03            # unsigned(3)

So the overhead here is one map head and two simple values per row.
(Of course, that assumes that one-byte simple values are still available in the greater context this is in.)

Even with a form of circumfix compression (e.g., mapping tables with parameters [1]), this is hard to beat encoding wise.
The record proposal as is takes four bytes per row (1+2 tag, 1 array).
This can be optimized significantly further only by amortizing the tag over more than one row, as my “CSV style” does, but that requires homogeneity.

Grüße, Carsten

[1]: https://datatracker.ietf.org/doc/draft-bormann-lpwan-cbor-template/