Re: [Cbor] Interactions of packed CBOR and tags

Carsten Bormann <cabo@tzi.org> Thu, 03 September 2020 16:53 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7BDC13A0FD9 for <cbor@ietfa.amsl.com>; Thu, 3 Sep 2020 09:53:28 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Level:
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id SsrfWcBRpP84 for <cbor@ietfa.amsl.com>; Thu, 3 Sep 2020 09:53:25 -0700 (PDT)
Received: from gabriel-vm-2.zfn.uni-bremen.de (gabriel-vm-2.zfn.uni-bremen.de [134.102.50.17]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 8FAF33A0FD3 for <cbor@ietf.org>; Thu, 3 Sep 2020 09:53:25 -0700 (PDT)
Received: from [192.168.217.102] (p5089ae91.dip0.t-ipconnect.de [80.137.174.145]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by gabriel-vm-2.zfn.uni-bremen.de (Postfix) with ESMTPSA id 4Bj6M74XQGzyrq; Thu, 3 Sep 2020 18:53:23 +0200 (CEST)
Content-Type: text/plain; charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.1\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <016f01d6820b$bc7d7cc0$35787640$@augustcellars.com>
Date: Thu, 3 Sep 2020 18:53:23 +0200
Cc: Brendan Moran <Brendan.Moran@arm.com>, cbor@ietf.org
X-Mao-Original-Outgoing-Id: 620844802.855808-6c40fe1e7a078da369024282e5e66bfb
Content-Transfer-Encoding: quoted-printable
Message-Id: <62FEE35D-75F3-422E-A6C0-FFE86ACBD9A5@tzi.org>
References: <00c101d67cb5$2588b790$709a26b0$@augustcellars.com> <E30F54B6-1A63-48AC-89AE-61983654B5A9@tzi.org> <00cc01d67cc9$766c7b60$63457220$@augustcellars.com> <4AE9B2FA-EEB3-4B45-96E4-9DC85118567D@arm.com> <016f01d6820b$bc7d7cc0$35787640$@augustcellars.com>
To: Jim Schaad <ietf@augustcellars.com>
X-Mailer: Apple Mail (2.3608.120.23.2.1)
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/lbNEiphYLweuDnJrElaaYcT7Xkk>
Subject: Re: [Cbor] Interactions of packed CBOR and tags
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 03 Sep 2020 16:53:29 -0000

On 2020-09-03, at 18:03, Jim Schaad <ietf@augustcellars.com> wrote:
> 
> 6([ "www.", "merch", "datatracker"], [".ietf.org""], 224(225(simple(2))), 224(226(simple(2)))] 
> 
> Where we have pulled both prefix and postfix strings extracted and maximize the amount of commonality.  

I like the idea of adding suffix packing.

In the current proposal, to distinguish shared items from prefix packing, we have

— a separate table (and thus number space), which makes sense if prefixes are rarely used as shared items and v.v. (using a prefix as shared item just requires prefixing it to the empty item, though);
— separate referents (one with no content, one with the content the prefix is prefixed to).

Adding suffix packing, we could have a third table and a third set of referents (also with content).  We also could have just separate referents, sharing the table/number space, if that makes sense.

We could also rethink the whole separation of shared items from prefix/suffix items in this process.

As Brendan says, being able to optimize the tables so the frequently used items land in the areas that have short referents is important.

[Apologies for using the term referent, which isn’t even in my dictionaries with the meaning I’m using it for (the thing that references).  But that’s the usage I learned when I learned about them in the 1970s… [1]]

Making prefix/suffix packing well-defined for arrays isn’t hard.  For maps, there is no difference (assuming we are always handling full key/value pairs — just doing a single key or a value can be done by sharing items).

I don’t think we want to address what would be “deterministic packing”.

Grüße, Carsten

[1]: https://scholar.google.com/scholar?hl=en&q=Ross%2C+D.+T.%2C+%22Uniform+Referents%3A+An+Essential+Property+for+a+Software+Engineering+Language.%22+Paper+presented+at+the+3rd+International+Symposium+on+Computer+and+Information+Sciences+%28COINS-69%29%2C+Miami+Beach%2C+Florida%2C+18%2D%2D20+December+19691969.