[Cbor] Re: Early allocation for packed CBOR (Re: Reminder: CBOR WG Virtual Meeting on 2024-12-11)

Carsten Bormann <cabo@tzi.org> Wed, 11 December 2024 01:40 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A16E4C1D61F5 for <cbor@ietfa.amsl.com>; Tue, 10 Dec 2024 17:40:25 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.905
X-Spam-Level:
X-Spam-Status: No, score=-1.905 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_BLOCKED=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id lwPstN347e2Z for <cbor@ietfa.amsl.com>; Tue, 10 Dec 2024 17:40:22 -0800 (PST)
Received: from smtp.zfn.uni-bremen.de (smtp.zfn.uni-bremen.de [IPv6:2001:638:708:32::21]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-256) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id BA540C1D4A85 for <cbor@ietf.org>; Tue, 10 Dec 2024 17:40:22 -0800 (PST)
Received: from smtpclient.apple (p548dc3ec.dip0.t-ipconnect.de [84.141.195.236]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.zfn.uni-bremen.de (Postfix) with ESMTPSA id 4Y7JFb4JzZzDCbl; Wed, 11 Dec 2024 02:40:19 +0100 (CET)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3776.700.51.11.1\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <5FEA5C07-4A39-4B58-B2AE-F261D111FCE6@cursive.net>
Date: Wed, 11 Dec 2024 02:40:07 +0100
Content-Transfer-Encoding: quoted-printable
Message-Id: <D0618F67-4868-4745-A526-F73DF1A98E1B@tzi.org>
References: <CALaySJKDFscUBGw4CPspXJvUTkXywVHc_FrmhO3ybBWTrwjGXw@mail.gmail.com> <CALaySJJ8-M9x8irtmF2pfDE3GRXU1am9n2a3XeDcmPT+kww+KA@mail.gmail.com> <CALaySJKTQT_9CC-wVVd+fY1NYJ73M8CP22hn=rWrFeTJSJDEsA@mail.gmail.com> <CALaySJKG3oagg6ffLTx8LgvLvnjHHA2DMGgY74E0q=rReAc4PA@mail.gmail.com> <CALaySJLtUR1=G_WH4H+zoJ5LCrHjBgEf1oW104zDtFQighY+gg@mail.gmail.com> <CALaySJLnKxU9m3BNPq4XayrSrorRBG2vuBz1AF-CsEBoSZe7Xg@mail.gmail.com> <CALaySJKaz7C=GN5E=saiDY4KxL+9xCfM0ocZuMStEQ96FnQ4KA@mail.gmail.com> <CALaySJJEXkey9vLAp8VqDXmPsWpxiWN9jjtVnGio1nMQ4K+mDQ@mail.gmail.com> <CALaySJJfc+tET4Vm5UQjHPK5mf61O0iR-1i6=X32CYtWxZLWTQ@mail.gmail.com> <CALaySJKdrk7aPzhT=kbE1B8pq1EBw74nmx_peSJMAoHsG5jyVQ@mail.gmail.com> <CALaySJ+fWX4zEnE5v-Q9R6eCv=kSJjnc-fsXL5PGPgac1GJAcA@mail.gmail.com> <B807C9D3-39A4-4024-BC1D-85DD84EA1735@tzi.org> <DFE56705-CCDD-4172-B577-C873E3DB4898@tzi.org> <5FEA5C07-4A39-4B58-B2AE-F261D111FCE6@cursive.net>
To: Joe Hildebrand <hildjj@cursive.net>
X-Mailer: Apple Mail (2.3776.700.51.11.1)
Message-ID-Hash: WAC4I2WHLEMCJBRZYOXNZBXTU76Q7NH4
X-Message-ID-Hash: WAC4I2WHLEMCJBRZYOXNZBXTU76Q7NH4
X-MailFrom: cabo@tzi.org
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-cbor.ietf.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
CC: CBOR <cbor@ietf.org>
X-Mailman-Version: 3.3.9rc6
Precedence: list
Subject: [Cbor] Re: Early allocation for packed CBOR (Re: Reminder: CBOR WG Virtual Meeting on 2024-12-11)
List-Id: "Concise Binary Object Representation (CBOR)" <cbor.ietf.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/jm4yDojZkpSJ6jAz_NGPXachbhM>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Owner: <mailto:cbor-owner@ietf.org>
List-Post: <mailto:cbor@ietf.org>
List-Subscribe: <mailto:cbor-join@ietf.org>
List-Unsubscribe: <mailto:cbor-leave@ietf.org>

Hi Joe,

please get well soon!

> On 11. Dec 2024, at 02:01, Joe Hildebrand <hildjj@cursive.net> wrote:
> 
> Does anyone else think that this is a LOT of tags to allocate for one thing?  Was there no other design that could achieve the goals?

A ton of tags.  I think some 335544318 or so of them.

Section 2.5 briefly addresses this [1].

[1]: https://www.ietf.org/archive/id/draft-ietf-cbor-packed-13.html#section-2.5


Let me point out that the way this was designed simply exercises a design pattern that has developed for CBOR.

In CBOR, tags have exactly one data item in its content.
For sharing tags, we can use that as an index into the table, so we have just one of them (number 6, in 1+0 space, where we add 16 direct sharing references from the 1+0 simple value space as well).

For argument-bearing tags, we have both the table index and the argument that need to be included.
This could be done putting these into an array, at the cost of one byte (0x82, that is).
Including the index in the tag number is a byte-saving workaround that now has been used in many places.
It means that we allocate a range of tag numbers and provide a function that computes the index out of the tag number (usually a subtraction).


Now, there is also the question whether we should expend the amount of tags we are registering here.
CBOR-packed is not really “one thing”, but a rather versatile mechanism that can be used in many ways (often by defining application-specific table-builders), so expending some of the tag reserve for this appears to be a good investment into the future.

Here is today’s tag-report:

range  used     %                 free                total
0 1+0    13 54.17                   11                   24
1 1+1    73 31.47                  159                  232
2 1+2  1087  1.67                64193                65280
3 1+4 65539  0.00           4294836221           4294901760
4 1+8     2  0.00 18446744069414584318 18446744069414584320

Expending about one fifth of the free 1+1 tags (and one tenth of the free 1+0 tags), as well as 80 % (!) of the 1+0 simple values, is a sizable investment, but I would consider that worthwhile given the above.  I don’t think the thirteenth of 1+2 and 1+4 each is oversized at all.

Grüße, Carsten


PS.:
If we talk about numbers, we can group the tags to be allocated into table building tags, of which we spend two (one 1+1 and one 1+2), special tags for function references, of which we spend three (1+1), and the approximately 335544318 we use for straight and inverted references (plus the one 1+0 for shared references).
Seems frugal to me :-)

Historical tidbit: 
an early form of cbor-packed was part of the initial design before we even named the format CBOR; we removed it from the proposal because that already was complex enough.  I’m quite happy that 11 years later this is still the way we want to address redundancy, but by now with a few refinements.