[Cbor] Re: Early allocation for packed CBOR (Re: Reminder: CBOR WG Virtual Meeting on 2024-12-11)

Vadim Goncharov <vadimnuclight@gmail.com> Sun, 15 December 2024 03:24 UTC

Return-Path: <vadimnuclight@gmail.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A8E52C137363 for <cbor@ietfa.amsl.com>; Sat, 14 Dec 2024 19:24:20 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.109
X-Spam-Level:
X-Spam-Status: No, score=-2.109 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id sAkKOm45hfAC for <cbor@ietfa.amsl.com>; Sat, 14 Dec 2024 19:24:20 -0800 (PST)
Received: from mail-lj1-x234.google.com (mail-lj1-x234.google.com [IPv6:2a00:1450:4864:20::234]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature ECDSA (P-256) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 53272C14F6A2 for <cbor@ietf.org>; Sat, 14 Dec 2024 19:24:20 -0800 (PST)
Received: by mail-lj1-x234.google.com with SMTP id 38308e7fff4ca-2ffdbc0c103so27283251fa.3 for <cbor@ietf.org>; Sat, 14 Dec 2024 19:24:20 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1734233058; x=1734837858; darn=ietf.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=bBT5MQswWw7AhKBE5AuS/Q2J8Cwbk+pWd192RvlWf1Q=; b=WmAfM8HnRDi4psdturpMggA6pD6af3efuajDaOlpD5mDRs+1kTMZjA/dN1C1hJxTcv T8eqQLPVO2Qknrwwvk/+qZprvGJ4NwzoNLR3dT2sNGH1694YISktJENXGiJIfzEqqjKJ bAZMl8HrnudYKp3kJWmfh3vOd8nshiBoaV87jVkkNAwATTLRgSenYd0KcRkQFX3r1AAn /OBdnMG9HiTsCEHNnv7gqt1GigG5snOhCsTY2nT+ADfIJ+KAnIMHIeaZA3hPcTjj0YVP M0aLI+I1Pf94KtFwqVHJ9B2N/P4Hvu2uYMkUuTYaiMlSwvmt9H/LDhk5AAtyQthericD LUJA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734233058; x=1734837858; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=bBT5MQswWw7AhKBE5AuS/Q2J8Cwbk+pWd192RvlWf1Q=; b=fmmR4Sm5+18Tl+SCEpmMuPjCtSEKUfgAlo/mXrezbydT253VWSa3tk4n3VUJy/nq8K Xd8TFqpUMhyJg+qCi5hboxvDdf6I3GHR64Otn9bBAPvB1bhjoZBgz2aUhIBY69Dcg0/t ZT8pOvpV8eKLw4YrY9mhEl3hMlRZvBn15bIVFRWxw03vUBWskA0OWrtNbRyGAqGx4+EF SezvsiFsblk5NhgbWes3rFAUS16Ni0evwDGLheYMlL2yZTWriVT3JnN7N7tuN0yxpPdJ g0H0+anRf/FpYhsFsqNONMOT1WUI3V41QJ45Ju0GnVo9wcBxl5chLdxLUcvT63HLRXS7 g0Bw==
X-Forwarded-Encrypted: i=1; AJvYcCVGHQfNmlS6175aHLUgTRFml9qzztTpP/nX6+RV9sazSpsa8EErOU6lDDbFOzhGuvKFPNX4@ietf.org
X-Gm-Message-State: AOJu0YxXCGgw5Bn2akgiu3zbqt2zxDna1sGQyCsY3xbxYsW6cJYHhOT3 dKamOCRcRCFx+7NRR8/jLZIsZ1zY2xIUnf23cOHvEveZCUlvOt9o
X-Gm-Gg: ASbGncsgWnShZgYJvefLQ2JYG3rvsHzS+ohxTHStk53BtlMsF3lXIQhRaXuIQPQ0fOT MsBd6t6c+WeTGvyflGGmSKQh29OgJOtCiPy6vKS2mbUPaNAaXWT87xVStuXwkcqnLhvlaLB4+CL ijkzyZ8RDrGhVPTEp0EDY5Iq/BqnXK+2pDmrbARqR9ZW/lnfgF8J0gXrOPM7RQr5ynheblFq9r7 ibFJSJxc2zNi6JI04qwd24i/LaXSIdKzRk1Zeuf85iqOaMIhuqhMYkV9gfK5PL1V2pr7jCZH4G6 O0Fm3H/feyHRkl2llVj3bBv5
X-Google-Smtp-Source: AGHT+IFrt+aLd2QaqjpRHV6qW+cDQhDJ/GJM9lIgKTefCNrt8cdZDor/1bSsIIIJedjDsu+3xPYfgg==
X-Received: by 2002:a2e:a30e:0:b0:2ff:df01:2b43 with SMTP id 38308e7fff4ca-3025447b0a9mr22051561fa.18.1734233057857; Sat, 14 Dec 2024 19:24:17 -0800 (PST)
Received: from nuclight.lan (broadband-37-110-95-35.ip.moscow.rt.ru. [37.110.95.35]) by smtp.gmail.com with ESMTPSA id 38308e7fff4ca-30344175b6bsm4321371fa.75.2024.12.14.19.24.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 14 Dec 2024 19:24:17 -0800 (PST)
Date: Sun, 15 Dec 2024 06:24:00 +0300
From: Vadim Goncharov <vadimnuclight@gmail.com>
To: Marco Tiloca <marco.tiloca=40ri.se@dmarc.ietf.org>
Message-ID: <20241215062400.795b7401@nuclight.lan>
In-Reply-To: <b6788b18-78f3-48e7-b86a-1e369123e7b1@ri.se>
References: <CALaySJKDFscUBGw4CPspXJvUTkXywVHc_FrmhO3ybBWTrwjGXw@mail.gmail.com> <CALaySJKaz7C=GN5E=saiDY4KxL+9xCfM0ocZuMStEQ96FnQ4KA@mail.gmail.com> <CALaySJJEXkey9vLAp8VqDXmPsWpxiWN9jjtVnGio1nMQ4K+mDQ@mail.gmail.com> <CALaySJJfc+tET4Vm5UQjHPK5mf61O0iR-1i6=X32CYtWxZLWTQ@mail.gmail.com> <CALaySJKdrk7aPzhT=kbE1B8pq1EBw74nmx_peSJMAoHsG5jyVQ@mail.gmail.com> <CALaySJ+fWX4zEnE5v-Q9R6eCv=kSJjnc-fsXL5PGPgac1GJAcA@mail.gmail.com> <B807C9D3-39A4-4024-BC1D-85DD84EA1735@tzi.org> <DFE56705-CCDD-4172-B577-C873E3DB4898@tzi.org> <5FEA5C07-4A39-4B58-B2AE-F261D111FCE6@cursive.net> <D0618F67-4868-4745-A526-F73DF1A98E1B@tzi.org> <98C6BEDA-C4B2-4657-ABE2-19FE637CE782@cursive.net> <2A875D49-DD88-42D9-969D-0841A6B41F95@tzi.org> <PH7PR02MB92920676E8817271E547F82FB73E2@PH7PR02MB9292.namprd02.prod.outlook.com> <5FBB4831-3E96-44E3-A2AA-B2D83B6C1B05@cursive.net> <03254343-C2C1-4725-8E69-1CF532472C25@tzi.org> <b6788b18-78f3-48e7-b86a-1e369123e7b1@ri.se>
X-Mailer: Claws Mail 3.19.1 (GTK+ 2.24.33; amd64-portbld-freebsd12.4)
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"
Content-Transfer-Encoding: 7bit
Message-ID-Hash: R3XTBDGUW476UTDZ6JQV4C4OYBBZOK6T
X-Message-ID-Hash: R3XTBDGUW476UTDZ6JQV4C4OYBBZOK6T
X-MailFrom: vadimnuclight@gmail.com
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-cbor.ietf.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
CC: Carsten Bormann <cabo@tzi.org>, Joe Hildebrand <hildjj@cursive.net>, Michael Jones <michael_b_jones@hotmail.com>, CBOR <cbor@ietf.org>
X-Mailman-Version: 3.3.9rc6
Precedence: list
Subject: [Cbor] Re: Early allocation for packed CBOR (Re: Reminder: CBOR WG Virtual Meeting on 2024-12-11)
List-Id: "Concise Binary Object Representation (CBOR)" <cbor.ietf.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/_MtcOMiGF_D3ZeOn8meCElIXO-k>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Owner: <mailto:cbor-owner@ietf.org>
List-Post: <mailto:cbor@ietf.org>
List-Subscribe: <mailto:cbor-join@ietf.org>
List-Unsubscribe: <mailto:cbor-leave@ietf.org>

On Wed, 11 Dec 2024 16:19:29 +0100
Marco Tiloca <marco.tiloca=40ri.se@dmarc.ietf.org> wrote:

> Hi all,
> 
> Personally, I don't have an issue with the idea of (early) allocating 
> such a large number of tags per se.
> 
> We are not running out of tag numbers, and even consuming one fourth

We are running out of them (and simple values) in 0..23 range, for no
real benefit.

> of the remaining precious 1+1 tag numbers is in my opinion worth the 
> achieved compression result (as Carsten's exercise also showed in his 
> previous mail).

The problem is that cbor-packed does not achieve good compression at all.
For 2017's CBOR source of 1210 bytes, tag 25/256 method achieves 904
bytes (note it's forced for 3 bytes per even shortest reference, in
contrast to cbor-packed cheating with single byte) and cbor-packed
claims 793 for same method and 564 with prefixes - while even
limited LZF achieves 425..435 bytes (depends on key order) and less
limited LZ4 variants get 404..416 bytes.

I'll try to manually compress with my CBAR idea tomorrow.

Have we ever seen *real* benchmarks for cbor-packed? It seems the whole
proposal should be withdrawn.

> I also don't perceive the fundamental construct of packed CBOR as an 
> abuse of the tag concept.

But, given the context, it clearly is.

-- 
WBR, @nuclight