[Cbor] Re: Early allocation for packed CBOR (Re: Reminder: CBOR WG Virtual Meeting on 2024-12-11)

Vadim Goncharov <vadimnuclight@gmail.com> Sun, 15 December 2024 03:32 UTC

Return-Path: <vadimnuclight@gmail.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 477DEC180B46 for <cbor@ietfa.amsl.com>; Sat, 14 Dec 2024 19:32:45 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.109
X-Spam-Level:
X-Spam-Status: No, score=-2.109 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id NUxyg47_Prbq for <cbor@ietfa.amsl.com>; Sat, 14 Dec 2024 19:32:44 -0800 (PST)
Received: from mail-lj1-x235.google.com (mail-lj1-x235.google.com [IPv6:2a00:1450:4864:20::235]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature ECDSA (P-256) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id E0CC1C15155A for <cbor@ietf.org>; Sat, 14 Dec 2024 19:32:44 -0800 (PST)
Received: by mail-lj1-x235.google.com with SMTP id 38308e7fff4ca-3022484d4e4so34133971fa.1 for <cbor@ietf.org>; Sat, 14 Dec 2024 19:32:44 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1734233563; x=1734838363; darn=ietf.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=evTmU5bft6Fel6AHGy/yoDEJhLlPHEPYMocDdcnc4RE=; b=VOYSI06e/96vK+6JRWu4B4tlsSelETdILqnl9M3CPHsV5ZxeMHxoQkdhaU81fZhVv9 iSxzAmm0NjJrnO/MNNUn8Jyan50oDdP+uEUvz/S0vGlAuTppsyyRVMSSP/KcL/tIsY/w makTNU2NYx5Ck/UJlxsUm1NHTWQOqfeBGNyXUwBMHs6Yvafq9co9lOJ6C52vNM1+5i6B 4TrmvIAnOUribqNSEA5aaiS6pflVkNjgCKbfPUPNVoXUGOWfJGzFolgpZgI9VLnTwGQu kAQtewMrMpSrZ8U08h1xENLEcLlhgcinPmyN9darfO0OscmRib0k9M6bB8yqdtt4Gq6e DOkg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734233563; x=1734838363; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=evTmU5bft6Fel6AHGy/yoDEJhLlPHEPYMocDdcnc4RE=; b=fRc1ooU5NDI7kruHyjRs/14LK/vAUJGVo4Q+TcKSHejOj8y9Kh7Enjv3HMJny0u5Uv xzbyBPcVa4rrSw/cAzIa7Uk8p8qbzmx9ZJ8PGsUHjj6nap059ms+ivT1+1drPrKnu/hZ negMLO9anrv8jWrglflbOOG+fKgJX0QfLAiOraHk+eB8El1VbvjXpNZ0oQIOKXan8zZW vyG8urQZVmfBNy3kZhTTAejzpvt9eq8faC6/8cMPYqvfgGIGTppCHUgkpFM5SmMWj3IH mKgRzVYk4u/IGqgVe3OewBAPsz2+ewbhWIQoBXJFGbL8D+xW+skN9xQrbNtZX2/xAQq9 whVA==
X-Forwarded-Encrypted: i=1; AJvYcCV4SXYRtslQ65zzRGFGOIjKxw6bkrni+fwKE2dg9UovmF8NPgAef/5WWyBE/siit2QaKc2h@ietf.org
X-Gm-Message-State: AOJu0Yx6XFfwPev0rtoWrWwEthRIptnn943sPm4viZKDyjiRruHGK8lT dgWyP6HRYMZVVEGxZ0YijRWyfeYzkZG3LqzY1XPKGautWzwGxuHs
X-Gm-Gg: ASbGncuwdbRM7iJ01BabGvOOuSn7RnSEcZZQ1WxhRtltDHijkI2HPiK+9TSN6sQwa/v g5FU5rbldE6liXPPR1USCa3IUOopa8yGl6BEumxKhPnzu13gJaC/3AV1nI2ODcKlL9PTG5Q+lYL QArje+9nY6mo/0OSoZ4MhwNzDZoE1cFMry+8M5rFwZTROTNXMheCGNVyEoKsQe65pQKhyhbeHJC JS1AX1suPGkXVoXB+pTlDMYAwS1tnz7O3mCrmh2dYFgMlvQrG8XED9Ils6kkR7B+iRrQx8kiyp2 2tNQNpGEk3nqySaWDtPCs4O4
X-Google-Smtp-Source: AGHT+IGT+KJ0TWWdJgclCLdZO1QNDPxwuVKGLkTz4m2VuBvNZd7QSHjMtmvKOjbDj1lZ2+ggKe85Tw==
X-Received: by 2002:a2e:b8c2:0:b0:302:4072:f3f7 with SMTP id 38308e7fff4ca-3025443fcd9mr19753771fa.14.1734233562858; Sat, 14 Dec 2024 19:32:42 -0800 (PST)
Received: from nuclight.lan (broadband-37-110-95-35.ip.moscow.rt.ru. [37.110.95.35]) by smtp.gmail.com with ESMTPSA id 38308e7fff4ca-3034404529dsm4503381fa.48.2024.12.14.19.32.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 14 Dec 2024 19:32:42 -0800 (PST)
Date: Sun, 15 Dec 2024 06:32:39 +0300
From: Vadim Goncharov <vadimnuclight@gmail.com>
To: Michael Richardson <mcr+ietf@sandelman.ca>
Message-ID: <20241215063239.21033529@nuclight.lan>
In-Reply-To: <29696.1733935516@obiwan.sandelman.ca>
References: <CALaySJKDFscUBGw4CPspXJvUTkXywVHc_FrmhO3ybBWTrwjGXw@mail.gmail.com> <CALaySJLtUR1=G_WH4H+zoJ5LCrHjBgEf1oW104zDtFQighY+gg@mail.gmail.com> <CALaySJLnKxU9m3BNPq4XayrSrorRBG2vuBz1AF-CsEBoSZe7Xg@mail.gmail.com> <CALaySJKaz7C=GN5E=saiDY4KxL+9xCfM0ocZuMStEQ96FnQ4KA@mail.gmail.com> <CALaySJJEXkey9vLAp8VqDXmPsWpxiWN9jjtVnGio1nMQ4K+mDQ@mail.gmail.com> <CALaySJJfc+tET4Vm5UQjHPK5mf61O0iR-1i6=X32CYtWxZLWTQ@mail.gmail.com> <CALaySJKdrk7aPzhT=kbE1B8pq1EBw74nmx_peSJMAoHsG5jyVQ@mail.gmail.com> <CALaySJ+fWX4zEnE5v-Q9R6eCv=kSJjnc-fsXL5PGPgac1GJAcA@mail.gmail.com> <B807C9D3-39A4-4024-BC1D-85DD84EA1735@tzi.org> <DFE56705-CCDD-4172-B577-C873E3DB4898@tzi.org> <5FEA5C07-4A39-4B58-B2AE-F261D111FCE6@cursive.net> <D0618F67-4868-4745-A526-F73DF1A98E1B@tzi.org> <98C6BEDA-C4B2-4657-ABE2-19FE637CE7 82@cursive.net> <2A875D49-DD88-42D9-969D-0841A6B41F95@tzi.org> <PH7PR02MB92920676E8817271E547F82FB73E2@PH7PR02MB9292.namprd02.prod.outlook.com> <29696.1733935516@obiwan.sandelman.ca>
X-Mailer: Claws Mail 3.19.1 (GTK+ 2.24.33; amd64-portbld-freebsd12.4)
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"
Content-Transfer-Encoding: 7bit
Message-ID-Hash: Z7TKOPGYBDAUWGOKMGNZCF6FZT4NUH7Z
X-Message-ID-Hash: Z7TKOPGYBDAUWGOKMGNZCF6FZT4NUH7Z
X-MailFrom: vadimnuclight@gmail.com
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-cbor.ietf.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
CC: Michael Jones <michael_b_jones@hotmail.com>, CBOR <cbor@ietf.org>
X-Mailman-Version: 3.3.9rc6
Precedence: list
Subject: [Cbor] Re: Early allocation for packed CBOR (Re: Reminder: CBOR WG Virtual Meeting on 2024-12-11)
List-Id: "Concise Binary Object Representation (CBOR)" <cbor.ietf.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/sqRcp0JNWL6j9dTvQgPvb4Leywk>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Owner: <mailto:cbor-owner@ietf.org>
List-Post: <mailto:cbor@ietf.org>
List-Subscribe: <mailto:cbor-join@ietf.org>
List-Unsubscribe: <mailto:cbor-leave@ietf.org>

On Wed, 11 Dec 2024 11:45:16 -0500
Michael Richardson <mcr+ietf@sandelman.ca> wrote:

> Michael Jones <michael_b_jones@hotmail.com> wrote:
>     > A typical draft allocating tags allocates a number that you can
>     > count on one hand.  Each such tag has a human-readable
>     > description of its purpose in the registry so people know what
>     > to use them for.  
> 
> Sure.
> 
> But, CBOR packed isn't the same as, for instance, RFC9164 (v4/v6

Yes. And more, it's not *justified*.

> tag), or even for RFC9277 which allocated 64K tags to match all
> future (CoAP) Content-Format values.   RFC9277 explicitely encouraged
> use of 4-byte (1+4) tags, mind you.

There is no objection for 4-byte range.

>     > I'd suggest that a reasonable maximum number of tags for a
>     > draft to allocate for its purposes is on the order of ten.  
> 
> a normal draft, sure.
> CBOR Packed is infrastructure for use by other applications.
> It's essentially a significant update to STD94.

Such an update should not grab such big number of tags from
applications, while performing notoriously bad for it's goal. Such
infrastructure update should use unused codes from main specification.
For example, exerpt from my CBAR (CBOR & generic BLOB by-Atom Reduc) idea:

   Internal pass: for CBOR, use unused codes 28..30 in each Major Type. For
   integers, it's shorter one (e.g. 3-byte), for each string type, prefix it
   with atom length as prescribed by CBOR, for tags, put common like
   stringref, and in special, make generic escape mechanism:

   e.g. 0xfc to escape one byte
   0xfd - extended escape, and e.g. for atoms in middle of strings

  FC - copy 2+ bytes literal
  FD - dispense (decompress) atom
  FE - escape a single byte

and all 0xFx codes switch off CBOR going to BLOB mode
[...]
  * for BLOB media type, initialize "remaining_bytes" to entire length, and go
    to INBLOB state
  * in CBOR state every opcode is allowed, and special one for strings emits
    CBOR string prefix, then sets remaining_bytes to string's length
  * in BLOB state, only FC/FD/FE allowed, any other MUST throw error, and
    decrement remaining_bytes for every emitted piece
  * when remaining_bytes becomes zero, go back to CBOR state

so, for strings:

  5C - 2+ byte length (varint) BLOB state inside
  5D - decompress one atom
  5E - single byte length BLOB state inside

and same for UTF-8 text strings (7C/7D/7E)

-- 
WBR, @nuclight