[Cbor] Re: Determinism of ordereded maps / arbitrary tags (Was: Two dup detection questions)

Laurence Lundblade <lgl@island-resort.com> Thu, 04 December 2025 05:13 UTC

Return-Path: <lgl@island-resort.com>
X-Original-To: cbor@mail2.ietf.org
Delivered-To: cbor@mail2.ietf.org
Received: from localhost (localhost [127.0.0.1]) by mail2.ietf.org (Postfix) with ESMTP id 0EAAB9510B9E for <cbor@mail2.ietf.org>; Wed, 3 Dec 2025 21:13:07 -0800 (PST)
X-Virus-Scanned: amavisd-new at ietf.org
X-Spam-Flag: NO
X-Spam-Score: -1.696
X-Spam-Level:
X-Spam-Status: No, score=-1.696 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_INVALID=0.1, DKIM_SIGNED=0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=no autolearn_force=no
Authentication-Results: mail2.ietf.org (amavisd-new); dkim=neutral reason="invalid (public key: not available)" header.d=island-resort.com
Received: from mail2.ietf.org ([166.84.6.31]) by localhost (mail2.ietf.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id sce61kKcyQRH for <cbor@mail2.ietf.org>; Wed, 3 Dec 2025 21:13:06 -0800 (PST)
Received: from sender4-pp-f112.zoho.com (sender4-pp-f112.zoho.com [136.143.188.112]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-256) server-digest SHA256) (No client certificate requested) by mail2.ietf.org (Postfix) with ESMTPS id CD4449510A36 for <cbor@ietf.org>; Wed, 3 Dec 2025 21:11:46 -0800 (PST)
ARC-Seal: i=1; a=rsa-sha256; t=1764825103; cv=none; d=zohomail.com; s=zohoarc; b=kxmrS7neDEGMohj5t5W9XXS7g7h2+Md7cz2yrKOHH6xEtHdcTr91qgRARBmVYGTxR7JVrFKGBIepyGs0FLXIPch22V8oBUm3MgdxkqV7E/qJAqDGNJUAp0ZvMJuSsgTHpwT+23I6qOPFSa9kY2kGXI/xybsxXwoEXWx2BScLWlg=
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1764825103; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:MIME-Version:Message-ID:References:Subject:Subject:To:To:Message-Id:Reply-To; bh=dEK1GVNAyKUBNfLzRulOkVaNDo+07BZib1U6e/q69wc=; b=e6fITBhnxE9Pfw8nw5IN++QmG9jJwQoezZBDK3O6tNrKLZEia0QUIGYHmjDA+pU+AgJob9AZH1h1cQ0ifPEkQk8i9Kws0lrlOGZfAMKgX7ZKD2lTiXELuzSrw0hwlh+sKEyzLdp/SAJ2ga1Hz7N5UTFnzJOwHYFqhVG6st57dLk=
ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=island-resort.com; spf=pass smtp.mailfrom=lgl@island-resort.com; dmarc=pass header.from=<lgl@island-resort.com>
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; t=1764825103; s=island; d=island-resort.com; i=lgl@island-resort.com; h=Content-Type:Mime-Version:Subject:Subject:From:From:In-Reply-To:Date:Date:Cc:Cc:Content-Transfer-Encoding:Message-Id:Message-Id:References:To:To:Reply-To; bh=dEK1GVNAyKUBNfLzRulOkVaNDo+07BZib1U6e/q69wc=; b=JZGRhvkNWAKuZxT26OxHMTGcQeX3O8YQPmnaUJkf/mnO5b+G51CxRQC2P9XXzMfx eX0qU5JctM6RZxqZ9NtuPXXNQqdQ+bmcOIC6TFnU543vherrKBKsqZjBQThwr/9h+zq xTrFW1uLNXrUhDSJ2qrwhLphlc17Z/uXDRDrOBgE=
Received: by mx.zohomail.com with SMTPS id 1764825102004613.5689126164991; Wed, 3 Dec 2025 21:11:42 -0800 (PST)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3826.500.181.1.5\))
From: Laurence Lundblade <lgl@island-resort.com>
In-Reply-To: <20251204002110.323c847d@nuclight.lan>
Date: Wed, 03 Dec 2025 21:11:30 -0800
Content-Transfer-Encoding: quoted-printable
Message-Id: <BB69B837-0004-4501-A102-BDC8A2C969AC@island-resort.com>
References: <CCE4510D-DD9D-46DD-8274-E637D55D56D7@island-resort.com> <20251204002110.323c847d@nuclight.lan>
To: Vadim Goncharov <vadimnuclight@gmail.com>
X-Mailer: Apple Mail (2.3826.500.181.1.5)
X-ZohoMailClient: External
Message-ID-Hash: JYZRINNRZXOPWNTRGGGVAXMKVUZJ7N62
X-Message-ID-Hash: JYZRINNRZXOPWNTRGGGVAXMKVUZJ7N62
X-MailFrom: lgl@island-resort.com
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-cbor.ietf.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
CC: CBOR <cbor@ietf.org>
X-Mailman-Version: 3.3.9rc6
Precedence: list
Subject: [Cbor] Re: Determinism of ordereded maps / arbitrary tags (Was: Two dup detection questions)
List-Id: "Concise Binary Object Representation (CBOR)" <cbor.ietf.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/9pcbdTDiVi60PgdqSFQemJKcS7w>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Owner: <mailto:cbor-owner@ietf.org>
List-Post: <mailto:cbor@ietf.org>
List-Subscribe: <mailto:cbor-join@ietf.org>
List-Unsubscribe: <mailto:cbor-leave@ietf.org>

It was good to get this out on the table and clear. I plan to add this to the serialization document:

In the basic generic data model, maps are unordered (see RFC 8949, Section 5.6). Applications MUST NOT rely on any particular key ordering, even if the data was produced using deterministic serialization. A CBOR library is not required to preserve the order of keys when decoding a map, and the underlying programming language may not preserve map order either— for example, Go provides no ordering guarantees for maps. The sole purpose of map sorting in deterministic serialization is to ensure reproducibility of the encoded byte stream, not to provide any semantic ordering of map entries. If an application requires a map to be ordered, it is responsible for applying its own sorting.

LL



> On Dec 3, 2025, at 1:21 PM, Vadim Goncharov <vadimnuclight@gmail.com> wrote:
> 
> On Tue, 2 Dec 2025 11:49:56 -0800
> Laurence Lundblade <lgl@island-resort.com> wrote:
> 
>> Q: Why does dup detection operate on data mode values (and sorting doesn’t)?
>> 
>> I think the answer is that the application makes use of the map keys in a
>> critical way — to distinguish data items like a person’s last name from
>> their shoe size. The application probably doesn’t have access to the
>> serialized form of the map key. It doesn’t want access to it either — the
>> point of CBOR is to serialize/deserialize data so applications don’t have to
>> worry about it.
>> 
>> This contrasts to map sorting where the important thing is that they are in
>> the same order, not so much what that order actually is.
> 
> Well, strictly speaking, application may want it's own sorting order, e.g. due
> to collation selected, so it's too, not only dup detection. It is just hard to
> imagine application directly depending on map sort order in *serialized* form,
> for it to be important, as maps are declared as "unordered" (some
> constrained/streaming impls which do not have resources for sorting in
> memory?).
> 
> However, if/when we will be discussing extensions like ordered maps (or more
> generally, tables), then questions of collation and sort order become
> meaningful to (deterministic) serialization: when contents of such data
> structure will be in tag instead of bstr, determinism (or not) of this inner
> part "infects" evaluation of whether "outer" part is deterministic ot not.
> I.e. this will be significant to GOAL of that deterministic encoding, for what
> purpose it is deterministic - compare, hashing... - because underlying array
> may well have deterministic encoding by more general/relaxed rules of outer
> part.
> 
> This kinda parallels the NaN tag bstr situation, but there wrapping to bstr
> was simple; and for determinism rules of arbitrary tags I don't know.
> 
> -- 
> WBR, @nuclight
> 
> _______________________________________________
> CBOR mailing list -- cbor@ietf.org
> To unsubscribe send an email to cbor-leave@ietf.org