Re: [Cbor] Unusual map labels, dCBOR and interop

Wolf McNally <wolf@wolfmcnally.com> Thu, 28 March 2024 20:33 UTC

Return-Path: <wolf@wolfmcnally.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 26D2DC14CE2B for <cbor@ietfa.amsl.com>; Thu, 28 Mar 2024 13:33:46 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.893
X-Spam-Level:
X-Spam-Status: No, score=-1.893 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=wolfmcnally-com.20230601.gappssmtp.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id phex2ObuGH_k for <cbor@ietfa.amsl.com>; Thu, 28 Mar 2024 13:33:41 -0700 (PDT)
Received: from mail-pl1-x635.google.com (mail-pl1-x635.google.com [IPv6:2607:f8b0:4864:20::635]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 784F2C14CE42 for <cbor@ietf.org>; Thu, 28 Mar 2024 13:33:41 -0700 (PDT)
Received: by mail-pl1-x635.google.com with SMTP id d9443c01a7336-1df01161b39so12933975ad.3 for <cbor@ietf.org>; Thu, 28 Mar 2024 13:33:41 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=wolfmcnally-com.20230601.gappssmtp.com; s=20230601; t=1711658020; x=1712262820; darn=ietf.org; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:from:to:cc:subject:date:message-id:reply-to; bh=eP5elLrEjzV4sb/bIvkc/eTAeKOKd0OmVA+FdiEWxYc=; b=JW8YGX2HNe38gQZvoM3dWn24Zy/AyfW7X5fj0L0GrP/yW6knD+cg0A1roRyV+mmbar ieNo5bwLwr/TRSqEvCHniKyvGl10GeI2B304NIgotwCDcJ9azQNpe9CiS08aBJD7Z+A3 IsORLc87c03GptuUgUak6CBx12MD3WNGTdtYWnYjblknz4aBti6ptgW+C92paXsP+3/9 P1JPPdqgOGbLCtn2ov3esDL7odBfuw0Kx+1I9cn+Aut2dchUWEL+2e9Sy8i2xVSAzyN6 i4L7yJ6dsDWgbW2PSAKgnSwBP26ef169xJ3vfTfb+vrAO1f8QagOtaXOLqjqOnnSpg9K /iHA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711658020; x=1712262820; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=eP5elLrEjzV4sb/bIvkc/eTAeKOKd0OmVA+FdiEWxYc=; b=oaeRIG1HehIUyaPrdlALybDJWg0Sc3VMByceSuaYFMjoZz+tusUWViLrEEsdpkTrSR Zhz1nHV0LCkoCfqXvp5mSlFgMdssVFUvWUmmkIbMPUwm+kCSLsamfwp9JEsbvwphF2dX Gw/NPcfGCRSOkXIPmrjt66NneRJaraxr3X9eUyQ7+siKAzpyQYNIHeLH1ZIqZK1P3lK/ 1c9p8azu0sz6XBaHrJD3nzUXQPe3ymlVITh2mj30W3bCpPor/Gu0G+FEVv5HujbIzbBt U2J1T55bmUmRS5w0yAHvw/Slynuk/v9Kz6CugZ0o0Z1adFvQ+7xl4uo44uPSYxeVK67p LqiA==
X-Forwarded-Encrypted: i=1; AJvYcCWsAp5IiRXF8NYxjX15TDJG0Q3fpEnEK4whaKfUpeHV1yURkdTTKOsCC5vxtKJzAd6dnrYQ+8xvW+dTZ7+l
X-Gm-Message-State: AOJu0Ywdd21CQB+0Kl4820EjNsXxIm68PLWKcLvk9syVk9AK+nZ+MG8U 76HZTyvC0t8NJtY4zgJ7dHOsq6T93tipnx2fltbNpXOe5VG766y0gtNMjtRxtTY=
X-Google-Smtp-Source: AGHT+IFZk7jz4qEhvx31Fl38VN0HxlL0EPjFFjM9bOTlZec4jdH6E9tmiyuTE3jaSkWf39LM3KyOZg==
X-Received: by 2002:a17:902:e750:b0:1e2:1234:dde7 with SMTP id p16-20020a170902e75000b001e21234dde7mr580389plf.61.1711658020495; Thu, 28 Mar 2024 13:33:40 -0700 (PDT)
Received: from smtpclient.apple (ip70-180-193-108.lv.lv.cox.net. [70.180.193.108]) by smtp.gmail.com with ESMTPSA id f6-20020a17090274c600b001e0648dfd68sm2042882plt.296.2024.03.28.13.33.38 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Thu, 28 Mar 2024 13:33:39 -0700 (PDT)
From: Wolf McNally <wolf@wolfmcnally.com>
Message-Id: <F2D7BBFA-FB54-4DDA-A09E-B148AC11C5E6@wolfmcnally.com>
Content-Type: multipart/alternative; boundary="Apple-Mail=_30A10DA8-CC7E-42B1-8D7C-C1C02335C136"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3774.500.171.1.1\))
Date: Thu, 28 Mar 2024 13:33:25 -0700
In-Reply-To: <6AA6FFE2-FEC6-4E67-B1DA-B598E68C76A7@island-resort.com>
Cc: Carsten Bormann <cabo@tzi.org>, "cbor@ietf.org" <cbor@ietf.org>, Christopher Allen <christophera@lifewithalacrity.com>, Shannon Appelcline <shannon.appelcline@gmail.com>
To: "lgl island-resort.com" <lgl@island-resort.com>
References: <8C245824-1990-4616-AB70-FFD4FACB1AE9@island-resort.com> <11E8A8A5-D891-49FF-AF16-697C06F463B3@tzi.org> <9A0CE364-C141-4EBE-9703-292C416D12F5@island-resort.com> <3D62C4F0-D570-4EE4-AF6A-163C708AA6BE@tzi.org> <58BA8F8C-0C63-4534-9BF7-255C32D02C16@island-resort.com> <5F1E1133-4565-4D0A-98EE-A13C6F5F67AA@wolfmcnally.com> <6AA6FFE2-FEC6-4E67-B1DA-B598E68C76A7@island-resort.com>
X-Mailer: Apple Mail (2.3774.500.171.1.1)
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/t3VH0MZ9hp8TblX3kA0dLWUYkfc>
Subject: Re: [Cbor] Unusual map labels, dCBOR and interop
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 28 Mar 2024 20:33:46 -0000

LL,

I think you make good points. And there is at least some precedent within the existing dCBOR I-D to specify restrictions that aren’t entirely based in its primary goal of determinism, for example limiting the range of negative integers that are valid, which was specified toward a secondary goal of compatibility with prevalent CPU architectures, and hence simpler implementation and use.

That said, there is currently one case where Blockchain Commons specifications not only allow for, but rely on, generally complex map keys: Gordian Envelope. Envelope is an enumerated type with several cases, each of which requires a unique discriminator. In the Gordian Envelope specification I-D, §3.4, we define the `assertion` case as a single-entry map, where the key-value pair corresponds to a semantic predicate-object pair, and where either field is itself an envelope, recursively. 

https://www.ietf.org/archive/id/draft-mcnally-envelope-06.html#section-3.4

The presence of this single-entry map is the discriminator for the `assertion` case, and this is part of Envelope’s design that facilitates arbitrarily deep metadata. Obviously, a single-entry map requires no sorting of keys— the deterministic ordering of assertions in an envelope is based on sorting digests of entire assertions, not map keys— so technically we did not need to use a map here. But the alternatives were either to register a special-purpose CBOR tag (low-numbered tag space is desirable but scarce), or to use a separate discriminator like an array with having a specified form like a leading constant integer (adding complexity in our implementation as we are already using CBOR arrays to distinguish our `node` case). Since dCBOR/CDE/CBOR admit complex map keys, we saw no reason not to use a single-entry map for this as a parsimonious solution.

~ Wolf

> On Mar 28, 2024, at 12:55 PM, lgl island-resort.com <lgl@island-resort.com> wrote:
> 
>> 
>> On Mar 27, 2024, at 6:25 PM, Wolf McNally <wolf@wolfmcnally.com> wrote:
>> 
>> LL,
>> 
>>> On Mar 24, 2024, at 12:21 PM, lgl island-resort.com <lgl@island-resort.com> wrote:
>>> 
>>> For dCBOR, I’d like to see map labels very restricted. I’m not a JSON user/expert, but it is a Jupiter-sized data point for me. It has massive use with only string labels. 
>>> 
>>> I’ll propose integers and strings (major types 0-3) to start discussion. Maps and arrays seem too much. Only strings might be OK.
>> 
>> I’m not sure whether the issue you’re having is coming from a place of design advice, or implementation complexity. From a protocol design perspective, in the vast majority of cases I agree that using integers or strings for map keys makes the most sense. From an implementation perspective, our code simply incrementally manages maps by using the serialized CBOR of a key as the “real” key and keeps the map in sorted order, even though it hides this fact from the user. This lets us skip the sorting step during serialization, and facilitates the validation step during deserialization. It also lets us support arbitrary map keys per the CBOR spec. Obviously this isn’t the only approach that works to keeping the serialized map supported, its keys unique, and also support complex keys.
> 
> Wolf, thank you kindly for taking the time to write a clear explanation. Makes the discussion converge faster (for me anyway).
> 
> I don’t have an issue. Just thought it was something worth discussing. Here are some reasons.
> 
> Implementation complexity — To implement non-scalar map labels, you’d probably do so in an OO environment where the label is an object. That’s a little out of line with the embedded goals of CBOR. Also, implementing maps as map labels in my embedded-oriented QCBOR project wouldn’t really work (but QCBOR is just one data point).
> 
> Alignment with JSON — Seems like lots of CBOR protocols will get translated to/from JSON because so much of the world runs on JSON. For example, I pushed hard for EAT to be realizable in both CBOR and JSON because I thought the attestation ecosystem would benefit. I was supported by the RATS WG to complete this. CDDL has been evolved to support CBOR+JSON better.
> 
> Simplicity/adoption — I think JSON succeeded because it is so simple. CBOR is handicapped in comparison. Seems that maps as map labels is something you do to show how clever you are, not because it is good design or gives expressive power not otherwise available.
> 
> 
>>> I can imagine many use cases other than the Gordian Envelope finding dCBOR highly desirable because it eliminates all the CBOR stuff that is variable and a lot of stuff that is hard to understand. Most people aren’t as smart or as into this stuff as we are. Simple map labels line up with this.
>> 
>> Again, the goal of dCBOR wasn’t “simple CBOR,” although it has some of that flavor, but “deterministic CBOR” as its name suggests, and I’ve kept that foremost in the design choices I made in its specification. I can see why someone might define an “sCBOR” protocol that further restricts dCBOR, but inasmuch as dCBOR meets its original goals, I don’t see the point of restricting it further.
> 
> I have a personal view that some sort of “simple CBOR” would be good for the CBOR ecosystem because I see smart IETFers often confused about serialization, determinism and other. If they’re confused, think about the masses of IT people.
> 
> dCBOR seems like it could serve both that and Gordian.
> 
> This is not something I personally want to put a lot of energy into especially if there isn’t big WG consensus and it is a push up hill. Just thought it was worth some discussion, clarification and consideration.
> 
> LL