Re: [Cbor] dCBOR moving from numerically-typeless systems

Wolf McNally <wolf@wolfmcnally.com> Sun, 12 March 2023 20:34 UTC

Return-Path: <wolf@wolfmcnally.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3905FC151540 for <cbor@ietfa.amsl.com>; Sun, 12 Mar 2023 13:34:49 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.895
X-Spam-Level:
X-Spam-Status: No, score=-1.895 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_BLOCKED=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=wolfmcnally-com.20210112.gappssmtp.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 7yH-Q8k4X6dM for <cbor@ietfa.amsl.com>; Sun, 12 Mar 2023 13:34:48 -0700 (PDT)
Received: from mail-oo1-xc33.google.com (mail-oo1-xc33.google.com [IPv6:2607:f8b0:4864:20::c33]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 68B00C151527 for <cbor@ietf.org>; Sun, 12 Mar 2023 13:34:48 -0700 (PDT)
Received: by mail-oo1-xc33.google.com with SMTP id n27-20020a4ad63b000000b005252709efdbso1557216oon.4 for <cbor@ietf.org>; Sun, 12 Mar 2023 13:34:48 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=wolfmcnally-com.20210112.gappssmtp.com; s=20210112; t=1678653287; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:from:to:cc:subject:date:message-id:reply-to; bh=hBv/hkl2hqY0pliJh+OrUZQFTue75uXSIBthrQofPqY=; b=yRmCAxf60ItjAkX79DtoJS92NLCYkW9N+C4DIhkWfOKVXArmkwVRcQdUxJTUpQh4Bu KczhNHJzvj3UE7OnPzpybI27pYawyNIF53zBTvr/2dF+qCVpcYv86Pn4I6eUbuLHGXL0 QZH/MM0E9hkvqxY0Zmqn2dsFoJvDHHOOgrWDhVI7ZhvDvSqQtQsiNlWV+TcPtR7e4aDG /Q/tdpN61Del/r3E+B8edxDDSiehenySm1HCnPT6pU9u9ju9RDeV6D9reyTEADQgTICm fTyLv/7SmW7efrV50N9SJ11heFmn3Ka4rzfK6qNu5qbjaTKIOdeazy+CwZdZF66PUuJQ owEQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678653287; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=hBv/hkl2hqY0pliJh+OrUZQFTue75uXSIBthrQofPqY=; b=c2O2zvvVIuDbN/yWaElRbVSXPMA9e4SYfsFFWv3z8Oqk2K0e/CyXaRaRJeKWwczRxZ wkZs29wl6QhOt3n1Q9Yno2nxCB+IYvpnYl0IQMMLNcj5UbmS+tHv6UlFbTPLUz/15P7n DCEwHaauB+zAY9ETLtJ+Sx69TqzVo/KmUhpR4v9mcR7Gt59YP/b3/u9Y7fIQr69XIdAT RIPBfOYYC35u1mkIbyUER+rvd8Evp1YJDXRKe4NiDYj2zNYDbpbIKvozT51LfhV3xMWm vdr4kbOfstu8Mv20SC2ZeuNMttfc5uZ5/LnTUR11Yyg/x5TCSwlMQA2Wfkhpb3o+0LBY +2Bg==
X-Gm-Message-State: AO0yUKXjGLsbb9DIBGTyfFP2cKZFz4im81Oq9Pk/widy2A96zqKKoDLR myTNbOGujH/12ZtzNEZzD7hh2CAqiW7OLrhZvoA=
X-Google-Smtp-Source: AK7set86fWx5TS1F+tE42NICsLyL5t4Xm0Pgxi3B9m21eVDO9BkyhD7C6Y6mc1kyMtAxz7O8tR1siQ==
X-Received: by 2002:a4a:6f05:0:b0:524:a1a9:f2b3 with SMTP id h5-20020a4a6f05000000b00524a1a9f2b3mr13546750ooc.8.1678653287438; Sun, 12 Mar 2023 13:34:47 -0700 (PDT)
Received: from smtpclient.apple ([185.222.243.89]) by smtp.gmail.com with ESMTPSA id 128-20020a4a1486000000b0052529fbbdd8sm2403202ood.18.2023.03.12.13.34.46 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sun, 12 Mar 2023 13:34:47 -0700 (PDT)
From: Wolf McNally <wolf@wolfmcnally.com>
Message-Id: <382FB8D4-03B8-4B92-B7BA-B3760D77D258@wolfmcnally.com>
Content-Type: multipart/alternative; boundary="Apple-Mail=_C6FDD20C-F3F2-4244-894E-5BDDAB75E8A3"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3731.400.51.1.1\))
Date: Sun, 12 Mar 2023 13:34:35 -0700
In-Reply-To: <3CE988BD-C87E-4A0C-ADA8-79124FE1FB51@tzi.org>
Cc: cbor@ietf.org
To: Carsten Bormann <cabo@tzi.org>
References: <2B1FA8CC-AD83-4E58-BE27-B6504F555694@wolfmcnally.com> <8551021E-A1A2-4764-B0DF-D3E7591EC9B6@tzi.org> <FD5D8771-E1CF-4C63-A141-054DE0085399@wolfmcnally.com> <D714A0A4-7452-4C45-8542-7A57A75C9748@tzi.org> <35A71381-A9DD-479C-A7D5-9B06F70B6F50@wolfmcnally.com> <3CE988BD-C87E-4A0C-ADA8-79124FE1FB51@tzi.org>
X-Mailer: Apple Mail (2.3731.400.51.1.1)
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/uGtEn1apzQWkbVzqFUwenoRfscI>
Subject: Re: [Cbor] dCBOR moving from numerically-typeless systems
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 12 Mar 2023 20:34:49 -0000


> On Mar 12, 2023, at 6:46 AM, Carsten Bormann <cabo@tzi.org> wrote:
> 
> On 2023-03-12, at 12:49, Wolf McNally <wolf@wolfmcnally.com <mailto:wolf@wolfmcnally.com>> wrote:
>> 
>> Carsten,
>> 
>>> On Mar 12, 2023, at 3:26 AM, Carsten Bormann <cabo@tzi.org> wrote:
>>> ...maybe you can explain your “no `null` in map entry values” rule next.  Isn’t that really very much about how the application processes these values?
>> 
>> Just like numeric values have multiple equivalent CBOR representations, map entries with “no value” also have two semantically-equivalent representations:
>> 
>> • omit the entry entirely, or
>> • encode the entry, but give it a `null` value.
>> 
>> If these representations are semantically equivalent,
> 
> Big assumption.

You *say* that, but I’d like to see an actual argument that it’s not in fact true.

> I think you are assuming that all map entries have default values, and that all default values are `null`.

No, not all map entries have default values. Some are required. Some specific combinations may be disallowed.

I have no idea what you mean by “and that all default values are `null`,” as this is equivalent to saying, “and that all default values are no value.” `null` cannot be a default value as semantically `null` expresses the absence of a value.

As I state in the I-D, `null` still has uses in other positions, such as in indexed structures like arrays where some of the elements are optional:

example = {
	a: [number / null, number / null, number / null]
}

In this example, the array MUST have three elements, any of which may be a number or `null`. What `null` means for each element is up to the specifier.

> What if your default value is 10?
> 
> If course, you can use `null` as a secondary representation of that default value, unless the map entry actually has defined semantics for (non-default) `null`; you may choose to exclude such application data models, and you are.
> 
> However, if 10 is the actual default value, it is the application that needs to ensure it never expresses this value (as default values must never be expressed in deterministic representation). 
> 
> You are essentially doing half of the work in the encoder (the part that suppresses `null` if it is the default value), and leaving the other half to the application (suppressing 10).  This is at the cost of not supporting data models at all where `null` is not the default value.
> 
> I find this weird.

Let’s focus on a more concrete example. Consider the following CDDL:

font = {
    name: tstr,
    ? size: (number .gt 0) .default 12
}

As written, the encoder does not have the option of making `size` `null`-valued. It clearly specifies that the field is either to be entirely included or entirely omitted, and that if it is included it must have a numeric value. Therefore, the codec would be right to reject a `null` value here.

I think you’re trying to say that there are conceivable and useful situations in which the codec would be *wrong* to reject it, but you haven’t yet given any examples of where that would be the case *in a deterministic setting.*

To restate the dCBOR requirement:

> Protocols that depend on dCBOR MUST specify the circumstances under which particular optional fields MUST or MUST NOT be present. Protocols that specify fields using key-value paired structures like CBOR maps, where some fields have default values MUST choose and document one of the following strategies:
> 
> • they MUST specify that the absence of the field means choosing the default. This allows the default to be changed later, or
> • they MUST encode the field regardless of whether the current default is chosen. This locks in the current value of the default.

In the above example, the CDDL is telling us that the field’s absence means choosing the default 12, and further that the default will not change, as it’s a fixed part of the published specification. And yes, in a deterministic setting the encoder MUST NOT encode the default value. In fact, I’ve written a lot of code for both JSON and CBOR that simply does not encode a key if it would have the default value; this is only parsimonious. But under dCBOR it is REQUIRED.

On the decoder side, null-valued map entries are semantically equivalent to an absent map entry (I again assert that the burden is on you to demonstrate otherwise) and therefore not well-formed dCBOR. So the “doing half the work in the encoder” as you put it is a natural division of labor between the encoder and the application, and is nothing remarkable to me. The decoder’s work is low-level well-formedness, and the application’s work is higher-level semantic validation.

Back to the dCBOR requirements for defaults: if the specifier wanted to choose the second policy, the CDDL would be written thus:

font = {
    name: tstr,
    size: (float .gt 0) ; Recommended default: 12
}

Now the encoder MUST encode some `size` field. If they want to use the default, the specifier has said it’s 12, but the implementer has to affirmatively encode it.

Now, I mention a possibility of the default “changing” in the case of an optional field. For clarity let's look at a practical example of this:

set-alarm = {
    name: tstr,
    ? expiry: time ; Defaults to current time + 1 hour
}

In this case, omitting the `expiry` field means the receiver will construe it to be 1 hour from the time it processes the request. The protocol specifier will also need to specify what happens if the `expiry` date is present but out of allowable range, and the application will have to enforce this constraint.

Again: in no case is a `null` value for the `expiry` field necessary, useful, or even interesting.

~ Wolf