Re: [Cbor] dCBOR moving from numerically-typeless systems

Wolf McNally <wolf@wolfmcnally.com> Sun, 12 March 2023 10:23 UTC

Return-Path: <wolf@wolfmcnally.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1E0E6C15E3FC for <cbor@ietfa.amsl.com>; Sun, 12 Mar 2023 03:23:37 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.896
X-Spam-Level:
X-Spam-Status: No, score=-1.896 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=wolfmcnally-com.20210112.gappssmtp.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vUzAklzfuUsp for <cbor@ietfa.amsl.com>; Sun, 12 Mar 2023 03:23:35 -0700 (PDT)
Received: from mail-oi1-x22c.google.com (mail-oi1-x22c.google.com [IPv6:2607:f8b0:4864:20::22c]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 595B1C14CEFA for <cbor@ietf.org>; Sun, 12 Mar 2023 03:23:35 -0700 (PDT)
Received: by mail-oi1-x22c.google.com with SMTP id bk32so7515697oib.10 for <cbor@ietf.org>; Sun, 12 Mar 2023 03:23:35 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=wolfmcnally-com.20210112.gappssmtp.com; s=20210112; t=1678616614; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:from:to:cc:subject:date:message-id:reply-to; bh=RC9A9UzGfJMfnUHL0l3I1tqwGXoxyS/7VzrtKzo5TwE=; b=c4H5H1aW5i8Woz/Q+MijbsXiiX7mwsrUF3rLeoug5Uxe4qgQ1UIg1pFNDtpigPJqq+ udXY5jW3OqQhR8hkoEywHRLwgToy/o1mPtebHNXPOozUkge7/dzM30KJsGnXDNPzLBV2 Cx98BeRlS4IH74CbYcs5wiyRnQ90yYuY5EdQ7lATZzvezfa29T6P8EK/qgrImw4D2/+t sv2Y9vvWzNcsgpYShNddtLRCUrhY2dPVvkdQg9UBKyGHXd3rWoVb3p410lJVekbhQnhS PvSBaL7YczVOJZIZd8I2AUNe8oTBKUuNf3fEGheM2+FbEAxRTiwMU9s71Q40gSv/3lnQ uOzg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678616614; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=RC9A9UzGfJMfnUHL0l3I1tqwGXoxyS/7VzrtKzo5TwE=; b=VPvoRTVScNf8zcOm/2G9XB7oHOABaGA162UtxibrXYZsUMVdLP0/Lxf1XhP1JRjR7R /xo4bkGpck/ZG3nzjz5lI4Iy5pe4xgwqlY15FrnAbxnGKWIwsOePhuRo+j9EldMxmmPD vGxdiHrtWIEnUK9/mDpK+6gkccS4AA0INLzHYRDBNToKVc/ugwKC+XeFejDdwnTGLbBu ef/fEmA/1Dehm6NNQXmV3cQpLRO38a9AhA+0UGemo5V7DQkX8aP96UFJekkxIQ/wve7M rltiPwBeqPwCwpsebzT4NPk8gRUxj7WgogqpV/WPYLZ91f+4QUQLZvUGI/TnvZZzrv9o /Jrg==
X-Gm-Message-State: AO0yUKV6+mdgdmh32VwFakgnZ3YurS2GdlkLvOnhPMPv0tVXNOpwo0fp +PEaCGlSdJ3t/1X77Anx4GWmjcrhG0bhEc+JZZI=
X-Google-Smtp-Source: AK7set/++dVIGTIqzUXC6V6n8mCsb1VBjjvWIeER1c/NqbIk/l1zcrJ77BdNHRM4BGQIwIAzmk8gTQ==
X-Received: by 2002:a05:6808:b05:b0:383:d3ae:1a5f with SMTP id s5-20020a0568080b0500b00383d3ae1a5fmr6119631oij.25.1678616614197; Sun, 12 Mar 2023 03:23:34 -0700 (PDT)
Received: from smtpclient.apple ([185.222.243.89]) by smtp.gmail.com with ESMTPSA id i20-20020a4a8d94000000b005251e3f92ecsm1955314ook.47.2023.03.12.03.23.33 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sun, 12 Mar 2023 03:23:33 -0700 (PDT)
From: Wolf McNally <wolf@wolfmcnally.com>
Message-Id: <FD5D8771-E1CF-4C63-A141-054DE0085399@wolfmcnally.com>
Content-Type: multipart/alternative; boundary="Apple-Mail=_3725D23D-7C3F-4775-8673-D76F1947647E"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3731.400.51.1.1\))
Date: Sun, 12 Mar 2023 03:23:21 -0700
In-Reply-To: <8551021E-A1A2-4764-B0DF-D3E7591EC9B6@tzi.org>
Cc: cbor@ietf.org
To: Carsten Bormann <cabo@tzi.org>
References: <2B1FA8CC-AD83-4E58-BE27-B6504F555694@wolfmcnally.com> <8551021E-A1A2-4764-B0DF-D3E7591EC9B6@tzi.org>
X-Mailer: Apple Mail (2.3731.400.51.1.1)
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/k3sjxUnnhyPy1OpzFAvIqILR8WY>
Subject: Re: [Cbor] dCBOR moving from numerically-typeless systems
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 12 Mar 2023 10:23:37 -0000

Carsten,

In the case of  [uint, 1.0..100.0, -14..68, float, float16] as a data model, the codec will ensure that the serialization is canonical, and the application must validate that each value is in the correct range.

Anything beyond what the CBOR codec can validate is up to the application to validate. That includes the range of values allowable in each position of the array. This is what I mean when I say that the codec can’t get you all the way there: all it can do is ensure that the canonical encoding is used for each numeric value; it can’t ensure that each element falls into a particular range unless you add some kind of scheme validation feature, like a built-in CDDL-based validator. That is out of scope for our current endeavor.

So as far as I’m concerned, nothing about our approach to dCBOR changes between my hand-crafted example, and your hand-crafted example.

Whether BigInt precision is necessary is up to the protocol specifier. This will in turn dictate the features of codecs that can work with the protocol. The same is also true for floating point: many protocols will not require the representation of floating point values, and will therefore work with codecs that don’t support them. None of this changes whether a given codec can be considered dCBOR compliant, and I made this clear in the I-D.

As I said before: it is up to the protocol specifier to specify the *maximum* precision of a numeric value. It is the job of the codec to find the canonical representation of it for encoding, and to validate that canonical representation upon decoding. The developer who wants to adopt that protocol must choose a codec that can handle its numerical precision needs, and perform any further validation of things like numeric ranges when decoding.

~ Wolf

> On Mar 12, 2023, at 1:00 AM, Carsten Bormann <cabo@tzi.org> wrote:
> 
> Hi Wolf,
> 
> (This is a nice specimen for what I call the fallacy of hand-crafted examples.)
> 
> So it seems your data model is, say:
> 
> a = [uint, 1.0..100.0, -14..68, float, float16]
> 
> (I’m guessing from your example, using uint and int as well as a floating point range just to point out that there may be other application-level type requirements than int vs. float.)
> 
> You already assume that, when extracting these values, you know the data model so you can hand the right type to the application (u64, f64, i8, f64, f16).  So you might as well employ that information for encoding.
> 
> You picked the example so that it actually has a benefit for the mixed integer/float encoding.
> But where your data model has floating point values, you are likely to have actual values all over the place, and the percentage of integral values (no fractional part) that you can represent as integers is low (depending on range — if you have a value from 1024.0 to 2047.0 with 10-bit precision, you are going to be more lucky of course :-), (but then there is no size difference in that particular range).
> 
> If you are stuck with the JavaScript number system, you’ll want to represent the full-range uint in position 0 as a BigInt (see RFC 7493): you can’t just push all numbers into binary64 (float64), losing exactness.
> 
> Again, I very much sympathize with the choices dCBOR makes (it was pretty much how I looked at things in 2013), I just wanted to point out where your hand-crafted example is missing out on additional considerations that just don’t pop up in that example, but do in my hand-crafted one :-).
> 
> Grüße, Carsten
>