Re: [Cbor] Deterministic CBOR as a possible DISPATCH item

Carsten Bormann <cabo@tzi.org> Sun, 05 March 2023 12:46 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 22872C14CEFE for <cbor@ietfa.amsl.com>; Sun, 5 Mar 2023 04:46:18 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.199
X-Spam-Level:
X-Spam-Status: No, score=-4.199 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id n5vfarBrtERm for <cbor@ietfa.amsl.com>; Sun, 5 Mar 2023 04:46:13 -0800 (PST)
Received: from smtp.zfn.uni-bremen.de (smtp.zfn.uni-bremen.de [134.102.50.15]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A05B5C14EB12 for <cbor@ietf.org>; Sun, 5 Mar 2023 04:46:10 -0800 (PST)
Received: from smtpclient.apple (p548dc9a4.dip0.t-ipconnect.de [84.141.201.164]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.zfn.uni-bremen.de (Postfix) with ESMTPSA id 4PV1fR34pfzDCbW; Sun, 5 Mar 2023 13:46:07 +0100 (CET)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3731.400.51.1.1\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <7ccbd455-95ac-f816-6395-5c21102ad015@gmail.com>
Date: Sun, 05 Mar 2023 13:45:53 +0100
Cc: cbor@ietf.org, Christopher Allen <ChristopherA@lifewithalacrity.com>, wolf@wolfmcnally.com
Content-Transfer-Encoding: quoted-printable
Message-Id: <F7F58384-060D-43CD-99B0-C681704DE5D1@tzi.org>
References: <ddb5cff8-a07b-b8f3-c9a2-6fa8865d5ed4@gmail.com> <2B296657-7486-4B77-9265-260AA5F47CB2@tzi.org> <7ccbd455-95ac-f816-6395-5c21102ad015@gmail.com>
To: Anders Rundgren <anders.rundgren.net@gmail.com>
X-Mailer: Apple Mail (2.3731.400.51.1.1)
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/a5BhhSYG45dbTt_H2Z2fW4HdQjk>
Subject: Re: [Cbor] Deterministic CBOR as a possible DISPATCH item
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 05 Mar 2023 12:46:18 -0000

> Summary:
> 1. Subset of floating point values.  No "signaling" in NaN.  I fully support that.

This is essentially an application layer restriction, which certainly works fine for many applications.

> 2. Floating point values that can be represented as integers MUST be represented as integers.  

This in essence also is an application layer restriction: It means that the application doesn’t get to distinguish between integer and floating point numbers.
This is close to what we had in mind when we wrote RFC 7049, but what we have found creates headaches for some applications.
But it is certainly something an application can choose to do.

In other words, if the application already converts integral numbers into integers, you can use an RFC8949-compliant deterministic encoder.
The fact that the application delegates this work to the CBOR encoder is just an implementation approach, which I consider valid.
Of course, it doesn’t work with protocols that use the distinction for some semantic purpose — which is an acceptable design limitation.
(I.e., it probably is a good idea not to design things that way…)

Note that an application that wants to see float16 or float32 for all or the majority of its numbers can conceptually truncate the precision before handing the numbers over to a generic deterministic encoder, which will do the right thing then.  Again, some delegation may be possible here.

> I'm not too thrilled of this, it seems like a quirk to change fundamental number type in order to save 2 bytes for values like 0.0, 10, etc.

If you come from a JSON base, where all numbers are floating point, recognizing integral numbers and representing them as integers can make a lot of sense.
So I do have a lot of sympathy with this approach.
But you cannot do this without the application’s consent: The application then cannot use the distinction between major type 0/1/tag 2/3 and major type 7(ai 25/26/27) as a semantic indicator.  Again, this is fine for very many applications.

> The rest is (presumably) 100% compliant with my dCBOR implementation.
> 
> QCBOR does not comply with RFC8949 for floating point data:
> https://github.com/laurencelundblade/QCBOR/issues/131

This certainly can be fixed (code is in the RFC).

Grüße, Carsten