Re: [Cbor] Updated Drafts for dCBOR I-D and Gordian Envelope Structured Data Format I-D & IANA Tag Registration

Wolf McNally <wolf@wolfmcnally.com> Wed, 31 May 2023 08:22 UTC

Return-Path: <wolf@wolfmcnally.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7D67EC151083 for <cbor@ietfa.amsl.com>; Wed, 31 May 2023 01:22:51 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.896
X-Spam-Level:
X-Spam-Status: No, score=-6.896 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=wolfmcnally-com.20221208.gappssmtp.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id g_unYr_NWid9 for <cbor@ietfa.amsl.com>; Wed, 31 May 2023 01:22:50 -0700 (PDT)
Received: from mail-pf1-x42c.google.com (mail-pf1-x42c.google.com [IPv6:2607:f8b0:4864:20::42c]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D2B55C151075 for <cbor@ietf.org>; Wed, 31 May 2023 01:22:50 -0700 (PDT)
Received: by mail-pf1-x42c.google.com with SMTP id d2e1a72fcca58-64d3578c25bso6333372b3a.3 for <cbor@ietf.org>; Wed, 31 May 2023 01:22:50 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=wolfmcnally-com.20221208.gappssmtp.com; s=20221208; t=1685521370; x=1688113370; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:from:to:cc:subject:date:message-id:reply-to; bh=30Bkcy6TNNPbLySlzPTqhe5ghDJ57DIEu/c1I/NDJSY=; b=zKhnYyGQnp7iWuVsbe0unBUh+63Am4S9LaWxejYWtDG61MuQzvuXLQzPiKhl2to65R GSM9ZSd2tqTI+Woe+NARtt2hW7cxxDOTTupn6gf3/kPJPiItkq5FC/pLTAnj+5mdkoNC ILblrEcYyi9MjxmjYewTRjTWnC2vsoMlKXCcx7tcQbeZ5rSeZVINT/1iPS2e+mPd5z+S 6qy+4ak7QgNT1uiAdUiDrZRNGCsgAivXgljhTioIxdCK/lEhJQCJxApke5tE38PH92Pj Lc2NbtDs2isDTahVu0TNN0M2Ov6CzhjiZzEfwU6focjdeGcApdo5sjgCYuH+7A0KL1+5 tZ+Q==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685521370; x=1688113370; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=30Bkcy6TNNPbLySlzPTqhe5ghDJ57DIEu/c1I/NDJSY=; b=VgxID28Mzkvx7uIKSK05BXECHVl1WcCbgb6kqRM699E/5ACOCeqVRsOS1KlnqbAW6Q pkRQY3wWnuHfMckvBbsX/9D84J1OtD11YUuN0fRCYnlqEz2NhW/KXgFxYwLE7gfE3g6N RFBTCRjcUzwg02zaFd1RwOvY48kUbgLhPBVp7Q916EGPW4YmIPic6t0Eka8gCo4IeXBR HpOIamzRd3w+rYbyOk0TDpfywGoFNIlKX4Sl3O/CcOcgdFiKQkWF8SvK3LJh6rkveLLa dmC6ZEtTPXv93g9nBgXmk6kKdmHl+OyTFd9Em3N5KQqmcKgid1PNlntZpr80Somf+IdG CPeQ==
X-Gm-Message-State: AC+VfDxLW/9QmEjJCa5+d/UnCKHiV+Ur0uBNRFKEcBw3r1pioxd3GpaY 3t2D+TSdMO9wBszXMIhSWCiCFQ==
X-Google-Smtp-Source: ACHHUZ69+jWQnfVVe8PAy3T1DHcUZYDD8xoCG0vJ3UhQ6NCDsXAknFwqYsAU0h2I6J+7qFO1gOXXFQ==
X-Received: by 2002:a05:6a00:124b:b0:64d:2e0a:4812 with SMTP id u11-20020a056a00124b00b0064d2e0a4812mr5857551pfi.17.1685521369867; Wed, 31 May 2023 01:22:49 -0700 (PDT)
Received: from smtpclient.apple (ip70-180-193-108.lv.lv.cox.net. [70.180.193.108]) by smtp.gmail.com with ESMTPSA id a11-20020aa7864b000000b0063d2989d5b4sm2857871pfo.45.2023.05.31.01.22.48 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 31 May 2023 01:22:49 -0700 (PDT)
From: Wolf McNally <wolf@wolfmcnally.com>
Message-Id: <EA5A4131-F715-437C-A9AD-FF6220D1A3B3@wolfmcnally.com>
Content-Type: multipart/alternative; boundary="Apple-Mail=_A89466CA-69A2-4020-883B-151754D33F1F"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3731.600.7\))
Date: Wed, 31 May 2023 01:22:37 -0700
In-Reply-To: <62FCDF82-9766-431F-A996-BF820D5564C5@island-resort.com>
Cc: Jeremy O'Donoghue <jodonogh@qti.qualcomm.com>, Carsten Bormann <cabo@tzi.org>, Christopher Allen <christophera@lifewithalacrity.com>, "cbor@ietf.org" <cbor@ietf.org>, "Shannon.Appelcline@gmail.com" <Shannon.Appelcline@gmail.com>
To: Laurence Lundblade <lgl@island-resort.com>
References: <CAAse2dEFB_FVP6_KkNANSYPW+yX4-M9pN3YkUq5=FTgLZnyWGw@mail.gmail.com> <4EBE3640-5F7F-46B8-961A-D1872A6A0CA4@tzi.org> <463016EF-0DAB-45D4-AB30-53FB2B76F52B@wolfmcnally.com> <DD0E7621-EE3A-496E-9D2C-1CD00E2D92F9@tzi.org> <PH0PR02MB7256A94640C7C02C75C3795FF2779@PH0PR02MB7256.namprd02.prod.outlook.com> <62FCDF82-9766-431F-A996-BF820D5564C5@island-resort.com>
X-Mailer: Apple Mail (2.3731.600.7)
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/o_teCltZnYddJHskEm8ff_0GYUI>
Subject: Re: [Cbor] Updated Drafts for dCBOR I-D and Gordian Envelope Structured Data Format I-D & IANA Tag Registration
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 31 May 2023 08:22:51 -0000

Christopher and I will be a today’s meeting.

The main concern I have over using the heuristic of “shortest encoding wins” is exactly the issue that, as has been pointed out by others, some BigNum-encoded representations are in fact shorter than regular CBOR uint/nint representations. This means that to encode these values deterministically, a dCBOR codec *must* support BigNum, and I don’t want this to be a requirement.

The proposal we have forwarded is “least precise wins.” Since all fixed-size types can be considered "less precise" than BigNum, BigNum is considered less-preferred, and will then only be used when all the less-precise types fail to represent a presented value. For applications that don’t require values that require BigNum representation, the dCBOR codecs used don’t need to support it.

The same principle also works on a smaller scale for floating point values: A constrained dCBOR codec wouldn’t even have to implement floating point support if it doesn’t need it, because the integer types are less precise than the floating point types.

One could quibble about the exact meaning of “precision” in this context when, for instance, comparing the precision of a float type to an int type, but our proposal comes down to establishing a canonical hierarchy of numeric types (not encodings!) from most-to-least preferred, and finding the single most-preferred type that can represent a presented value without loss of accuracy. So if it can be represented in one byte, it will be. If a codec supports BigNum, and a value is presented to the codec API as a BigNum, and can’t be represented by any more preferred (less-precise) type, then BigNum will be used for the encoding.

The primary goal of dCBOR is determinism. Shortest-possible encoded form is nice when possible, but a non-goal.

~ Wolf

> On May 10, 2023, at 10:55 AM, Laurence Lundblade <lgl@island-resort.com> wrote:
> 
> On May 10, 2023, at 2:44 AM, Jeremy O'Donoghue <jodonogh@qti.qualcomm.com <mailto:jodonogh@qti.qualcomm.com>> wrote:
>> 
>> To give a very practical use-case, Rust has a native i128 type, for which the most “natural” encoding of Carsten’s example would be as a CBOR negative integer. Haskell uses “unlimited” length integers as (perhaps more usefully) does Python. 
>>  
>> I would prefer to see DCBOR require encoding of integer types on the smallest/most natural CBOR type. Only when outside the range of the CBOR positive and negative integer ranges should bigint be used. We don’t have to constrain everything else because of the limitations of the C and C++ builtin integer types.
>>  
>> Best regards
>> Jeremy
> 
> +1 for this.
> 
> DCBOR implementations and applications are going to vary depending on whether floating point is needed/available and on the precision of the float and on the range of integers needed/available. It’s not possible to expect every implementation to support every number possible, nor will every use case need every value.
> 
> It might be helpful (necessary to pre-define a few number number range profiles so a use case can pick from a menu and so there are example profiles.
> 
> For example:
> Basic 64-bit integer profile: Integer values between  MIN_INT64 to MAX_INT64 are allowed. All others error out.
> Basic float profile: All values that can be represented by an IEEE 754 double are allowed. All others error out. Bignum support is not required.
> Basic float profile with lignum: All values that can be represented by an IEEE 754 double are allowed. All others error out. Bignum representation are allowed. Implementations must round off big numbers that have greater precision than can be represented by an IEEE 754 double.
> LL
> 
> 
>