Re: [Cbor] Decoding numbers and compliance verification in dCBOR

Wolf McNally <wolf@wolfmcnally.com> Sat, 11 March 2023 08:28 UTC

Return-Path: <wolf@wolfmcnally.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A0ED8C151707 for <cbor@ietfa.amsl.com>; Sat, 11 Mar 2023 00:28:12 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.894
X-Spam-Level:
X-Spam-Status: No, score=-1.894 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=wolfmcnally-com.20210112.gappssmtp.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id SKTNYFJ9SeSq for <cbor@ietfa.amsl.com>; Sat, 11 Mar 2023 00:28:11 -0800 (PST)
Received: from mail-oo1-xc34.google.com (mail-oo1-xc34.google.com [IPv6:2607:f8b0:4864:20::c34]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id EDFD3C15154A for <cbor@ietf.org>; Sat, 11 Mar 2023 00:28:11 -0800 (PST)
Received: by mail-oo1-xc34.google.com with SMTP id c184-20020a4a4fc1000000b005250b2dc0easo1145540oob.2 for <cbor@ietf.org>; Sat, 11 Mar 2023 00:28:11 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=wolfmcnally-com.20210112.gappssmtp.com; s=20210112; t=1678523291; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=yuNwbST/2/TlsPqKAWF+iD9/c4iNCTjqhapnk2Q6Qz0=; b=Q5wRs5esAbEAUBgQw3TVHObYDjpr48XAYEW/l9wroFnKVZD5S7TqDtvXRSvR59Xb5U JhyYWDpkwjJ0jfdAMqve/KWPymeA8iFCNWWrFScLXcz2e8bUtF8jl9wKYkSvXkLfCf4A ec3PQ9rBEOi/Yi682tBBjHLUZfVnITsWzGKSbaWWMB+mD/bua9huUsitsv/fpctn8O9e 3gNpuFnjlruMKSxhdKVyogLwWPSbg7TJRPCCwFPaNS1ENJu/H6NYpPacggR8suH910il dWJrEjG4W+2DgFFC322xcgO/ByQ8SJUYQ4NcbMJdm5DHgrWazOU5AGxj5z4eInObZ8nZ uwEA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678523291; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=yuNwbST/2/TlsPqKAWF+iD9/c4iNCTjqhapnk2Q6Qz0=; b=KU0fW+N2ID4/M0e54sfJTkxn0dLYLmA7AigLYoWloLMMul3aCR9wYPkeP4mfR6Pyvf 4vUhXWz8K3gLFutKfyXkqtibdyzheUq8Xdx7kxgTfX+IAztDKtaArgxMUp+9NZCqefws 4enu6grkomtS50tIKAI/jeo153xVNtouxVCsYafL8Ce1Zv0lZGRwlh0leuBvdCk+GD5A bjDCbjj0WW5ATz7Hch7S51x0qfvpaGv+u4vZLLrAy89sSnucJPS7KoIY7wUvGh3yyevB Xlw56tnhrCzT+HNvy6tHfvILYAz7DpqyCIMuxv5EJRwW4Kj/SpkKiAPf6Ja7PBqP0Y9N cR7A==
X-Gm-Message-State: AO0yUKWcp7V1n0IyvMjEW9tn+ifLN1b7ntZ8f2zemqp8ztcJI2ENAU49 y7ULwdkOSoFpp6o2At/yOUegMw==
X-Google-Smtp-Source: AK7set8itp77GxvQZKOpTJvRE8c7xK9rMsGZecDUjxQfq6Sbw+7HHiwol4Y+eZHROTeFqgjaBKHOzQ==
X-Received: by 2002:a4a:df0c:0:b0:51a:a89a:4be3 with SMTP id i12-20020a4adf0c000000b0051aa89a4be3mr10939600oou.9.1678523291053; Sat, 11 Mar 2023 00:28:11 -0800 (PST)
Received: from smtpclient.apple ([185.222.243.89]) by smtp.gmail.com with ESMTPSA id h11-20020a4a940b000000b0051ac0f54447sm837616ooi.33.2023.03.11.00.28.10 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sat, 11 Mar 2023 00:28:10 -0800 (PST)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3731.400.51.1.1\))
From: Wolf McNally <wolf@wolfmcnally.com>
In-Reply-To: <38de8a78-0140-45af-b4fb-f601265809e4@gmail.com>
Date: Sat, 11 Mar 2023 00:27:58 -0800
Cc: Carsten Bormann <cabo@tzi.org>, Laurence Lundblade <lgl@island-resort.com>, cbor@ietf.org
Content-Transfer-Encoding: quoted-printable
Message-Id: <09207367-8B74-434C-89B1-881780DCECA5@wolfmcnally.com>
References: <83BF059D-BEF2-4C5F-9DE8-7A99A529833F@island-resort.com> <8999DCEA-6572-4A69-85EC-AA7AD0170837@tzi.org> <38de8a78-0140-45af-b4fb-f601265809e4@gmail.com>
To: Anders Rundgren <anders.rundgren.net@gmail.com>
X-Mailer: Apple Mail (2.3731.400.51.1.1)
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/2JLuUvFeoa_zAXLLl7-1gbbmyNo>
Subject: Re: [Cbor] Decoding numbers and compliance verification in dCBOR
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 11 Mar 2023 08:28:12 -0000

For us, minimizing absolute size of serialization of a numeric value is less of a goal than having a single, deterministic serialization for a given value. In addition, dCBOR codec implementors should be able to forego floating point or bignum support, and still be able to expect the same canonical serialization for all the integers representable by major types 0 and 1.

~ Wolf
Lead Researcher, Blockchain Commons

> On Mar 10, 2023, at 11:06 PM, Anders Rundgren <anders.rundgren.net@gmail.com> wrote:
> 
> If shortest possible number representation is an absolute goal, you already have a problem with "pure" integers.  An integer value of 1099511627775 (0xffffffffff) would actually yield two bytes less(!) using the Bignums type.
> 
> It would (IMO) be unwise trying to fix this in dCBOR.
> 
> Anders
> 
> On 2023-03-10 20:14, Carsten Bormann wrote:
>> Hi Laurence,
>> I think your arguments are important.
>> But for a receiver of information, there are also benefits from knowing what to expect.
>> In particular map processing can be simpler if only a specific sequence of the entries needs to be accepted.
>> Whether floating point values are important for an application may also influence whether it is worth to expend some additional processing.  The simplest devices often get by without any floating point operations.  Once floating point becomes important for the functioning of a device, processors such as those of ARM’s Cortex M4 series can help reduce the power-hungry on-time of the device by efficiently performing the floating point computations.  With these processors (and certainly with processors of the smartphone, laptop, and server classes), the requirements of dCBOR become almost trivial.
>> I don’t want to take a particular side here, just point out that not all applications are the same.  Having a go-to profile of CBOR that covers an interesting subset of applications appears to be a net win to me.
>> Additional CDDL support such as that provided in draft-bormann-cbor-cddl-more-control with .cbordet and .cborseqdet may be desirable.  There is currently no way in CDDL to express the rules about number representation that dCBOR has adopted; that may be one interesting gap.
>> Little aside: When I started to think about this, I started to wonder: What is the dCBOR way to represent 65536000000.0?
>> 5 bytes:
>>>> 65536000000.0.to_cbor.hexs
>> => "fa 51 74 24 00"
>> 9 bytes:
>>>> 65536000000.to_cbor.hexs
>> => "1b 00 00 00 0f 42 40 00 00”
>> Here, the floating point representation is shorter…
>> But wait, even if float32 cannot be exact (e.g., for 65536000001), try:
>> 7 bytes:
>>>> CBOR.decode("c2 45 0f 42 40 00 01".xeh)
>> => 65536000001
>> 9 bytes:
>>>> CBOR.decode("c2 45 0f 42 40 00 01".xeh).to_cbor.hexs
>> => "1b 00 00 00 0f 42 40 00 01”
>> (Number systems are always an unwieldy part of representation formats.)
>> Grüße, Carsten
>> _______________________________________________
>> CBOR mailing list
>> CBOR@ietf.org
>> https://www.ietf.org/mailman/listinfo/cbor
>