[Cbor] Re: Soliciting unresolved points around dCBOR

Christian Amsüss <christian@amsuess.com> Mon, 17 June 2024 13:39 UTC

Return-Path: <christian@amsuess.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6A114C18DB8D for <cbor@ietfa.amsl.com>; Mon, 17 Jun 2024 06:39:43 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.909
X-Spam-Level:
X-Spam-Status: No, score=-1.909 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 1bmxwLkK1UEV for <cbor@ietfa.amsl.com>; Mon, 17 Jun 2024 06:39:41 -0700 (PDT)
Received: from smtp.akis.at (smtp.akis.at [IPv6:2a02:b18:500:a515::f455]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 28FDEC1DA1D4 for <cbor@ietf.org>; Mon, 17 Jun 2024 06:39:39 -0700 (PDT)
Received: from poseidon-mailhub.amsuess.com ([IPv6:2a02:b18:c13b:8010:a800:ff:fede:b1bd]) by smtp.akis.at (8.17.2/8.17.2) with ESMTPS id 45HDdZUk022592 (version=TLSv1.2 cipher=ECDHE-ECDSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 17 Jun 2024 15:39:35 +0200 (CEST) (envelope-from christian@amsuess.com)
X-Authentication-Warning: smtp.akis.at: Host [IPv6:2a02:b18:c13b:8010:a800:ff:fede:b1bd] claimed to be poseidon-mailhub.amsuess.com
Received: from poseidon-mailbox.amsuess.com (hermes.lan [10.13.13.254]) by poseidon-mailhub.amsuess.com (Postfix) with ESMTP id A61CD3C9B8; Mon, 17 Jun 2024 15:39:34 +0200 (CEST)
Received: from hephaistos.amsuess.com (unknown [IPv6:2a02:b18:c13b:8010:1478:26bb:b7b2:76a8]) by poseidon-mailbox.amsuess.com (Postfix) with ESMTPSA id 68B5536D01; Mon, 17 Jun 2024 15:39:34 +0200 (CEST)
Received: (nullmailer pid 17375 invoked by uid 1000); Mon, 17 Jun 2024 13:39:34 -0000
Date: Mon, 17 Jun 2024 15:39:34 +0200
From: Christian Amsüss <christian@amsuess.com>
To: Anders Rundgren <anders.rundgren.net@gmail.com>
Message-ID: <ZnA8lsWgEjWNRZLx@hephaistos.amsuess.com>
References: <Zm7eekcpgBmJQ5jv@hephaistos.amsuess.com> <ZnAlntTjsdqobA7h@hephaistos.amsuess.com> <be9f8d65-47ba-4a49-b443-7de016b5bd5d@gmail.com>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg="pgp-sha256"; protocol="application/pgp-signature"; boundary="rmITTlYM//I1cc4M"
Content-Disposition: inline
In-Reply-To: <be9f8d65-47ba-4a49-b443-7de016b5bd5d@gmail.com>
X-Scanned-By: MIMEDefang 2.86
Message-ID-Hash: RQ5KM5TNOB3CRFX3HX7JPXQEGUFYS4QZ
X-Message-ID-Hash: RQ5KM5TNOB3CRFX3HX7JPXQEGUFYS4QZ
X-MailFrom: christian@amsuess.com
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-cbor.ietf.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
CC: cbor@ietf.org
X-Mailman-Version: 3.3.9rc4
Precedence: list
Subject: [Cbor] Re: Soliciting unresolved points around dCBOR
List-Id: "Concise Binary Object Representation (CBOR)" <cbor.ietf.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/xU6_HL8yRVRue0I_E2zWgzyl1KY>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Owner: <mailto:cbor-owner@ietf.org>
List-Post: <mailto:cbor@ietf.org>
List-Subscribe: <mailto:cbor-join@ietf.org>
List-Unsubscribe: <mailto:cbor-leave@ietf.org>

On Mon, Jun 17, 2024 at 02:48:34PM +0200, Anders Rundgren wrote:
> To get any further, the dCBOR advocates should provide a simple
> example showing a Gordian-like operation that fails using CDE "as is".

That may be a useful exercise indeed.

To my understanding, the goal of using deterministic CBOR encodings in
the first place is to ensure that consumers can ingest Gordian data,
store them in their native format by value, and re-encode Gordian to get
bit-identical data for cryptographic purposes.

(Were there not the aspect of independent storage, we wouldn't need even
deterministic encoding, but could just always reuse the original
encoding for verification.)

Numeric reduction comes into play when different parties use different
data types internally. No matter whether it is a database with a u64
column or a database with a f32 column, if both receive the array [1, 1,
2, 3, 5], store it and re-serialize, they wind up with the same data.
This is what I understand the desired benefit of numeric reduction to
be.

Now I still do not agree with the design of such a system: Since by
design the semantics of array doesn't include typing information
(otherwise the application could tell the serializer whether it's floats
or ints), the collaboration of the u64 and the f32 tool will work until
it fails when a value is produced that is not understood by one of them
(at which point that party can't ingest the data any more). I'd prefer
to have arrays (or other fields) with clear semantics that fail on
general mismatch and then are guaranteed to succeed at runtime.

But to the best of my understanding, such a design is what (correctly,
albeit on to me undesirable premises) motivates numeric reduction.

BR
c (once more, chair hat off)

-- 
To use raw power is to make yourself infinitely vulnerable to greater powers.
  -- Bene Gesserit axiom