Re: [Cbor] Record proposal

Emile Cormier <emile.cormier.jr@gmail.com> Thu, 02 December 2021 06:18 UTC

Return-Path: <emile.cormier.jr@gmail.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id EC2213A0AFE for <cbor@ietfa.amsl.com>; Wed, 1 Dec 2021 22:18:45 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.098
X-Spam-Level:
X-Spam-Status: No, score=-2.098 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Eyelx-4WG8oT for <cbor@ietfa.amsl.com>; Wed, 1 Dec 2021 22:18:43 -0800 (PST)
Received: from mail-yb1-xb33.google.com (mail-yb1-xb33.google.com [IPv6:2607:f8b0:4864:20::b33]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 44AEF3A0AFB for <cbor@ietf.org>; Wed, 1 Dec 2021 22:18:43 -0800 (PST)
Received: by mail-yb1-xb33.google.com with SMTP id d10so70407586ybe.3 for <cbor@ietf.org>; Wed, 01 Dec 2021 22:18:43 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=yZs4hsPPRo2diyOwcvDRsrXmugRNIoyTVd+/+ybKrBk=; b=H29J4ANh87/US/LG/qZE3N3d5wZY9IiX1m7hSrg4lZwwqyhConq0I4+DHRt0ZyFchK 6sH+eX+zlU+z8gaSdF0OtFkxaFJomKDnbsJMKh62HKebn6AL0vFK+ZNQUQIXedjORMJH CwdzxHwkzdMkNoQo0sw785VxOa5JNUziFvgBiyw9hMGaEhix/pqzf5BtHHSKya5S6gk7 qoIzWPDDVx+15m+c+nGIS/3Q3FaS3CzkJAvNIg/SynWSvWiH790L6Xj97No/Ti0j0qgW mmJwUwDGywmE1CEFx+EHIHa+zpjBExvAGYqUnaDdmWgE/R16bjszL/bKa3PoADHqoBWi zkPQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=yZs4hsPPRo2diyOwcvDRsrXmugRNIoyTVd+/+ybKrBk=; b=tKNKvBBwawO+ueQpug6YpljZzICgnJYIzytL7XgByIPGErm3I3SRQLjByAau1z3fFO SioSPUteXzJRf41ZAVpLEluZht0Rvu404EkA3mFqUeQdtWPxORz9YqJzRDMVCJQbtujZ XKNekiliEj51R7tEoczUjKL3faaaWtZ65rNilv2/vyxprKYVrpMD2wUabSoTQdZt6IA3 iY7RiVviBdXDPaSy1J4HmsM2JRju21Su3VEAPoXDuQ6FaCykKhz/iL1m47Ojvv821etE Y8RADKldnX7pngpziEwTaVwfkqlkcQt95W1lFlAliLCgXAzpY9BB1HamH82qddtHG3LE 5LZA==
X-Gm-Message-State: AOAM533m+EMx4Xx/SFSG/qAFuSyphEhDcObqthwm8eLrl+8dFv4IAnXH 8xv4oJusxH1eM+j5nkFkyg612CSjOlmKeBYo2V4/foWMcvQ=
X-Google-Smtp-Source: ABdhPJx+2rq8cUqozxoaO+Ps38fnvMgmDQXGfC1jtZ21NQTJHiN43BueDgnUMev7kfTd/fmm3/u6lLpEXpx+YxdfHm4=
X-Received: by 2002:a25:bfca:: with SMTP id q10mr12370142ybm.68.1638425920898; Wed, 01 Dec 2021 22:18:40 -0800 (PST)
MIME-Version: 1.0
References: <CAEs2a6vNhrHhaiPUNtbJ68WYfbrprETPr+kmWNJgNXMSawyBig@mail.gmail.com> <E3F121DA-95EE-43C6-BC72-E3763C034944@tzi.org> <CAEs2a6uZrT9FFP6qa+hPV2sYO0y+xJJmLaF-pPoynE2vqspfBg@mail.gmail.com> <CAEs2a6uHyvvghAMCN=UmhpJMoiES7zoPmGi-bATZWXgjA068Mg@mail.gmail.com> <YY0B4YxuMuw20umu@hephaistos.amsuess.com> <CAEs2a6utA=GQSx2Ln=5wnoNdS6z+0ExdCcfNXG6cAg=1MxnT=w@mail.gmail.com> <901541DC-A520-44CD-AA8D-F2CE77F03FA0@tzi.org> <CAEs2a6sZd4s-DJ3R_M4BLwO12s8i2AGfv0yXCaWdy+baOuAEqw@mail.gmail.com> <8CA1A63D-70B5-4109-ABE7-9CF9197F0375@tzi.org> <CAEs2a6uTKJ1DOTjREjKaRSY6kNAHSof97OoRAZbjDWOazLQC+A@mail.gmail.com> <CAEs2a6tY02haauD4OL18fp15Zet2bqkq+xVzEvAEiK5cvTpy2w@mail.gmail.com> <5C7719D8-8DCB-41BE-9111-882A02D43506@tzi.org> <CAEs2a6vVL9_wvrbwske80m5P5Y1xKw6_ecitDL9uybf2TsvWHw@mail.gmail.com> <CAEs2a6tW7K71wKfK-EerdntmyTppqDrz=Fjb7BfADXAkH5N3gA@mail.gmail.com> <CAM70yxB_YaRWccUk_UfLgxwd1gSUNxkDmaWfh-15wEiXsVe9Ng@mail.gmail.com> <CAEs2a6s3=jSb2N7+JHntApW9PWgBCUxV5TP5ej7vLfuR4T6fug@mail.gmail.com>
In-Reply-To: <CAEs2a6s3=jSb2N7+JHntApW9PWgBCUxV5TP5ej7vLfuR4T6fug@mail.gmail.com>
From: Emile Cormier <emile.cormier.jr@gmail.com>
Date: Thu, 2 Dec 2021 02:18:29 -0400
Message-ID: <CAM70yxDJFiv=skuU4=vkos9Zk1JpqQ+6xGGDFj6+Tb6gqCdKFg@mail.gmail.com>
To: Kris Zyp <kriszyp@gmail.com>
Cc: Carsten Bormann <cabo@tzi.org>, =?UTF-8?Q?Christian_Ams=C3=BCss?= <christian@amsuess.com>, "cbor@ietf.org" <cbor@ietf.org>
Content-Type: multipart/alternative; boundary="000000000000018c8505d223c349"
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/NJZ05AWEDeiNfg4MV6NbL942wIM>
Subject: Re: [Cbor] Record proposal
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 02 Dec 2021 06:18:46 -0000

On Thu, Dec 2, 2021 at 1:34 AM Kris Zyp <kriszyp@gmail.com> wrote:

> I certainly empathize with the idea of keeping tags as conceptually
> distinct in terms of just adding some extra non-transformational semantic
> description to the underlying CBOR data structure. However, there is a wide
> range of "meanings" in tags, and realistically, with many, or even most
> tags, you can't really "make sense" of a payload without the "semantic
> meaning"; the meaning is what provides the direction for how to make sense
> of the data. For many tags, something needs to understand the data beyond
> just a raw CBOR structure.
>

But it should be possible for applications to agree in advance on the
meaning of the data and still be able to interpret it in the absence of
tags. This practice of embedding an extra number via a tag range is ruining
this ability for applications to agree in advance on the meaning of the
data in the absence of tags.

Your proposed tag range of 57344 - 57599 could simply be replaced by a
single tag followed by an array of size 2. The first element of that array
would be the record ID that would otherwise be encoded via the tag value.
In this way, an application expecting a record set would still be able to
interpret the data in the absence of tags, because the record IDs appear
explicitly in the data.

In section 3.4 of the spec, it says: "If a tag requires further structure
to its content, this structure is *provided by the enclosed data item*."
(emphasis mine). This clause is being violated by tags that rely on tag
range to convey additional data.


> Without the semantic meaning, there isn't data loss; raw CBOR structures
> could always be transcoded to JSON or anything else (with conventions for
> tags and such)
>

Well, that's the problem: there is no convention for tags in JSON. A
CBOR-to-JSON converter now needs to understand your record set tags, and
generate a modified structure for JSON that contains the record IDs that
would otherwise be lost.

This practice of using a tag range to avoid the few measly bytes of a
two-element tuple is making it so that once an application adopts these
tags, it can never switch to another encoding other than CBOR.

I'm afraid CBOR can no longer be considered as the binary equivalent of
JSON plus byte arrays. Instead of tags enhancing data with semantic meaning
hints, it's effectively extending the number of data types. This is what
I've realized with signed CBOR bignums which also rely on tag ranges (a
range of two): they are now a new data type that my decoder needs to
interpret if I don't want the application layer to be concerned with tags.