Re: [Cbor] Record proposal

Emile Cormier <emile.cormier.jr@gmail.com> Wed, 01 December 2021 23:57 UTC

Return-Path: <emile.cormier.jr@gmail.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7FF893A0D10 for <cbor@ietfa.amsl.com>; Wed, 1 Dec 2021 15:57:42 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.086
X-Spam-Level:
X-Spam-Status: No, score=-2.086 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, T_SPF_TEMPERROR=0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id EwuNN9OOQ-8x for <cbor@ietfa.amsl.com>; Wed, 1 Dec 2021 15:57:36 -0800 (PST)
Received: from mail-yb1-xb35.google.com (mail-yb1-xb35.google.com [IPv6:2607:f8b0:4864:20::b35]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 54C4C3A0D12 for <cbor@ietf.org>; Wed, 1 Dec 2021 15:57:36 -0800 (PST)
Received: by mail-yb1-xb35.google.com with SMTP id v138so68343144ybb.8 for <cbor@ietf.org>; Wed, 01 Dec 2021 15:57:36 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=dDKZnBD5llC6S7V+Cai+7Q4j0Kg4mLqEjZOaIWWIJds=; b=WVBLW3TUEgaeIZCvZPyKSZNAqx+D45bGl1LJl2Eb28Z0JPu9fr17OalHMV2C6/iaiZ BiHUTwzSriMiEHm+Ym7N1X+JATgr+SadXCRLgSzrw1vlYR3wuZLWa38HRM4iJJzsONKX H/vUK4sIxyprABEZYWlDTjlVtfuV78KojY0sDo5z1ME+kKxHJJpBRcnF+QXBU9uiH1Rb M5qh45fLcg8lEzTCRYy9su3A6ygrpi8YX3RjLLT1O0YQTKTEgZOz47R3xT15Y2qy7351 ZGdMytBy6eCSZoDpcPu7x/fZS4BOApp8PWCqyfO6DjD3pLMN3LDCGGd2jI3EyOEBF+v+ mb7A==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=dDKZnBD5llC6S7V+Cai+7Q4j0Kg4mLqEjZOaIWWIJds=; b=pZf3ZiPPXlNmXNWD/4qmCLZyJ4JL3ZIRua1U22SwT4OX82fdFOIBihUX5s2+3JsosM 4y9uzYhmRsPLKL6lHsOvMsfSjZ0b541/g7AnWw7GXZ2a1IcbtTyh9+X28PwSOgg2IhZD tPIOuL9SydOX/YHS9tDug+8xE1DrXbAGna9oct8LGWaBMaGR3s355+WCoqrfT+m5/CF3 ChIqw+QKdX9n+5qm15w+JMhRM/MacMo4gb0CLONGZL7uRFYza2exfPvtpUq1EG24O4o3 QLWRRw1qB/UvrwTtyxQlGOF203xE1S06fRKmh9674KOJXJ4CdHq0TzY6DVezZVd4fohV Eung==
X-Gm-Message-State: AOAM532F2X5ucDruAecIabJFDkJOrTCoITwE1UKlX68qnh+P0Dj3+emo hS5ndSpGCa3DsFPzUG5I5YDkod1Aq3Rz2+doisQ=
X-Google-Smtp-Source: ABdhPJz3fo1v3Kakj/6dFweKGP+dFuqMwq3/LDZ6NAWVnUNWpw8PkeywCTBOZSgzq6n+RYYJj59BL5KtreJh/eq7G7I=
X-Received: by 2002:a25:2346:: with SMTP id j67mr11194103ybj.467.1638403054545; Wed, 01 Dec 2021 15:57:34 -0800 (PST)
MIME-Version: 1.0
References: <CAEs2a6vNhrHhaiPUNtbJ68WYfbrprETPr+kmWNJgNXMSawyBig@mail.gmail.com> <E3F121DA-95EE-43C6-BC72-E3763C034944@tzi.org> <CAEs2a6uZrT9FFP6qa+hPV2sYO0y+xJJmLaF-pPoynE2vqspfBg@mail.gmail.com> <CAEs2a6uHyvvghAMCN=UmhpJMoiES7zoPmGi-bATZWXgjA068Mg@mail.gmail.com> <YY0B4YxuMuw20umu@hephaistos.amsuess.com> <CAEs2a6utA=GQSx2Ln=5wnoNdS6z+0ExdCcfNXG6cAg=1MxnT=w@mail.gmail.com> <901541DC-A520-44CD-AA8D-F2CE77F03FA0@tzi.org> <CAEs2a6sZd4s-DJ3R_M4BLwO12s8i2AGfv0yXCaWdy+baOuAEqw@mail.gmail.com> <8CA1A63D-70B5-4109-ABE7-9CF9197F0375@tzi.org> <CAEs2a6uTKJ1DOTjREjKaRSY6kNAHSof97OoRAZbjDWOazLQC+A@mail.gmail.com> <CAEs2a6tY02haauD4OL18fp15Zet2bqkq+xVzEvAEiK5cvTpy2w@mail.gmail.com> <5C7719D8-8DCB-41BE-9111-882A02D43506@tzi.org> <CAEs2a6vVL9_wvrbwske80m5P5Y1xKw6_ecitDL9uybf2TsvWHw@mail.gmail.com> <CAEs2a6tW7K71wKfK-EerdntmyTppqDrz=Fjb7BfADXAkH5N3gA@mail.gmail.com>
In-Reply-To: <CAEs2a6tW7K71wKfK-EerdntmyTppqDrz=Fjb7BfADXAkH5N3gA@mail.gmail.com>
From: Emile Cormier <emile.cormier.jr@gmail.com>
Date: Wed, 1 Dec 2021 19:57:23 -0400
Message-ID: <CAM70yxB_YaRWccUk_UfLgxwd1gSUNxkDmaWfh-15wEiXsVe9Ng@mail.gmail.com>
To: Kris Zyp <kriszyp@gmail.com>
Cc: Carsten Bormann <cabo@tzi.org>, =?UTF-8?Q?Christian_Ams=C3=BCss?= <christian@amsuess.com>, cbor@ietf.org
Content-Type: multipart/alternative; boundary="00000000000010b07c05d21e70ac"
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/-JmHqlXMiCvHaHt4EGYNbSUSVu0>
Subject: Re: [Cbor] Record proposal
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 01 Dec 2021 23:57:49 -0000

I'd like to point out that this proposal as it's currently written will
result in loss of information with CBOR decoders that don't understand the
tags, or when the data elements are transcoded to another protocol such as
JSON. This is due to the tags carrying information (record IDs) that the
application needs in order to make sense of the received payload. The
proposed tags do more than provide "semantic meaning"; they also carry
information.

This practice of bundling numeric information into the tags is making it so
that these CBOR items are not interoperable when transcoded to JSON (think
web browser), or when a CBOR decoder doesn't understand the tags and
doesn't propagate them up to the application.


On Mon, Nov 29, 2021 at 9:35 AM Kris Zyp <kriszyp@gmail.com> wrote:

> To follow up, I did update the proposal/spec to use the tag id range
> of 57342-57599. Let me know if you would prefer different ids.
> Thanks,
> Kris
>
> On Wed, Nov 17, 2021 at 10:52 PM Kris Zyp <kriszyp@gmail.com> wrote:
> >
> > That sounds reasonable to me.
> >
> > To clarify a little bit of the rationale for this:
> > The purpose of a global registry, as I understand it, is to define the
> > global tag ids that need to be known by independent encoders and
> > decoders prior to their encoding and decoding interaction. However,
> > ids that can be communicated within a message for the scope of that
> > message, don't require global allocation, if they are only used within
> > a specified scope. This is just like in a programming language: a
> > variable from an inner block may shadow another variable from an outer
> > block or a global scope temporarily within its block without affecting
> > the outer or global scope (this doesn't require any coordination with
> > a registry of globals). And a decoder that really implements this
> > proposal would, by definition, be implementing this behavior of
> > temporary tag reassignment/shadowing, if it really is a true record
> > implementation (i.e. a real record implementation wouldn't see
> > gobbledygook if it is conformant).
> >
> > Also allowing dynamic/temporary allocation permits potential use of
> > shorter/more efficient tag ids based on known use of other tags, and
> > in the case of record references, these may occur much more frequently
> > in a data item/document compared record definitions, and be much more
> > sensitive to size differences. Furthermore, this also sidesteps issues
> > with having to figure out the appropriate size of a range of ids to
> > globally allocate. How many ids should be allocated? In terms of
> > percentage of available space, registering/using 256 ids from the tag
> > 1+2 range seems like it is effectively equivalent as registering 1 id
> > from the tag 1+1 range (and tag 1+4 would definitely be undesirable
> > due to the high frequency and sensitivity to size). And I have had
> > users of my implementations say they need a few hundred structures to
> > be defined and referenceable, so even 256 may be somewhat limiting to
> > some users.
> >
> > But that being said, if you are thinking that it may be too onerous
> > for (record) implementators to support this type of potentially
> > dynamic/temporary tag reassignment with possibly conflicting ids (as
> > opposed to using a more stable table of tags or tag ranges), I can
> > understand that concern. What size of chunk of tag ids (above 32768)
> > do you think would be appropriate to allocate? Would a chunk of 256 be
> > reasonable?
> >
> > Anyway, thank you for your help and feedback!
> > Thanks,
> > Kris
> >
> >
> >
> > On Wed, Nov 17, 2021 at 1:52 PM Carsten Bormann <cabo@tzi.org> wrote:
> > >
> > > Hi Kris,
> > >
> > > thank you for the update.
> > > I still need to take a closer look, but I have one immediate reaction:
> > > You probably shouldn’t induce implementations to “hijack” a tag number.
> > > The handling of a tag number that has actually been allocated for
> something else may be deeply embedded into the CBOR decoder; a record
> implementation then would only see gobbledygook.
> > > So I think we should allocate some space for tag-allocated tags — I
> don’t see a big problem with allocating a chunk above 32768 just for the
> record proposal.
> > >
> > > Grüße, Carsten
> > >
> > >
> > > > On 2021-11-17, at 16:16, Kris Zyp <kriszyp@gmail.com> wrote:
> > > >
> > > > I have updated my tag registration proposal/submission at
> > > > https://github.com/kriszyp/cbor-records to use 1+2 tag entries,
> which
> > > > hopefully makes this a much less intrusive registry entry. I have
> also
> > > > updated the proposed tag definitions to also support up-front
> > > > declaration of a set of record structure definitions for a data item
> > > > ("around" that data item, like the packed approach, as you
> suggested),
> > > > in addition to the current inline record definitions (which can be
> > > > scoped with record definitions and should follow a well-specified
> > > > order for when they can be referenced by subsequent references). I
> > > > hope this offers flexibility for encoders that have all structures
> > > > known a-priori and streaming encoders (that do not), while still
> > > > maintaining nearly the same mechanics for decoders. Let me know if
> you
> > > > think this looks reasonable.
> > > > Thank you!
> > > > Kris
> > > >
> > > > On Thu, Nov 11, 2021 at 9:15 PM Kris Zyp <kriszyp@gmail.com> wrote:
> > > >>
> > > >>> Actually, stats would be very interesting.
> > > >>> I was assuming that the 1+1 setup comes with a number of 1+2
> referencing records, the hit from going to 1+2 there as well would be
> relatively insignificant.
> > > >>> Number are better than assumptions!
> > > >>
> > > >> You are definitely right, using 1+2 tag for defining records is
> pretty
> > > >> insignificant (around a quarter of percent in my tests). Anyway, I
> put
> > > >> together some tests comparing CBOR packed, record structures, and
> > > >> combinations with a couple test data structures, with my library,
> for
> > > >> the sake of further comparisons:
> > > >> https://gist.github.com/kriszyp/b623b85d2dc25ac9e3b07d8f39df9307
> > > >>
> > > >> Anyway, seeing this, I am happy to go ahead and update my proposal
> to
> > > >> use a 1+2 tag for defining records. And thinking about this, I don't
> > > >> think my proposal necessarily even needs to mandate the tag ids used
> > > >> for referencing records since those are dynamically assigned and
> > > >> explicitly specified by the encoder itself (encoder obviously must
> not
> > > >> conflict and use tag ids that will be used for other purposes),
> albeit
> > > >> can encourage a certain range (presumably from the first-come
> > > >> first-serve range).
> > > >>
> > > >> Do you have any preference for a tag id to use? It looks like 279 is
> > > >> the next in the contiguous block, but sounds like choosing aesthetic
> > > >> characters is the new preference (29299/"rs" perhaps).
> > > >>
> > > >> Anyway, thanks again for the helpful feedback, really appreciate it!
> > > >>
> > > >> Thanks,
> > > >> Kris
> > > >
> > > > _______________________________________________
> > > > CBOR mailing list
> > > > CBOR@ietf.org
> > > > https://www.ietf.org/mailman/listinfo/cbor
> > >
>
> _______________________________________________
> CBOR mailing list
> CBOR@ietf.org
> https://www.ietf.org/mailman/listinfo/cbor
>