Re: [Cbor] Record proposal
Kris Zyp <kriszyp@gmail.com> Thu, 18 November 2021 05:53 UTC
Return-Path: <kriszyp@gmail.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BB4963A03F8 for <cbor@ietfa.amsl.com>; Wed, 17 Nov 2021 21:53:09 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.098
X-Spam-Level:
X-Spam-Status: No, score=-2.098 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id wE9tMHefvBf3 for <cbor@ietfa.amsl.com>; Wed, 17 Nov 2021 21:53:05 -0800 (PST)
Received: from mail-ed1-x52f.google.com (mail-ed1-x52f.google.com [IPv6:2a00:1450:4864:20::52f]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id CF4663A03F4 for <cbor@ietf.org>; Wed, 17 Nov 2021 21:53:04 -0800 (PST)
Received: by mail-ed1-x52f.google.com with SMTP id t5so22111799edd.0 for <cbor@ietf.org>; Wed, 17 Nov 2021 21:53:04 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=0xkx2yU/qO+0Jk5eePdRNz63wFgQJg93plEQfcYcHMM=; b=RiVm5CAs/voQ0NszAH3U//RFpllnD6pXZyWheFMtpzCu86sikPsFLFB1DAlgi1LeKh HWSZBzA/iTANrqr6zSr7xiCb8qgoCXRmEef3IUkQRAZnbIhOOQ/VJO5Ttav4/QtqbPTQ RHTuK4Fwtc0rLJNyThmT0lz1ppNOF3fLxDHWvb2/dMcb8kdTKRc6/DvJpF9I//48VTxC cF+sEQK/fUZ+CfHA00KKEnQQAgPFDH5XcGDoHILFy3qKlKXKVuS5UOic6B0G2TNLfDT0 7d/RltpVVgu5TG/YLhvwDHya0mvf+pOeB2W9rPPIEfwiVT9zSCgD4RXRjoqiqZpowV0D VJPw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=0xkx2yU/qO+0Jk5eePdRNz63wFgQJg93plEQfcYcHMM=; b=tcdYWOQCxfh87KObTXLjHQ37GUv9tU0daGUtONy14Plk5tc4kRIubDjf5Kg/YZDNZz LJSZFe8jVp0CzqMIMGN15MKDd7vKOF6jQezfOGQs1zqFfVxHMXjNFxn9IxpVES7oEUPg dD1+ycqbEu6xVW1uZnofX7FlPLYu3ThIHpjEqmTep/cgk/OhRN630vrBbNGtAudlLNN8 ByVsuwQfgg6Oxxkp9d4rTTaMJKZvpunHV6my1KEx2JPweGU4/7VXx2nkJek79InHQJdR TjW9Hv83mkFA+757x5h4ZW9KH51eW+wu4MsQ+8rEGMcLr++0FiHXSYjK5taWXfJMf+eY m8eg==
X-Gm-Message-State: AOAM533ACow/4euyjR2Sayoc2pvUVTQBNKvGA0O88jwdkTyN/RvMp9Kb RHsLVOroRgFdd4s4Tferqt1xNhY+EmqENo5k6c+nDFvE
X-Google-Smtp-Source: ABdhPJy6bmzO+m1XvBpr4p/r7m1J9wqozzqkkHD6r/SXjMY9ayWLBTOU86Qteg//Ji6qi0int7yuWok62IeZoGCMQWc=
X-Received: by 2002:a50:da48:: with SMTP id a8mr7284741edk.146.1637214778135; Wed, 17 Nov 2021 21:52:58 -0800 (PST)
MIME-Version: 1.0
References: <CAEs2a6vNhrHhaiPUNtbJ68WYfbrprETPr+kmWNJgNXMSawyBig@mail.gmail.com> <E3F121DA-95EE-43C6-BC72-E3763C034944@tzi.org> <CAEs2a6uZrT9FFP6qa+hPV2sYO0y+xJJmLaF-pPoynE2vqspfBg@mail.gmail.com> <CAEs2a6uHyvvghAMCN=UmhpJMoiES7zoPmGi-bATZWXgjA068Mg@mail.gmail.com> <YY0B4YxuMuw20umu@hephaistos.amsuess.com> <CAEs2a6utA=GQSx2Ln=5wnoNdS6z+0ExdCcfNXG6cAg=1MxnT=w@mail.gmail.com> <901541DC-A520-44CD-AA8D-F2CE77F03FA0@tzi.org> <CAEs2a6sZd4s-DJ3R_M4BLwO12s8i2AGfv0yXCaWdy+baOuAEqw@mail.gmail.com> <8CA1A63D-70B5-4109-ABE7-9CF9197F0375@tzi.org> <CAEs2a6uTKJ1DOTjREjKaRSY6kNAHSof97OoRAZbjDWOazLQC+A@mail.gmail.com> <CAEs2a6tY02haauD4OL18fp15Zet2bqkq+xVzEvAEiK5cvTpy2w@mail.gmail.com> <5C7719D8-8DCB-41BE-9111-882A02D43506@tzi.org>
In-Reply-To: <5C7719D8-8DCB-41BE-9111-882A02D43506@tzi.org>
From: Kris Zyp <kriszyp@gmail.com>
Date: Wed, 17 Nov 2021 22:52:45 -0700
Message-ID: <CAEs2a6vVL9_wvrbwske80m5P5Y1xKw6_ecitDL9uybf2TsvWHw@mail.gmail.com>
To: Carsten Bormann <cabo@tzi.org>
Cc: Christian Amsüss <christian@amsuess.com>, cbor@ietf.org
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/cKzPHg6fJa4sbHjXEdgu5cfgvoM>
Subject: Re: [Cbor] Record proposal
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 18 Nov 2021 05:53:10 -0000
That sounds reasonable to me. To clarify a little bit of the rationale for this: The purpose of a global registry, as I understand it, is to define the global tag ids that need to be known by independent encoders and decoders prior to their encoding and decoding interaction. However, ids that can be communicated within a message for the scope of that message, don't require global allocation, if they are only used within a specified scope. This is just like in a programming language: a variable from an inner block may shadow another variable from an outer block or a global scope temporarily within its block without affecting the outer or global scope (this doesn't require any coordination with a registry of globals). And a decoder that really implements this proposal would, by definition, be implementing this behavior of temporary tag reassignment/shadowing, if it really is a true record implementation (i.e. a real record implementation wouldn't see gobbledygook if it is conformant). Also allowing dynamic/temporary allocation permits potential use of shorter/more efficient tag ids based on known use of other tags, and in the case of record references, these may occur much more frequently in a data item/document compared record definitions, and be much more sensitive to size differences. Furthermore, this also sidesteps issues with having to figure out the appropriate size of a range of ids to globally allocate. How many ids should be allocated? In terms of percentage of available space, registering/using 256 ids from the tag 1+2 range seems like it is effectively equivalent as registering 1 id from the tag 1+1 range (and tag 1+4 would definitely be undesirable due to the high frequency and sensitivity to size). And I have had users of my implementations say they need a few hundred structures to be defined and referenceable, so even 256 may be somewhat limiting to some users. But that being said, if you are thinking that it may be too onerous for (record) implementators to support this type of potentially dynamic/temporary tag reassignment with possibly conflicting ids (as opposed to using a more stable table of tags or tag ranges), I can understand that concern. What size of chunk of tag ids (above 32768) do you think would be appropriate to allocate? Would a chunk of 256 be reasonable? Anyway, thank you for your help and feedback! Thanks, Kris On Wed, Nov 17, 2021 at 1:52 PM Carsten Bormann <cabo@tzi.org> wrote: > > Hi Kris, > > thank you for the update. > I still need to take a closer look, but I have one immediate reaction: > You probably shouldn’t induce implementations to “hijack” a tag number. > The handling of a tag number that has actually been allocated for something else may be deeply embedded into the CBOR decoder; a record implementation then would only see gobbledygook. > So I think we should allocate some space for tag-allocated tags — I don’t see a big problem with allocating a chunk above 32768 just for the record proposal. > > Grüße, Carsten > > > > On 2021-11-17, at 16:16, Kris Zyp <kriszyp@gmail.com> wrote: > > > > I have updated my tag registration proposal/submission at > > https://github.com/kriszyp/cbor-records to use 1+2 tag entries, which > > hopefully makes this a much less intrusive registry entry. I have also > > updated the proposed tag definitions to also support up-front > > declaration of a set of record structure definitions for a data item > > ("around" that data item, like the packed approach, as you suggested), > > in addition to the current inline record definitions (which can be > > scoped with record definitions and should follow a well-specified > > order for when they can be referenced by subsequent references). I > > hope this offers flexibility for encoders that have all structures > > known a-priori and streaming encoders (that do not), while still > > maintaining nearly the same mechanics for decoders. Let me know if you > > think this looks reasonable. > > Thank you! > > Kris > > > > On Thu, Nov 11, 2021 at 9:15 PM Kris Zyp <kriszyp@gmail.com> wrote: > >> > >>> Actually, stats would be very interesting. > >>> I was assuming that the 1+1 setup comes with a number of 1+2 referencing records, the hit from going to 1+2 there as well would be relatively insignificant. > >>> Number are better than assumptions! > >> > >> You are definitely right, using 1+2 tag for defining records is pretty > >> insignificant (around a quarter of percent in my tests). Anyway, I put > >> together some tests comparing CBOR packed, record structures, and > >> combinations with a couple test data structures, with my library, for > >> the sake of further comparisons: > >> https://gist.github.com/kriszyp/b623b85d2dc25ac9e3b07d8f39df9307 > >> > >> Anyway, seeing this, I am happy to go ahead and update my proposal to > >> use a 1+2 tag for defining records. And thinking about this, I don't > >> think my proposal necessarily even needs to mandate the tag ids used > >> for referencing records since those are dynamically assigned and > >> explicitly specified by the encoder itself (encoder obviously must not > >> conflict and use tag ids that will be used for other purposes), albeit > >> can encourage a certain range (presumably from the first-come > >> first-serve range). > >> > >> Do you have any preference for a tag id to use? It looks like 279 is > >> the next in the contiguous block, but sounds like choosing aesthetic > >> characters is the new preference (29299/"rs" perhaps). > >> > >> Anyway, thanks again for the helpful feedback, really appreciate it! > >> > >> Thanks, > >> Kris > > > > _______________________________________________ > > CBOR mailing list > > CBOR@ietf.org > > https://www.ietf.org/mailman/listinfo/cbor >
- [Cbor] Record proposal Carsten Bormann
- Re: [Cbor] Record proposal Christian Amsüss
- Re: [Cbor] Record proposal Carsten Bormann
- Re: [Cbor] Record proposal Kris Zyp
- Re: [Cbor] Record proposal Carsten Bormann
- Re: [Cbor] Record proposal Kris Zyp
- Re: [Cbor] Record proposal Carsten Bormann
- Re: [Cbor] Record proposal Kris Zyp
- Re: [Cbor] Record proposal Kris Zyp
- Re: [Cbor] Record proposal Christian Amsüss
- Re: [Cbor] Record proposal Kris Zyp
- Re: [Cbor] Record proposal Carsten Bormann
- Re: [Cbor] Record proposal Kris Zyp
- Re: [Cbor] Record proposal Carsten Bormann
- Re: [Cbor] Record proposal Kris Zyp
- Re: [Cbor] Record proposal Kris Zyp
- Re: [Cbor] Record proposal Carsten Bormann
- Re: [Cbor] Record proposal Kris Zyp
- Re: [Cbor] Record proposal Kris Zyp
- Re: [Cbor] Record proposal Emile Cormier
- Re: [Cbor] Record proposal Kris Zyp
- Re: [Cbor] Record proposal Emile Cormier
- Re: [Cbor] Record proposal Carsten Bormann
- Re: [Cbor] Record proposal Emile Cormier
- Re: [Cbor] Record proposal Kris Zyp
- [Cbor] Request for reviewers for external documen… Christian Amsüss
- Re: [Cbor] Request for reviewers for external doc… Christian Amsüss
- Re: [Cbor] Record proposal Carsten Bormann
- Re: [Cbor] Request for reviewers for external doc… Christian Amsüss