Re: [Cbor] Consolidated set of tags for map-like entities

Emile Cormier <emile.cormier.jr@gmail.com> Tue, 16 March 2021 23:27 UTC

Return-Path: <emile.cormier.jr@gmail.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 118D53A12D8 for <cbor@ietfa.amsl.com>; Tue, 16 Mar 2021 16:27:19 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.097
X-Spam-Level:
X-Spam-Status: No, score=-2.097 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Ci-K_YFDiMBk for <cbor@ietfa.amsl.com>; Tue, 16 Mar 2021 16:27:17 -0700 (PDT)
Received: from mail-pf1-x434.google.com (mail-pf1-x434.google.com [IPv6:2607:f8b0:4864:20::434]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id E76DC3A12D6 for <cbor@ietf.org>; Tue, 16 Mar 2021 16:27:16 -0700 (PDT)
Received: by mail-pf1-x434.google.com with SMTP id c17so6299902pfv.12 for <cbor@ietf.org>; Tue, 16 Mar 2021 16:27:16 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=F0G8BkiYEoQEfyFTdNF3SzYXqQN7RBBh19nAVYnLArc=; b=u8QyJsZzWRivW3Jc7BVJTlrBzCWWfxVnmM254tTaTHwdo4NyMYx4mvuN2uF5J6xapl 842GTWlXLJboUCzUHU2somus3oGqYM8W6qaB6H+yG+B/NH5L2/8UgAnvfk+fXnuqFdWu qRD0Wa5hf9KZ4tlNqYi9dF+SY64aCkHstm7hCnd85jV8pEN4beXgP75WtsrxeHfDfBap wjrksa21N7B6zuqP8P4LMFINZDXV47Vv6sQE7xkojaCqj0f/cH/ZGePnHk8/9MrUZCUi xIronep4Ig5A20bzIe58EfaHs8GriTtGKVOrAhIjOjwLBo7XEYaRDW9zmbcNJrIxAgHn fI6g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=F0G8BkiYEoQEfyFTdNF3SzYXqQN7RBBh19nAVYnLArc=; b=Y3spbfFMlQGBVzyfn1+KYtx328PGBTyvGQjkz6Ihi+SN7VHTSunpAA983sY1SxDChp 8udY4Q83qucnJJiR8OEmD9zmnII2rOE1mAHMy/Wx/HmEo909EbwN4m3tDpynn8oy0G49 baxfDhTOZ9hfaS+sXpcXFYafaOf9eqIA6oDZj3b936FOs3gMsnW9afIUHX/WpDJ9Odj0 C/tw6Siufz77VOSHE6rvq6TDXx7fk+nGS+IaL+4p+miNjwpGrYqu156lNaRr0srnCz61 8fgisIcs0nwzPf+kWVKPsxE2VbWoToOvqxI3VrXmJs9o960NJmDLlOXt6gpJeSSsNALs vkTw==
X-Gm-Message-State: AOAM5307b2V14NEBmhCB6H42mJK0kER+SVXDXRELNjHOclUGJiEf1U5B HetA9ZT6SFSAGVHD9DRpeYXvEs5s5ONUM0BO3q24kdkWwIY=
X-Google-Smtp-Source: ABdhPJwABd/Og2pRFkXnaFP0zw+j31NA1/Z/n9q8m4nrnixPWnXGHfQfiLjMJ3drTS6xc/7proYwREIKNAcI+7udgVc=
X-Received: by 2002:a63:6606:: with SMTP id a6mr131201pgc.310.1615937235852; Tue, 16 Mar 2021 16:27:15 -0700 (PDT)
MIME-Version: 1.0
References: <CAM70yxC7o3CNz67Yx3-yLt3tPQaTRaezOO99Ssa-3ppu-NwD=Q@mail.gmail.com> <d9790cc20ed6ef267e656ab7b4f8492f@mothers-arms.co.uk> <CAM70yxBifhSvsmRu3bq3ayrqqMecp2q1LORUvGcK28jujZrOfg@mail.gmail.com> <CAM70yxB+iDW_11p+AHWc7Njcb-jpBS99OJU+70xBw9Pe+20U1g@mail.gmail.com>
In-Reply-To: <CAM70yxB+iDW_11p+AHWc7Njcb-jpBS99OJU+70xBw9Pe+20U1g@mail.gmail.com>
From: Emile Cormier <emile.cormier.jr@gmail.com>
Date: Tue, 16 Mar 2021 20:27:04 -0300
Message-ID: <CAM70yxA7iCKHRxHqvAt404xYU5f69Grun4rpTBD98+1HzZWc=w@mail.gmail.com>
To: Kio Smallwood <kio@mothers-arms.co.uk>
Cc: cbor@ietf.org
Content-Type: multipart/alternative; boundary="000000000000ec212905bdafb4db"
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/AVQDbN5B-1TdQQYziEG9tHcPw6g>
Subject: Re: [Cbor] Consolidated set of tags for map-like entities
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Mar 2021 23:27:19 -0000

I've updated the GitHub repo with a new section describing how the tags
relate to programming language containers. I only have Javascript, Python,
and C++ there for now. I'm not fluent in Python and had to consult online
documentation. It's entirely possible I got things wrong in the Python
subsection.

I have the following new thoughts:

1. We should consider adding similar tags for lists of discrete values (not
key-value pairs). For example, the homogeneity, ordering, and uniqueness
properties could be associated with programming language lists, sets, and
multisets. I think this would add eight additional tags, for a total of 20.

2. I'd like to replace the terms "homogenous/heterogeneous" with the terms
"uniform/arbitrary" for better readability.

On Tue, Mar 16, 2021 at 3:38 PM Emile Cormier <emile.cormier.jr@gmail.com>
wrote:

> RFC8746 has this to say about homogeneous arrays:
>
> Which CBOR data items constitute elements of the same application type is
>> specific to the application.
>>
>
> I think this interpretation of "homogenous" should be the same for this
> proposal.
>
> On Tue, Mar 16, 2021 at 2:36 PM Emile Cormier <emile.cormier.jr@gmail.com>
> wrote:
>
>> Hi Kio,
>>
>> I'll add an information section on how the various tags relate to
>> programming language data structures. I could start with JS and C++ data
>> structures, and allow other folks to send or PR me the ones for other
>> languages.
>>
>> For homogeneous key/values, it was actually my intention that the first
>> key/value encountered "establishes" the type for the remainder. A decoder
>> could reject a homogenous map where a key/value does not match the type
>> established by the first.
>>
>> You've made me realize that the interpretation for "homogenous" could be
>> tricky for numeric types. For example, if the first homogenous key is a
>> positive integer, can the decoder assume that the following keys may also
>> include negative integers and floats? I'll check what it says for
>> homogenous arrays in https://tools.ietf.org/html/rfc8746 . It may have
>> already been settled there.
>>
>> This also begs the question: what does "homogeneous" mean for key/value
>> types that are themselves arrays or maps?
>>
>> Thanks for the feedback!
>> Emile
>>
>> On Mon, Mar 15, 2021 at 4:36 PM Kio Smallwood <kio@mothers-arms.co.uk>
>> wrote:
>>
>>> Hi Emile,
>>>
>>> Thanks for enumerating all of the possibilities. I have a few thoughts:
>>>
>>> * Would it be worth noting which programming languages and environments
>>> already make use of each type? For example, users of the cbor2 python
>>> library would only see tag 128 (equivalent to major type 5) tag 129, 130,
>>> for ordered and multimaps, and 132 when inter-operating with
>>> javascript/json. The rest would be very strange to encounter in a Python
>>> datastructure.
>>>
>>> * For the Homogeneous keys/values, should we (codec implementers) assume
>>> that the type of the first key/value encountered in the data item
>>> determines the type of the remainder?
>>>
>>> Thanks,
>>>
>>> Kio
>>>
>>>
>>> On 2021-03-12 23:50, Emile Cormier wrote:
>>>
>>> Hi Everyone,
>>>
>>> I've been corresponding privately with Prof. Bormann about a new tag I
>>> proposed for encoding multimaps as arrays or pairs. He's made me aware of
>>> your recent discussion about another proposed tag for encoding "ordered"
>>> maps, where the insertion order must be preserved. Carsten has indicated to
>>> me his desire to consolidate all of these map-related tags together.
>>>
>>> I would like to contribute to this endeavor of consolidating map-related
>>> tags. To get the discussion started, I would like to share with you a draft
>>> proposal I've written to serve as a starting point:
>>> https://github.com/ecorm/cbor-map-like
>>>
>>> Here's a brief summary:
>>>
>>>
>>>>
>>>> This document proposes a consolidated set of CBOR tags for map-like
>>>> entities involving key-value pairs. These tags encode the following
>>>> meta-data concerning map-like entities:
>>>>
>>>>    - the homogeneity of the key and value types,
>>>>    - the preservation of the insertion order of the key-value pairs,
>>>>    - the uniqueness of the keys, and,
>>>>    - the major type used to encode the key-value pairs.
>>>>
>>>>
>>> My interest in CBOR stems from my work on a new C++ serialization
>>> library that can support any JSON-like encoding. This library is closed
>>> source for now, but I may be able to convince my employer to make it open
>>> source in the future.
>>>
>>> I'm also involved with the WAMP protocol, which allows CBOR as one of
>>> its encodings.
>>>
>>> I'm looking forward to collaborating with you and I hope that we can
>>> come to a solution that satisfies everyone's needs.
>>>
>>> Cheers,
>>> Emile Cormier
>>>
>>> _______________________________________________
>>> CBOR mailing list
>>> CBOR@ietf.org
>>> https://www.ietf.org/mailman/listinfo/cbor
>>>
>>>
>>>