Re: [Cbor] "ordered hash"

Kio Smallwood <kio@mothers-arms.co.uk> Thu, 30 July 2020 09:05 UTC

Return-Path: <kio@mothers-arms.co.uk>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B7AF83A0FEF for <cbor@ietfa.amsl.com>; Thu, 30 Jul 2020 02:05:11 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.629
X-Spam-Level:
X-Spam-Status: No, score=-1.629 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, KHOP_HELO_FCRDNS=0.267, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ZLz_5lzinnH0 for <cbor@ietfa.amsl.com>; Thu, 30 Jul 2020 02:05:09 -0700 (PDT)
Received: from authenticated.a-painless.mh.aa.net.uk (a-painless.mh.aa.net.uk [IPv6:2001:8b0:0:30::51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id ADBCD3A0FED for <cbor@ietf.org>; Thu, 30 Jul 2020 02:05:09 -0700 (PDT)
Received: from a-webmail.thn.aa.net.uk ([2001:8b0:62::22] helo=webmail.aa.net.uk) by a-painless.mh.aa.net.uk with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from <kio@mothers-arms.co.uk>) id 1k14Uu-0004sW-4u; Thu, 30 Jul 2020 10:05:08 +0100
Received: from cpc104502-sgyl40-2-0-cust104.18-2.cable.virginm.net ([82.34.80.105]) by webmail.aa.net.uk with HTTP (HTTP/1.1 POST); Thu, 30 Jul 2020 10:05:07 +0100
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="=_4e920ad0342e196f1e9433448e6820f0"
Date: Thu, 30 Jul 2020 10:05:07 +0100
From: Kio Smallwood <kio@mothers-arms.co.uk>
To: Alexander Pelov <a@ackl.io>
Cc: cbor@ietf.org
In-Reply-To: <CACQW0ErMQtzKF_tTo6a6A5eGUL9BGv7EBsKBWvayrDHBOZvzEg@mail.gmail.com>
References: <25163.1596045376@localhost> <DA6A1A7E-59E8-42C9-A294-7D23BFD4BF0A@arm.com> <0A0FEABF-ADF2-4342-9AC3-42F768F14CCF@mothers-arms.co.uk> <CB724CD0-52AC-4036-B5BA-70D7F056E4B6@tzi.org> <CACQW0ErMQtzKF_tTo6a6A5eGUL9BGv7EBsKBWvayrDHBOZvzEg@mail.gmail.com>
Message-ID: <5f66b697f4749fd0575e7548141f90cd@mothers-arms.co.uk>
X-Sender: kio@mothers-arms.co.uk
User-Agent: Roundcube Webmail/1.3.14
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/SSF5trC6JKYIf14ub4YoeoWC40k>
Subject: Re: [Cbor] "ordered hash"
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 30 Jul 2020 09:05:12 -0000

Hi Alexander, 

Thanks for this, it's good to see people thinking in similar directions!


Cheers, 

Kio 

On 2020-07-30 8:46, Alexander Pelov wrote:

> Hi all, 
> 
> Several years ago we had this type of discussion for the use of COMI/CORECONF. We called it in a couple of different ways:  
> - DeterministicMultimaps 
> - COMAP (CBOR Order-preserving Multimap) 
> - Order-Keeping Multimap (OKMM) 
> 
> Unfortunately, I don't think we published any open specs outside the Task Force (although we did have quite some discussion at CORE WG). 
> 
> One of the latest specs we had is attached (sorry for the PDF).  
> 
> Cheers, 
> Alexander 
> 
> On Thu, Jul 30, 2020 at 6:08 AM Carsten Bormann <cabo@tzi.org> wrote: 
> 
>> On 2020-07-30, at 01:44, Kio Smallwood <kio@mothers-arms.co.uk> wrote:
>>> 
>>> I wrote a proposal for tag 272 ordered map.
>>> 
>>> https://github.com/Sekenre/cbor-ordered-map-spec
>> 
>> (272 is already taken: Non-UTF-8 CESU-8 string.)
>> 
>>> Unfortunately no one has approved the tag allocation or given me any guidance on how to proceed.
>> 
>> Let's look into this.
>> 
>> I think this idiom is frequent enough that it should get a 1+1 tag.
>> 
>> The difference from a map is that this is both order-preserving and allowing duplicate keys; whether you want one or two of these properties on the application data model level may differ between applications.
>> 
>> The traditional alist (association list) is
>> 
>> AList<K, V> = [* [K, V]]
>> 
>> By using the exact structure of a map (except that the array head will have double the argument that the map head has), we can save 1 byte per k/v pair ("flattened alist"):
>> 
>> FAList<K, V> = [* (K, V)]
>> 
>> So [["a", 1], ["b", 2]] becomes ["a", 1, "b", 2]
>> 
>> I'm not sure which of these MCR was inquiring on.  The first is definitely an alist, and there is no need for a new term.  The second can be called an ordered multimap, but there are other structures that also deserve this name.  
>> 
>> If adjacent duplicate keys of non-trivial size are likely in an ordered multimap, a better version of the multimap is:
>> 
>> MList<K, V> = [* (K, [* V])]
>> 
>> So [["aaa", 1], ["aaa", 2], ["bbb", 3]] becomes ["aaa", [1, 2], "bbb", [3]]
>> 
>> If V can never be an array (*), this can be further optimized into:
>> 
>> OMList<K, V> = [* (K, [2* V] / V)]
>> 
>> So [["aaa", 1], ["aaa", 2], ["bbb", 3]] becomes ["aaa", [1, 2], "bbb", 3]
>> 
>> Further refine by using delta coding for the keys (works best with sorted keys), etc.  Also, a map can be used for the outer structure if order preservation is not actually needed:
>> 
>> MMap<K, V> = {* K => [* V]}
>> OMMap<K, V> = {* K => [2* V] / V}
>> 
>> There are many idioms to choose from; it would probably be best to define tags for at least each of the ones listed here.  I think FAList (what Kio proposed) is the one that will see the most use.
>> 
>> Grüße, Carsten
>> 
>> (*) Or extra-wrap just arrays:
>> OEMList<K, V> = [* (K, [2* V] / [V .and [* any]] / V)]
>> 
>> _______________________________________________
>> CBOR mailing list
>> CBOR@ietf.org
>> https://www.ietf.org/mailman/listinfo/cbor