Re: [Cbor] "ordered hash"

Kio Smallwood <kio@mothers-arms.co.uk> Thu, 30 July 2020 09:22 UTC

Return-Path: <kio@mothers-arms.co.uk>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A35BB3A100E for <cbor@ietfa.amsl.com>; Thu, 30 Jul 2020 02:22:54 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.895
X-Spam-Level:
X-Spam-Status: No, score=-1.895 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id mrptl4F561Ho for <cbor@ietfa.amsl.com>; Thu, 30 Jul 2020 02:22:52 -0700 (PDT)
Received: from authenticated.a-painless.mh.aa.net.uk (a-painless.mh.aa.net.uk [IPv6:2001:8b0:0:30::51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 08A803A0AE6 for <cbor@ietf.org>; Thu, 30 Jul 2020 02:22:51 -0700 (PDT)
Received: from a-webmail.thn.aa.net.uk ([2001:8b0:62::22] helo=webmail.aa.net.uk) by a-painless.mh.aa.net.uk with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from <kio@mothers-arms.co.uk>) id 1k14m2-0006bJ-9v; Thu, 30 Jul 2020 10:22:50 +0100
Received: from cpc104502-sgyl40-2-0-cust104.18-2.cable.virginm.net ([82.34.80.105]) by webmail.aa.net.uk with HTTP (HTTP/1.1 POST); Thu, 30 Jul 2020 10:22:50 +0100
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="=_0a6161064199be81b572e7d78ec09d8c"
Date: Thu, 30 Jul 2020 10:22:50 +0100
From: Kio Smallwood <kio@mothers-arms.co.uk>
To: Carsten Bormann <cabo@tzi.org>
Cc: cbor@ietf.org
In-Reply-To: <F1783E76-12E9-46CC-BE64-DF90862932DE@tzi.org>
References: <25163.1596045376@localhost> <DA6A1A7E-59E8-42C9-A294-7D23BFD4BF0A@arm.com> <0A0FEABF-ADF2-4342-9AC3-42F768F14CCF@mothers-arms.co.uk> <CB724CD0-52AC-4036-B5BA-70D7F056E4B6@tzi.org> <99a918d0f4cdabc88071a82056efe621@mothers-arms.co.uk> <F1783E76-12E9-46CC-BE64-DF90862932DE@tzi.org>
Message-ID: <3a9dec4d93dcc0957e93a4517297020c@mothers-arms.co.uk>
X-Sender: kio@mothers-arms.co.uk
User-Agent: Roundcube Webmail/1.3.14
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/OpN2qMzpwRtcqDG2p66BnHooYTo>
Subject: Re: [Cbor] "ordered hash"
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 30 Jul 2020 09:22:55 -0000

Yes you can use my text, I'll have to think about what address to use...


Cheers, 

Kio 

On 2020-07-30 10:05, Carsten Bormann wrote:

> Thank you, Kio!
> 
> I think I will write a subsection in 
> https://www.ietf.org/id/draft-bormann-cbor-notable-tags-02.html
> (even if we don't have the actual tags yet), probably 6.4.
> 
> Can I use your text as a contribution (like I did [and still plan to do] with Peter Occil's contribution)?  (I would need some address information that you want to see in a draft/RFC.)
> 
> Grüße, Carsten
> 
> On 2020-07-30, at 10:59, Kio Smallwood <kio@mothers-arms.co.uk> wrote:
> 
> Thanks Carsten!
> 
> I think this idiom is frequent enough that it should get a 1+1 tag.
> Yes I agree.
> 
> The difference from a map is that this is both order-preserving and allowing duplicate keys; whether you want one or two of these properties on the application data model level may differ between applications.
> I've always felt that allowing duplicate keys is unnecessary, does it need to be bundled with order-preserving? Your other example MList<K, V> = [* (K, [* V])] covers this use case nicely.
> 
> FAList<K, V> = [* (K, V)]
> Yes a flattened AList is the main use case. Allocating tags for the other structures at the same time makes good sense.
> 
> Thanks,
> 
> Kio
> 
> On 2020-07-30 5:08, Carsten Bormann wrote:
> 
> On 2020-07-30, at 01:44, Kio Smallwood <kio@mothers-arms.co.uk> wrote: 
> I wrote a proposal for tag 272 ordered map.
> 
> https://github.com/Sekenre/cbor-ordered-map-spec 
> (272 is already taken: Non-UTF-8 CESU-8 string.)
> 
> Unfortunately no one has approved the tag allocation or given me any guidance on how to proceed. 
> Let's look into this.
> 
> I think this idiom is frequent enough that it should get a 1+1 tag.
> 
> The difference from a map is that this is both order-preserving and allowing duplicate keys; whether you want one or two of these properties on the application data model level may differ between applications.
> 
> The traditional alist (association list) is
> 
> AList<K, V> = [* [K, V]]
> 
> By using the exact structure of a map (except that the array head will have double the argument that the map head has), we can save 1 byte per k/v pair ("flattened alist"):
> 
> FAList<K, V> = [* (K, V)]
> 
> So [["a", 1], ["b", 2]] becomes ["a", 1, "b", 2]
> 
> I'm not sure which of these MCR was inquiring on.  The first is definitely an alist, and there is no need for a new term.  The second can be called an ordered multimap, but there are other structures that also deserve this name.  
> 
> If adjacent duplicate keys of non-trivial size are likely in an ordered multimap, a better version of the multimap is:
> 
> MList<K, V> = [* (K, [* V])]
> 
> So [["aaa", 1], ["aaa", 2], ["bbb", 3]] becomes ["aaa", [1, 2], "bbb", [3]]
> 
> If V can never be an array (*), this can be further optimized into:
> 
> OMList<K, V> = [* (K, [2* V] / V)]
> 
> So [["aaa", 1], ["aaa", 2], ["bbb", 3]] becomes ["aaa", [1, 2], "bbb", 3]
> 
> Further refine by using delta coding for the keys (works best with sorted keys), etc.  Also, a map can be used for the outer structure if order preservation is not actually needed:
> 
> MMap<K, V> = {* K => [* V]}
> OMMap<K, V> = {* K => [2* V] / V}
> 
> There are many idioms to choose from; it would probably be best to define tags for at least each of the ones listed here.  I think FAList (what Kio proposed) is the one that will see the most use.
> 
> Grüße, Carsten
> 
> (*) Or extra-wrap just arrays:
> OEMList<K, V> = [* (K, [2* V] / [V .and [* any]] / V)]

_______________________________________________
CBOR mailing list
CBOR@ietf.org
https://www.ietf.org/mailman/listinfo/cbor