Re: [Cbor] "ordered hash"

Alexander Pelov <a@ackl.io> Thu, 30 July 2020 07:46 UTC

Return-Path: <alexander@ackl.io>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6AE283A0F5D for <cbor@ietfa.amsl.com>; Thu, 30 Jul 2020 00:46:53 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.898
X-Spam-Level:
X-Spam-Status: No, score=-1.898 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=ackl-io.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id R23Pbj7MyDiw for <cbor@ietfa.amsl.com>; Thu, 30 Jul 2020 00:46:51 -0700 (PDT)
Received: from mail-io1-xd33.google.com (mail-io1-xd33.google.com [IPv6:2607:f8b0:4864:20::d33]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D79463A0F59 for <cbor@ietf.org>; Thu, 30 Jul 2020 00:46:50 -0700 (PDT)
Received: by mail-io1-xd33.google.com with SMTP id j8so14943636ioe.9 for <cbor@ietf.org>; Thu, 30 Jul 2020 00:46:50 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ackl-io.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=2sxUXaAC1tvzKwNFw++OdWTE63FePth/UOoVNC6FJi8=; b=Y8JQf2DUW8UM0eSjUATU59KQFtbP1bLcLgaHr8Q6KFSxxpp0O91OQB8kZp4o1Gh5DB oBlCrLa5QA+LR/JoLdTW9GqtbVSWAZqw+caFXK8W6mJz1Inkdg5zCWxFETm6Hit3ZDN8 b9WniJL0fM4FbYesZ+RWv5LGaPujQDeF3chYi/fZayXPIDeeqp0e9WoOwIIVQVMHbfBi uhMtgb14vVx5x0WhfX2NVJfx6WXRMqGQHC8jJv4RbrhI3jf45LswZfFUSYmTLNz009I0 5BN/oJHmmgDR0k+84x70V9AFuxoDqa/rjAEiC9WFE/LNLj+Vhjodavnqb8pKgOV01bSa UQYQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=2sxUXaAC1tvzKwNFw++OdWTE63FePth/UOoVNC6FJi8=; b=SYe1zXvOtOaNElom1ys3GUFTLuZtMLAGNnonD6y96ubP6DBt6Ysd+Pp4C6ma90vMuB X12Y+lHkvhp8mxRg4BjTvuO+OMoUXW6unyoGRiEsXW34wBAN64LeodZuEiut/3hkUtG4 N/J2ZnWYfsD8AsLKk+bIKSYsHSeUwBGhCjOaB/odEO5Ell+EUIoIE2WaqrvZki0lvM/t 5ntFvqVRcdj10GGIxRuvyBMozL1mC+SD+4o9c3TOt2O//myOvA1+tUKsgyxHvIJA9XvA LpGMhsSqW6h5BsCZQQuMVtVsh3yk+TGYG3wkJly2ZjF67HYRZ2U95QPI85B6djouDIKs vvDQ==
X-Gm-Message-State: AOAM533I9tXof4I4zuhtZa2JYJTlkqjv+wnTcBX8fAHKYc5siij6w9Rw niHHwPMg4vgfIu/RVg6kfG0qnEP/6anq2FLYLBijrQ==
X-Google-Smtp-Source: ABdhPJz1PpuKRS8pKIA9cVZDyaPJmzBqk1pgE/nz5+BeY+jldsOhieMlOjnfrzaUltopv/UVfxpBoitmtm5vF46ughU=
X-Received: by 2002:a6b:fe0f:: with SMTP id x15mr31214005ioh.173.1596095209438; Thu, 30 Jul 2020 00:46:49 -0700 (PDT)
MIME-Version: 1.0
References: <25163.1596045376@localhost> <DA6A1A7E-59E8-42C9-A294-7D23BFD4BF0A@arm.com> <0A0FEABF-ADF2-4342-9AC3-42F768F14CCF@mothers-arms.co.uk> <CB724CD0-52AC-4036-B5BA-70D7F056E4B6@tzi.org>
In-Reply-To: <CB724CD0-52AC-4036-B5BA-70D7F056E4B6@tzi.org>
From: Alexander Pelov <a@ackl.io>
Date: Thu, 30 Jul 2020 09:46:16 +0200
Message-ID: <CACQW0ErMQtzKF_tTo6a6A5eGUL9BGv7EBsKBWvayrDHBOZvzEg@mail.gmail.com>
To: Carsten Bormann <cabo@tzi.org>
Cc: Kio Smallwood <kio@mothers-arms.co.uk>, cbor@ietf.org, Brendan Moran <Brendan.Moran@arm.com>, Michael Richardson <mcr+ietf@sandelman.ca>
Content-Type: multipart/mixed; boundary="000000000000fd0d9405aba3dfd0"
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/VpQDzpK3ZNNBoc_2E0_keK8SAfA>
Subject: Re: [Cbor] "ordered hash"
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 30 Jul 2020 07:46:54 -0000

Hi all,

Several years ago we had this type of discussion for the use of
COMI/CORECONF. We called it in a couple of different ways:
- DeterministicMultimaps
- COMAP (CBOR Order-preserving Multimap)
- Order-Keeping Multimap (OKMM)

Unfortunately, I don't think we published any open specs outside the Task
Force (although we did have quite some discussion at CORE WG).

One of the latest specs we had is attached (sorry for the PDF).

Cheers,
Alexander



On Thu, Jul 30, 2020 at 6:08 AM Carsten Bormann <cabo@tzi.org> wrote:

> On 2020-07-30, at 01:44, Kio Smallwood <kio@mothers-arms.co.uk> wrote:
> >
> > I wrote a proposal for tag 272 ordered map.
> >
> > https://github.com/Sekenre/cbor-ordered-map-spec
>
> (272 is already taken: Non-UTF-8 CESU-8 string.)
>
> > Unfortunately no one has approved the tag allocation or given me any
> guidance on how to proceed.
>
> Let’s look into this.
>
> I think this idiom is frequent enough that it should get a 1+1 tag.
>
> The difference from a map is that this is both order-preserving and
> allowing duplicate keys; whether you want one or two of these properties on
> the application data model level may differ between applications.
>
> The traditional alist (association list) is
>
> AList<K, V> = [* [K, V]]
>
> By using the exact structure of a map (except that the array head will
> have double the argument that the map head has), we can save 1 byte per k/v
> pair (“flattened alist”):
>
> FAList<K, V> = [* (K, V)]
>
> So [[“a”, 1], [“b”, 2]] becomes [“a”, 1, “b”, 2]
>
> I’m not sure which of these MCR was inquiring on.  The first is definitely
> an alist, and there is no need for a new term.  The second can be called an
> ordered multimap, but there are other structures that also deserve this
> name.
>
> If adjacent duplicate keys of non-trivial size are likely in an ordered
> multimap, a better version of the multimap is:
>
> MList<K, V> = [* (K, [* V])]
>
> So [[“aaa”, 1], [“aaa”, 2], [“bbb”, 3]] becomes [“aaa”, [1, 2], “bbb”, [3]]
>
> If V can never be an array (*), this can be further optimized into:
>
> OMList<K, V> = [* (K, [2* V] / V)]
>
> So [[“aaa”, 1], [“aaa”, 2], [“bbb”, 3]] becomes [“aaa”, [1, 2], “bbb”, 3]
>
> Further refine by using delta coding for the keys (works best with sorted
> keys), etc.  Also, a map can be used for the outer structure if order
> preservation is not actually needed:
>
> MMap<K, V> = {* K => [* V]}
> OMMap<K, V> = {* K => [2* V] / V}
>
> There are many idioms to choose from; it would probably be best to define
> tags for at least each of the ones listed here.  I think FAList (what Kio
> proposed) is the one that will see the most use.
>
> Grüße, Carsten
>
> (*) Or extra-wrap just arrays:
> OEMList<K, V> = [* (K, [2* V] / [V .and [* any]] / V)]
>
>
> _______________________________________________
> CBOR mailing list
> CBOR@ietf.org
> https://www.ietf.org/mailman/listinfo/cbor
>