Re: [Cbor] Packed CBOR and dictionaries

Michael Richardson <mcr+ietf@sandelman.ca> Fri, 28 August 2020 18:20 UTC

Return-Path: <mcr+ietf@sandelman.ca>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B3CAD3A1090; Fri, 28 Aug 2020 11:20:47 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id FwGi5QTd_-73; Fri, 28 Aug 2020 11:20:45 -0700 (PDT)
Received: from tuna.sandelman.ca (tuna.sandelman.ca [IPv6:2607:f0b0:f:3:216:3eff:fe7c:d1f3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1A2F03A1060; Fri, 28 Aug 2020 11:20:43 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1]) by tuna.sandelman.ca (Postfix) with ESMTP id C3A73389D2; Fri, 28 Aug 2020 13:59:41 -0400 (EDT)
Received: from tuna.sandelman.ca ([127.0.0.1]) by localhost (localhost [127.0.0.1]) (amavisd-new, port 10024) with LMTP id ZoJhx2Usfu1B; Fri, 28 Aug 2020 13:59:39 -0400 (EDT)
Received: from sandelman.ca (obiwan.sandelman.ca [IPv6:2607:f0b0:f:2::247]) by tuna.sandelman.ca (Postfix) with ESMTP id CCB6E389D7; Fri, 28 Aug 2020 13:59:38 -0400 (EDT)
Received: from localhost (localhost [IPv6:::1]) by sandelman.ca (Postfix) with ESMTP id 4DD5E6D2; Fri, 28 Aug 2020 14:20:38 -0400 (EDT)
From: Michael Richardson <mcr+ietf@sandelman.ca>
To: Jim Schaad <ietf@augustcellars.com>, draft-bormann-cbor-packed@ietf.org, cbor@ietf.org
In-Reply-To: <008c01d67c47$aaf73be0$00e5b3a0$@augustcellars.com>
References: <008c01d67c47$aaf73be0$00e5b3a0$@augustcellars.com>
X-Mailer: MH-E 8.6+git; nmh 1.7+dev; GNU Emacs 26.1
X-Face: $\n1pF)h^`}$H>Hk{L"x@)JS7<%Az}5RyS@k9X%29-lHB$Ti.V>2bi.~ehC0; <'$9xN5Ub# z!G,p`nR&p7Fz@^UXIn156S8.~^@MJ*mMsD7=QFeq%AL4m<nPbLgmtKK-5dC@#:k
MIME-Version: 1.0
Content-Type: multipart/signed; boundary="=-=-="; micalg="pgp-sha512"; protocol="application/pgp-signature"
Date: Fri, 28 Aug 2020 14:20:38 -0400
Message-ID: <28732.1598638838@localhost>
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/qoKYdcGshRTfryNSaqmi2-MQOA8>
Subject: Re: [Cbor] Packed CBOR and dictionaries
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 28 Aug 2020 18:20:53 -0000

I feel that the WG should adopt the document already.

Jim Schaad <ietf@augustcellars.com> wrote:
    > * Should the dictionary expansion be separate or part of this draft?  I am
    > not sure how I want to address this.  If you have a dictionary w/ 50,000
    > entries in it, that is going to change how things should be done.  It may
    > also be that one might want to use a dictionary entry for something that
    > might otherwise be encoded as a prefix and the prefix might not be needed
    > anymore.

Once the dictionary is larger than 512, I guess 131072 is the next size.
That uses four-byte references, and so the dictionary ought to provide at
least four-byte substitutions, right?
Otherwise, we'd be expanding rather then compressing.
It seems that a (C,Python,etc.) array is probably always appropriate as the internal
datastructure to perform lookups into the dictionary. It shouldn't require a
sparse array, ever, should it?

    > Being able to do packed is going to be of importance for doing CoRAL, but
    > just as important is being able to do dictionaries.  Dictionaries do have
    > the downside that if they are not referenced internal to the structure then
    > from a security point of view they can be problematic as the
    > signed/encrypted CBOR byte stream is no longer self-contained.  This is not
    > a problem for packed CBOR as long as the packing does not cross the security
    > boundary.

Agreed.


--
Michael Richardson <mcr+IETF@sandelman.ca>, Sandelman Software Works
 -= IPv6 IoT consulting =-