[Cbor] Packed CBOR and dictionaries

Jim Schaad <ietf@augustcellars.com> Thu, 27 August 2020 07:57 UTC

Return-Path: <ietf@augustcellars.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A3DEF3A0E3C; Thu, 27 Aug 2020 00:57:20 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id aZytlccVkmIZ; Thu, 27 Aug 2020 00:57:19 -0700 (PDT)
Received: from mail2.augustcellars.com (augustcellars.com [50.45.239.150]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 3DF523A0E3B; Thu, 27 Aug 2020 00:57:15 -0700 (PDT)
Received: from Jude (73.180.8.170) by mail2.augustcellars.com (192.168.0.56) with Microsoft SMTP Server (TLS) id 15.0.1395.4; Thu, 27 Aug 2020 00:57:09 -0700
From: Jim Schaad <ietf@augustcellars.com>
To: draft-bormann-cbor-packed@ietf.org
CC: cbor@ietf.org
Date: Thu, 27 Aug 2020 00:57:07 -0700
Message-ID: <008c01d67c47$aaf73be0$00e5b3a0$@augustcellars.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Mailer: Microsoft Outlook 16.0
Content-Language: en-us
Thread-Index: AdZ8HUNl6mcdBpDzSxeWAllddNi0ng==
X-Originating-IP: [73.180.8.170]
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/vnFv82QQMwe7ZP7r1obFjphr8gU>
Subject: [Cbor] Packed CBOR and dictionaries
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 27 Aug 2020 07:57:21 -0000

In the past two weeks packed CBOR has come up in two different contexts and
I have started trying to think how I would write a compression routine that
would be reasonable quick but still give good results.  Part of the question
that I keep running into is the fact that the use of a dictionary is going
to change how I approach this.  I may find that I want to put dictionary
entries into the prefix or shared section of the packed structure depending
on how far down in the dictionary the entries are.  

*  Should we be doing an equivalent draft to CBOR packed about how to use
dictionaries?  I think that having a standard dictionary expansion is going
to be important.

* Should the dictionary expansion be separate or part of this draft?  I am
not sure how I want to address this.  If you have a dictionary w/ 50,000
entries in it, that is going to change how things should be done.  It may
also be that one might want to use a dictionary entry for something that
might otherwise be encoded as a prefix and the prefix might not be needed
anymore.

* When building a dictionary, do I only want to do compete substitutions, or
it be recommended that prefixes also be encoded into the dictionary.  This
would mean that if dictionaries are treated completely separately that a
prefix encoding is needed there as well, but that encoding might be to a
substitution in the prefix section of the packed data followed by a prefix
using the packed encoding.

Being able to do packed is going to be of importance for doing CoRAL, but
just as important is being able to do dictionaries.  Dictionaries do have
the downside that if they are not referenced internal to the structure then
from a security point of view they can be problematic as the
signed/encrypted CBOR byte stream is no longer self-contained.  This is not
a problem for packed CBOR as long as the packing does not cross the security
boundary.

Jim