Re: [Cbor] Interactions of packed CBOR and tags

Jim Schaad <ietf@augustcellars.com> Thu, 27 August 2020 23:26 UTC

Return-Path: <ietf@augustcellars.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C29B63A140B for <cbor@ietfa.amsl.com>; Thu, 27 Aug 2020 16:26:22 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id oNu27OTjQk6P for <cbor@ietfa.amsl.com>; Thu, 27 Aug 2020 16:26:21 -0700 (PDT)
Received: from mail2.augustcellars.com (augustcellars.com [50.45.239.150]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2BAA73A1408 for <cbor@ietf.org>; Thu, 27 Aug 2020 16:26:21 -0700 (PDT)
Received: from Jude (73.180.8.170) by mail2.augustcellars.com (192.168.0.56) with Microsoft SMTP Server (TLS) id 15.0.1395.4; Thu, 27 Aug 2020 16:26:15 -0700
From: Jim Schaad <ietf@augustcellars.com>
To: 'Carsten Bormann' <cabo@tzi.org>
CC: cbor@ietf.org
References: <00c101d67cb5$2588b790$709a26b0$@augustcellars.com> <E30F54B6-1A63-48AC-89AE-61983654B5A9@tzi.org>
In-Reply-To: <E30F54B6-1A63-48AC-89AE-61983654B5A9@tzi.org>
Date: Thu, 27 Aug 2020 16:26:14 -0700
Message-ID: <00cc01d67cc9$766c7b60$63457220$@augustcellars.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
X-Mailer: Microsoft Outlook 16.0
Content-Language: en-us
Thread-Index: AQHF41LwYmnJHHEpyMn9bwuLrqikDwI3uBqqqVwMZMA=
X-Originating-IP: [73.180.8.170]
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/ClKv7nyH5b57_sUtBgV7ENo16cU>
Subject: Re: [Cbor] Interactions of packed CBOR and tags
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 27 Aug 2020 23:26:23 -0000


-----Original Message-----
From: Carsten Bormann <cabo@tzi.org> 
Sent: Thursday, August 27, 2020 2:32 PM
To: Jim Schaad <ietf@augustcellars.com>
Cc: draft-bormann-cbor-packed@ietf.org; cbor@ietf.org
Subject: Re: [Cbor] Interactions of packed CBOR and tags



> On 2020-08-27, at 23:00, Jim Schaad <ietf@augustcellars.com> wrote:
> 
> While building a test library of strings for evaluating my algorithm, 
> I ended up with a question of how tags interact with the idea of CBOR packing.
> Specifically, if I use a standard date/time string with tag 0, should 
> that text string be considered as a candidate for packing?
> 
> 0("1970-01-01T00:00Z") could potentially be compressed to 0(simple(3))
> 
> The problem is that this is no longer a valid CBOR encoding so it 
> would not seem to be a legal thing to do.
> 
> Question:  Must packed CBOR be valid CBOR or does that requirement 
> only apply to unpacked CBOR?

Great question.

The cop-out could be: either.
Since CBOR-valid packed CBOR is a subset of (just well-formed) packed CBOR, it could be a parameter given to the compressor whether that is allowed to use compression opportunities like the above or not.

What are the benefits/drawbacks:

* (just well-formed) packed CBOR may lead to trouble with a generic decoder that cannot handle (present to the application) invalid constructs like 0(simple(3)).
The application can decide whether it wants to live with this limitation or not.

* the structural coherence of the packed structure (that this draft is about) will be expressed as a validity constraint.  It is a bit weird to then relax validity of what goes in there, but not entirely without precedent (e.g., tag 24, even though there is a more explicit firewall here).

* not using those compression opportunities can be wasteful, not just for the example given above (tag 0), but also for tags like 32.

I think I would emphasize the (just well-formed) packed CBOR, but still introduce CBOR-valid packed CBOR as a selectable additional constraint for applications that need to work with pre-packed (designed before packed was invented) generic decoders.

[JLS] Yes I agree that only requiring that the result be well-formed makes the most sense.  It probably makes sense to discuss the implications in the document.  A more interesting case might be tag 26 which could have duplicate or prefix lines of text coming up frequently.

I think it might make sense to reference tag 25 and say that this does the same thing only much better.
[/JLS]

Do we need to encode this selection?  (E.g., via different top-level tags.)  Probably not.

[JLS] I don't think that there needs to be different top-level tags.  

Jim


Grüße, Carsten