Re: [Cbor] [COSE] CBOR magic number, file format and tags

Michael Richardson <mcr+ietf@sandelman.ca> Thu, 21 January 2021 13:44 UTC

Return-Path: <mcr+ietf@sandelman.ca>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 882B83A0BD9 for <cbor@ietfa.amsl.com>; Thu, 21 Jan 2021 05:44:57 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 18y3rDb4Ke8F for <cbor@ietfa.amsl.com>; Thu, 21 Jan 2021 05:44:54 -0800 (PST)
Received: from tuna.sandelman.ca (tuna.sandelman.ca [IPv6:2607:f0b0:f:3:216:3eff:fe7c:d1f3]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A4CED3A0BC5 for <cbor@ietf.org>; Thu, 21 Jan 2021 05:44:54 -0800 (PST)
Received: from localhost (localhost [127.0.0.1]) by tuna.sandelman.ca (Postfix) with ESMTP id CA64E38BAC; Thu, 21 Jan 2021 08:46:54 -0500 (EST)
Received: from tuna.sandelman.ca ([127.0.0.1]) by localhost (localhost [127.0.0.1]) (amavisd-new, port 10024) with LMTP id TEM-nlCLbVUH; Thu, 21 Jan 2021 08:46:53 -0500 (EST)
Received: from sandelman.ca (obiwan.sandelman.ca [209.87.249.21]) by tuna.sandelman.ca (Postfix) with ESMTP id E426D38BA9; Thu, 21 Jan 2021 08:46:53 -0500 (EST)
Received: from localhost (localhost [IPv6:::1]) by sandelman.ca (Postfix) with ESMTP id 531EB22B; Thu, 21 Jan 2021 08:44:51 -0500 (EST)
From: Michael Richardson <mcr+ietf@sandelman.ca>
To: Carsten Bormann <cabo@tzi.org>, Josef 'Jeff' Sipek <jeffpc@josefsipek.net>, cbor@ietf.org
In-Reply-To: <4192413D-0D60-4AFB-8897-FE2A09780E83@tzi.org>
References: <3C77CB5D-6AEA-4D70-96A2-3826DB8DAB18@island-resort.com> <10306.1611186961@localhost> <YAjT3j4cwvnLR4AA@meili.valhalla.31bits.net> <14857.1611195109@localhost> <YAjkmwsdqw0P+gA1@meili.valhalla.31bits.net> <4192413D-0D60-4AFB-8897-FE2A09780E83@tzi.org>
X-Mailer: MH-E 8.6+git; nmh 1.7+dev; GNU Emacs 26.1
X-Face: $\n1pF)h^`}$H>Hk{L"x@)JS7<%Az}5RyS@k9X%29-lHB$Ti.V>2bi.~ehC0; <'$9xN5Ub# z!G,p`nR&p7Fz@^UXIn156S8.~^@MJ*mMsD7=QFeq%AL4m<nPbLgmtKK-5dC@#:k
MIME-Version: 1.0
Content-Type: multipart/signed; boundary="=-=-="; micalg="pgp-sha512"; protocol="application/pgp-signature"
Date: Thu, 21 Jan 2021 08:44:51 -0500
Message-ID: <30468.1611236691@localhost>
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/16dpXJKOSMy1TosMIulGk_AlBSg>
Subject: Re: [Cbor] [COSE] CBOR magic number, file format and tags
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 21 Jan 2021 13:44:58 -0000

I think this belongs in CBOR, not COSE.
It sounds like this is something that a few people are interested in persuing.
Is it a BCP or a Standard?

Carsten Bormann <cabo@tzi.org> wrote:
    > Challenge accepted.  CBOR diagnostic notation is quite versatile, you
    > just have to tell cbor2diag.rb what you want.

okay.

    > cbor2diag.rb -t: bytes_as_text, i.e., use text form for bytes if
    > possible.

    > $ printf cbor | cbor2diag.rb -t "bor" $ printf CBOR | cbor2diag.rb -t
    > ‘BOR'

    > Oh, and, BTW:

    > cbor2diag.rb -e: try_decode_embedded, i.e., try embedded CBOR.

    > $ printf CBOR | cbor2diag.rb -te << 'OR' >>

alright, I award full cool points.

    > Back to the magic number issue:

    > For CBOR data items, I prefer registering a tag and prefixing with that
    > head (*) over prefixing with an array head and a string.  See, we have
    > a magic number registry built right into CBOR...

I can live with this, but here are my concerns.
If we go this way, then it just becomes a BCP on good CBOR protocol design.

1) when transmitting over the wire, shorter tags are preferred, and I'm
   concerned that they will be too short to useful *on-disk* magic numbers.
   The same short sequence could occur in other kinds of files rather easily.

2) I wanted a structure that made it rather easy for an application that is
   reading from the disk to know what part they can omit when transmitting.
   That's why I initially looked at the two-element array, with the second
   element being the actual "thing"

    > For CBOR sequences, obviously prefixing the sequence with a
    > CBOR-encoded (text or byte) string sounds quite good.  Making that
    > optional (to save bytes when you don’t need it) requires that the
    > sequence otherwise cannot start with that kind of string.

For on-disk files, if one uses a CBOR Sequence, I wasn't thinking about ever
having the prefix be optional.  But, we could go that way easily if do

   Tag<per-protocol-tag>CBOR
or
   Tag<CBOR-protocol-tag>Per-protocol-Value.

As the prefix, as I think that would be quite easily not part of any protocol.

--
Michael Richardson <mcr+IETF@sandelman.ca>   . o O ( IPv6 IøT consulting )
           Sandelman Software Works Inc, Ottawa and Worldwide