Re: [Cbor] tag 24 and 55799 (was Re: my (WGLC re-)views on error processing in RFC7049bis and future-proofing)

Carsten Bormann <cabo@tzi.org> Tue, 26 May 2020 21:43 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8E1AD3A09A2 for <cbor@ietfa.amsl.com>; Tue, 26 May 2020 14:43:32 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Level:
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id uWQ6AaxA5AYv for <cbor@ietfa.amsl.com>; Tue, 26 May 2020 14:43:28 -0700 (PDT)
Received: from gabriel-vm-2.zfn.uni-bremen.de (gabriel-vm-2.zfn.uni-bremen.de [134.102.50.17]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 6318C3A0925 for <cbor@ietf.org>; Tue, 26 May 2020 14:43:28 -0700 (PDT)
Received: from [172.16.42.112] (p548dc699.dip0.t-ipconnect.de [84.141.198.153]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by gabriel-vm-2.zfn.uni-bremen.de (Postfix) with ESMTPSA id 49WnWx4p6SzytX; Tue, 26 May 2020 23:43:25 +0200 (CEST)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.80.23.2.2\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <589BF33E-9A41-400B-A91B-F45F85062269@island-resort.com>
Date: Tue, 26 May 2020 23:43:25 +0200
Cc: Michael Richardson <mcr+ietf@sandelman.ca>, cbor@ietf.org
X-Mao-Original-Outgoing-Id: 612222205.056496-7b279f04393f92516aa7ea2a4c3ee4c6
Content-Transfer-Encoding: quoted-printable
Message-Id: <AD183B67-2B49-4CB3-B81D-BB024B4317E7@tzi.org>
References: <17300.1588779159@localhost> <38BB6FFF-737F-4C11-AD7A-DA3F28A9F570@tzi.org> <CANh-dXkdjMyO=WFUxrF06OfP+RE9v11unKJXL8P3UtEe+prV1w@mail.gmail.com> <13690.1588894939@localhost> <CANh-dXmjD=RCwh7ExjSvFx+5ciew+eqHoVS88OommQ2xVnX5=Q@mail.gmail.com> <2963.1589473899@localhost> <BC0EC9BE-4202-4EED-A619-CDEB9BF312CE@tzi.org> <26665.1589593222@localhost> <589BF33E-9A41-400B-A91B-F45F85062269@island-resort.com>
To: Laurence Lundblade <lgl@island-resort.com>
X-Mailer: Apple Mail (2.3608.80.23.2.2)
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/SPssfthsI_BkGWj0OFnnvl6z2rY>
Subject: Re: [Cbor] tag 24 and 55799 (was Re: my (WGLC re-)views on error processing in RFC7049bis and future-proofing)
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 26 May 2020 21:43:33 -0000


> On 2020-05-26, at 23:04, Laurence Lundblade <lgl@island-resort.com> wrote:
> 
> 
>> On May 15, 2020, at 6:40 PM, Michael Richardson <mcr+ietf@sandelman.ca> wrote:
>> 
>>>> I note that AFAIK, we do not use tag#24 (Encoded CBOR data item) for
>>>> the signed object, in COSE.  Should we?  What's the difference between
>>>> #24 and #55799.
>> 
>>> 55799 is a tag that can have any CBOR data item as tag content 24 is a
>>> tag that can only be on byte strings.  The byte string then *encodes*
>>> another CBOR data item.  (The main use here is to keep the decoder from
>>> decoding, to provide easy skip-ability or because we need exact bytes
>>> as in COSE.)  As often with tags, there is no need for tag 24 on a byte
>>> string when it is clear from context that the byte string contains
>>> encoded CBOR; this is the case in COSE.
>> 
>> Understood.
> 
> My answer on the difference is that you use 55799 when the surrounding data / file / protocol is not CBOR and 24 when it is CBOR. 55799 is intended to work as a magic number, 24 is not because it is not unique enough.
> 
> From a decoder point of view, they should be handled exactly the same

Actually, no.

55799 has any CBOR data item as tag content and essentially has the semantics of that CBOR data item.

24 has a byte string as tag content.  That byte string is identified by the tag as encoded CBOR. 
A byte string with embedded encoded CBOR is a data item, but it is different from the data item that was encoded and embedded. Decoders will differ in their tag 24 handling: they might go ahead and decode that CBOR or just hand the byte string and tag to the application.  In either case, the decoded CBOR is not simply “in place”, making the tag and the byte string vanish.

> and should be able to occur legally in exactly the same contexts. Right? Probably this should be stated explicitly. Right now there is no linkage between them in the text.
> 
> In addition to skip-ability it seems they could be used when some data item might be JSON or might be CBOR or might be something else and you want to explicit say what it is. Should there be a JSON tag?

Very likely.  That would probably take a UTF-8 string as tag content.
(Tag 262 works for JSON-encoded content like tag 24 does for CBOR-encoded content, requiring a byte string with that content though.)

> This is kind of twisted, but seems legal.
>   Encoding: 
>     - start with an RFC 3339 date string
>     - base 64 encode it and tag it as so with tag 34

Why would one want to do that?

>     - tag it with tag 24 (55799 ) or to say it is CBOR

Ignoring the 55799 case, you are missing one encoding step (tag 24 requires a byte string with encoded CBOR, not a tagged text string).

>     - tag it with tag 0 to say it is a date string

Not allowed.  This only takes major type 3, no tags.

>     - tag it with 22 to say it should be b64’d if re-encoded later

Well, not “it”, but any byte string in the CBOR data item.
So, here it would base-64 encode for JSON conversion the byte string in tag 24 (the tag 24 itself would presumably be stripped in any JSON conversion, but I’m already confused — you can do “JSON in JSON” as well, just not tag it).

>  Decoding:
>     - remove the base 64 encoding because of the tag 34

That is inside, so you see it last.

>     - feed it back to the CBOR encoder because of tag 24
>     - interpret it as an RFC 3339 date because of tag 0

Lost track.

> Base64 encoding / decoding is not that much code or that difficult, so a generic decoder might actually do this.

I wouldn’t touch that with a ten-foot pole — it might convert that pole into base64 without asking me.

Grüße, Carsten