Re: [Cbor] 7049bis: The concept of "optional tagging" is not really used in practice #126

Carsten Bormann <> Sun, 03 November 2019 21:04 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 1FCA61200CE for <>; Sun, 3 Nov 2019 13:04:05 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -4.2
X-Spam-Status: No, score=-4.2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id kaesINT_yrjz for <>; Sun, 3 Nov 2019 13:04:02 -0800 (PST)
Received: from ( []) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id ECA051200C4 for <>; Sun, 3 Nov 2019 13:04:01 -0800 (PST)
Received: from [] ( []) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPSA id 475pM43PhjzyWd; Sun, 3 Nov 2019 22:04:00 +0100 (CET)
Content-Type: text/plain; charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.1\))
From: Carsten Bormann <>
In-Reply-To: <>
Date: Sun, 3 Nov 2019 22:03:59 +0100
X-Mao-Original-Outgoing-Id: 594507838.0811599-f7a75664086a1285a6bde5f772ba8956
Content-Transfer-Encoding: quoted-printable
Message-Id: <>
References: <> <> <> <>
To: Christophe Lohr <>
X-Mailer: Apple Mail (2.3445.9.1)
Archived-At: <>
Subject: Re: [Cbor] 7049bis: The concept of "optional tagging" is not really used in practice #126
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Sun, 03 Nov 2019 21:04:05 -0000

> However, I read "schema description" from the /semantic/ point of view:
> a description which explains the meaning of data items.  

Right.  “Schema” probably is one of the most misused terms in this space.
(That’s why CDDL is called a “data definition language”.)

> The IP RFC not
> only tells that the first 4bits are an unsigned int, it also tells that
> this number is the protocol version. 
> CBOR (neither JSON) can't tells this by itself, except if one defines a
> TAG for this.

Tags (we prefer to write simple English words in lower case) tell you how to interpret an enclosed item with different (additional) data semantics.  So a tag with number 1 tells you the enclosed number really is to be interpreted as a POSIX epoch-based date.

> So, the next question is: “is there some guidelines for using TAGs?"
> Well, it's probably too early. One may have to wait that CBOR usages
> grow in maturity.

CBOR has been around for half a decade now, so I think we have a pretty good comprehension now of when to use tags.

> What should I decide for my system design regarding CBOR TAGs?

➔ Use tags when they are useful.

There are no general guidelines like the ones you propose below, because the usefulness depends on the specific context.

> Shall I:
> - prohibit TAGs since this is redundant with other parts of my design
> specifications (which already explicit the meaning of each field); or

If you have a relatively rigid data shape (“schema” in the usual structural sense), you may indeed not need tags, because you can infer an alternative interpretation from structure (e.g., field names in a map used as a struct, position in a record, etc.).  They may still be useful when you want to express a choice, e.g., if you want to support both epoch-based and text-based dates, use Tag 0 or Tag 1.  Another example is integers: If you expect to interchange integers that might not fit into 64 bits, use a choice between a built-in integer (major types 0 and 1) and a tag 2/3:

                  uint = #0
                  nint = #1
                  int = uint / nint
                  biguint = #6.2(bstr)
                  bignint = #6.3(bstr)
                  bigint = biguint / bignint
                  integer = int / bigint

> - put TAGs everywhere for everything because TAGs bring semantic to data; or

"Everywhere” I don’t know.  But if you expect your implementations to rely on generic decoders/encoders doing the work, using tags may be a labor-saving device.  This is particularly useful when CBOR is used for general serialization in a programming environment (where you may not have a hard and fast data definition with your data).

> - add TAGs to some fields and not to others (which ones and why?)

Yes.  Only add them when they are useful.  To express a choice, and/or to have the generic codec do the work.

Grüße, Carsten