Re: [Cbor] 0x5fff/0x7fff (Re: 7049bis: Diagnostic notation gaps)

Thiago Macieira <> Wed, 16 September 2020 15:20 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 3AECF3A0AC6 for <>; Wed, 16 Sep 2020 08:20:14 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.919
X-Spam-Status: No, score=-1.919 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id dmEFHfFNVRRN for <>; Wed, 16 Sep 2020 08:20:12 -0700 (PDT)
Received: from ( []) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 2836A3A0ADA for <>; Wed, 16 Sep 2020 08:20:09 -0700 (PDT)
IronPort-SDR: JxmXq4F2jiWBBb9DSy5KfeV3EGSetLihee8EpeTHkPcxT00GK21jk+2jdyZNvOrn5wqRz9vOEh lO9oYz+LQUjw==
X-IronPort-AV: E=McAfee;i="6000,8403,9746"; a="160418105"
X-IronPort-AV: E=Sophos;i="5.76,433,1592895600"; d="scan'208";a="160418105"
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from ([]) by with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Sep 2020 08:20:09 -0700
IronPort-SDR: ZbEJqeGA9BGAZaxy7dmd+fGTTD1yHB8kta55VfVvdF8eoXXOiVypeCGhlOP9zOj+J/hX4mQ672 6D08iCKtXnoQ==
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.76,433,1592895600"; d="scan'208";a="288387161"
Received: from ([]) by with ESMTP; 16 Sep 2020 08:20:08 -0700
Received: from ( by ( with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Wed, 16 Sep 2020 08:20:07 -0700
Received: from ( by ( with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.1.1713.5 via Frontend Transport; Wed, 16 Sep 2020 08:20:07 -0700
Received: from tjmaciei-mobl1.localnet ( by ( with Microsoft SMTP Server (TLS) id 14.3.439.0; Wed, 16 Sep 2020 08:20:07 -0700
From: Thiago Macieira <>
To:, Carsten Bormann <>
Date: Wed, 16 Sep 2020 08:20:07 -0700
Message-ID: <4142723.pZsabe47Ai@tjmaciei-mobl1>
Organization: Intel Corporation
In-Reply-To: <>
References: <> <1973898.N1gx0QA8IB@tjmaciei-mobl1> <>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="UTF-8"
X-Originating-IP: []
Archived-At: <>
Subject: Re: [Cbor] 0x5fff/0x7fff (Re: 7049bis: Diagnostic notation gaps)
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 16 Sep 2020 15:20:14 -0000

On Wednesday, 16 September 2020 04:32:08 PDT Carsten Bormann wrote:
> Hi Thiago,
> Let me try to recap the two proposals for 0x5fff/0x7fff:
> * Add b/t.  The minimum change would be to only allow this for empty
> indefinite length strings (no chunks).  In which case there still would
> need to be some logic to only emit b/t when the indefinite length string
> closes right away.  If we don’t do the minimum change, we would have
> interop issues with all implementations that don’t know about this.

Indeed. The simplest for diagnostic dumpers would be to always allow "b" and 
"t" at that point, since the type has been decoded.

> * Use ‘’_/“”_.  This is indeed the minimum change, because it is no change
> from RFC 7049, just pointing out that this stands for 0x5fff and 0x7fff,
> respectively (well, there is a small change in that this is clarified to
> stand for no chunks, so you would need to say (_ ‘') etc if you have empty
> chunks, but that is already natural).  A generator of diagnostic notation
> would indeed need some look-ahead (one byte) to do this on the fly.  We
> don’t know how many consumers already decode ‘’_/“”_, so it is hard to say
> what interop issues we’d have, but the alternative (_ ) cannot be decoded
> unambiguously anyway.

Hmm... that's also a possibility, since this is currently an unused sequence.

$ printf '\x84\x58\0\x59\0\0\x5a\0\0\0\0\x5b\0\0\0\0\0\0\0\0' | ./cbordump -i

Looking at my own code at
corelib/serialization/cbordump/main.cpp#L294-L327, the complexity is about the 
same either way. There's already some state being kept across chunks work due 
to the comma. It wouldn't be too difficult to realise that the length wasn't 
known and nothing got printed.

> I have a preference for not adding mechanism where it is not needed (even
> though this requires some code in the implementation — but that is true for
> both cases).
> Let’s discuss this some more today at the CBOR interim meeting.
Thiago Macieira - thiago.macieira (AT)
  Software Architect - Intel DPG Cloud Engineering