[Cbor] Robert Wilton's No Objection on draft-ietf-cbor-7049bis-14: (with COMMENT)

Robert Wilton via Datatracker <noreply@ietf.org> Wed, 09 September 2020 11:55 UTC

Return-Path: <noreply@ietf.org>
X-Original-To: cbor@ietf.org
Delivered-To: cbor@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id AFC7D3A08AE; Wed, 9 Sep 2020 04:55:43 -0700 (PDT)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
From: Robert Wilton via Datatracker <noreply@ietf.org>
To: "The IESG" <iesg@ietf.org>
Cc: draft-ietf-cbor-7049bis@ietf.org, cbor-chairs@ietf.org, cbor@ietf.org, Francesca Palombini <francesca.palombini@ericsson.com>, francesca.palombini@ericsson.com
X-Test-IDTracker: no
X-IETF-IDTracker: 7.16.0
Auto-Submitted: auto-generated
Precedence: bulk
Reply-To: Robert Wilton <rwilton@cisco.com>
Message-ID: <159965254327.29648.15632221792885283453@ietfa.amsl.com>
Date: Wed, 09 Sep 2020 04:55:43 -0700
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/awQt3R5vJdR3ixvXTx6jkuGZoyU>
Subject: [Cbor] Robert Wilton's No Objection on draft-ietf-cbor-7049bis-14: (with COMMENT)
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 09 Sep 2020 11:55:44 -0000

Robert Wilton has entered the following ballot position for
draft-ietf-cbor-7049bis-14: No Objection

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)

Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
for more information about IESG DISCUSS and COMMENT positions.

The document, along with other ballot positions, can be found here:



Thank you for your work on this document, and bringing this to full standard. 
Since I'm a big fan of CBOR and try to evangelize it whenever possible I'm
please to see this happening.

However, I have one minor annoyance with CBOR, which is the range of negative
numbers that are encoded in major type 1.  My gripe is that the encoding allows
for negative integers that are not easily representable in a simple form in
most programming languages without using something equivalent to BigInteger.

E.g., all values below -2^63 won't fit into a int64 type, and the value 2^64
won't even fit into an uint64 that was used to represent a negative number
(obviously unless it followed the CBOR encoding semantics of being offset by 1)

For a generic decoder I presume that this isn't an issue since it can fallback
to something like BigInteger.  But for other decoders handling normal sized
integer datatypes I would presume that they would effectively presumably regard
anything smaller than -2^63 as not well-formed for their specific problem

I'm not suggesting that this should be changed (hence comment not a discuss),
but there are a couple of places in the document that it might be helpful to
warn implementors about this, that I have mentioned below.

Other minor comments:

    3.  Specification of the CBOR Encoding

       Major type 0:  an integer in the range 0..2**64-1 inclusive.  The
          value of the encoded item is the argument itself.  For example,
          the integer 10 is denoted as the one byte 0b000_01010 (major type
          0, additional information 10).  The integer 500 would be
          0b000_11001 (major type 0, additional information 25) followed by
          the two bytes 0x01f4, which is 500 in decimal.

       Major type 1:  a negative integer in the range -2**64..-1 inclusive.
          The value of the item is -1 minus the argument.  For example, the
          integer -500 would be 0b001_11001 (major type 1, additional
          information 25) followed by the two bytes 0x01f3, which is 499 in

Would writing "0 to 2**64-1" be more clear than 0..2**64-1?  Or otherwise
perhaps mention that in the terminology section that "x..y" is used to
represent an inclusive range set of all values from x to y, including x and y. 
Also, noting that here where ".." has been used it explicit states that it is
inclusive, but that doesn't appear to be the case everywhere.

I suggest changing "Major type 0:  an integer ..." back to "Major type 0:  an
unsigned integer", as in RFC7049, because the type is referred to as "Unsigned
integer".  It also makes it more consistent with the definition of Major type 1.

    3.2.1.  The "break" Stop Code

       The "break" stop code is encoded with major type 7 and additional
       information value 31 (0b111_11111).  It is not itself a data item: it
       is just a syntactic feature to close an indefinite-length item.

       If the "break" stop code appears anywhere where a data item is
       expected, other than directly inside an indefinite-length string,
       array, or map -- for example directly inside a definite-length array
       or map -- the enclosing item is not well-formed.

I was wondering whether it would be helpful to clarify that by
indefinite-length string it means text or byte string?  Although this becomes
clear in section 3.2.3 anyway ...  My thinking is that section 3.2 lists 4
types that can have indefinite length, and then in this section both types are
string are treated together.

    3.2.3.  Indefinite-Length Byte Strings and Text Strings

Would it be helpful to clarify that the chunks must be the same type.  E.g. you
cannot have a byte string that contains text string chunks and vice-versa?  Expected Later Encoding for CBOR-to-JSON Converters

"Tags number 21 to 23 ..." => "Tag numbers 21 to 23 ..."

    4.2.1.  Core Deterministic Encoding Requirements

          Floating-point values also MUST use the shortest form that
          preserves the value, e.g. 1.5 is encoded as 0xf93e00 and 1000000.5
          as 0xfa49742408.  (One implementation of this is to have all
          floats start as a 64-bit float, then do a test conversion to a
          32-bit float; if the result is the same numeric value, use the

I find this paragraph slightly opaque, and I would suggest spelling out that
1.5 has been encoded as a 16 bit IEEE float, whereas 1.00000005 has been
encoded as a 32 bit IEEE float.  The same comment applies to 4.2.2 as well.

I also noticed that in most places the document refers to "floating-point" but
in a few places "floating point" is used instead.

    5.5.  Numbers

As per my top comment, I think that it would be useful to raise in this section
that CBOR can encode negative values that cannot normally be represented in a
compact form.

    6.1.  Converting from CBOR to JSON

       Most of the types in CBOR have direct analogs in JSON.  However, some
       do not, and someone implementing a CBOR-to-JSON converter has to
       consider what to do in those cases.  The following non-normative
       advice deals with these by converting them to a single substitute
       value, such as a JSON null.

       *  An integer (major type 0 or 1) becomes a JSON number.

It is worth referencing back to section 5.5 on Javascript numbers and
explicitly warn that not all CBOR integers can be precisely represented as JSON
numbers, and there may be a loss of precision?

    Appendix C.  Pseudocode

       Major types 0 and 1 are designed in such a way that they can be
       encoded in C from a signed integer without actually doing an if-then-
       else for positive/negative (Figure 2).  This uses the fact that
       (-1-n), the transformation for major type 1, is the same as ~n
       (bitwise complement) in C unsigned arithmetic; ~n can then be
       expressed as (-1)^n for the negative case, while 0^n leaves n
       unchanged for non-negative.  The sign of a number can be converted to
       -1 for negative and 0 for non-negative (0 or positive) by arithmetic-
       shifting the number by one bit less than the bit length of the number
       (for example, by 63 for 64-bit numbers).

This was another place where I thought that it might be useful to warn the
reader about decoding negative integers and the risk of overflow taking a major
1 value into an int64 native type.