Re: [Cbor] WGLC review of draft-ietf-cbor-7049bis-09

Jim Schaad <ietf@augustcellars.com> Sat, 07 December 2019 19:24 UTC

From: Jim Schaad <ietf@augustcellars.com>
To: 'Carsten Bormann' <cabo@tzi.org>
CC: draft-ietf-cbor-7049bis@ietf.org, cbor@ietf.org
References: <036501d5ac98$201dd490$60597db0$@augustcellars.com> <CF8D72A5-5378-4550-9311-62241E28746F@tzi.org>
In-Reply-To: <CF8D72A5-5378-4550-9311-62241E28746F@tzi.org>
Date: Sat, 07 Dec 2019 11:24:36 -0800
Message-ID: <037901d5ad33$f86f9fe0$e94edfa0$@augustcellars.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Thread-Index: AQHQ2zLAodfcg8UghEvdulp8QNZcLgG3QVSZp6ryloA=
Content-Language: en-us
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/dvqYV2-VIL1zP1bD46GCsrlr8_Y>
Subject: Re: [Cbor] WGLC review of draft-ietf-cbor-7049bis-09
Precedence: list

-----Original Message-----
From: Carsten Bormann <cabo@tzi.org> 
Sent: Saturday, December 7, 2019 9:37 AM
To: Jim Schaad <ietf@augustcellars.com>
Cc: draft-ietf-cbor-7049bis@ietf.org; cbor@ietf.org
Subject: Re: [Cbor] WGLC review of draft-ietf-cbor-7049bis-09

Hi Jim,

Thank you for this review.

> On Dec 7, 2019, at 01:49, Jim Schaad <ietf@augustcellars.com> wrote:
> 
> Here is my WGLC review of the document.
> 
> *  In paragraph 2 of section 1, I don't understand how the first 
> sentence leads into the second sentence.  The first talks about 
> specific design goals, so I expected a discussion of some of those 
> goals.  Instead I get a discussion of the data model.  Paragraph 3 
> seems to follow better from the first sentence of paragraph 2.

I don’t think the first sentence is meant to lead into the second sentence.  This brief introduction provides a set of facts that distinguish CBOR from previous efforts but are not necessarily interrelated.
Paragraph 3 could be moved as a second sentence of paragraph 2, but that would distract a bit from the brief explanation of what CBOR is.

[JLS] Let's have Paul look at it.  It just reads really badly to me.  Forward references would be one way to say that they are not related items.

> * In section 1 - paragraph 3.  I think this would fit better in 
> section 1.1 rather than doing the forward reference.

Section 1.1 is about the objectives behind CBOR.  Section 1 paragraph 3 is about previous formats.

If we do want to restructure this, maybe it would be worth to do a subsection “history” (or “related work”?) with the JSON material and the pointer to appendix E.

[JLS] That would be part of the "are not well met" in the first sentence.   I don’t know that a history section would help here.

> * Section 1.2 - The terminology of Encoder and Decoder do not seem to 
> match each other.  The decoder decodes a well-formed CBOR data item.  
> The Encoder generates the representation of a CBOR data item.  I 
> assume that the decoder should decode the representation of a 
> well-formed CBOR data item.  The other way around would be that the Encoder creates a well-formed CBOR data item.

Right.  This is one instance where we don’t properly handle the distinction between the abstract data item and its representation format/encoded data item.

This is an issue that would need to go through the whole definition section and probably much more; e.g., well-formed is a property of an instance of the representation format.  Should we go for that?

[JLS] I hope that is not needed.

Symmetry between encoder and decoder should be easy to achieve even without such massive surgery, e.g., s/well-formed CBOR data item/well-formed encoded CBOR data item/ (maybe even without the “CBOR”).

[JLS] I think you need some changes on the Encoder side as well.

> * Section 2.2 last paragraph.  /be considered duplicates/be considered
> duplicates, even if encoded as different major types,/   This is a belts and
> suspenders suggestion but not something I would die for.

I’m not sure I like the “if” (integer 0 and float 0.0 are always encoded as different major types).  How about “while”?

[JLS] That is just fine.  Other approach would be /even if/when/

> * Section 3 - para 1 - I do not believe that the encoding is 
> summarized in table 6.  This table just summarized the first byte of 
> the encoding.  Is this pointing to the right table or is the text just slightly messed up?

It is pointing to the right table.  That table is indeed indexed by the first byte, but does have more information in its right column.  (It *is* a summary.)

How about:
"The encoding is
summarized in {{jumptable}}, indexed by the initial byte.”
(in PR below).

[JLS] fine.

> * Section 3.2.3 - The text is silent on a chuck being of length zero.
> Should this be stated as discouraged but legal?

I don’t know why this needs to be discouraged but it certainly is not ruled out.
To this reader, chunks of length 0 need to be discussed as much as chunks of length 4711, which are also not ruled out.

[JLS] Yes, but checks of length 4711 are, if not expected, useful.  A chuck of length 0 is not useful.

> * Section 3.4.2 - Is this section and 3.4.3 supposed to be at the same 
> level s 3.4.1 or are they meant to be children?

Fixed in PR below.  I’m not a big fan of subsubsubsections, so I just got rid of the section head for 3.4.1 entirely, adding the last paragraph to the introduction of Section 3.4:

Protocols using tag numbers 0 and 1 extend the generic data model
({{cbor-data-models}}) with data items representing points in time; tag numbers 2 and 3, with arbitrarily sized integers; and tag numbers
4 and 5, with floating point values of arbitrary size and precision.

[JLS] works for me

> * Section 4.1 - I don't know what 1_000_000_000 is supposed to represent.
> My first assumption is that this is decimal, but it is not completely 
> clear that is correct.

Good languages have had these for a while, but even Java 7, C#, and Python allow these underscores now… Changed text explaining this in 1.2 slightly in the PR below:
Underscores can be added to a number solely for readability.

[JLS] I have never seen this.   I don't know that I would agree that this represents a good practice, but ok

> * Section 4.2.2 - Do we want to include a point in this section about 
> saying that applications may also need to set rules about how the 
> content of a tag is going to be expressed.  One example would be tag 
> 6.1 where an application could state that all times are in UTC or that 
> times are only expressed to the whole second.

I would add (in PR below):  Deterministic encoding considerations also apply to the content of tags.
Would that address your point?

[JLS] Works for me

> * Section 5.2 - I think that /containing a byte string or a text 
> string/containing a byte string or containing a text string/  The byte 
> string is an error in itself, the text string is only an issue if the 
> embedded text is incorrect.

Added a comma to that (in PR below).

[JLS] WFM

> * Section 9.2 - Are you intending to update the template to point to 
> this document rather than saying with RFC 7049?  This is not stated here.

Not sure about updating the template, but the registries should indeed be updated to this specification.

Added at the end of Section 9 in PR below:

\[To be removed by RFC editor:]
IANA is requested to update these registries to point to the present document instead of RFC 7049.

[JLS] As noted in the PR, I think IANA is going to want this instruction to remain in the document.

> * Appendix B - Is there a reason why the first entry is Integer rather 
> than Unsigned Integer?

Not really (well, 0 to 23 are unsigned even if we don’t say so :-).
Fixed in PR below.

[JLS] WFM

> * Pseudo code - Should there be a break after case 7  (switch (mt))?  
> Having it would be cleaner

It is cleaner if we plan to update this code later.
I don’t think we do…
(And I don’t think we need another line in that code.)

[JLS] I prefer it, but that is just fine.

** Jim

> * Appendix F -  This is missing some things which seem to be rather 
> important.
> - New method to sort maps
> - Lots of work on describing the CBOR Data Models
> - Serialization considerations  + Preferred - Canonical
> - Change in how validation/validity is described
> - Appendix G

Indeed.
I think we need to do a separate PR for the list of changes.

> * Appendix G - /exexactly/exactly/

Fixed in PR #144 (already merged).

I have prepare a pull request with the changes that I already know what I would propose to make from these comments:

https://github.com/cbor-wg/CBORbis/pull/149

Of course, the questions raised above could lead to more changes.

Grüße, Carsten

[Cbor] WGLC review of draft-ietf-cbor-7049bis-09 Jim Schaad
Re: [Cbor] WGLC review of draft-ietf-cbor-7049bis… Laurence Lundblade
Re: [Cbor] WGLC review of draft-ietf-cbor-7049bis… Carsten Bormann
Re: [Cbor] WGLC review of draft-ietf-cbor-7049bis… Jim Schaad
Re: [Cbor] WGLC review of draft-ietf-cbor-7049bis… Carsten Bormann