Re: [Cbor] 🔔 WGLC on draft-ietf-cbor-7049bis-09

Francesca Palombini <francesca.palombini@ericsson.com> Wed, 12 February 2020 09:48 UTC

From: Francesca Palombini <francesca.palombini@ericsson.com>
To: Carsten Bormann <cabo@tzi.org>, "cbor@ietf.org" <cbor@ietf.org>
CC: "draft-ietf-cbor-7049bis@ietf.org" <draft-ietf-cbor-7049bis@ietf.org>
Thread-Topic: [Cbor] 🔔 WGLC on draft-ietf-cbor-7049bis-09
Thread-Index: AQHVmtgHlP3xTHM9I0+UjjjMBCp1aqfAUb8AgEHfOICAFb0dAA==
Date: Wed, 12 Feb 2020 09:47:56 +0000
Message-ID: <A6A773E2-A3A3-4C84-B807-42352D6D895F@ericsson.com>
References: <293AFF31-D0EF-45D6-9B9D-E8136481C404@ericsson.com> <A808010A-AD61-4FEA-A79F-9AB669E38B6A@ericsson.com> <445FA6E3-5C29-476F-9AEB-716EAE1D8847@tzi.org>
In-Reply-To: <445FA6E3-5C29-476F-9AEB-716EAE1D8847@tzi.org>
Accept-Language: en-GB, en-US
Content-Language: en-GB
received-spf: None (protection.outlook.com: ericsson.com does not designate permitted sender hosts)
Content-Type: text/plain; charset="utf-8"
Content-ID: <39FC36E424347B458C87632ED87C4154@eurprd07.prod.outlook.com>
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-MS-Exchange-CrossTenant-Network-Message-Id: 629b19a2-359d-44d2-cbee-08d7afa0a34c
X-MS-Exchange-CrossTenant-originalarrivaltime: 12 Feb 2020 09:47:56.4682 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 92e84ceb-fbfd-47ab-be52-080c6b87953f
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: mLPi7MrzWoBBCK5c6ko4KOZn1ljLCI+b4e5KkXX50Ly0GN8GKopltVFdtNE3YY/TzdZIJsRpMT0iI4qbkx1x4wRcO8slURzVmkniG+lpcX+nO/FU6WwtxQNHfQHAmRaQ
X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM6PR07MB4867
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/Ts0vvYlQSjzgnC_fyrVsdaQWanU>
Subject: Re: [Cbor] 🔔 WGLC on draft-ietf-cbor-7049bis-09
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 12 Feb 2020 09:48:03 -0000

Hi Carsten,

Thanks for taking care of these. You can go ahead and merge the PR, it looks good to me.
Detailed answers inline.

Francesca

On 29/01/2020, 15:49, "Carsten Bormann" <cabo@tzi.org> wrote:

    Hi Francesca,
    
    here are my comments on your review.
    
    > * Section 3.4.7
    >  
    > While reading this again, I realized that CBOR sequences cannot be tagged, as by definition they are not one data item. I think being able to tag CBOR sequence with the self-describing tag in the scenario described in this section would be good.
    
    You can tag any single data item in the CBOR sequence.
    Since CBOR sequences are just concatenated encoded data items, I see no easy way to add some overall information to the sequence itself.
    
FP: Right. Maybe this was more of a high level consideration rather than a request to change anything.

    > * Section 4.2.2
    >  
    > Second to last sub-bullet: "If a protocol includes a field that can express integers..."
    >  
    > I noted an inconcistency here with the text on preferred encoding preferring using maj types 0 and 1 (see text in section 3.4.4. "The preferred encoding of an integer...")
    
    That section has recently been edited, and there are some new edits waiting in #165.
    I hope to be able to cover this in #165.  The intention is to recommend using the preferred encoding for integers that fit into mt0/1, i.e., basic integers.  But a protocol could deviate, at the cost of requiring more work in the application and possibly the generic codec (which would need to separately handle the non-preferred case).
     
FP: The revised text works for me (either before or after #165, the text I am looking at is just moved).

    > * Section 9.5
    >  
    > Considering the Apps Area Working Group does not exist anymore, should the contact here being updated?
    
    Yes, I have added this to #161.
    My assumption is that the change controller simply is IESG.
    I note a lot of variation in how this is handled in recent RFCs, e.g., in RFC 8628 there are registrations that have change controller IETF (OAuth extensions); all others there have IESG.

FP: I think IESG + cbor-wg as contact would be OK. Or possibly, art@ietf.org.
    
    > Minor/Editorials
    >  
    > * Contributing
    >  
    > It might be good to put in a note for the RFC Editor to remove this section.
    
    (BTW, the thing is a “note", not a section — at the same level as abstracts.)
    Added in new branch “francesca-editorial”.

FP: Thanks.
     
    > 
    > * Section 1.1, Point 2, sub-bullet 1
    >  
    > For readability, I would put an example of "very small amount of code" number directly in the text, in the parenthesis when mentioning class 1 constrained nodes.
    
    What would be in that example?
    I’d be very hesitant in adding much to this section, which should stay concise.

FP: For example:
OLD: "(for example, in class 1 constrained
          nodes as defined in [RFC7228])"
OLD: "(for example, in class 1 constrained
          nodes with ~ 100 KiB code size as defined in [RFC7228])"
This would give a direct example to the text above, without having to go look for it in the ref. But again, this is very minor, take it or leave it..
    
    >  
    > * Section 1.1, Point 4, sub-bullet 1
    >  
    > "and by implementation complexity maintining a lower bound" does not read correctly to me. Am I missing something?
    
    Citing the full text:
    
    4. The serialization must be reasonably compact, but data compactness
       is secondary to code compactness for the encoder and decoder.
    
        * "Reasonable" here is bounded by JSON as an upper bound in size,
          and by implementation complexity maintaining a lower bound.
    
    This is a bound to “reasonable”: we don’t want to increase implementation complexity considerably to improve compactness.  So implementation complexity maintains a lower bound to what is reasonable to achieve for the size.  I have tried rephrasing that in the new branch.

FP: Thanks, that reads better to me.
     
    > * Section 1.1, Point 5, sub-bullet 1
    >  
    > I would suggest to add "for example for devices of class 1" as an example of what "reasonably frugal in CPU" means.
    
    But it means much more than that, as the next sentence explains.
    Again, I’d prefer to avoid adding much complexity to this section.

FP: Ok.
     
    > * Section 1.2
    >  
    > The term "representation format" is only used in this section twice, everywhere else "encoded data item" is used. I would go ahead and remove this formulation, and only use encoded data item. This would also make the part on decoder and encoder more symmetric (right now Decoder talks about "encoded data item" and Encoder talks about "represetnation format").
    
    While this would simplify things, it would also increase the reliance on a term that the reader is still trying to understand at this point (essentially increasing circularity).  “Representation format” should be sufficiently generic for introductory text, which no need for definitions.

FP: Ok, that makes sense.
     
    > * Section 1.2, "Valid: "
    >  
    > This paragraph talks about "semantic restrictions that apply to CBOR data items"; it would be good to add a hint on where these are defined in the specification.
    
    Right.  That would be a reference to Section 5.3, I’d assume, not to all ~ 81 places that talk about validity?

FP: one reference is probably good enough __
     
    > * Section 2
    >  
    > "A simple value, identified by a number between 0 and 255, but distinct from that number"
    >  
    > I had to read this sentence several times to understand that the part "but distinct from that number" is meant to note to the reader that the value of the item is not the number it's identified by. I would formulate as written here, rather than as it is now. ("Note that the value of the item is not the number it's identified by")
    
    Well, we need to define it, not just adding notes.
    Hmm.

FP: Sure, I still think the current formulation might be confusing, in particular that "distinct" refers to the item's value. But I am not a native speaker, and if it's only me I'm ok with leaving as is (or with the small change in PR).
     
    > * Section 3.1, Major type 4
    >  
    > The text states that arrays can also be called sequences. With the publication of CBOR Sequences, can we remove this statement, as sequences are (although related) different things?
    
    Good point.
    CBOR sequences are, but arrays are still called sequences in many other places (which might include the applications sitting on top of CBOR).
    Need to think about a better way to say “are also called”.
    Attempt in new branch.

FP: Good attempt, thanks. I think "In other formats" is good enough. Also it's the first time I see "?" used in references in markdown, what is that for?
     
    > * Section 3.1, Major type 5
    >  
    > Using underscoring to highlight a term (in this case "pairs") should be explained in terminology.
    
    Solved by RFCXMLv3 :-)
    (Still needed for the plain text version, as is the equivalent for typeset versions.)

FP: nice __
     
    > * Section 3.1, Major type 6
    >  
    > To be consistent with other major types, it might be good to shortly mention ranges for tags here.
    
    Good point!

FP: Thanks for adding it.
     
    > * Section 3.2
    >  
    > While the motivation for arrays and maps is obvious, I would have appreciated some more text on motivation (or an example of use case) where indefinite-length strings are useful.
    
    That would be an expansion of calling out “streaming”, right?
    Attempt is in the branch.

FP: To be honest I am not sure what I had in mind, but the text is good __
    
    > * Section 3.2.3
    >  
    > I don't understand the link between the previous sentence and this one:
    >  
    > "   Note that this implies that the bytes of a single UTF-8 character
    >    cannot be spread between chunks: a new chunk can only be started at a
    >    character boundary."
    >  
    > Nor am I sure of the meaning of the term "spread" here.
    
    All component strings need to be valid, i.e., sequences of UTF-8 characters; this means a single character cannot be started in one component string and then go on in the next component string.
    "split up" may be better.

FP: Ah I see. Thanks for clarifying! 
    
    >  
    > * Section 3.3
    >  
    > Re-appearance of the term "sequence", which I would still avoid.
    
    Yes.  We also wanted to avoid “byte string”, because that is always confusing (in particular if the data item encodes a byte string).  Maybe “sequence of bytes”?  Or maybe:
    
    For example, assume an encoded data item consisting of the bytes:
    
FP: Yes, perfect.
    
    >  
    > * Table 4
    >  
    > I would have explicitely stated what data items where allowed for each tag number, rather than writing multiple.
    
    We could do that.  There are two cases: Essentially anything (21–23, 55799), and the numbers allowed by tag 1; for the latter we could write “integer or float”, and “(any)” for the former.

FP: That works, thanks. Even if the information was already in the text, I think this improves readability.
    
    > * Section 3.4.4
    >  
    > The term "preferred encoding" appears here for the first time without any reference or introduction.
    
    (I think this is now 3.4.3.)
    The next now has a pointer to Section 4.1, and we are using “preferred serialization”.

FP: Yes, thanks. I see that's the case in my branch, not in the submitted version yet.
    
    >  
    > * Section 3.4.5
    >  
    > "while the mantissa also can be" -> "while the mantissa can also be"
    
    Yes.
     
    > * Section 3.4.5
    >  
    > Expand NaN on first appearance here (instead of 5.6.1)
    
    #165 now has a mention (and expansion) in the terminology section.
    We could expand here, as well, I’d probably leave the expansion in 5.6.1. because there is not much context about non-finites here, while there is in 3.4.5.
     
    > * Section 4.2.2
    >  
    > "may want to exclude them from interchange, interchanging"
    >  
    > I would reformulate this.
    
    Because it is wrong, misleading, or because the same word is used twice?
    
    If the latter,
    "may want to exclude them from the protocol format, interchanging"
    maybe?

FP: same word used twice. Reformulation works.
     
    > * Section 4.2.3
    >  
    > Capitalize section title
    
    I’m sure the RFC editor will have a lot of these.
    Fixed in the branch now.

FP: I'm sure, but I try to make you stay as little time as possible in the RFC editor's hands __
     
    > * Section 4.2.3
    >  
    > First paragraph: please add a reference to 4.2.1 when talking about core deterministic encoding requirements.
    
    Yes.
     
    > * Section 5.4
    >  
    > "A generic encoder also may want" -> "A generic encoder may also want"
    
    Yes.
     
    > * Section 5.6
    >  
    > "Duplicate keys are also prohibited by CBOR decoders that
    >    enforce validity (Section 5.4)."
    >  
    > I have a slight problem with the term "prohibited by" decoders... Decoders do not prohibit, at most they do not accept.
    
    Good point.  So let’s say “not accepted”.

FP: nice.
    
    >  
    > * Section 5.6
    >  
    > "except to specify that some, orders are disallowed" -> remove comma
    
    (Fixed previously.)
    
    >  
    > * Section 7.1, last sub-bullet
    >  
    > Please reference section 7.2.
    
    Yes.  (Let’s see whether the RFC editor throws that out again…)

FP: oh you mean for fw reference? 
    
    >  
    > * Section 8.1
    >  
    > I like examples. I would have liked an example for the second paragraph of this section.
    
    Good point.

FP: Thanks.
     
    > * Appendix G
    >  
    > "this may not be actually be an error" -> "this may not actually be an error"
    
    Yes.
    
    Now PR #166: https://github.com/cbor-wg/CBORbis/pull/166
    
    Grüße, Carsten

[Cbor] 🔔 WGLC on draft-ietf-cbor-7049bis-09 Francesca Palombini
Re: [Cbor] 🔔 WGLC on draft-ietf-cbor-7049bis-09 Jeffrey Yasskin
Re: [Cbor] 🔔 WGLC on draft-ietf-cbor-7049bis-09 Francesca Palombini
[Cbor] Bignums and the generic data models (Re: 🔔… Carsten Bormann
Re: [Cbor] 🔔 WGLC on draft-ietf-cbor-7049bis-09 Carsten Bormann
Re: [Cbor] 🔔 WGLC on draft-ietf-cbor-7049bis-09 Carsten Bormann
Re: [Cbor] Bignums and the generic data models (R… Jeffrey Yasskin
Re: [Cbor] Bignums and the generic data models (R… Carsten Bormann
Re: [Cbor] Bignums and the generic data models (R… Laurence Lundblade
Re: [Cbor] Bignums and the generic data models (R… Carsten Bormann
Re: [Cbor] 🔔 WGLC on draft-ietf-cbor-7049bis-09 Carsten Bormann
Re: [Cbor] 🔔 WGLC on draft-ietf-cbor-7049bis-09 Francesca Palombini
Re: [Cbor] 🔔 WGLC on draft-ietf-cbor-7049bis-09 Carsten Bormann