Re: [Cbor] AD review of draft-ietf-cbor-file-magic-10

Michael Richardson <mcr+ietf@sandelman.ca> Sat, 02 April 2022 17:01 UTC

Return-Path: <mcr+ietf@sandelman.ca>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 579AD3A078A; Sat, 2 Apr 2022 10:01:17 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.109
X-Spam-Level:
X-Spam-Status: No, score=-2.109 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=sandelman.ca
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8KqwBV3lOPQt; Sat, 2 Apr 2022 10:01:12 -0700 (PDT)
Received: from tuna.sandelman.ca (tuna.sandelman.ca [209.87.249.19]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 6EAD23A043C; Sat, 2 Apr 2022 10:01:10 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1]) by tuna.sandelman.ca (Postfix) with ESMTP id E0A5A38D92; Sat, 2 Apr 2022 13:12:05 -0400 (EDT)
Received: from tuna.sandelman.ca ([127.0.0.1]) by localhost (localhost [127.0.0.1]) (amavisd-new, port 10024) with LMTP id SccY-50S1yDY; Sat, 2 Apr 2022 13:12:01 -0400 (EDT)
Received: from sandelman.ca (obiwan.sandelman.ca [IPv6:2607:f0b0:f:2::247]) by tuna.sandelman.ca (Postfix) with ESMTP id 0937438D88; Sat, 2 Apr 2022 13:12:01 -0400 (EDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=sandelman.ca; s=mail; t=1648919521; bh=lVKhg5C/yKHfEo3/1iQ6EdOBZDXPfxWY3K1PB3xXeQw=; h=From:To:cc:Subject:In-Reply-To:References:Date:From; b=JLnXF5JZXoxv68+5lwMui+SraPpT0jhUQShUb83iTODD3/Bch49XNSgOWFO3Z3ZLP 9UqEsASFkTo4s1Kgpk4V0qi42couTk2s6pDHYzIS+vzhNrD42xkEgieAW+attpaSP8 wdvwlZTwvmXYToUuTuUJex+8zKmLg+l1YIais8iTuA6b531f4JNn8mAluA/keQJLvL 7w3pDqBGsMvdizEsSN2Yau+HuJEs3paA8AGmgBZVJPzr4qA0//blo4aOc4e36CoakO KuIx+SV8Wi9Qf/eU+ynY955vhRTrVTqiy0qRrKaDbH7Ue0UsztAmzDfZoRVgHc3w0E 4ZpWOzdD4V6vA==
Received: from localhost (localhost [IPv6:::1]) by sandelman.ca (Postfix) with ESMTP id D7F683E9; Sat, 2 Apr 2022 13:01:03 -0400 (EDT)
From: Michael Richardson <mcr+ietf@sandelman.ca>
To: Francesca Palombini <francesca.palombini=40ericsson.com@dmarc.ietf.org>
cc: "draft-ietf-cbor-file-magic@ietf.org" <draft-ietf-cbor-file-magic@ietf.org>, "cbor@ietf.org" <cbor@ietf.org>
In-Reply-To: <HE1PR07MB4217656476550D97A1ACEA6A98E09@HE1PR07MB4217.eurprd07.prod.outlook.com>
References: <HE1PR07MB4217656476550D97A1ACEA6A98E09@HE1PR07MB4217.eurprd07.prod.outlook.com>
X-Mailer: MH-E 8.6+git; nmh 1.7+dev; GNU Emacs 26.1
X-Face: $\n1pF)h^`}$H>Hk{L"x@)JS7<%Az}5RyS@k9X%29-lHB$Ti.V>2bi.~ehC0; <'$9xN5Ub# z!G,p`nR&p7Fz@^UXIn156S8.~^@MJ*mMsD7=QFeq%AL4m<nPbLgmtKK-5dC@#:k
MIME-Version: 1.0
Content-Type: multipart/signed; boundary="=-=-="; micalg="pgp-sha512"; protocol="application/pgp-signature"
Date: Sat, 02 Apr 2022 13:01:03 -0400
Message-ID: <24827.1648918863@localhost>
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/Ur8yHXf5KOxVG57YlWAVPtpxPqk>
Subject: Re: [Cbor] AD review of draft-ietf-cbor-file-magic-10
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 02 Apr 2022 17:01:18 -0000

Carsten, I've assembled my changes into many small commits at:
         https://github.com/cbor-wg/cbor-magic-number/pull/19

So that everyone can see the changes.
I could add commit ids the email if that's helpful.

Francesca Palombini <francesca.palombini=40ericsson.com@dmarc.ietf.org> wrote:
    > Thank you for this document. Here is my AD review.

    > The draft has a number of minor issues that can be fixed during IETF
    > Last Call (however I suggest you do fix them asap so that reviewers
    > won't point them out again). I also have a couple bigger questions re
    > reasoning around motivation, context and doc structure (see
    > 9. 15. 18. 19.), but they can be addressed with any additional Last
    > Call comment, which I will start soon.

Working on this now.

    > Thanks,
    > Francesca

    > 1. -----

    > servers derive the MIME-type information from file extensions.

    > FP: I'd suggest using the terminology "media type" rather than "MIME
    > type" as this is not supposed to use the MacOS terminology. (see
    > https://trac.ietf.org/trac/art/wiki/TypicalARTAreaIssues#MediaTypes)

okay.

    > 2. -----

    > It is noted that in the media type registration, that a magic number
    > is asked for, if available, as is a file extension.

    > FP: the sentence does not parse (too many "that"?).

fixed.

    > 3. -----

    > into the beginning of a CBOR format file.  Two possible methods of
    > enveloping data are presented: a CBOR Protocol author will specify
    > one.  (A CBOR Protocol is a specification which uses CBOR as its

    > FP: Suggestion to add that two methods are defined based on the data
    > being an unique CBOR data item or a sequence of CBOR items (this
    > becomes clear later in the text later on, but might be good to clarify
    > for the reader at this point).

I've changed "unique" to "single"

    > 4. ----

    > Examples of CBOR Protocols currently under development include CoSWID
    > [I-D.ietf-sacm-coswid] and EAT [I-D.ietf-rats-eat].  COSE itself

    > FP: Please expand acronyms on first use (here and later - see
    > https://www.rfc-editor.org/materials/abbrev.expansion.txt for the list
    > of acronyms that don't need to be expanded, such as ASN.1). Also for
    > later in the text, it would be good to add references when possible
    > (for example, where DER or PKCS1 are mentioned).

I've expanded DER, but it's really awkward.
Expanding PKCS1 would actually hamper understanding, I think.

    > 5. -----

    > A major inspiration for this document is observing the mess in
    > certain ASN.1 based systems where most files are PEM encoded; these

    > FP: Can we replace "mess" with a less colloquial term such as
    > "confusion"?

okay.
(cf: https://en.wikipedia.org/wiki/Poutine
"ça va faire une maudite poutine!" (English: "It will make a damn mess!") )
[take what you will on the current Ukrainian situation...]

    > 6. -----

    > data item defined in [STD94].

    > The term "diagnostic notation" refers to the human-readable notation
    > for CBOR data items defined in Section 8 of [RFC8949] and Appendix G

    > FP: (Here and later on in the doc) Please reference consistently either
    > STD94 or RFC8949 - just use one (and fix the reference accordingly).

fixed.

    > 7. -----

    > 16777216) to 0xFFFFFFFF (decimal 4294967295).  It is further
    > suggested to avoid values that have an embedded zero byte in the four
    > bytes of their binary representation (such as 0x12003456).

    > FP: Why is it suggested to avoid embedded zero byte?

added:
   This suggestion avoids potential issues with null terminated strings in C.

There ought not be any issues, but the tags could be processed by non-CBOR
aware systems.

    > 8. -----

    > Section 2.1

    > FP: In this section I would also add a sentence saying that protocols
    > registering this type of tag should also reference this specification
    > (additionally to the registration template or any other spec they might
    > have as reference).

added: "When making the IANA request, a reference to this specification will
       help the reviewer to understand why the specific tag value and size is being
       requested. "

Is this what you had in mind?


    > 9. -----

    > Whether these two tags should be removed for specific further
    > processing is a decision made by the CBOR Protocol specification.

    > FP: It is not very clear to me what the sentence adds: this does not
    > help the implementer make that decision, since it does not give some
    > guidance about what could be a reason for removing the tags for
    > processing. It is a "you may do that if you so wish" without any
    > additional information.

"Whenever bytes transmitted over the wire are at a premium, the tags should
be removed."

(Not SHOULD, this document is not normative)

    > 10. -----

    > Section 2.2.1.

    > FP: (Weak) suggestion to add an informative reference to senml+cbor
    > Content Format IANA registration (so just to the IANA registry).

I don't think the reference will be useful.  It's an example.

    > 11. -----

    > 0x42_4F_52 ('BOR' in diagnostic notation).

    > FP: Could you avoid the underscores? I don't think they are necessary
    > and they are not hex valid syntax (and same comments for other
    > occurrences throughout the doc)

There are a number of places where underscore is allowed in hex literals, it
acts like a comma to aid in readability.    It appears to originate in
Verilog.
Carsten?

    > 3.  Advice to Protocol Developers

    > FP: Really nitpicking here :) But could we use "Protocol Authors", to
    > be consistent with the terminology used elsewhere in the doc?

Okay, but, "Author" does not appear any other place.

    > 13. -----


    > As an example of where there was a problem with previous security
    > systems, "PEM" format certificate files grew to be able to contain
    > multiple certificates by simple concatenation.  The PKCS1 format

    > FP: Can we add an informative reference if readers want to dive deeper into this?

Well, not really.  It would be contained in dozens of emails and
stackoverflow discussions where people try to do some with DER encoded
things, fail and then try with PEM encoded, and succeed.
But, none of the specifications say either is allowed, it's just that
software was made to be more tolerant in one case and not another.
It would probably take me a take to find a useful, consise thing to reference.

    > 14. -----

    > then it may be easier to explain if the Labeled CBOR Sequence format
    > is used.

    > FP: I am not sure I understand the meaning of the sentence.

How about:

## Are there tags at the start?

-If the Protocol expects to use other tags values at the top-level, then it may be easier to explain if the Labeled CBOR Sequence format is used.
+If the Protocol expects to use other tags values at the top-level, then the use
+of the tag wrapped format may be easier to explain in the protocol description.


    > 15. -----

    > Section 3

    > FP: This is a very useful section, however I think I haven't understood
    > from it the main advantage from using the tag wrapped vs the labeled
    > CBOR sequence, except the fact that "programs need to be able to skip
    > the first item of the sequence" in order to use the labeled CBOR
    > sequence, which to me does not seem like such a huge requirement. Is
    > that another way to say "programs that do not support CBOR sequences"?
    > Then it would be good to explicitely state that, IMO.

I'm rewording that section.

    > 16. -----

    > Section 5.1 and 5.2

    > FP: Please add that this is the CBOR Tags registry. Also please add a
    > note to IANA that the existing registration should be updated to
    > reference this document and update the fields as defined.

done.

    > Additional comment: I think it is better to described the Data Item as
    > "byte string" as for the current registration, rather than update to
    > "tagged byte string" - that is not a CBOR Data Item per se.

Carsten was very specific about this.

    > 17. -----

    > Semantics:
    > for each tag number NNNNNNNN, the representation of content-format
    > (RFC7252) NNNNNNNN-1668546560

    > FP: I think this should get a reference to
    > https://www.iana.org/assignments/core-parameters/core-parameters.xhtml#content-formats
    > rather than RFC7252. I also wonder how this is going to look like in
    > the registry, but I guess IANA will tell us.

okay.

    > Also it is a bit confusing to use "NNNNNNNN" and not just "N" (as I
    > read it as a decimal, and 8xN doesn't really give me any more
    > information about the number).

I agree, the many Ns are intended to imply it's a big number, I think.

    > 18. -----

    > Appendix A

    > FP: I have some vague memory of this being discussed during interims -
    > so please remind me if I have forgotten, but why is this an appendix
    > rather than in the document itself?

Because the document is about tagging CBOR protocols, and this secttion is
about tagging Non-CBOR Protocols.

    > Appendix C

    > FP: I think a bit more text around the motivation and need for such a tag would be useful.

    > 20. -----

    > FP: Please update the reference: draft-ietf-core-new-block has been published as RFC 9177

It happened so fast, it seems.


--
Michael Richardson <mcr+IETF@sandelman.ca>   . o O ( IPv6 IøT consulting )
           Sandelman Software Works Inc, Ottawa and Worldwide