[Cbor] Adam Roach's Discuss on draft-ietf-cbor-cddl-06: (with DISCUSS and COMMENT)

Adam Roach <adam@nostrum.com> Tue, 20 November 2018 05:22 UTC

Return-Path: <adam@nostrum.com>
X-Original-To: cbor@ietf.org
Delivered-To: cbor@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id 421141293FB; Mon, 19 Nov 2018 21:22:26 -0800 (PST)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
From: Adam Roach <adam@nostrum.com>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-cbor-cddl@ietf.org, Barry Leiba <barryleiba@computer.org>, cbor-chairs@ietf.org, barryleiba@computer.org, cbor@ietf.org
X-Test-IDTracker: no
X-IETF-IDTracker: 6.88.0
Auto-Submitted: auto-generated
Precedence: bulk
Message-ID: <154269134623.26525.15947501642666003321.idtracker@ietfa.amsl.com>
Date: Mon, 19 Nov 2018 21:22:26 -0800
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/jzJ297jNm5EoJkJj-n8R43QSOP4>
Subject: [Cbor] Adam Roach's Discuss on draft-ietf-cbor-cddl-06: (with DISCUSS and COMMENT)
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 20 Nov 2018 05:22:26 -0000

Adam Roach has entered the following ballot position for
draft-ietf-cbor-cddl-06: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
for more information about IESG DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-cbor-cddl/



----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

Thanks for all the work that went into creating this document. I have one point
that I think needs discussion, although it's entirely possible that I'm thinking
about this the wrong way.

§3.8:

Given that the list of control operators can be expanded in the future, it's
not clear what automated tools are supposed to do if they encounter controls
that they do not understand. I initially thought that it might be possible to
just ignore control operators (and their parameters) if they aren't understood,
as this would simply result in a more permissive validation of data against a
schema; but the ".and" control gives an example of a control operator where this
kind of elision would fail.

With the lack of any version indicators in CDDL, this seems like a straight-up
interoperability issue.


----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

I also have a handful of non-critical comments of varying importance.

Please expand "CBOR":
 (1) In the title
 (2) Upon first use in the document body

See https://www.rfc-editor.org/materials/abbrev.expansion.txt for details.

---------------------------------------------------------------------------

§1.2:

>  New terms are introduced in _cursive_.  CDDL text in the running text
>  is in "typewriter".

This is perplexing, as I know of no tool that will render the canonical form
of current RFCs in the way being described. Is the intention to hold this
document until the new RFC format is available?

---------------------------------------------------------------------------

§2:

>  The rest of this section introduces a number of basic concepts of
>  CDDL, and section Section 3 defines additional syntax.  Appendix C

Nit: "...and Section 3..."

---------------------------------------------------------------------------

§2.2.2:

>  delimited by a "//" (double slash).  Note that the "//" operators
>  binds much more weakly than the other CDDL operators, so each line

Nit: "...operator binds..." or "...operators bind..."

---------------------------------------------------------------------------

§3.1:

>  o  A name can consist of any of the characters from the set {'A',
>     ..., 'Z', 'a', ..., 'z', '0', ..., '9', '_', '-', '@', '.', '$'},

This looks like a formal syntax of some kind, but I don't know where it's
defined. Notably, since this document has just defined ".." to be an inclusive
range operator and "..." to be an exclusive range operator, defining the set of
allowed characters in this way seems to run the risk of interpreting, e.g., "Z"
to be disallowed.

I suggest either defining the set of allowed characters using a formally defined
and cited grammar (e.g., ABNF), or using prose.

---------------------------------------------------------------------------

§3.1:

>  o  outside strings, whitespace (spaces, newlines, and comments) is
>     used to separate syntactic elements for readability (and to
>     separate identifiers or numbers that follow each other); it is
>     otherwise completely optional.

This seems nominally at odds with the following text in §2.2.2.1, which points
to at least one other case where whitespace is mandatory:

>  When using a name as
>  the left hand side of a range operator, use spacing as in
>
>     min .. max
>
>  to separate off the range operator.

---------------------------------------------------------------------------

§3.1:

>     If prefixed as "h" or "b64", the string is
>     interpreted as a sequence of pairs of hex digits (base16) or a
>     base64(url) string, respectively

Please normatively cite RFC 4648, sections 8 and 5 respectively.

---------------------------------------------------------------------------

§3.8.1:

>  When applied to an unsigned integer, the ".size" control restricts
>  the range of that integer by giving a maximum number of bytes that
>  should be needed in a computer representation of that unsigned
>  integer.  In other words, "uint .size N" is equivalent to
>  "0...BYTES_N", where BYTES_N == 256**N.
>
>     audio_sample = uint .size 3 ; 24-bit, equivalent to 0..16777215
>
>               Figure 9: Control for integer size in bytes

While they're semantically the same, the example is oddly mismatched with the
preceding text. Consider instead:

      audio_sample = uint .size 3 ; 24-bit, equivalent to 0...16777216

---------------------------------------------------------------------------

Appendix B:

>           / "#" "6" ["." uint] "(" S type S ")" ; note no space!

No space where? I see two space productions in that rule (so it clearly
applies to some specific location), and there are several places where spaces
cannot appear.

>     type1 = type2 [S (rangeop / ctlop) S type2]

This rule doesn't seem to properly capture the ambiguity of "a...b". There is a
terribly complex way to address this by defining parallel "type2" and "type3"
rules that differ only in whether a dot is allowed to appear in their value, and
defining type1 as requiring a space after the type that can contain dots -- but
that is probably overkill. It's probably sufficient to reiterate the warning
about requiring a space under such circumstances as a comment on this rule.

>     HEXDIG = DIGIT / "A" / "B" / "C" / "D" / "E" / "F"

It is a common implementor mistake to forget that ABNF is, by default,
case-insensitive. It is probably worth adding a comment here as a reminder.
(The same applies to "0x", "0b", "e", and "p" above, but these seem less likely
to appear in arbitrary case.)

---------------------------------------------------------------------------

Appendix B:

>     SCHAR = %x20-21 / %x23-5B / %x5D-10FFFD / SESC
>     SESC = "\" %x20-10FFFD
...
>     PCHAR = %x20-10FFFD

These almost certainly should be:

      SCHAR = %x20-21 / %x23-5B / %x5D-7E / %x80-10FFFD / SESC
      SESC = "\" %x20-7E / %x80-10FFFD
...
      PCHAR = %x20-7E / %x80-10FFFD

(i.e., exclude the control character %x7F)

---------------------------------------------------------------------------

Appendix C:

>  (It is not an error to extend a rule name
>  that has not yet been defined; this makes the right hand side the
>  first entry in the choice being created.)

Is it an error to redefine a rule name that has already been defined?