Re: [Cbor] 🔔 WGLC on draft-ietf-cbor-7049bis-09

Jeffrey Yasskin <jyasskin@chromium.org> Sun, 24 November 2019 11:24 UTC

Return-Path: <jyasskin@google.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BE3D912012A for <cbor@ietfa.amsl.com>; Sun, 24 Nov 2019 03:24:03 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.24
X-Spam-Level:
X-Spam-Status: No, score=-9.24 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_KAM_HTML_FONT_INVALID=0.01, URIBL_BLOCKED=0.001, USER_IN_DEF_SPF_WL=-7.5] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=chromium.org
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id lNDZtRCR6VP2 for <cbor@ietfa.amsl.com>; Sun, 24 Nov 2019 03:24:01 -0800 (PST)
Received: from mail-qt1-x832.google.com (mail-qt1-x832.google.com [IPv6:2607:f8b0:4864:20::832]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 342A2120128 for <cbor@ietf.org>; Sun, 24 Nov 2019 03:24:01 -0800 (PST)
Received: by mail-qt1-x832.google.com with SMTP id y39so13780335qty.0 for <cbor@ietf.org>; Sun, 24 Nov 2019 03:24:01 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=CfWuL7SBpBqVOfbR3S0nKkr7z5BM+6c4BN4ugkT0RNw=; b=A2VmWH4Suo35/iYihe90gvYawfpG94DfPMGZPM/s3xNvqhtZ2ww46YhnL63ygrfUz/ t819/XAMF85Ffdk2tpeeLa0kxEzj8fF6wYMsNiAuT3HBPL5rkB+uNc6YQh4pgtRowZVo mzkOHwrMYcmJsdFlTfBYWUyM5Qbb4gwMG69Z0=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=CfWuL7SBpBqVOfbR3S0nKkr7z5BM+6c4BN4ugkT0RNw=; b=E8u8QVj3vm9Pg0AXXfSIndNSBW4kb4ao9EVPkk1G2TYyOdoGrZgOTEPcwPWSbjp7C7 neUgepxxv5kAy9Yuqaet9mthPCoSO2op6hMZ9+bdxoQyL5C2lJs9QLwT/WP3hCYkrOjx JVpvEm5uXz+9FeQexGDpfF83f3rkwwp7E9tZmk934QytCHS/has4G9dhrUk2sey0wozm FRVJuvo2ru7cKHwupN0Hm1zyf5gzWe214GzshPWNiAHQLrzC0+Tg2Vj+LZKxFKFOKP1y PSeBsDi+aHBQm7cxP3XQ77NS18ja4Vmk3uW20cKoneTzWuYmOA10XhPHvSFt6+VETU8/ FeXg==
X-Gm-Message-State: APjAAAWcnwDV+V9yaH+741WrmEp3SzETNRm3lRXTHx+lIZxa+1o3IrSQ wv/l/iazpSokNE1CNPUuGoNBe6oI9AHAOymGYUQQHA==
X-Google-Smtp-Source: APXvYqzuSFmqroOQOHUF4MpS9WsYL8bQ9Aa5Ns85euD0wc6mWXydLPJjtvipcS7V+n02h2AXDttCYf+lAZ1xhcVE2yc=
X-Received: by 2002:ac8:138b:: with SMTP id h11mr6444943qtj.153.1574594639366; Sun, 24 Nov 2019 03:23:59 -0800 (PST)
MIME-Version: 1.0
References: <293AFF31-D0EF-45D6-9B9D-E8136481C404@ericsson.com>
In-Reply-To: <293AFF31-D0EF-45D6-9B9D-E8136481C404@ericsson.com>
From: Jeffrey Yasskin <jyasskin@chromium.org>
Date: Sun, 24 Nov 2019 03:23:47 -0800
Message-ID: <CANh-dXnPRd7w_z2LA0gYD0GHVbmych4BGA5_-vmJz+Zn1qBh_w@mail.gmail.com>
To: Francesca Palombini <francesca.palombini=40ericsson.com@dmarc.ietf.org>
Cc: "cbor@ietf.org" <cbor@ietf.org>, "draft-ietf-cbor-7049bis@ietf.org" <draft-ietf-cbor-7049bis@ietf.org>, "cbor-chairs@ietf.org" <cbor-chairs@ietf.org>
Content-Type: multipart/alternative; boundary="0000000000002584e4059815e2d6"
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/B84oVjRdGM8Du4z1tPopBQEDSCU>
Subject: Re: [Cbor] 🔔 WGLC on draft-ietf-cbor-7049bis-09
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 24 Nov 2019 11:24:04 -0000

Sorry I've been so lax with incremental reviews. Here's a complete set of
comments covering all of
https://tools.ietf.org/html/draft-ietf-cbor-7049bis-09 (although I only
skimmed the appendices). Overall, this is way better than RFC 7049. Thanks
to everyone who worked on improving it!

1.  Introduction

   -

   "The format defined here follows some specific design goals that are not
   well met by current formats." <- "Follows" doesn't really work. Maybe
   "pursues"? "Satisfies"?
   -

   "It is important to note that this is not a proposal that the grammar in
   RFC 8259 be extended in general, since doing so would cause a significant
   backwards incompatibility with already deployed JSON documents." <- This is
   probably not necessary anymore. It could be removed entirely or replaced
   with "This format is not an extension of the grammar in RFC 8259."


3.1.  Major Types

   -

   In Table 1, can we spell out "Major Type" instead of "mt"?


3.4.  Tagging of Items

   -

   I'm not sure what "while retaining its structure" accomplishes here. Can
   we remove it?
   -

   "That is, a tag is a data item consisting of a tag number and an
   enclosed value. The content of the tag (the enclosed data item) is the data
   item (the value) that is being tagged." might be unnecessary. I think it's
   covered by the earlier text in this section. If it's still needed, it
   should probably move earlier to where we define tagged data. It doesn't fit
   next to the discussion of what it means to put a tag inside a tag.
   -

   "it can just jump over the initial bytes of the tag (that encode the tag
   number)" isn't quite right: it's not just skipping it, it's reporting both
   the tag number and value to the application. Maybe "Understanding the
   semantics of tags is optional for a decoder: it can present the tag number
   and content to the application without interpreting the tag as a whole."


3.4.1.  Date and Time

   -

   The next two sections seem like they should be subsections.


3.4.3.  Epoch-based Date/Time

   -

   "An application that requires tag number 1 support may restrict" has a
   lowercase MAY, which has an ambiguous effect after RFC 8174. Do we want MAY
   or can?


3.4.4.  Bignums

   -

   This section has a forward reference to "preferred encoding", which
   should cite section 4.1. I note that 4.1 uses "preferred serialization"
   instead, so maybe we should switch this section to that term.
   -

   "and preferred encoding never makes use of bignums that also can be
   expressed as basic integers (see below)." <- This seems inconsistent with
   "In the generic data model, bignum values are not equal to integers from
   the basic data model". If they're not the same value at the data model
   level, they can't be alternate encodings of each other.


3.4.5.  Decimal Fractions and Bigfloats

   -

   "Decimal fractions (tag number 4) use base-10 exponents; the value of a
   decimal fraction data item is m*(10**e). Bigfloats (tag number 5) use
   base-2 exponents; the value of a bigfloat data item is m*(2**e)." is
   redundant with the first paragraph of the section.
   -

   This section also suggests that integers be used instead of integral
   bigdecimals and bigfloats. That only works if the specific data model says
   they're equivalent. Maybe we should say specific data models SHOULD make
   them equivalent and then SHOULD set the preferred encoding to the integer
   version?


3.4.6.2.  Expected Later Encoding for CBOR-to-JSON Converters

   -

   This section only defines the tags obliquely and never says what tag 23
   means. I suggest starting sentences with the tag number, e.g. "Tag number
   21 means the contained byte string is expected to be encoded in base64url
   without padding ... Tag number 22 means ..."


3.4.6.3.  Encoded Text

   -

   "Tag numbers 33 and 34 are for base64url- and base64-encoded text
   strings" should maybe have "respectively"?


4.1.  Preferred Serialization

   -

   "1_000_000_000" has enough digits that maybe we should use 10**9 (or
   10<sup>9</sup> in v3) instead.


4.2.  Deterministically Encoded CBOR

   -

   "Some protocols may want" has a lowercase "may". Consider "might".


4.2.1.  Core Deterministic Encoding Requirements

   -

   This section says "Floating point values also MUST use the shortest form
   that preserves the value, e.g. 1.5 is encoded as 0xf93e00 and 1000000.5 as
   0xfa49742408.", but 4.2.2 says "If a protocol allows for IEEE floats, then
   additional deterministic encoding rules might need to be added." We should
   only put the float rule in one of these sections.


4.2.2.  Additional Deterministic Encoding Considerations

   -

   "the deterministic format would not allow them" isn't clear what "them"
   is. Do we mean "would not allow the data to be tagged"? Or should we just
   say that the deterministic format for the protocol needs to specify whether
   the tag is or is not present?
   -

   The "If a protocol includes a field that can express floating-point
   values" paragraph also assumes "… and the protocol's specific data model
   declares integers and floating point values to be interchangeable".
   -

   The "A protocol might give encoders the choice of representing a URL
   ..." item feels like it's repeating the "CBOR tags present …" paragraph.
   Maybe we should move it to an example in that paragraph?
   -

   Maybe the ""Protocols that include floating, big integer, or other
   complex values need to define extra requirements on their deterministic
   encodings. For example:" introductory sentence belongs at the top of the
   whole section.


5.  Creating CBOR-Based Protocols

   -

   "This section discusses" might read better as "The rest of this section"
   since it's after a bit of the section.


5.2.  Generic Encoders and Decoders

   -

   "Even though CBOR attempts to minimize these cases, not all well-formed
   CBOR data is valid" is redundant with a lot of text from earlier in the
   document.
   -

   I wonder if this whole subsection is out of place. It reads like a
   definition of generic en/decoders rather than a consideration for designing
   protocols. Maybe it should be a subsection of "Terminology"?


5.3.  Validity of Items

   -

   "The first layer that does process the semantics of an invalid CBOR item
   MUST take one of two choices:" covers our discussion of duplicate map keys.
   Right now, our requirements aren't consistent with the requirements in this
   section, so we should make sure to incorporate this section when we resolve
   those.


5.3.2.  Tag validity

   -

   "might present this tag to the application in a similar way to how it
   would present a tag with an unknown tag number" seems to suggest that it's
   wrong to replace the invalid tag with an error marker or to stop processing
   entirely, even though that's what 5.3 suggests.


5.5.  Numbers

   -

   "the JavaScript number system treats all numbers as floating point" is
   no longer true:
   https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/BigInt
   -

   "a compact application should accept" uses a lowercase "should" and
   would seem to discourage compact applications that check for a
   deterministic decoding. Do we mean that accepting wider encodings is likely
   to make the application more compact?
   -

   "The preferred encoding for a floating-point value is the shortest
   floating-point encoding" is redundant with section 4.1, although it does
   include more detail. I *think* I'd rather put the whole definition of the
   preferred encoding in 4.1 instead of having some of it in protocol
   considerations.


5.6.  Specifying Keys for Maps

   -

   "probably should be limited" -> "may need to be limited" or "the
   specification is probably simpler if"? To avoid BCP 14 terms.
   -

   We're already discussing the question of duplicate keys in another
   thread.
   -

   "except to specify that some, orders" has an extra comma.
   -

   "be enough reason on its own" -> "on their own"
   -

   The "should consider using small integers as keys" has the downside that
   it makes it harder for humans to understand the meaning of the data without
   the schema. "for constrained devices" might protect the rest of us from
   that downside, but would it make sense to say it explicitly?


5.6.1.  Equivalence of Keys

   -

   This section might be shorter if it just says that map keys are
   duplicates if they have the same value in the generic data model or if the
   specific data model for the protocol (Section 2.2) says they're equivalent.
   The rest of the section just duplicates information that's already in
   Section 2. The note in the last paragraph does still seem useful.


8.1.  Encoding Indicators

   -

   "Note that the encoding indicator "_" is thus an abbreviation of the
   full form "_7", which is not used." is confusing where it is. It might make
   more sense if we swap its paragraph with the previous one and move it to
   after the definition of "[_ 1, 2]".
   -

   "As a special case, byte and text strings of indefinite length" doesn't
   seem like a special case to me. It's just the way you represent the
   encoding of an indefinite-length byte or text string.


9.2.  Tags Registry

   -

   Did we decide not to tighten registration for the 256–65535 space?


9.3.  Media Type ("MIME Type")

   -

   Is there a reason this section switches to artwork in the middle?


9.4.  CoAP Content-Format

   -

   Could this section link to
   https://www.iana.org/assignments/core-parameters/core-parameters.xhtml#content-formats
   ?


Appendix A.  Examples

   -

   This starts with some references to Unicode code points, which could use
   the new <u> tag.


Appendix F.  Changes from RFC 7049

   -

   This looks quite incomplete.



Jeffrey

On Thu, Nov 14, 2019 at 6:41 PM Francesca Palombini <francesca.palombini=
40ericsson.com@dmarc.ietf.org> wrote:

> CBOR wg,
>
>
>
> This starts a four weeks WG last call on
> https://tools.ietf.org/html/draft-ietf-cbor-7049bis-09 , ending on **Thursday,
> 12 December**.
>
> Please send inputs to the mailing list that you have read the document and
> do or do not feel it is ready to progress, along with any issues that you
> believe need to be dealt with.
>
>
>
> We will discuss any open issues we’ve gotten at the f2f, Thursday, 21
> November.
>
>
>
> CBOR Chairs
>
> Jim & Francesca
> _______________________________________________
> CBOR mailing list
> CBOR@ietf.org
> https://www.ietf.org/mailman/listinfo/cbor
>