Re: [Cbor] my (WGLC re-)views on error processing in RFC7049bis and future-proofing

Michael Richardson <mcr+ietf@sandelman.ca> Sat, 16 May 2020 01:41 UTC

Return-Path: <mcr+ietf@sandelman.ca>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BC6893A08B1 for <cbor@ietfa.amsl.com>; Fri, 15 May 2020 18:41:09 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id QUMOH6oTlVVJ for <cbor@ietfa.amsl.com>; Fri, 15 May 2020 18:41:07 -0700 (PDT)
Received: from tuna.sandelman.ca (tuna.sandelman.ca [IPv6:2607:f0b0:f:3:216:3eff:fe7c:d1f3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 40FEC3A0843 for <cbor@ietf.org>; Fri, 15 May 2020 18:41:05 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1]) by tuna.sandelman.ca (Postfix) with ESMTP id 2695738A3E; Fri, 15 May 2020 21:38:42 -0400 (EDT)
Received: from tuna.sandelman.ca ([127.0.0.1]) by localhost (localhost [127.0.0.1]) (amavisd-new, port 10024) with LMTP id emV3KVFUeIfc; Fri, 15 May 2020 21:38:25 -0400 (EDT)
Received: from sandelman.ca (obiwan.sandelman.ca [IPv6:2607:f0b0:f:2::247]) by tuna.sandelman.ca (Postfix) with ESMTP id 064FC38A14; Fri, 15 May 2020 21:38:25 -0400 (EDT)
Received: from localhost (localhost [IPv6:::1]) by sandelman.ca (Postfix) with ESMTP id 0E8A7516; Fri, 15 May 2020 21:40:22 -0400 (EDT)
From: Michael Richardson <mcr+ietf@sandelman.ca>
To: Carsten Bormann <cabo@tzi.org>
cc: cbor@ietf.org
In-Reply-To: <BC0EC9BE-4202-4EED-A619-CDEB9BF312CE@tzi.org>
References: <17300.1588779159@localhost> <38BB6FFF-737F-4C11-AD7A-DA3F28A9F570@tzi.org> <CANh-dXkdjMyO=WFUxrF06OfP+RE9v11unKJXL8P3UtEe+prV1w@mail.gmail.com> <13690.1588894939@localhost> <CANh-dXmjD=RCwh7ExjSvFx+5ciew+eqHoVS88OommQ2xVnX5=Q@mail.gmail.com> <2963.1589473899@localhost> <BC0EC9BE-4202-4EED-A619-CDEB9BF312CE@tzi.org>
X-Mailer: MH-E 8.6+git; nmh 1.7+dev; GNU Emacs 26.1
X-Face: $\n1pF)h^`}$H>Hk{L"x@)JS7<%Az}5RyS@k9X%29-lHB$Ti.V>2bi.~ehC0; <'$9xN5Ub# z!G,p`nR&p7Fz@^UXIn156S8.~^@MJ*mMsD7=QFeq%AL4m<nPbLgmtKK-5dC@#:k
MIME-Version: 1.0
Content-Type: multipart/signed; boundary="=-=-="; micalg="pgp-sha512"; protocol="application/pgp-signature"
Date: Fri, 15 May 2020 21:40:22 -0400
Message-ID: <26665.1589593222@localhost>
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/qqIYwIv91CY_BILjE9z86fTeCUQ>
Subject: Re: [Cbor] my (WGLC re-)views on error processing in RFC7049bis and future-proofing
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 16 May 2020 01:41:17 -0000

Carsten Bormann <cabo@tzi.org> wrote:
    >>> 28, 29, 30: These values are reserved for future additions to the
    >>> CBOR format.  In the present version of CBOR, the encoded item is not
    >>> well-formed.
    >>
    >> I think that there is a bug here.  What should a parser written today
    >> do when it encounters these values?  (forward reference to section
    >> 7.2?)

    > Give up (not well-formed), as there is no way to know how big the head
    > with these ai values is.

Give up, returning the content so-far, or ??

    >> Getting this right is how we deal with future-proofing.

    > We do have extension points that have full compatibility; this
    > potential one just doesn’t.

    >> It seems seeing such a thing means a current decoder has to
    >> abort/fail.  What we write here has a profound implication, I think,
    >> on how easily we could act on the advice of section 7.2.

    > It would be a CBOR 1.1 (or 2.0), it will not be easy on existing
    > decoders!

Should we plan a tag for version?

    >> Section 10, first paragraph implies we should say something.
    >>
    >> In general, I think that the details in this introductionary encoding
    >> section are too detailed, particularly for 31.  I think that detail
    >> belongs later on. I got no value (I retained nothing) from having that
    >> level of detail there.
    >>
    >> I wonder if section 3.1, under major type 0 should give clarify that
    >> "0" is encoded as 0b000_00000. (That is no negative 0)
    >>
    >> "A string containing an invalid UTF-8 sequence is well- formed but
    >> invalid."
    >>
    >> I think that this might need clarification.
    >>
    >> I guess that RFC8742 include sequences of 7049bis CBOR sequences.  I
    >> wonder if Updates 8742 is appropriate.

    > You lost me here.

Do we need to Update 8742 too?

    >> 3.2.3: (Note that zero-length chunks, while not particularly useful,
    >> are permitted.)
    >>
    >> they might be useful in non-TCP/IP situations where it is useful to
    >> send a "keep-alive" on some channel.

    > We haven’t addressed general padding im CBOR (which would again require
    > a 1.1 or 2.0 maybe), and I would hate to suggest this here as the only
    > padding that CBOR already offers.

Okay, so maybe forward to Protocol.

    >> Could future Simple Values (such as 0..19) can, have complex structure
    >> the way that values 24->27 do?

    > No, the general syntax of heads does apply to the unallocated code
    > points as well.

I think we should state this.  It's a extension point that we can reliably skip.

    >> I note that AFAIK, we do not use tag#24 (Encoded CBOR data item) for
    >> the signed object, in COSE.  Should we?  What's the difference between
    >> #24 and #55799.

    > 55799 is a tag that can have any CBOR data item as tag content 24 is a
    > tag that can only be on byte strings.  The byte string then *encodes*
    > another CBOR data item.  (The main use here is to keep the decoder from
    > decoding, to provide easy skip-ability or because we need exact bytes
    > as in COSE.)  As often with tags, there is no need for tag 24 on a byte
    > string when it is clear from context that the byte string contains
    > encoded CBOR; this is the case in COSE.

Understood.

    >> I guess I will read onwards to find out... Got it.
    >>
    >> BTW: Tag 25 and 29 are called out after Table 5, but are not listed
    >> *in* table 5.  That whole paragraph could use some more periods, and
    >> maybe a blank line.  I'm still loss as to why <untagged><null> is
    >> better than <epoch><null>.
    >>
    >> Why can't we use decimal fractions, or bigfloats for time?

    > That may have been a mistake (which is one reason we have tag 1001
    > now).  The WG has generally taken a dim view on extending the domain
    > (allowable syntax for tag content) for a tag, so we can’t “fix” that —
    > note that for the date tag, we have taken the decision not to reuse one
    > tag for two different tag content syntaxes either.

okay.

--
Michael Richardson <mcr+IETF@sandelman.ca>, Sandelman Software Works
 -= IPv6 IoT consulting =-