Re: [Cbor] [Technical Errata Reported] RFC7049 (6221)

Jim Schaad <ietf@augustcellars.com> Mon, 06 July 2020 00:19 UTC

From: Jim Schaad <ietf@augustcellars.com>
To: 'Stuart Cheshire' <cheshire@apple.com>, 'Carsten Bormann' <cabo@tzi.org>
CC: 'Paul Hoffman' <paul.hoffman@vpnc.org>, 'The IESG' <iesg@ietf.org>, cbor@ietf.org
References: <20200704225242.3264EF406D5@rfc-editor.org> <25ADFCDD-1B4D-4A9C-87DE-780F89DC0F87@tzi.org> <CE6DBF36-E47A-4794-B37E-367BA15C61C7@apple.com>
In-Reply-To: <CE6DBF36-E47A-4794-B37E-367BA15C61C7@apple.com>
Date: Sun, 05 Jul 2020 17:18:37 -0700
Message-ID: <005101d6532b$00919b90$01b4d2b0$@augustcellars.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Thread-Index: AQKqUehVnd6CTzXa4cuEuM5FqHfZ5AClSWwGAcwbHuKnPibswA==
Content-Language: en-us
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/lZa_LHStzSGwwW0fDDxn7ezmbrY>
Subject: Re: [Cbor] [Technical Errata Reported] RFC7049 (6221)
Precedence: list

We have not gone through the IETF wide Last Call - so any comments are welcome.

Jim


> -----Original Message-----
> From: Stuart Cheshire <cheshire@apple.com>
> Sent: Sunday, July 5, 2020 3:52 PM
> To: Carsten Bormann <cabo@tzi.org>; Jim Schaad <ietf@augustcellars.com>
> Cc: Paul Hoffman <paul.hoffman@vpnc.org>; The IESG <iesg@ietf.org>;
> cbor@ietf.org
> Subject: Re: [Technical Errata Reported] RFC7049 (6221)
> 
> On 4 Jul 2020, at 16:43, Carsten Bormann <cabo@tzi.org> wrote:
> 
> > Are you aware of the 7049bis effort?
> 
> I am now. If you would like my feedback I can review it and send you
> comments. I realize it would not be appropriate to make any technical changes
> at this stage, but I may think of suggestions about minor wording changes to
> improve clarity. Is draft-ietf-cbor-7049bis-14 the latest version?
> 
> 
> > So I believe there is no problem in the pseudocode (which only has a
> comment changed in the above fix).
> >
> > A couple of questions come up here:
> >
> > * Is the pseudocode possibly too clever?
> >
> > Maybe.  We even had one other misread of the pseudocode in
> https://github.com/cbor-wg/CBORbis/issues/148.
> >
> > After seven years, I’m not sure that replacing the time-tested code from RFC
> 7049 by new code doesn’t incur too much risk.
> > (In those seven years, we found we had to add one line, which checks against
> excluded two-byte forms of simple values.)
> 
> Now that I understand it properly, I see how the pseudocode works, and I see
> why I was confused.
> 
> Without comments or a definition in RFC7049 of the well_formed function, I
> was left guessing. I guessed (wrongly) that it is defined to return the major type
> of the item consumed. The new explanatory text in draft-ietf-cbor-7049bis-14
> makes it clear that it is not this simple.
> 
> Yes, the pseudocode is clever. Not too clever -- there is considerable merit in
> being able to make pseudocode that fits on one page of an RFC. I think a few
> minor tweaks can reduce misunderstandings, and still keep it fitting on a single
> page.
> 
> A minor textual suggestion. One of the function definitions below has a space
> before the opening parenthesis and one does not. I would suggest making them
> consistent.
> well_formed (breakable = false)
> well_formed_indefinite(mt, breakable)
> 
> The comments in the pseudocode use the terms “finite data item” and “finite-
> length chunk”, which are not used elsewhere in the RFC. Perhaps, to be
> consistent with the rest of the document, I suggest “definite-length item” and
> “definite-length chunk”.
> 
> One thing I would suggest to improve the clarity of the code would be not to
> overload the value zero to mean *both* that an item with major type zero was
> consumed (positive integer) *and* that an indefinite-length item was
> consumed (of any major type).
> 
> Perhaps well_formed_indefinite could return -1 when a "break" stop code is
> consumed, and -2 when an entire indefinite-length item is consumed.
> 
> Or maybe make well_formed_indefinite end with “return mt | 0x80” so that it
> *does* return the correct major type, but with the top bit set to show that it
> was an indefinite-length variant of that major type.
> 
> Also, the final comment “no break out” is a little confusing. What it really
> means is that this *is* the end of an indefinite-length item, so it really *does*
> break out of processing a complete indefinite-length item.
> 
> Maybe the comments at the end of well_formed and well_formed_indefinite
> (respectively) could be:
> 
> well_formed:
>      return mt;                    // definite-length data item
> 
> well_formed_indefinite:
>      return mt | 0x80;             // indefinite-length data item
> or
>      return -2;                    // indefinite-length data item
> 
> 
> > * Shouldn’t we allow indefinite strings as chunks in indefinite strings?
> >
> > No.  An implementation that wants to put together indefinite strings from
> strings that are (possibly) indefinite can simply take out the brackets off the
> inner ones.  The actual use case for indefinite strings is “streaming”, where you
> just don’t know how long your string will be before you have to start sending it,
> called “chunking” in other contexts.  When you do that and have an indefinite
> string to send, it is very easy to take off the brackets (0x5f/0x7f and 0xff).
> >
> > Having multiple ways to say the same thing can always lead to
> interoperability issues (and increases the cost of interoperability tests).
> >
> > Worse, some applications or implementations will start to ascribe semantics
> to the presence or absence of redundant pairs of brackets.
> >
> > Indefinite-length strings already add considerable complexity to some CBOR-
> consuming code; removing the ability to rely on only definit-length chunks
> being in there would add further complexity.
> 
> I agree with you 100% on this. The arguments you make, plus the stack usage
> point that Jim Schaad made, are all good.
> 
> However I didn’t find any explanation like this in draft-ietf-cbor-7049bis-14.
> 
> Given that the CBOR format could trivially support nested indefinite-length
> strings, there may be temptation for creative implementers to “improve”
> CBOR by allowing this. I can imagine a discussion between engineers debating
> whether to do this. Having clear text in the RFC stating that this was considered
> and rejected, and is considered unnecessary and a bad idea, and why, would
> help avoid those engineering debates ending with the wrong conclusion.
> 
> Stuart Cheshire

Re: [Cbor] [Technical Errata Reported] RFC7049 (6… Carsten Bormann
Re: [Cbor] [Technical Errata Reported] RFC7049 (6… Jim Schaad
Re: [Cbor] [Technical Errata Reported] RFC7049 (6… Stuart Cheshire
Re: [Cbor] [Technical Errata Reported] RFC7049 (6… Stuart Cheshire
Re: [Cbor] [Technical Errata Reported] RFC7049 (6… Jim Schaad
Re: [Cbor] [Technical Errata Reported] RFC7049 (6… Carsten Bormann
Re: [Cbor] [Technical Errata Reported] RFC7049 (6… Barry Leiba