Re: [Cbor] [Technical Errata Reported] RFC7049 (6221)

Jim Schaad <ietf@augustcellars.com> Sun, 05 July 2020 02:41 UTC

Return-Path: <ietf@augustcellars.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0D4A63A0809; Sat, 4 Jul 2020 19:41:05 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4lIhlob2vOR1; Sat, 4 Jul 2020 19:41:02 -0700 (PDT)
Received: from mail2.augustcellars.com (augustcellars.com [50.45.239.150]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1D24D3A0808; Sat, 4 Jul 2020 19:41:01 -0700 (PDT)
Received: from Jude (73.180.8.170) by mail2.augustcellars.com (192.168.0.56) with Microsoft SMTP Server (TLS) id 15.0.1395.4; Sat, 4 Jul 2020 19:40:36 -0700
From: Jim Schaad <ietf@augustcellars.com>
To: 'Carsten Bormann' <cabo@tzi.org>, 'RFC Errata System' <rfc-editor@rfc-editor.org>
CC: cheshire@apple.com, cbor@ietf.org, 'Paul Hoffman' <paul.hoffman@vpnc.org>, iesg@ietf.org
References: <20200704225242.3264EF406D5@rfc-editor.org> <25ADFCDD-1B4D-4A9C-87DE-780F89DC0F87@tzi.org>
In-Reply-To: <25ADFCDD-1B4D-4A9C-87DE-780F89DC0F87@tzi.org>
Date: Sat, 04 Jul 2020 19:40:33 -0700
Message-ID: <003901d65275$aa6e87b0$ff4b9710$@augustcellars.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
X-Mailer: Microsoft Outlook 16.0
Thread-Index: AQKqUehVnd6CTzXa4cuEuM5FqHfZ5AClSWwGp0scylA=
Content-Language: en-us
X-Originating-IP: [73.180.8.170]
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/o0lUnuB-pWjAPbCeSN3mV4o0dto>
Subject: Re: [Cbor] [Technical Errata Reported] RFC7049 (6221)
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 05 Jul 2020 02:41:05 -0000


> -----Original Message-----
> From: CBOR <cbor-bounces@ietf.org> On Behalf Of Carsten Bormann
> Sent: Saturday, July 4, 2020 4:44 PM
> To: RFC Errata System <rfc-editor@rfc-editor.org>
> Cc: cheshire@apple.com; cbor@ietf.org; Paul Hoffman
> <paul.hoffman@vpnc.org>; iesg@ietf.org
> Subject: Re: [Cbor] [Technical Errata Reported] RFC7049 (6221)
> 
> Hi Stuart,
> 
> thank you for this errata report.
> 
> Are you aware of the 7049bis effort?
> That document was submitted to the IESG a couple of weeks ago.
> 
> We had exactly that same discussion in https://github.com/cbor-
> wg/CBORbis/issues/96, and added https://github.com/cbor-
> wg/CBORbis/pull/101/files as a fix.
> 
> > Alternatively, the “well_formed” function could return an indication
> > of whether the item it determined to be well-formed was in fact a
> > definite-length item or an indefinite-length item, and that would then
> > be checked on the “if (it != mt)” line.
> 
> Which is exactly what is there, and what is explained better now with the above
> fix.
> 
> So I believe there is no problem in the pseudocode (which only has a comment
> changed in the above fix).
> 
> A couple of questions come up here:
> 
> * Is the pseudocode possibly too clever?
> 
> Maybe.  We even had one other misread of the pseudocode in
> https://github.com/cbor-wg/CBORbis/issues/148.
> 
> After seven years, I’m not sure that replacing the time-tested code from RFC
> 7049 by new code doesn’t incur too much risk.
> (In those seven years, we found we had to add one line, which checks against
> excluded two-byte forms of simple values.)
> 
> * Shouldn’t we allow indefinite strings as chunks in indefinite strings?
> 
> No.  An implementation that wants to put together indefinite strings from
> strings that are (possibly) indefinite can simply take out the brackets off the
> inner ones.  The actual use case for indefinite strings is “streaming”, where you
> just don’t know how long your string will be before you have to start sending it,
> called “chunking” in other contexts.  When you do that and have an indefinite
> string to send, it is very easy to take off the brackets (0x5f/0x7f and 0xff).
> 
> Having multiple ways to say the same thing can always lead to interoperability
> issues (and increases the cost of interoperability tests).
> 
> Worse, some applications or implementations will start to ascribe semantics to
> the presence or absence of redundant pairs of brackets.
> 
> Indefinite-length strings already add considerable complexity to some CBOR-
> consuming code; removing the ability to rely on only definit-length chunks
> being in there would add further complexity.

Even worse, the ability to have nested indefinite-length objects leads to stack overflow problems as it can just go on indefinitely.   Many implementations of ASN.1 which support this feature have had bugs reported against them.

Jim

> 
> * Why doesn’t your own code check for this?
> 
> Touché.  In my backlog...
> 
> Grüße, Carsten
> 
> 
> > On 2020-07-05, at 00:52, RFC Errata System <rfc-editor@rfc-editor.org>
> wrote:
> >
> > The following errata report has been submitted for RFC7049, "Concise
> > Binary Object Representation (CBOR)".
> >
> > --------------------------------------
> > You may review the report below and at:
> > https://www.rfc-editor.org/errata/eid6221
> >
> > --------------------------------------
> > Type: Technical
> > Reported by: Stuart Cheshire <cheshire@apple.com>
> >
> > Section: Appendix C
> >
> > Original Text
> > -------------
> >   well_formed_indefinite(mt, breakable) {
> >     switch (mt) {
> >       case 2: case 3:
> >         while ((it = well_formed(true)) != -1)
> >           if (it != mt)           // need finite embedded
> >             fail();               //    of same type
> >         break;
> >
> > Corrected Text
> > --------------
> > Various possible fixes exist; however, none can be expressed briefly
> > in just a few lines of replacement text.
> >
> > Possibly the “well_formed” function could be changed to take an
> > additional parameter indicating whether indefinite length items are
> > allowed.
> >
> > Alternatively, the “well_formed” function could return an indication
> > of whether the item it determined to be well-formed was in fact a
> > definite-length item or an indefinite-length item, and that would then
> > be checked on the “if (it != mt)” line.
> >
> > Notes
> > -----
> > Appendix C gives pseudocode for verifying whether a CBOR item is well-
> formed. However, it does not match the rules specified elsewhere in the RFC.
> >
> > The normative text in the body of the RFC states:
> >
> >>   Indefinite-length byte strings and text strings are actually a
> >>   concatenation of zero or more definite-length byte or text strings
> >>   ("chunks") that are together treated as one contiguous string.
> >
> > The restrictive term “definite-length” is crucial there, and the pseudocode in
> Appendix C disregards this.
> >
> > The pseudocode deems an indefinite-length byte/text string to be valid if it
> consists of a sequence of constituent items of the same type that are definite-
> length *or* indefinite-length.
> >
> > It could be argued that the pseudocode is actually more sensible. Since an
> indefinite-length string item is the concatenation of zero or more well-formed
> string items, why shouldn’t it allow some of those well-formed string items to
> themselves be indefinite-length strings? There is no clear argument why not. If
> indefinite-length strings within other items are a useful concept in the first
> place, why wouldn’t they also be a useful concept within indefinite-length string
> items? Perhaps there is a library that outputs indefinite-length strings, and you
> want to use that library to output a string that is itself part of a larger
> indefinite-length string?
> >
> > Indeed, a decoder applying Postel’s Robustness Principle (be liberal with what
> you accept) may choose to allow indefinite-length strings containing indefinite-
> length strings, nested arbitrarily deeply, since the intent, meaning, encoding
> and decoding of such items is clear and unambiguous.
> >
> > However, such an implementation, applying the Robustness Principle, would
> accept input deemed illegal by the specification. Different implementations,
> some more liberal and some less, would decode the same input differently.
> >
> > In the case where security policy is decided using one decoder
> implementation, and acting on the message is handled by another decoder
> implementation, this could result in security failures.
> >
> > Possibly the normative text of the format specification could be changed to
> match the pseudocode, allowing any well-formed string of the right type
> (definite-length or indefinite-length) to be a constituent within an indefinite-
> length string.
> >
> > Alternatively, if this is not desirable, it would be helpful for the specification
> text to explain, in unambiguous terms, why the designers chose to disallow
> indefinite-length strings to be used as constituents within an indefinite-length
> string. This would discourage creative implementers from feeling that it would
> be beneficial to be liberal and accept this input anyway, and would reduce the
> risk of inconsistent parsing by different implementations. In this case, the
> pseudocode should be corrected, to avoid implementers basing their code on
> that and getting incorrect implementations as a result.
> >
> > Instructions:
> > -------------
> > This erratum is currently posted as "Reported". If necessary, please
> > use "Reply All" to discuss whether it should be verified or rejected.
> > When a decision is reached, the verifying party can log in to change
> > the status and edit the report, if necessary.
> >
> > --------------------------------------
> > RFC7049 (draft-bormann-cbor-09)
> > --------------------------------------
> > Title               : Concise Binary Object Representation (CBOR)
> > Publication Date    : October 2013
> > Author(s)           : C. Bormann, P. Hoffman
> > Category            : PROPOSED STANDARD
> > Source              : IETF - NON WORKING GROUP
> > Area                : N/A
> > Stream              : IETF
> > Verifying Party     : IESG
> 
> _______________________________________________
> CBOR mailing list
> CBOR@ietf.org
> https://www.ietf.org/mailman/listinfo/cbor