Re: [Cbor] List of not-well-formed CBOR and test vectors

Thiago Macieira <thiago.macieira@intel.com> Mon, 29 July 2019 19:24 UTC

Return-Path: <thiago.macieira@intel.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0243E120026 for <cbor@ietfa.amsl.com>; Mon, 29 Jul 2019 12:24:51 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.9
X-Spam-Level:
X-Spam-Status: No, score=-6.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_HI=-5, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zwmvdDHaBfy5 for <cbor@ietfa.amsl.com>; Mon, 29 Jul 2019 12:24:49 -0700 (PDT)
Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 3E66E120025 for <cbor@ietf.org>; Mon, 29 Jul 2019 12:24:49 -0700 (PDT)
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 29 Jul 2019 12:24:48 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.64,324,1559545200"; d="scan'208";a="182865813"
Received: from orsmsx101.amr.corp.intel.com ([10.22.225.128]) by orsmga002.jf.intel.com with ESMTP; 29 Jul 2019 12:24:48 -0700
Received: from tjmaciei-mobl1.localnet (10.54.75.14) by ORSMSX101.amr.corp.intel.com (10.22.225.128) with Microsoft SMTP Server (TLS) id 14.3.439.0; Mon, 29 Jul 2019 12:24:47 -0700
From: Thiago Macieira <thiago.macieira@intel.com>
To: cbor@ietf.org
CC: Laurence Lundblade <lgl@island-resort.com>
Date: Mon, 29 Jul 2019 12:24:47 -0700
Message-ID: <9430055.7DLcDZMovz@tjmaciei-mobl1>
Organization: Intel Corporation
In-Reply-To: <CF3F871E-7489-4770-B2FE-1746C392ACF0@island-resort.com>
References: <CF3F871E-7489-4770-B2FE-1746C392ACF0@island-resort.com>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="UTF-8"
X-Originating-IP: [10.54.75.14]
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/cOf0v0fxFlp9c0WfgEf6jvLc_dM>
Subject: Re: [Cbor] List of not-well-formed CBOR and test vectors
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 29 Jul 2019 19:24:51 -0000

On Monday, 29 July 2019 11:49:18 PDT Laurence Lundblade wrote:
> I think I’ve made a comprehensive list of all things that are not well
> formed by going through the latest draft and my decoder. There are about a
> dozen of them. I’ve also created 110 test vectors that cover them pretty
> thoroughly.
> 
> Everything is here
> <https://github.com/laurencelundblade/QCBOR/blob/not_well_formed/test/not_w
> ell_formed_cbor.h> in a C header. The dozen types of non-well-formedness are
> listed as comments in the header file. The test vectors are in an array
> that can be used for testing. It is BSD-3 license.
> 
> I’ve turned up one bug in the RFC’s pseudo code. It doesn’t catch an
> indefinite length string as a segment in another indefinite length string.
> 
> I’d like to get some review, some folks to try it out and such to see if
> I’ve missed anything and all is right. When that is done I’ll make a pull
> request for the draft out it. Probably in about two weeks.

Hello Laurence

I'll add your test vectors to TinyCBOR soon and see if there's anything I 
didn't catch. You're welcome to do the same with my test data, see:
https://github.com/intel/tinycbor/blob/dev/tests/parser/tst_parser.cpp#L1538-L1767

There are 194 entries in that list, though some of them are slightly duplicate 
of one another. There are a couple that also test implementation limits, like 
strings bigger than half your machine's address space. Some others aren't 
testing invalid CBOR, but common parsing mistakes like reading a size of -1 or 
overflowing counters / pointers.

Another that I came across but isn't seen in the test list was that my buffer 
was always followed by a NUL byte, which masked one read-past-the-end. The 
trick for that test is in
https://github.com/intel/tinycbor/blob/dev/tests/parser/tst_parser.cpp#L159-L192
which always places the data to be parsed at the end of a page, followed by a 
page with no read access.

And if you look at https://github.com/intel/tinycbor/blob/dev/tests/parser/
tst_parser.cpp#L1796-L2155. you'll see an extensive list of strict, canonical 
mode and JSON-compatibility testing.

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel System Software Products