[Cbor] Cbor-packed gem 0.1.5

Carsten Bormann <cabo@tzi.org> Thu, 31 March 2022 18:55 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 92BB23A1CCB for <cbor@ietfa.amsl.com>; Thu, 31 Mar 2022 11:55:21 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.907
X-Spam-Level:
X-Spam-Status: No, score=-1.907 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id KUgMAKGpvbho for <cbor@ietfa.amsl.com>; Thu, 31 Mar 2022 11:55:18 -0700 (PDT)
Received: from gabriel-smtp.zfn.uni-bremen.de (gabriel-smtp.zfn.uni-bremen.de [134.102.50.15]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D2AF43A1CC0 for <cbor@ietf.org>; Thu, 31 Mar 2022 11:55:17 -0700 (PDT)
Received: from [192.168.217.118] (p5089ad4f.dip0.t-ipconnect.de [80.137.173.79]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by gabriel-smtp.zfn.uni-bremen.de (Postfix) with ESMTPSA id 4KTstp5cVwzDCc5; Thu, 31 Mar 2022 20:55:14 +0200 (CEST)
From: Carsten Bormann <cabo@tzi.org>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
X-Mao-Original-Outgoing-Id: 670445714.363861-72091c825abd2b2e3bdf7b2ed18e2b71
Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.7\))
Date: Thu, 31 Mar 2022 20:55:14 +0200
Message-Id: <6ECDDEEE-C217-48CB-A3E9-D3029B32F585@tzi.org>
To: cbor@ietf.org
X-Mailer: Apple Mail (2.3608.120.23.2.7)
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/Dj29VM9iu1gIiY_MpwSIpzNlD-Q>
Subject: [Cbor] Cbor-packed gem 0.1.5
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 31 Mar 2022 18:55:22 -0000

TL;DR: gem update

I pushed an updated version of the cbor-packed gem yesterday.  The cbor-packed gem can be used with cbor-diag to pack and unpack a CBOR data item.

The bug I fixed was a bit surprising to me, and it only occurs in larger examples, so I’ll give a quick summary.

References to the 17th and further shared-item table entries are expressed using Tag 6 with an zigzag-encoded integer value, e.g.,

6(0)
references the 17th item,
6(-1)
references the 18th item,
6(1)
references the 19th item, etc.

I had implemented that correctly.

Now in a larger example, the shared item table grew larger, actually large enough that the value of that tag became large enough to itself benefit from shared item compression:

6(6(-7))
references an item that is indexed by the result of referencing another shared item via 6(-7) and reversing the zigzag encoding (see Table 1 of cbor-packed).

The result of referencing 6(-7) was 44, i.e., a two-encoding-bytes reference yielded a two-encoding-bytes value — that was bug number one: The threshold for using a shared item reference should not be that the reference is not worse than the value, but it should require that the reference be shorter than the value.  But that was just a performance bug.

Bug number two was in my code that distinguishes shared item references from prefix references (some of which also might use tag 6).  The code did not dereference the tag value of the outer 6(…) before testing for an integer tag value, so it didn’t find an integer (meaning shared item), but instead incorrectly assumed a prefix reference.

This shows the value of running contorted examples.  It also demonstrates that in a compressor some bugs only get triggered in rare cases, which might be exposed first in the vicinity of another bug.

The next step clearly is writing a fuzzer(**) for CBOR data items and running that against the cbor-packed gem.

But before that happens, you can get the bug fix now:

    gem update cbor-packed

And if you haven’t used that yet, more about how to use cbor-packed in the context of cbor-diag is in:

<https://mailarchive.ietf.org/arch/msg/cbor/x0lFCFZ5j6N21JWDiABHQNn0ccA>

Grüße, Carsten

(**): Fun task: making the fuzzer “sticky” (non-random) enough so the packer actually finds redundancy that it can do something with...

PS.: An interesting distraction when chasing the bug was that my long and contrived test data had occurrences of 19 00FF (1+2-byte representation of 255), which of course became 18 FF (1+1-byte representation) after any kind of processing, so things never seemed to match up before I looked more closely.  Does anyone remember which CBOR implementation had that off-by-one error?