Re: [apps-discuss] Concise Binary Object Representation (CBOR)

Carsten Bormann <cabo@tzi.org> Thu, 23 May 2013 17:00 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: apps-discuss@ietfa.amsl.com
Delivered-To: apps-discuss@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E2EF721F9783 for <apps-discuss@ietfa.amsl.com>; Thu, 23 May 2013 10:00:40 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -105.356
X-Spam-Level:
X-Spam-Status: No, score=-105.356 tagged_above=-999 required=5 tests=[AWL=-1.272, BAYES_00=-2.599, FF_IHOPE_YOU_SINK=2.166, HELO_EQ_DE=0.35, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 6WK-mNGmbFWC for <apps-discuss@ietfa.amsl.com>; Thu, 23 May 2013 10:00:25 -0700 (PDT)
Received: from informatik.uni-bremen.de (mailhost.informatik.uni-bremen.de [IPv6:2001:638:708:30c9::12]) by ietfa.amsl.com (Postfix) with ESMTP id BD4B821F96E7 for <apps-discuss@ietf.org>; Thu, 23 May 2013 09:54:03 -0700 (PDT)
X-Virus-Scanned: amavisd-new at informatik.uni-bremen.de
Received: from smtp-fb3.informatik.uni-bremen.de (smtp-fb3.informatik.uni-bremen.de [134.102.224.120]) by informatik.uni-bremen.de (8.14.4/8.14.4) with ESMTP id r4NGrv5x021482; Thu, 23 May 2013 18:53:57 +0200 (CEST)
Received: from [10.0.1.4] (reingewinn.informatik.uni-bremen.de [134.102.218.123]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by smtp-fb3.informatik.uni-bremen.de (Postfix) with ESMTPSA id DC83C3085; Thu, 23 May 2013 18:53:56 +0200 (CEST)
Mime-Version: 1.0 (Mac OS X Mail 6.3 \(1503\))
Content-Type: text/plain; charset="iso-8859-1"
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <CAMm+LwjM7uikYLDAZ31L2XyCgxOwzP+aa29VQe72zACQ6ttYqA@mail.gmail.com>
Date: Thu, 23 May 2013 18:53:56 +0200
Content-Transfer-Encoding: quoted-printable
Message-Id: <CBF04019-BA8F-410E-B35F-740CF3569A11@tzi.org>
References: <61CB1D18-BABC-4C77-93E6-A9E8CDA8326B@vpnc.org> <CAK3OfOhVRqUp+xn8mBj8_x8pgubc7bhWebzsFLvoj+ieWmr5gg@mail.gmail.com> <CAMm+LwjM7uikYLDAZ31L2XyCgxOwzP+aa29VQe72zACQ6ttYqA@mail.gmail.com>
To: Phillip Hallam-Baker <hallam@gmail.com>
X-Mailer: Apple Mail (2.1503)
Cc: Paul Hoffman <paul.hoffman@vpnc.org>, General discussion of application-layer protocols <apps-discuss@ietf.org>
Subject: Re: [apps-discuss] Concise Binary Object Representation (CBOR)
X-BeenThere: apps-discuss@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: General discussion of application-layer protocols <apps-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/apps-discuss>
List-Post: <mailto:apps-discuss@ietf.org>
List-Help: <mailto:apps-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 23 May 2013 17:00:41 -0000

> The only types I have found a need for in my JSON schema are:
> 
> Integer
> Float 
> String (UTF8)
> DateTime   (As RFC 3339 format string)
> Binary (as base64 encoded string)
> List (X)  - JSON List

CBOR is a pretty good match then (we also have a map/dictionary/table/JSON object).
(There is never a need to base-64 encode anything in CBOR.)

> Since most of the proposals are of the form <type> <data> where <type> and we only have 7 fundamental types it is quite easy to make the type octet combine the type and length information. So if the integer fits in 2 bytes then just use 2 bytes and mark the length in the type octet, the number 256 could be represented as:
> 
> X2 01 00

Yep.
(CBOR has a slightly different encoding of the first byte to be able to encode the number right there if it fits).

> I would use twos compliment for all values and so a 64 bit unsigned integer could potentially have 9 octets but that would be a rare event.

CBOR uses two different encoding types for unsigned and negative numbers.  See Figure 2 for an easy way to convert signed integers to that.

> A side benefit to this approach is that big numbers are easily supported.

CBOR also has a representation for big (> 64-bit) numbers (based on the byte string).

> Representing floats is a lot harder than the specs I have written seem to understand. In particular the main reason I would want a binary format for JSON is that it is NOT possible to round trip floats from decimal format to binary without special care.

Representing floats is rather easy -- IEEE 754 is well-documented.  So CBOR uses the three main binary floating point formats from that (Half, Single, Double).
If you do need (negative) base-10 exponents, we have a special Decimal fraction (inspired by YANG and EXI), trying not to use decimal mantissae.

> So the main reason for using a binary JSON is going to be not introducing the errors, possibly cumulative errors that round tripping from decimal to binary introduces.

That is one good reason (not the main one I'm interested in).

> So we do need to have 32 bit and 64 it floats.

Strictly, that is not needed, as every 32-bit float can be easily converted into a 64-bit float.
But the inverse (going down from 64 to 32 to 16, and checking whether that is lossless) is also easy.

Grüße, Carsten