[apps-discuss] binarypack: streaming, very large values, indefinite lengths

"Manger, James H" <James.H.Manger@team.telstra.com> Fri, 12 October 2012 01:49 UTC

Return-Path: <James.H.Manger@team.telstra.com>
X-Original-To: apps-discuss@ietfa.amsl.com
Delivered-To: apps-discuss@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C053621F8540 for <apps-discuss@ietfa.amsl.com>; Thu, 11 Oct 2012 18:49:31 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.862
X-Spam-Level:
X-Spam-Status: No, score=-0.862 tagged_above=-999 required=5 tests=[AWL=0.039, BAYES_00=-2.599, HELO_EQ_AU=0.377, HOST_EQ_AU=0.327, RELAY_IS_203=0.994]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id BSpB5aMwIbym for <apps-discuss@ietfa.amsl.com>; Thu, 11 Oct 2012 18:49:29 -0700 (PDT)
Received: from ipxbno.tcif.telstra.com.au (ipxbno.tcif.telstra.com.au [203.35.82.204]) by ietfa.amsl.com (Postfix) with ESMTP id 5D6CE21F853F for <apps-discuss@ietf.org>; Thu, 11 Oct 2012 18:49:27 -0700 (PDT)
X-IronPort-AV: E=Sophos;i="4.80,575,1344175200"; d="scan'208";a="95342778"
Received: from unknown (HELO ipcbni.tcif.telstra.com.au) ([10.97.216.204]) by ipobni.tcif.telstra.com.au with ESMTP; 12 Oct 2012 12:49:25 +1100
X-IronPort-AV: E=McAfee;i="5400,1158,6862"; a="93714971"
Received: from wsmsg3702.srv.dir.telstra.com ([172.49.40.170]) by ipcbni.tcif.telstra.com.au with ESMTP; 12 Oct 2012 12:49:25 +1100
Received: from WSMSG3153V.srv.dir.telstra.com ([172.49.40.159]) by WSMSG3702.srv.dir.telstra.com ([172.49.40.170]) with mapi; Fri, 12 Oct 2012 12:49:25 +1100
From: "Manger, James H" <James.H.Manger@team.telstra.com>
To: Carsten Bormann <cabo@tzi.org>, "apps-discuss@ietf.org" <apps-discuss@ietf.org>
Date: Fri, 12 Oct 2012 12:49:24 +1100
Thread-Topic: binarypack: streaming, very large values, indefinite lengths
Thread-Index: Ac2n7sLLFkHb+EXJStmIlo4F6eugbgAIhhFg
Message-ID: <255B9BB34FB7D647A506DC292726F6E114FDB91D11@WSMSG3153V.srv.dir.telstra.com>
References: <868851912C182241B686E0BD4D73BC1713B73C@xmb-aln-x08.cisco.com> <0484F1B0-2C8B-48B3-8523-CC01C0A23D48@tzi.org>
In-Reply-To: <0484F1B0-2C8B-48B3-8523-CC01C0A23D48@tzi.org>
Accept-Language: en-US, en-AU
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: en-US, en-AU
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
Subject: [apps-discuss] binarypack: streaming, very large values, indefinite lengths
X-BeenThere: apps-discuss@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: General discussion of application-layer protocols <apps-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/apps-discuss>
List-Post: <mailto:apps-discuss@ietf.org>
List-Help: <mailto:apps-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 12 Oct 2012 01:49:31 -0000

Carsten,

Binarypack [draft-bormann-apparea-bpack] looks fairly simple and efficient.

One limitation is that you have to know the length of a string, the number of items in an array, and the number of key/value pairs in a table (aka object or map) before you start encoding them.

JSON does not have this limitation.

This is likely to make binarypack unsuitable for really large values, for values created in a streaming mode, or for tools that want to filter a stream as it passes.

One solution would be to allow values of these types to be sent in pieces. A new "partial" representation for these types would be defined (using the same syntax as the representation that uses a 4-byte length field). Any number of those can be concatenated to convey a full value (terminated with a non-partial representation of the same type, or defining a special terminator).

So "Hello, World!" could be encoded as:
  D9 0000000D 48 65 6C 6C 6F 2C 20 57 6F 72 6C 64 21
Or (picking D7 to indicate a "partial" string)
  D7 00000005 48 65 6C 6C 6F
  D7 00000008 2C 20 57  6F 72 6C 64 21
  B0

P.S. I believe protocol buffers have the same scalability limitation.

P.S. ASN.1 BER supports indefinite-length encodings to overcome this limitation.

P.S. Being able to encode parts of a larger field is also useful for occasional manual debugging as you can modify a value (add/remove bytes, characters, array items, or key/value pairs) without having to recalculate nested length fields all the way up to the root of the message.

--
James Manger