Re: [Cbor] correctness of implied top level array?

Carsten Bormann <cabo@tzi.org> Thu, 28 February 2019 20:47 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1948F130FD3 for <cbor@ietfa.amsl.com>; Thu, 28 Feb 2019 12:47:04 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.199
X-Spam-Level:
X-Spam-Status: No, score=-4.199 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id b7oZlZlTvUrg for <cbor@ietfa.amsl.com>; Thu, 28 Feb 2019 12:47:01 -0800 (PST)
Received: from mailhost.informatik.uni-bremen.de (mailhost.informatik.uni-bremen.de [IPv6:2001:638:708:30c9::12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 7BD82130FCE for <cbor@ietf.org>; Thu, 28 Feb 2019 12:47:01 -0800 (PST)
X-Virus-Scanned: amavisd-new at informatik.uni-bremen.de
Received: from submithost.informatik.uni-bremen.de (submithost2.informatik.uni-bremen.de [134.102.200.7]) by mailhost.informatik.uni-bremen.de (8.14.5/8.14.5) with ESMTP id x1SKkqDf016670; Thu, 28 Feb 2019 21:46:57 +0100 (CET)
Received: from [192.168.217.106] (p54A6C2FE.dip0.t-ipconnect.de [84.166.194.254]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by submithost.informatik.uni-bremen.de (Postfix) with ESMTPSA id 449Pjm0Gh1z1Bp8; Thu, 28 Feb 2019 21:46:51 +0100 (CET)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.1\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <01396AC3-0EDE-4AEC-B60E-1274B9E66C52@mozilla.com>
Date: Thu, 28 Feb 2019 21:46:50 +0100
Cc: Michael Richardson <mcr+ietf@sandelman.ca>, cbor@ietf.org, Laurence Lundblade <lgl@island-resort.com>
X-Mao-Original-Outgoing-Id: 573079608.596424-c4a42901768017ceaa3bdb1f72f67783
Content-Transfer-Encoding: quoted-printable
Message-Id: <EBB5C10D-5232-4BED-9061-FE28FD5B5534@tzi.org>
References: <81789050-5133-48B0-BEE7-4F1E0BBB4C06@island-resort.com> <40A3B694-80A4-4AD7-A2A6-C071C6E88D2D@tzi.org> <F0A06813-3F1F-4D53-80A1-4CBBBB91DC64@island-resort.com> <0A96C82A-85DB-411D-812D-5A3479A8EA87@mozilla.com> <052FFFD1-6145-4451-91A0-B07ED0AEC726@tzi.org> <9644.1551315204@localhost> <01396AC3-0EDE-4AEC-B60E-1274B9E66C52@mozilla.com>
To: Joe Hildebrand <jhildebrand@mozilla.com>
X-Mailer: Apple Mail (2.3445.9.1)
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/G-aFbTp7BUhFgPqhbTepqyQkZck>
Subject: Re: [Cbor] correctness of implied top level array?
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 28 Feb 2019 20:47:04 -0000

> On Feb 28, 2019, at 20:23, Joe Hildebrand <jhildebrand@mozilla.com> wrote:
> 
>> On Feb 27, 2019, at 5:53 PM, Michael Richardson <mcr+ietf@sandelman.ca> wrote:
>> 
>> An alternative way to do the desired action (and bring this inside of cbor)
>> would be an (indefinite?) array type that was specified to concatenate.
>> 
>> I'm not really arguing for this, but I think it's worth knowing why this
>> would be less good a thing.
> 
> It's an interesting idea.  One of the reasons this didn't feel right to me was that my initial take on indefinite-length arrays was to read the whole array, growing memory as needed.  I've moved on from that, but would expect others to find themselves in a similar spot.
> 
> More interestingly though, there are times when I might want to use an optional external length-framing approach, like embedding a single CBOR data item in a WebSocket message.

It is hard for a format like CBOR to take in external length information — data items need to be self-delimiting *within* the tree, so taking in that information at the top level would mean that a different encoding would be needed there.

CBOR sequences are actually exactly that, for the specific case of a top-level array.
So this is your “array type that was specified to concatenate” — except that this cannot be used within a CBOR data item ([[1, 2], [3, 4]] ≠ [[1, 2, 3, 4]]), only right on top.

Whether a single data item or a CBOR sequence is expected is indicated by meta information, such as the Content-Type.  We could define more things like CBOR maps, CBOR strings (hey, we already have ct 0 for text and 42 for bytes).  This is becoming complicated quickly.

CBOR sequences stand out because they are often exactly what is needed for streaming, and for flexible storage of partial streams in containers that can then simply be concatenated.

Next stop: Getting a CDDL spec to define top-level sequences…  (Right now, we simply describe a top-level array and add English-language information that this is to be encoded as a sequence.  Quite similar to how .cborseq works inside the spec.)

Grüße, Carsten