Re: [Cbor] draft-bormann-cbor-sequence-00 and Big Data

Sean Leonard <dev+ietf@seantek.com> Mon, 22 July 2019 21:06 UTC

Return-Path: <dev+ietf@seantek.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0F8491200A3 for <cbor@ietfa.amsl.com>; Mon, 22 Jul 2019 14:06:25 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.597
X-Spam-Level:
X-Spam-Status: No, score=-2.597 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ddyvA2tbqnnt for <cbor@ietfa.amsl.com>; Mon, 22 Jul 2019 14:06:22 -0700 (PDT)
Received: from relay8-d.mail.gandi.net (relay8-d.mail.gandi.net [217.70.183.201]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 743631200B8 for <cbor@ietf.org>; Mon, 22 Jul 2019 14:06:22 -0700 (PDT)
X-Originating-IP: 174.65.80.226
Received: from [192.168.122.118] (ip174-65-80-226.sd.sd.cox.net [174.65.80.226]) (Authenticated sender: sean@seantek.org) by relay8-d.mail.gandi.net (Postfix) with ESMTPSA id 591671BF205; Mon, 22 Jul 2019 21:06:18 +0000 (UTC)
Content-Type: text/plain; charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.8\))
From: Sean Leonard <dev+ietf@seantek.com>
In-Reply-To: <E3329E7E-7112-40FC-81CF-090EED6687C7@tzi.org>
Date: Mon, 22 Jul 2019 14:06:16 -0700
Cc: Burt Harris <burt_harris@hotmail.com>, "cbor@ietf.org" <cbor@ietf.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <EE4A37BF-E2EF-45F7-A20D-DF97F30EA004@seantek.com>
References: <MWHPR22MB0336C10E7C7D2C4F2146D4FA92E10@MWHPR22MB0336.namprd22.prod.outlook.com> <E3329E7E-7112-40FC-81CF-090EED6687C7@tzi.org>
To: Carsten Bormann <cabo@tzi.org>
X-Mailer: Apple Mail (2.3445.104.8)
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/B0AZcPwjfxdOZkiTY5WMDKoVojM>
Subject: Re: [Cbor] draft-bormann-cbor-sequence-00 and Big Data
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jul 2019 21:06:25 -0000

I agree with Carsten that “CBOR sequence” is fine as-is.

I think it is inspired by JSON sequences, with the same semantics: the order is the order that the sender put them in as the sender was outputting them into the data item. I.e., “preserve the sequence” at zero cost. Whether there is some greater meaning depends on the application.

The order is something that should be preserved from one sequence to another sequence (streaming in, streaming out), but I do not believe that this needs to be called out in the draft. If you reorder the elements in the sequence, the hash of the cbor-sequence would be expected to change.

Sean

> On Jun 22, 2019, at 10:43 PM, Carsten Bormann <cabo@tzi.org>; wrote:
> 
> Hi Burt,
> 
> thanks for your kind words on RFC 8610 and CBOR sequences!
> 
> Whether the sequence-preserving property of CBOR sequences is useful for your application or not depends on the application.  CBOR sequences do not take a position on this, they just preserve the sequence (because they can, at approximately zero cost).  This is exactly as with CBOR arrays, where it also depends on the application whether the actual sequence means something or not.  For serialization, they need to be in a specific sequence, so we might as well make that available to the application.  There are applications that use CBOR sequences in a way that cannot really be called a stream, so I think we are better off with the “sequence” name.
> 
> Tagging a CBOR sequence as such is a bit difficult because logically a CBOR tag would be tagging the first item in the sequence.  CBOR sequences are best used in a context where it is clear what they are (e.g., via a media type); I tend to think of them like CBOR arrays that have been taken out of their package…
> 
> Any special split markers can be provided by simply putting CBOR data items for those markers into the CBOR sequence.  And, yes, such an application might be a good reason to allocate a CBOR Simple value or two.  We don’t need an equivalent for YAML “…”, though, because the CBOR data items themselves are self-delimiting.
> 
> Grüße, Carsten
> 
> _______________________________________________
> CBOR mailing list
> CBOR@ietf.org
> https://www.ietf.org/mailman/listinfo/cbor