Re: [Json] On flat vs nested JSON encoding style

Carsten Bormann <cabo@tzi.org> Thu, 04 February 2016 15:39 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 343CC1B31A7 for <json@ietfa.amsl.com>; Thu, 4 Feb 2016 07:39:21 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id nlvaTNJ7fcsx for <json@ietfa.amsl.com>; Thu, 4 Feb 2016 07:39:19 -0800 (PST)
Received: from relay2-d.mail.gandi.net (relay2-d.mail.gandi.net [217.70.183.194]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 5302C1B319E for <json@ietf.org>; Thu, 4 Feb 2016 07:39:19 -0800 (PST)
Received: from mfilter24-d.gandi.net (mfilter24-d.gandi.net [217.70.178.152]) by relay2-d.mail.gandi.net (Postfix) with ESMTP id E35ABC5A7B; Thu, 4 Feb 2016 16:39:17 +0100 (CET)
X-Virus-Scanned: Debian amavisd-new at mfilter24-d.gandi.net
Received: from relay2-d.mail.gandi.net ([IPv6:::ffff:217.70.183.194]) by mfilter24-d.gandi.net (mfilter24-d.gandi.net [::ffff:10.0.15.180]) (amavisd-new, port 10024) with ESMTP id eB4LaLd4On6u; Thu, 4 Feb 2016 16:39:16 +0100 (CET)
X-Originating-IP: 93.199.254.229
Received: from nar.local (p5DC7FEE5.dip0.t-ipconnect.de [93.199.254.229]) (Authenticated sender: cabo@cabo.im) by relay2-d.mail.gandi.net (Postfix) with ESMTPSA id 4B851C5AB0; Thu, 4 Feb 2016 16:39:15 +0100 (CET)
Message-ID: <56B370A1.1050508@tzi.org>
Date: Thu, 04 Feb 2016 16:39:13 +0100
From: Carsten Bormann <cabo@tzi.org>
User-Agent: Postbox 4.0.8 (Macintosh/20151105)
MIME-Version: 1.0
To: Anders Rundgren <anders.rundgren.net@gmail.com>
References: <CAMm+LwirhVcmUkdfyA3WKe_W747JTWNF1Ht2Nr8NJdDxOFCJOw@mail.gmail.com> <56B36D15.1030306@gmail.com>
In-Reply-To: <56B36D15.1030306@gmail.com>
X-Enigmail-Version: 1.2.3
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 8bit
Archived-At: <http://mailarchive.ietf.org/arch/msg/json/ErkTgVM0q9J4XKda97poaIIJKJw>
Cc: Phillip Hallam-Baker <phill@hallambaker.com>, JSON WG <json@ietf.org>
Subject: Re: [Json] On flat vs nested JSON encoding style
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 04 Feb 2016 15:39:21 -0000

Anders Rundgren wrote:
> Anyway, isn't CBOR what people are targeting for constrained devices?

Yes, but the desire to be able to do sequential (as opposed to
DOM-based) processing goes beyond constrained devices.

But talking about those:

The original design of SenML <draft-jennings-core-senml-04.txt> had a
top level map (what is called "object" in JavaScript, but then arrays
are also objects in JavaScript) of the form:

{ "metadata1": "value",
  "metadata2": 4711,
  "contents": [
     ... potentially many measurements
     ... where the interpretation of those
     ... depends on the metadata.
  ]
}

This looks deceptively sequential on a napkin, and it creates no
problems for a DOM-based receiver implementation (until the measurements
don't fit into memory any more).  For an implementation that tries to do
sequential processing, this structure is a big problem:  The metadata
may occur before or after the measurement array, depending on the whims
of the encoder, so you may not have what you need to process the
measurements sequentially at the place they occur.

Requiring the encoder to send data in a particular sequence is
theoretically possible, but in practice just means you lose that part of
the ecosystem that happens not to do that, and often you don't even get
a guarantee (order may be dictated by hash values that may be randomized
in a good implementation).

So we are moving SenML to something like:

[ { ... metadata blob ... },
  ... measurements ...
]

(the details are still being discussed).

Of course, in CBOR we could do:

{ { ... metadata blob ... }:
  [ measurements ]
}

i.e., use the metadata as the key for the map.  Which is essentially
what PHB is proposing, but that works in JSON only if the metadata
happens to be a single string (which is the case in the example given by
PHB).

Back to the discussion about representation style:

Preferring

{ "foo":
  { ... }
}

over

{ "type": "foo",
  ...
}

may or may not be compatible with the "struct"-style (see section 2 of
<draft-greevenbosch-appsawg-cbor-cddl-07.txt> for a definition of
various styles of using JSON-like data models), depending on whether
"foo" happens to be a constant in your implementation or you expect to
encounter new values as you go along.

Grüße, Carsten