[Json] On flat vs nested JSON encoding style

Phillip Hallam-Baker <phill@hallambaker.com> Thu, 04 February 2016 14:59 UTC

Return-Path: <hallam@gmail.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com []) by ietfa.amsl.com (Postfix) with ESMTP id 9A3281B30D1 for <json@ietfa.amsl.com>; Thu, 4 Feb 2016 06:59:43 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.621
X-Spam-Status: No, score=0.621 tagged_above=-999 required=5 tests=[BAYES_20=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FM_FORGED_GMAIL=0.622, FREEMAIL_FROM=0.001, SPF_PASS=-0.001] autolearn=no
Received: from mail.ietf.org ([]) by localhost (ietfa.amsl.com []) (amavisd-new, port 10024) with ESMTP id Rghjw0AdEm_L for <json@ietfa.amsl.com>; Thu, 4 Feb 2016 06:59:42 -0800 (PST)
Received: from mail-lf0-x22a.google.com (mail-lf0-x22a.google.com [IPv6:2a00:1450:4010:c07::22a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 548C71B30C6 for <json@ietf.org>; Thu, 4 Feb 2016 06:59:42 -0800 (PST)
Received: by mail-lf0-x22a.google.com with SMTP id j78so37900565lfb.1 for <json@ietf.org>; Thu, 04 Feb 2016 06:59:42 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:date:message-id:subject:from:to:content-type; bh=NOO6jrxHj/PXJKerdHwiBZs5jj5cslhOuU9S3vs4ANE=; b=yaM44Bg1O0YJGyrcaP2HPb6RavHYRv06dZtVD/lnom7PLBNKMiFl0v83oRMcP0XZma 3L4LnNaHDCda9FuOaeio2Ba31NO3MtArC4cubug9R8rXz+zeIA6IQTnEYHjQn/6AIVoJ Cm8u4CHTuVsvbUTrBpitjmmXtfxa/VjsmA2x04KbgP0mq2P+cQoWP2PdfVG/s8cMNBK7 +D2orL+cXebR5Jku+hA1N9CMqe10PbwwsaqQsbpG9ZdHPVBZDh7efmhZPi+pwVinjLAQ wBQUyXkUJr1YAmlrVDP/ft0mN+bZElOVuFhZz7IVgOKPScdm+wp2pcEnQgrfe5m4Nfqc J0CA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:date:message-id:subject:from :to:content-type; bh=NOO6jrxHj/PXJKerdHwiBZs5jj5cslhOuU9S3vs4ANE=; b=hQC9lLcl2Rj6MwnUbu6+yI3J4AUuaknn6iEePk1AtSFndEBG0kVPnWOwfk0BaXhTVq YDz+j5iqg0YtNcTPcDiSxFeMv1EgYB5P8joAQb12tJ76kz8oDDwcm1DXgSzUCFWOwbtv mVtyZNU9gtcGystkmGP6K6P/2KlkNokGwUcmph+IDy38L5SiVNaoQ6wv3wscerDGu3pn 8Bid6QinI/jAtxTXdzfJgAwOUOjVz/JgmpriCBvGmGFkfc/WCPVPHGQ9JdAICF/Od1tO 4onTHrXuHK6ukQsJb6Y+NXRRIUEf+HLfboE2MaP9R9OHCeuxAvT/zy83USbjK1nhJHFh wNxA==
X-Gm-Message-State: AG10YOSHluGCOBEl5glbRofshsPRP4YggBXzyc5otAUg3EEmOISg68lg3MkNjkgstm2vmWW8EgcV7FTc953JfA==
MIME-Version: 1.0
X-Received: by with SMTP id t194mr3065429lfe.48.1454597980608; Thu, 04 Feb 2016 06:59:40 -0800 (PST)
Sender: hallam@gmail.com
Received: by with HTTP; Thu, 4 Feb 2016 06:59:40 -0800 (PST)
Date: Thu, 04 Feb 2016 09:59:40 -0500
X-Google-Sender-Auth: i2PDX0PGuqGxc3XhOKC6298GlSY
Message-ID: <CAMm+LwirhVcmUkdfyA3WKe_W747JTWNF1Ht2Nr8NJdDxOFCJOw@mail.gmail.com>
From: Phillip Hallam-Baker <phill@hallambaker.com>
To: JSON WG <json@ietf.org>
Content-Type: text/plain; charset="UTF-8"
Archived-At: <http://mailarchive.ietf.org/arch/msg/json/iGzraXX4dwFPQ1QoGJjhuwpzTLw>
Subject: [Json] On flat vs nested JSON encoding style
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 04 Feb 2016 14:59:43 -0000

I think it would be helpful to have a style guide to encourage styles
of JSON use that impose the fewest constraints on
serialization/deserialization design.

For example, lets say that we have a protocol that has two types of
request, A and B. There are two common ways that this can be
represented in JSON:

Nested style:

{ "A" :
   { "field1" : "value1", "field2" : "value2", "field3" : "value3" } }

Flat style:

{ "Type" :"A" ,  "field1" : "value1", "field2" : "value2", "field3" : "value3" }

They both look similar, the only difference is that there is an extra
layer of nesting in the first case which means that the label "A" will
always precede the data it describes. This isn't the case for the
second which can also be written :

{ "field1" : "value1", "field2" : "value2",  "Type" : "A" , "field3" :
"value3" }

While the issue comes up with requests, it isn't limited to requests,
it can happen anywhere that you have a need to tag an object to
describe the type.

If you are using a scripting language, you parse the data to a DOM
tree and which is the in-memory representation of the data.

But those of us who use strongly typed languages like C# and Java
don't work of DOM trees. We map the data to objects in our
implementation language. and we type check and validate all our inputs
before we begin processing. Because that is best practice for
minimizing attack surface from an invalid input.

Nested style allows us to build fast, single pass deserialization
tools that require no additional buffering. A constrained device can
build the parse tree on the fly without buffering a whole message
before it starts processing.

Flat style requires us to buffer the data and do a two pass approach.
It is very much less efficient as a result.