Re: [Json] serializing sequences of JSON values

Phillip Hallam-Baker <hallam@gmail.com> Mon, 10 March 2014 18:47 UTC

Return-Path: <hallam@gmail.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 793E11A050E for <json@ietfa.amsl.com>; Mon, 10 Mar 2014 11:47:52 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.999
X-Spam-Level:
X-Spam-Status: No, score=-1.999 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id xKblj_zK50nL for <json@ietfa.amsl.com>; Mon, 10 Mar 2014 11:47:50 -0700 (PDT)
Received: from mail-la0-x229.google.com (mail-la0-x229.google.com [IPv6:2a00:1450:4010:c03::229]) by ietfa.amsl.com (Postfix) with ESMTP id 0825C1A03FC for <json@ietf.org>; Mon, 10 Mar 2014 11:47:49 -0700 (PDT)
Received: by mail-la0-f41.google.com with SMTP id gl10so5033918lab.0 for <json@ietf.org>; Mon, 10 Mar 2014 11:47:44 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=c7RAmR2f6VBl4L5ko76M9NH6apGPtWdrYEPVMXBsTKo=; b=mIQGtj75CKrcVzek4/+hsyA3TPiktFDA/q6iGIYrM0Od4sMbzUsyzfVcebykAmLZ8d oR8bCTVfyqWPq3075s3zpuVH7P/psHbks7NXBrfnyVlWD9hniGq8UsgPBoBRK/8ixdTu pyy2sDuOMEZb0pkTBDaoxZ1KM42UkLuwvnCb4e6U2Pzg9h+Qk7yHVhx0ZPZr9GZ/nkXH G4gPCw0wS13cnA2HEM+vvuf+aaiqVKxJdsa3IIKPtsZj2UMKch0697b7OilGNwTVDplk oilHQTXLxo1xtwxrryGdY4ch9x7HAR87w3Afv2c9x8RsX3ZCjeETXs0z8ehVbAcpHi/J EHHg==
MIME-Version: 1.0
X-Received: by 10.152.209.70 with SMTP id mk6mr12462516lac.13.1394477264008; Mon, 10 Mar 2014 11:47:44 -0700 (PDT)
Received: by 10.112.37.168 with HTTP; Mon, 10 Mar 2014 11:47:43 -0700 (PDT)
In-Reply-To: <CAO1wJ5SLFoUSGoyM4WZa+r2Sf1A_-9e1DmUtRQqfx0UT77VXTA@mail.gmail.com>
References: <c19534113ff9489abcc4402fae3c1f62@BL2PR02MB307.namprd02.prod.outlook.com> <CAK3OfOigDS2CizGAtdRaQWbSgiJqw0Ogi-TPkWv35GjPB=CQGw@mail.gmail.com> <CAO1wJ5SLFoUSGoyM4WZa+r2Sf1A_-9e1DmUtRQqfx0UT77VXTA@mail.gmail.com>
Date: Mon, 10 Mar 2014 14:47:43 -0400
Message-ID: <CAMm+Lwg9ZCBQ8QsAE4+8rbywFUPpB9tEDzdb3C34J7cBfqu_Qw@mail.gmail.com>
From: Phillip Hallam-Baker <hallam@gmail.com>
To: Jacob Davies <jacob@well.com>
Content-Type: multipart/alternative; boundary=001a11380246f7fe7d04f44509bb
Archived-At: http://mailarchive.ietf.org/arch/msg/json/IhLjlXGaW4QpQbwYtPWVeEA7khg
X-Mailman-Approved-At: Mon, 10 Mar 2014 12:17:21 -0700
Cc: Pete Resnick <presnick@qti.qualcomm.com>, Bjoern Hoehrmann <derhoermi@gmx.net>, Paul Hoffman <paul.hoffman@vpnc.org>, Larry Masinter <masinter@adobe.com>, "json@ietf.org" <json@ietf.org>, "Paul E. Jones" <paulej@packetizer.com>, Tim Bray <tbray@textuality.com>, "Joe Hildebrand \(jhildebr\)" <jhildebr@cisco.com>, "Matt Miller \(mamille2\)" <mamille2@cisco.com>, Nico Williams <nico@cryptonector.com>, Barry Leiba <barryleiba@computer.org>, "rfc7158@schmorp.de" <rfc7158@schmorp.de>
Subject: Re: [Json] serializing sequences of JSON values
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 10 Mar 2014 18:47:52 -0000

On Mon, Mar 10, 2014 at 2:14 PM, Jacob Davies <jacob@well.com> wrote:

> Does either the old or new specification say anything about multiple
> values in sequence? I'm pretty darn sure the old version at least said
> that application/json documents contain one and only one value, and
> says nothing about serializing JSON in other formats, so encapsulation
> is not addressed. I also don't see how any process that expects to
> read sequences of self-delimited JSON objects and arrays is going to
> blithely accept numbers and strings and whatnot.


True, but there are multiple ways that sequences can be fashioned and this
is a standards organization so we should pick exactly one separator.

The log file use case is critical for me. The stupidity of the restriction
in XML is tedious enough without repeating it in the JSON world. People
want to write append only logs of information. So the sequence syntax does
not work. Having to back up and erase the last closing square brace makes
the logging process slower by an order of magnitude.


At the moment I only log full objects and so there is no ambiguity. But I
would much prefer to have a more general approach. I see the following
options:

A) Observe an implicit delimiter on sequences [...][...] or objects
{...}{...}

B) Observe White space as a delimiter

C) Observe a comma as a delimiter


These are not disjoint sets. I think I am going to change my code to accept
A, B or C. However at the moment I only generate sequences consistent with
A.

I think that C is the approach that fits best with JSON. For better or
worse, JSON uses the comma as an item separator and does not ignore
superfluous entries. So this requires a little more state handling than
absolutely necessary.

My vote would be to require C for the log file use case for consistency but
as I say, I will change my code to accept A, B or C.


I find the idea
> unlikely that there is some running code that 1. expects sequences of
> JSON values with no delimiters, 2. uses a parser that precisely
> followed the JSON specification and rejected anything but JSON objects
> and arrays prior to this change, 3. has that parser change under it to
> accept all kinds of values and 4. is so indifferent to the contents of
> what it is processing that it doesn't notice that it's getting numbers
> and strings instead of objects or arrays.


I am not sure what 3 or 4 mean. But I have 1 and 2 already. The parser
returns each time that a full object is read. If there is nothing to be
read then it returns a null object. The encoder updates the log using
append only writes that can be flagged as atomic on most O/S.

-- 
Website: http://hallambaker.com/