Re: [Json] Regarding JSON text sequence ambiguities (Re: serializing sequences of JSON values)

Tim Bray <tbray@textuality.com> Tue, 11 March 2014 18:35 UTC

Return-Path: <tbray@textuality.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3632C1A0788 for <json@ietfa.amsl.com>; Tue, 11 Mar 2014 11:35:52 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.977
X-Spam-Level:
X-Spam-Status: No, score=-1.977 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id tXEtVEL9vNXP for <json@ietfa.amsl.com>; Tue, 11 Mar 2014 11:35:44 -0700 (PDT)
Received: from mail-ve0-f169.google.com (mail-ve0-f169.google.com [209.85.128.169]) by ietfa.amsl.com (Postfix) with ESMTP id 12BE31A0743 for <json@ietf.org>; Tue, 11 Mar 2014 11:35:43 -0700 (PDT)
Received: by mail-ve0-f169.google.com with SMTP id pa12so9278990veb.0 for <json@ietf.org>; Tue, 11 Mar 2014 11:35:38 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=7YgnbLhVjgsnlvkA4zrfx7z9CF8LT9kzJtq7V2zpSwQ=; b=VKOVKc0EmvQBRGzibZudQo7iVG/upvS6cjvsu2c3yCXbYB/GdgP6h18Oc3ItQ2Xfbc 0o0vFFhbrvIrf5bJmbqZjpH+ppWANidWlR/Zy0BEPXEO3y7f/7iQZqverXOlJ5RCIwr2 wy6IMInzqaNsDWySY1H5AgjjP8ymFKmJnHNuFjmpcxrow+lGHwUoUrjOPEOgb+QSyVW1 GWeRrcXpyYsimnz97uh8dWOmlR4vr9yW7um/5DPrj8zm80Hb8xVlcrTrTVjD3bmVRwtG CoViKLYcimstq9AfoPvq9upoiyLOP12R0+lLXg/fBNVHcoUekwv2Gx05fJq12RBYFFYL zkiQ==
X-Gm-Message-State: ALoCoQm+in/JuczhKReClIRT9snhzWr4N/MPAp5u4AE/4Nch7Vh9nFGAkInb6i8hRet4bG6dVBwP
MIME-Version: 1.0
X-Received: by 10.58.212.200 with SMTP id nm8mr20654990vec.11.1394562938068; Tue, 11 Mar 2014 11:35:38 -0700 (PDT)
Received: by 10.220.98.73 with HTTP; Tue, 11 Mar 2014 11:35:37 -0700 (PDT)
X-Originating-IP: [96.49.81.176]
In-Reply-To: <CAOXDeqpbSmEicxq_JzJa2iQDn8uJp3XkWp3FGbsbpg-_vgOiaQ@mail.gmail.com>
References: <CAK3OfOj_XQJq-JKAjNdH-GuH0_UwZfeWntgyyizMpTLmSaWQoA@mail.gmail.com> <CAK3OfOio58+1yuxQOcvWep1CADMfE1PVC48XDid0dWvd8=SVjA@mail.gmail.com> <CAOXDeqoYb=NXz4ikMxAg3EHFA+903bFgdpR_BL-K18U2oYriXQ@mail.gmail.com> <CAK3OfOiPDfWpOZgExTmwwq6WFcuVbyi_z3C0=M9RhQveBhV_+w@mail.gmail.com> <CAHBU6iuRyRd95Wa_omGS1_T52t+s0AKjWPUW21EAh2ySHuFp=A@mail.gmail.com> <CAMm+LwjRA8x0=zXGRVDy0BqYvyOcEp7=gnUiG4vYOb1RScoyrA@mail.gmail.com> <CAK3OfOj1g_sbnhw9FBCCZtLWsFS5F+aoPX0d5AMkRxQ2fHQi0A@mail.gmail.com> <CAOXDeqpbSmEicxq_JzJa2iQDn8uJp3XkWp3FGbsbpg-_vgOiaQ@mail.gmail.com>
Date: Tue, 11 Mar 2014 11:35:37 -0700
Message-ID: <CAHBU6itdCdJE3t8gE=AOcOofaFORfJxZxo3ZqbpF9nTMv_CYaQ@mail.gmail.com>
From: Tim Bray <tbray@textuality.com>
To: Matthew Morley <matt@mpcm.com>
Content-Type: multipart/alternative; boundary=047d7bd6aab28a742704f458fc7f
Archived-At: http://mailarchive.ietf.org/arch/msg/json/oX6ak33OYZjH2fu8JlZ-4AYMCQg
Cc: Nico Williams <nico@cryptonector.com>, Phillip Hallam-Baker <hallam@gmail.com>, Paul Hoffman <paul.hoffman@vpnc.org>, "json@ietf.org" <json@ietf.org>
Subject: Re: [Json] Regarding JSON text sequence ambiguities (Re: serializing sequences of JSON values)
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 11 Mar 2014 18:35:52 -0000

My assumption is that many (most? all?) JSON parsers can be altered to,
when they encounter a JSON text containing an object, just stop reading
when they hit the trailing “}”.  So my notion is that using such a parser
you have a loop like

while !eof
   obj = parser.ReadObjectAndStop() // parse error handling left as an
exercise for the reader
   doWhateverWith(obj)
   eatZeroOrMoreWhiteSpaceCharacters()  // assumes some sort of peekChar()
end

So yeah, requiring a SINGLE NEWLINE AND NOTHING ELSE would be simpler.  But
if you’re going to allow any other non-significant white space characters,
 then why bother requiring that one of them be a newline?


On Tue, Mar 11, 2014 at 11:27 AM, Matthew Morley <matt@mpcm.com> wrote:

> I'm not advocating for comma separators...
>
> But having multiple top level JSON elements separated by a coma is
> equivalent to processing an array structure. The initial [ and the closing
> ] are implicitly mapped to the connection/stream/etc. start and end events.
>
> It is just a minor token replacement at the top level between elements,
> which could be layered into some existing tooling. From this point of view,
> I would imagine the retooling is minor for either use case. It does mean
> tools need to be depth aware.
>
>
> On Tue, Mar 11, 2014 at 2:08 PM, Nico Williams <nico@cryptonector.com>wrote;wrote:
>
>> On Tue, Mar 11, 2014 at 12:09 PM, Phillip Hallam-Baker <hallam@gmail.com>
>> wrote:
>> > On Tue, Mar 11, 2014 at 12:48 PM, Tim Bray <tbray@textuality.com>
>> wrote:
>> >>
>> >> Heh, I wonder if there’d be any chance of getting consensus.  I can’t
>> >> imagine ever using anything but Object Object Object with optional
>> >> whitespace separator; unless we all agree on that going in I’d
>> pessimistic
>> >> about anyone convincing anyone else...
>> >
>> > But JSON has comma separators, so {..}, {..}, {..} makes far more sense.
>>
>> JSON text sequences would be a new Proposed Standard (if we go there)
>> but like JSON, there exist uses of this "new" thing already -- that
>> is, before we get to writing the RFC.
>>
>> The uses of JSON text sequences that I know of use newlines, not
>> commas nor comma-and-newline.  The reason for this is that these use
>> cases are text logfile-like: the entries are lines, lines containing
>> JSON texts -- usually compact texts, i.e., with no newlines in the
>> text, and never more than one text per-line.
>>
>> For me other uses of JSON text sequences generally result from my use
>> of jq, which also effectively separates texts with a newline.  Note
>> that jq doesn't need texts to be written compactly when parsing JSON
>> text sequences.  It happens though that if you write texts compactly
>> followed by a newline then you can implement JSON text sequences with
>> all existing JSON parsers.
>>
>> Switching to using a comma-and-newline would require significant
>> retooling.  Therefore I don't see it happening.  Whereas just
>> separating JSON texts with newlines is in use because it's always been
>> the obvious thing to do.
>>
>> Nico
>> --
>>
>
>
>
> --
> Matthew P. C. Morley
>