Re: [Json] Regarding JSON text sequence ambiguities (Re: serializing sequences of JSON values)

Matthew Morley <matt@mpcm.com> Tue, 11 March 2014 15:31 UTC

Return-Path: <mmorley@mpcm.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BFEFD1A074E for <json@ietfa.amsl.com>; Tue, 11 Mar 2014 08:31:13 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.977
X-Spam-Level:
X-Spam-Status: No, score=-1.977 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8nNjtR1rv5Rg for <json@ietfa.amsl.com>; Tue, 11 Mar 2014 08:31:11 -0700 (PDT)
Received: from mail-lb0-f176.google.com (mail-lb0-f176.google.com [209.85.217.176]) by ietfa.amsl.com (Postfix) with ESMTP id C311A1A03F7 for <json@ietf.org>; Tue, 11 Mar 2014 08:31:10 -0700 (PDT)
Received: by mail-lb0-f176.google.com with SMTP id 10so5632793lbg.7 for <json@ietf.org>; Tue, 11 Mar 2014 08:31:04 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=DcdlyPdgMjmua3hW7fH2DdXDxtIyN1lcz98q4oCOAcA=; b=dbKnZHFeKNmZC93ki8L2noVvuOySkXDZjdBWTBGg7Yp2ojWJ4o+CxsVbsvNSA0qAnj ssZLOAkXwVA1uZaHr7ub5EhfUlNzAVHq8h11puIRVDgLEYkuhbHZvHxHy7CcQ2sumuKz z9AO/Gc9VK1/e8ESTX3wBhPSgZlC0glRuWEPENb7U9AaMZqVZzUJ301YDYpz98ncP5Q8 LFmuswewpTdetPxItsb1XPXa0LtnYW/7iYF31mlZUS9uxO+JsaNrdtJ+K/LTem5pbkdq JQAC5Xd0+GnZPGvFYDsWY7BeatWImqnTAbsG5O7lchvwVAbPGPr1d1hSKt3AH/Ha16HB VUAQ==
X-Gm-Message-State: ALoCoQlZrT24PXxsxgn2a2USZzfuVmIBy1Ap3aKiy0/R1UzdMFo4M4UwxyLHcTFyy9Jedj/J/yhl
MIME-Version: 1.0
X-Received: by 10.112.172.198 with SMTP id be6mr26642936lbc.5.1394551864159; Tue, 11 Mar 2014 08:31:04 -0700 (PDT)
Sender: mmorley@mpcm.com
Received: by 10.114.82.170 with HTTP; Tue, 11 Mar 2014 08:31:04 -0700 (PDT)
In-Reply-To: <CAK3OfOio58+1yuxQOcvWep1CADMfE1PVC48XDid0dWvd8=SVjA@mail.gmail.com>
References: <CAK3OfOj_XQJq-JKAjNdH-GuH0_UwZfeWntgyyizMpTLmSaWQoA@mail.gmail.com> <CAK3OfOio58+1yuxQOcvWep1CADMfE1PVC48XDid0dWvd8=SVjA@mail.gmail.com>
Date: Tue, 11 Mar 2014 11:31:04 -0400
X-Google-Sender-Auth: qlrGfjytj3BB3VYquUMqaX3Nvc8
Message-ID: <CAOXDeqoYb=NXz4ikMxAg3EHFA+903bFgdpR_BL-K18U2oYriXQ@mail.gmail.com>
From: Matthew Morley <matt@mpcm.com>
To: Nico Williams <nico@cryptonector.com>
Content-Type: multipart/alternative; boundary="001a11c266487c074404f45668c6"
Archived-At: http://mailarchive.ietf.org/arch/msg/json/6SqvntmuT2XN4qQnmOkC2RIq7Ko
Cc: Paul Hoffman <paul.hoffman@vpnc.org>, "json@ietf.org" <json@ietf.org>
Subject: Re: [Json] Regarding JSON text sequence ambiguities (Re: serializing sequences of JSON values)
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 11 Mar 2014 15:31:14 -0000

This issue came up with json-rpc structures and streaming json, which while
not directly related in that we expect to encounter only objects at the top
level... having guidance on handling top-level streaming is important for
json.

https://groups.google.com/forum/#!msg/json-rpc/fd1_anYEnDs/nnvnU51bfCoJ
*https://groups.google.com/d/msg/json-rpc/cXlc8bD-a_M/cwfLYWSly54J
<https://groups.google.com/d/msg/json-rpc/cXlc8bD-a_M/cwfLYWSly54J>*
http://www.simple-is-better.org/json-rpc/transport_sockets.html


On Mon, Mar 10, 2014 at 6:11 PM, Nico Williams <nico@cryptonector.com>wrote:

> Consider jq (https://stedolan.github.io/jq) as an example.
>
> jq can and will read a sequence of JSON texts from stdin.  That
> sequence isn't an array.  jq applies its program to each input.  jq
> produces as its output all the values -encoded as JSON texts- output
> by the jq program.  This is exceedingly convenient, including for log
> files (for the reason that you give: log files are an indeterminate
> length, unending append-only sequence).
>
> jq handles the ambiguity by requiring top-level values to be
> "terminated" by any of: unambiguous top-level values, whitespace, or
> EOF.  Thus the sequence of null, true, false, 1, and 2, requires
> whitespace between every value.  While the sequence of true, "foo",
> false, "bar", 1, "foobar", and 2 doesn't.  An input that looks like
> "truefalsenull12" elicits a parse error.
>
> Note that the jq parser does NOT in fact parse multiple top-level
> values; it parses at most one top-level value.  Instead the program
> using the parser feeds input bytes to the parser until the parser
> finds the end of a top-level value and outputs it.  Remaining unparsed
> bytes are buffered, of course so that when the program adds more input
> bytes the parser can be restarted to parse the next top-level value.
>
> This does mean that the jq parser will not output null when fed 'null'
> until one more byte _or_ EOF are fed to it.  But if fed '[1,2]' then
> the parser emits the parsed array value immediately when the closing
> bracket is parsed, without waiting for further inputs.
>
> jq always outputs a newline (though it could be space or tab) after
> outputting any top-level value's JSON text encoding.  It does so
> precisely to avoid these ambiguities.
>
> (This is also why jq must continue to output a newline or other
> whitespace after every output text, except, perhaps, when in raw
> output mode, in which case the outputs aren't JSON texts.)
>
> Nico
> --
>
> _______________________________________________
> json mailing list
> json@ietf.org
> https://www.ietf.org/mailman/listinfo/json
>



-- 
Matthew P. C. Morley