Re: [Json] Using a non-whitespace separator (Re: Working Group Last Call on draft-ietf-json-text-sequence)
Nico Williams <nico@cryptonector.com> Thu, 05 June 2014 08:59 UTC
Return-Path: <nico@cryptonector.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D47AF1A01C0 for <json@ietfa.amsl.com>; Thu, 5 Jun 2014 01:59:26 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.043
X-Spam-Level:
X-Spam-Status: No, score=-1.043 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001, IP_NOT_FRIENDLY=0.334, RCVD_IN_DNSWL_NONE=-0.0001] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id YHFYv47c1rMc for <json@ietfa.amsl.com>; Thu, 5 Jun 2014 01:59:25 -0700 (PDT)
Received: from homiemail-a96.g.dreamhost.com (sub4.mail.dreamhost.com [69.163.253.135]) by ietfa.amsl.com (Postfix) with ESMTP id 98EC51A042B for <json@ietf.org>; Thu, 5 Jun 2014 01:59:25 -0700 (PDT)
Received: from homiemail-a96.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a96.g.dreamhost.com (Postfix) with ESMTP id 6F8F83B8069 for <json@ietf.org>; Thu, 5 Jun 2014 01:59:19 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=cryptonector.com; h= mime-version:in-reply-to:references:date:message-id:subject:from :to:cc:content-type; s=cryptonector.com; bh=gCJDygB+BEUAkePgYSyz ibbOYJ4=; b=ZBMeEOwIXoActzMNDByufPk7xH+UjMe8senvbntw2AyANboRbhgH s87YJfg3BuTzzt/rl7vlOK67dGqbXJiwSvoH3M5KRG594Iuudd/NWhN0unL1TaEj ufu63bWz9Ibo3JQxczojbOnNO1sogHBsgZN97mABmbU1QKVC4C8wNU8=
Received: from mail-wg0-f41.google.com (mail-wg0-f41.google.com [74.125.82.41]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: nico@cryptonector.com) by homiemail-a96.g.dreamhost.com (Postfix) with ESMTPSA id EE0CB3B8062 for <json@ietf.org>; Thu, 5 Jun 2014 01:59:18 -0700 (PDT)
Received: by mail-wg0-f41.google.com with SMTP id z12so710603wgg.0 for <json@ietf.org>; Thu, 05 Jun 2014 01:59:17 -0700 (PDT)
MIME-Version: 1.0
X-Received: by 10.180.102.10 with SMTP id fk10mr13764593wib.42.1401958757740; Thu, 05 Jun 2014 01:59:17 -0700 (PDT)
Received: by 10.216.29.200 with HTTP; Thu, 5 Jun 2014 01:59:17 -0700 (PDT)
In-Reply-To: <255B9BB34FB7D647A506DC292726F6E11546B21D22@WSMSG3153V.srv.dir.telstra.com>
References: <CAK3OfOidgk13ShPzpF-cxBHeg34s99CHs=bpY1rW-yBwnpPC-g@mail.gmail.com> <CAHBU6itr=ogxP4uoj57goEUSOCpsRx1AXVnW1NQwSTPxbbttkw@mail.gmail.com> <CAK3OfOhft+XJeMrg5rdY9E6fxAkJ2qsT3UHwu7zt=NEz2Q3XOQ@mail.gmail.com> <CAK3OfOhy-N0zjCVxtOMB8SqZEKceVvBz9Y6i0fo2W8i+gHKm4Q@mail.gmail.com> <CAK3OfOiQnLq29cv+kas3B8it-+82VmXvL3Rq1C5_767FDhBjRg@mail.gmail.com> <03CFAB3E-F4C6-4AE8-A501-8525376C4AA7@vpnc.org> <CAK3OfOja-17V391tTK91R98X8XQzd0iPnur2=oo4ii+MCOt+Rg@mail.gmail.com> <CFB42410.4EDDC%jhildebr@cisco.com> <CAMm+Lwime-=UQPu3t2ty05CZLb7xUMi9KGi31Xi2B7RNF5S3Og@mail.gmail.com> <CAK3OfOg_k4Ngq+z1pn4b+XRf0M1Hqx8qZ9BtW0sa8QQ+bjKJyA@mail.gmail.com> <084664DB-A55D-465E-8888-97BA0BB59637@vpnc.org> <CAHBU6itEph5GzB-P8bUUvUMopRNxcCE-16qys7ofhdmsDvpN4w@mail.gmail.com> <CAMm+LwjoeC1R4O2iCPo+RfUFn4Qca4zyytqa817ayH60mNaWLg@mail.gmail.com> <CAK3OfOhjPZUXK6C0qSsQQZvOgR3Sv3SWpyH=qTuihuDC9uvXrA@mail.gmail.com> <255B9BB34FB7D647A506DC292726F6E11546B21D22@WSMSG3153V.srv.dir.telstra.com>
Date: Thu, 05 Jun 2014 03:59:17 -0500
Message-ID: <CAK3OfOiCETzxvJSKSqPEXCdMyg92FB+0NpfR0fwgKmXOYSHvgw@mail.gmail.com>
From: Nico Williams <nico@cryptonector.com>
To: "Manger, James" <James.H.Manger@team.telstra.com>
Content-Type: multipart/alternative; boundary="f46d0444ebfdbec69204fb12f5aa"
Archived-At: http://mailarchive.ietf.org/arch/msg/json/S_QYM4sDDcoj6XQYLvR0EtQm2tU
Cc: Phillip Hallam-Baker <ietf@hallambaker.com>, Tim Bray <tbray@textuality.com>, Paul Hoffman <paul.hoffman@vpnc.org>, Joe Hildebrand Hildebrand <jhildebr@cisco.com>, IETF JSON WG <json@ietf.org>
Subject: Re: [Json] Using a non-whitespace separator (Re: Working Group Last Call on draft-ietf-json-text-sequence)
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 05 Jun 2014 08:59:27 -0000
On Thursday, June 5, 2014, Manger, James <James.H.Manger@team.telstra.com> wrote: > >> JSON-sequence = *( ws %1e JSON-text ) > > RS as a JSON sequence prefix or separator was a bad idea when discussed a > month ago and still is. I'm happy to go with the current I-D's LF-based recovery mechanism or variant of it. If that's not acceptable then something like RS has to be it. A third alternative is to abandon logfiles as a use or accept truncated writes leading to fatal parse errors from the corrupted point forwards. A fourth alternative is to not publish on the Standards track, maybe publish as Informational or Experimental whatever fewer of us think is best. Before we get to any of those alternatives I'd like to try to get consensus. Remember, it's rough consensus and running code. I have running code and I'm open to changing it, but if some views are mutually exclusive and therefore some have to be on the rough side of consensus, then so be it. > * You cannot (easily) enter an RS in notepad. A reason to make it optional, as I proposed -- only logfile writers should have to emit it. > * You cannot (easily) enter an RS in vi. Meh. !printf '\x1E'. Also, same answer as above. > * You cannot see an RS. But that's harmless. Especially if you know to expect to find it there. > * An RS causes Chrome to treat a file as binary data, instead of text. That could get fixed (e.g., if Chrome learns about this new MIME type). > * Cut-n-paste a JSON value with an invisible RS prefix and the result is > NOT JSON, ie it will fail with a JSON parser as RS is not allowed in JSON. But you can be careful to not cut the RS. Or we could make it RS SP, to make it easier to find where to cut. > * No one uses RS. That's not much of an argument :) > * RS is now labelled INFORMATION SEPARATOR TWO, not RECORD SEPARATOR. Ditto. And noted. > * We aren't using INFORMATION SEPARATOR ONE, THREE or FOUR. Again. > * A newline as a JSON value terminator is sufficient to parse a JSON > sequence unambiguously. Except in the presence of incompletely-written entries. > * RS doesn't work well with APIs that read text by the line. Do you have any examples of such APIs? Define "well". > * Detecting a newline that separates JSON values is more complex than > detecting an RS character, but it is not that complex (eg handful of lines > of code). Maybe, and that is my preferred solution. I'm with you there. > * An RS prefix detects only slightly more cases of accidentally truncated > writes (in the middle of a top-level number, in a top-level string in the > middle of an escape sequence) -- not enough to be compelling. There are other cases if we have as a goal to recover starting at the very next full text following the truncated one. Preceding and following texts with LF also helps recovery in the cases you mention. But it's not enough. The current I-D covers some alternatives that do remove all ambiguities at little cost to either the parser or the encoder, including: removing internal newlines from texts (cheap operation; no need to parse and re-encode) and/or preceding texts with "null" LF. > * The awkwardness of RS will mean many implementations will be lenient, > but leniency becomes "expected" which leads to interop problems. Parsers shouldn't require it. Encoders should emit it if there's a chance of truncated writes. How can interop problems arise from this formula? What am I missing? > "A JSON sequence is the concatenation of zero or more JSON values, where > each JSON value is terminated with a newline." > > Simple to understand. Simple to write. Simple enough to parse. Simple > enough to resync from the middle of a sequence. Almost identical recovery > from accidental corruption is possible in almost all the same instances > regardless of whether an RS prefix or newline suffix is used. > Yes, and I very much like that, right up until one wants to cater to logfiles and the truncated write problem. See alternatives above. We must choose one. Two camps are squared off and I'm in the middle. One will win or all may lose. My preference is to say that logfile writers must remove internal newlines from the texts they write. That is by far the simplest fix for write truncation. Not all sequences will be logfile-like. No RS if we go with that. I haven't seen any strong arguments against that, in fact You have stronger arguments against RS than I have seen against internal newline removal by logfile writers. Let's sleep on it and revisit tomorrow, Nico --
- [Json] Using a non-whitespace separator (Re: Work… Nico Williams
- Re: [Json] Using a non-whitespace separator (Re: … John Cowan
- Re: [Json] Using a non-whitespace separator (Re: … Tim Bray
- Re: [Json] Using a non-whitespace separator (Re: … Nico Williams
- Re: [Json] Using a non-whitespace separator (Re: … Nico Williams
- Re: [Json] Using a non-whitespace separator (Re: … Paul Hoffman
- Re: [Json] Using a non-whitespace separator (Re: … Nico Williams
- Re: [Json] Using a non-whitespace separator (Re: … Nico Williams
- Re: [Json] Using a non-whitespace separator (Re: … Nico Williams
- Re: [Json] Using a non-whitespace separator (Re: … Paul Hoffman
- Re: [Json] Using a non-whitespace separator (Re: … Nico Williams
- Re: [Json] Using a non-whitespace separator (Re: … Martin J. Dürst
- Re: [Json] Using a non-whitespace separator (Re: … Joe Hildebrand (jhildebr)
- Re: [Json] Using a non-whitespace separator (Re: … Phillip Hallam-Baker
- Re: [Json] Using a non-whitespace separator (Re: … Nico Williams
- Re: [Json] Using a non-whitespace separator (Re: … Paul Hoffman
- Re: [Json] Using a non-whitespace separator (Re: … Tim Bray
- Re: [Json] Using a non-whitespace separator (Re: … Tim Bray
- Re: [Json] Using a non-whitespace separator (Re: … Phillip Hallam-Baker
- Re: [Json] Using a non-whitespace separator (Re: … Nico Williams
- Re: [Json] Using a non-whitespace separator (Re: … Tim Bray
- Re: [Json] Using a non-whitespace separator (Re: … Nico Williams
- Re: [Json] Using a non-whitespace separator (Re: … Manger, James
- Re: [Json] Using a non-whitespace separator (Re: … Nico Williams
- Re: [Json] Using a non-whitespace separator (Re: … Jacob Davies
- Re: [Json] Using a non-whitespace separator (Re: … Paul Hoffman
- Re: [Json] Using a non-whitespace separator (Re: … Nico Williams
- Re: [Json] Using a non-whitespace separator (Re: … Paul Hoffman
- Re: [Json] Using a non-whitespace separator (Re: … Tim Bray
- Re: [Json] Using a non-whitespace separator (Re: … Nico Williams
- Re: [Json] Using a non-whitespace separator (Re: … John Cowan
- Re: [Json] Using a non-whitespace separator (Re: … John Cowan
- Re: [Json] Using a non-whitespace separator (Re: … John Cowan
- Re: [Json] Using a non-whitespace separator (Re: … Nico Williams
- Re: [Json] Using a non-whitespace separator (Re: … Manger, James