Re: [Json] Using a non-whitespace separator (Re: Working Group Last Call on draft-ietf-json-text-sequence)

Tim Bray <tbray@textuality.com> Thu, 05 June 2014 18:10 UTC

Return-Path: <tbray@textuality.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C62F81A01A0 for <json@ietfa.amsl.com>; Thu, 5 Jun 2014 11:10:50 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.977
X-Spam-Level:
X-Spam-Status: No, score=-1.977 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id evajC80SWhjo for <json@ietfa.amsl.com>; Thu, 5 Jun 2014 11:10:48 -0700 (PDT)
Received: from mail-ve0-f177.google.com (mail-ve0-f177.google.com [209.85.128.177]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 16BE51A017F for <json@ietf.org>; Thu, 5 Jun 2014 11:10:48 -0700 (PDT)
Received: by mail-ve0-f177.google.com with SMTP id db11so1660308veb.22 for <json@ietf.org>; Thu, 05 Jun 2014 11:10:41 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-type; bh=YOvsW+TDSxDdR+j8fSy/+U2AB1ycJSYoFoh3GlBwObE=; b=b6CKzg97ccua1bw/8IJgrG8Hg277yCqwYy0vih8K1AKB6afgyVFSAUjLkkWkrX1KNc n/fqb7StezG3rPUUoQxBGWFnyUBN6LbyaftfdCIL2PcS/sm/9jJ+NIDdu4BLkz8kqJ1p 5AVaaK5wdkIFkaqAgCaVqcCf1a12cDwhTaxCC2njkW2uOJb8/dE2EKAFW35+rqu78Len 6qgwZ4cbREObcXMBNVKpH9ppsQHfBr9tX+48mezC4oORm8IJ2nbfXJjjltgsNiJ0pwLv HbuuqhWqSrZYKnsPCWVO3iRHrLTLJQzbwlJ0QAj7N1QTd/Tr1YM58vOEij6kYddvRlQ/ P3ow==
X-Gm-Message-State: ALoCoQlRvBJvU4GwxrEHs4pJ/z1XpWIMnGA6YgzevD0Jt0xfMibmEQ6aLLf0Us40j/EpnHB53zrp
X-Received: by 10.58.126.4 with SMTP id mu4mr54295407veb.0.1401991840964; Thu, 05 Jun 2014 11:10:40 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.220.98.73 with HTTP; Thu, 5 Jun 2014 11:10:20 -0700 (PDT)
X-Originating-IP: [24.85.103.37]
In-Reply-To: <93018E84-581D-4B75-8B58-6BFAD27D8EE3@vpnc.org>
References: <CAK3OfOidgk13ShPzpF-cxBHeg34s99CHs=bpY1rW-yBwnpPC-g@mail.gmail.com> <CAHBU6itr=ogxP4uoj57goEUSOCpsRx1AXVnW1NQwSTPxbbttkw@mail.gmail.com> <CAK3OfOhft+XJeMrg5rdY9E6fxAkJ2qsT3UHwu7zt=NEz2Q3XOQ@mail.gmail.com> <CAK3OfOhy-N0zjCVxtOMB8SqZEKceVvBz9Y6i0fo2W8i+gHKm4Q@mail.gmail.com> <CAK3OfOiQnLq29cv+kas3B8it-+82VmXvL3Rq1C5_767FDhBjRg@mail.gmail.com> <03CFAB3E-F4C6-4AE8-A501-8525376C4AA7@vpnc.org> <CAK3OfOja-17V391tTK91R98X8XQzd0iPnur2=oo4ii+MCOt+Rg@mail.gmail.com> <CFB42410.4EDDC%jhildebr@cisco.com> <CAMm+Lwime-=UQPu3t2ty05CZLb7xUMi9KGi31Xi2B7RNF5S3Og@mail.gmail.com> <CAK3OfOg_k4Ngq+z1pn4b+XRf0M1Hqx8qZ9BtW0sa8QQ+bjKJyA@mail.gmail.com> <084664DB-A55D-465E-8888-97BA0BB59637@vpnc.org> <CAHBU6itEph5GzB-P8bUUvUMopRNxcCE-16qys7ofhdmsDvpN4w@mail.gmail.com> <CAMm+LwjoeC1R4O2iCPo+RfUFn4Qca4zyytqa817ayH60mNaWLg@mail.gmail.com> <CAK3OfOhjPZUXK6C0qSsQQZvOgR3Sv3SWpyH=qTuihuDC9uvXrA@mail.gmail.com> <255B9BB34FB7D647A506DC292726F6E11546B21D22@WSMSG3153V.srv.dir.telstra.com> <93018E84-581D-4B75-8B58-6BFAD27D8EE3@vpnc.org>
From: Tim Bray <tbray@textuality.com>
Date: Thu, 5 Jun 2014 11:10:20 -0700
Message-ID: <CAHBU6isCCXCTSTHXhdon-CUArJxJ8iCEoj==2HqpAKfagiT7zg@mail.gmail.com>
To: Paul Hoffman <paul.hoffman@vpnc.org>
Content-Type: multipart/alternative; boundary=047d7b6700afa8be1d04fb1aa97f
Archived-At: http://mailarchive.ietf.org/arch/msg/json/HX_Dy0bidKplO794KIgPEdFrzA4
Cc: "Manger, James" <James.H.Manger@team.telstra.com>, IETF JSON WG <json@ietf.org>
Subject: Re: [Json] Using a non-whitespace separator (Re: Working Group Last Call on draft-ietf-json-text-sequence)
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 05 Jun 2014 18:10:51 -0000

I agree with every point Paul made.  I’m completely convinced that we need
an unambiguous can’t-appear-in-JSON separator/prefix.    Anything smaller
than 1F other than the characters included in JSON would be fine by me.
 How about U+0000?


On Thu, Jun 5, 2014 at 10:01 AM, Paul Hoffman <paul.hoffman@vpnc.org> wrote:

> <no hat>
>
> As a summary from below: you prefer a more normal character like NL plus
> the need to escape it in strings, versus an obscure character like RS that
> requires no escaping. Is that correct?
>
> On Jun 4, 2014, at 11:25 PM, Manger, James <
> James.H.Manger@team.telstra.com> wrote:
>
> >>> JSON-sequence = *( ws %1e JSON-text )
> >
> > RS as a JSON sequence prefix or separator was a bad idea when discussed
> a month ago and still is.
> >
> > * You cannot (easily) enter an RS in notepad.
> > * You cannot (easily) enter an RS in vi.
> > * You cannot see an RS.
>
> It seems like the purpose of draft-ietf-json-text-sequence is to create a
> format that can be used for log files and other such files that are
> constantly added to. If so, then the above complaints are not really
> relevant, right?
>
> > * An RS causes Chrome to treat a file as binary data, instead of text.
>
> Ditto.
>
> > * Cut-n-paste a JSON value with an invisible RS prefix and the result is
> NOT JSON, ie it will fail with a JSON parser as RS is not allowed in JSON.
>
> That will be true of anything other than a character that doesn't need to
> be escaped, right? People asked for RS (or something like it) so that they
> didn't have to deal with escaping when the value chosen was also in a
> string.
>
> > * No one uses RS.
> > * RS is now labelled INFORMATION SEPARATOR TWO, not RECORD SEPARATOR.
> > * We aren't using INFORMATION SEPARATOR ONE, THREE or FOUR.
>
> All irrelevant. We are creating a new specification.
>
> > * A newline as a JSON value terminator is sufficient to parse a JSON
> sequence unambiguously.
>
> Sure. And it also causes the need to have escaping.
>
> > * RS doesn't work well with APIs that read text by the line.
>
> Are there JSON APIs that do that?
>
> > * Detecting a newline that separates JSON values is more complex than
> detecting an RS character, but it is not that complex (eg handful of lines
> of code).
>
> The people who asked for RS seemed more concerned about escaping newlines
> in the JSON being written, not detecting it on the incoming. Do you agree
> that that is also a concern?
>
> > * An RS prefix detects only slightly more cases of accidentally
> truncated writes (in the middle of a top-level number, in a top-level
> string in the middle of an escape sequence) -- not enough to be compelling.
>
> That was not the major motivation, however.
>
> > * The awkwardness of RS will mean many implementations will be lenient,
> but leniency becomes "expected" which leads to interop problems.
>
> That is a prediction of the future.
>
> > "A JSON sequence is the concatenation of zero or more JSON values, where
> each JSON value is terminated with a newline."
> >
> > Simple to understand. Simple to write. Simple enough to parse. Simple
> enough to resync from the middle of a sequence. Almost identical recovery
> from accidental corruption is possible in almost all the same instances
> regardless of whether an RS prefix or newline suffix is used.
>
> Sure, but it ignores the issue many people had about escaping.
>
> --Paul Hoffman
> _______________________________________________
> json mailing list
> json@ietf.org
> https://www.ietf.org/mailman/listinfo/json
>



-- 
- Tim Bray (If you’d like to send me a private message, see
https://keybase.io/timbray)