Re: [Json] Using a non-whitespace separator (Re: Working Group Last Call on draft-ietf-json-text-sequence)
Nico Williams <nico@cryptonector.com> Mon, 02 June 2014 00:03 UTC
Return-Path: <nico@cryptonector.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B168A1A0103 for <json@ietfa.amsl.com>; Sun, 1 Jun 2014 17:03:25 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.044
X-Spam-Level:
X-Spam-Status: No, score=-1.044 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FM_FORGED_GMAIL=0.622, IP_NOT_FRIENDLY=0.334, RCVD_IN_DNSWL_NONE=-0.0001] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id I1xedw9i_haf for <json@ietfa.amsl.com>; Sun, 1 Jun 2014 17:03:24 -0700 (PDT)
Received: from homiemail-a90.g.dreamhost.com (sub4.mail.dreamhost.com [69.163.253.135]) by ietfa.amsl.com (Postfix) with ESMTP id CC9921A0100 for <json@ietf.org>; Sun, 1 Jun 2014 17:03:24 -0700 (PDT)
Received: from homiemail-a90.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a90.g.dreamhost.com (Postfix) with ESMTP id ADB1F2AC05D for <json@ietf.org>; Sun, 1 Jun 2014 17:03:19 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=cryptonector.com; h= mime-version:in-reply-to:references:date:message-id:subject:from :to:cc:content-type; s=cryptonector.com; bh=y2QCVEHbYtlm6xz2e5l3 iZJ05L8=; b=EgRbZu/g0BgurGqAlfucSwDDw7jzsI41tN3kmGoWomyQgiJ1Uapx CqqcrjV1Zi4Qe7HQGkaTjpQ0dUVz7ad47iinfTDKBPoUVyCVXjMbQngi5x1pCBTp nkgjX2kzuzgwH/wQ6+ikeecQ4eVRlzxVwRFyd5tuFgRT9wPQBse5CzE=
Received: from mail-wi0-f180.google.com (mail-wi0-f180.google.com [209.85.212.180]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: nico@cryptonector.com) by homiemail-a90.g.dreamhost.com (Postfix) with ESMTPSA id 62BF02AC059 for <json@ietf.org>; Sun, 1 Jun 2014 17:03:19 -0700 (PDT)
Received: by mail-wi0-f180.google.com with SMTP id hi2so3725307wib.7 for <json@ietf.org>; Sun, 01 Jun 2014 17:03:18 -0700 (PDT)
MIME-Version: 1.0
X-Received: by 10.180.12.135 with SMTP id y7mr17360936wib.39.1401667398077; Sun, 01 Jun 2014 17:03:18 -0700 (PDT)
Received: by 10.216.29.200 with HTTP; Sun, 1 Jun 2014 17:03:18 -0700 (PDT)
In-Reply-To: <03CFAB3E-F4C6-4AE8-A501-8525376C4AA7@vpnc.org>
References: <CAK3OfOidgk13ShPzpF-cxBHeg34s99CHs=bpY1rW-yBwnpPC-g@mail.gmail.com> <CAHBU6itr=ogxP4uoj57goEUSOCpsRx1AXVnW1NQwSTPxbbttkw@mail.gmail.com> <CAK3OfOhft+XJeMrg5rdY9E6fxAkJ2qsT3UHwu7zt=NEz2Q3XOQ@mail.gmail.com> <CAK3OfOhy-N0zjCVxtOMB8SqZEKceVvBz9Y6i0fo2W8i+gHKm4Q@mail.gmail.com> <CAK3OfOiQnLq29cv+kas3B8it-+82VmXvL3Rq1C5_767FDhBjRg@mail.gmail.com> <03CFAB3E-F4C6-4AE8-A501-8525376C4AA7@vpnc.org>
Date: Sun, 01 Jun 2014 19:03:18 -0500
Message-ID: <CAK3OfOja-17V391tTK91R98X8XQzd0iPnur2=oo4ii+MCOt+Rg@mail.gmail.com>
From: Nico Williams <nico@cryptonector.com>
To: Paul Hoffman <paul.hoffman@vpnc.org>
Content-Type: text/plain; charset="UTF-8"
Archived-At: http://mailarchive.ietf.org/arch/msg/json/Bi5TsaghK3WpS-NMdKECd07vzT4
Cc: IETF JSON WG <json@ietf.org>
Subject: Re: [Json] Using a non-whitespace separator (Re: Working Group Last Call on draft-ietf-json-text-sequence)
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 02 Jun 2014 00:03:25 -0000
On Sun, Jun 1, 2014 at 6:17 PM, Paul Hoffman <paul.hoffman@vpnc.org> wrote: >>> Oh, right, the separator must be a character that must be escaped in >>> strings. That greatly limits the range of characters we can choose >>> from. >> >> And it has to be a one-byte character (therefore an ASCII character, >> and the texts must be encoded in UTF-8). > > Why is that? The problem we're talking about is logfiles where applications "append" to the logfile. Incomplete writes can result in some circumstances. The question then is: how to recover? In particular: how to read past an incomplete entry to the next complete entry? The I-D currently describes a recovery heuristic, but some reviewers have stated a preference for a stronger, more easily understood recovery method. One obvious approach is to separate texts with a byte or byte sequence that could not normally happen in a JSON text. A byte is simpler and easier to handle and understand than a byte sequence: because the latter can be written incompletely for the same reasons that a JSON text can. We've considered all of these approaches (not in this order): 0) newline separator 1) #0 + JSON text boundary detection ABNF 2) #1 + removal of newlines from JSON texts (which does not require re-encoding, FYI) 3) #2 + write something like a text consisting of a single null value before every text 4) use some other separator (that isn't a JSON whitespace character) 5) #4 with ASCII RS as the separator Any separator for #4 has to be something that cannot happen in a JSON text normally, and it has to be something amenable to recovery from partial writes. Even partial writes always at least write complete _bytes_ (if they write any). Therefore a one-byte separator is always unambiguous. In order for a separator to be utterly unambiguous in the face of partial writes it has to involve a byte that can never occur in a JSON text. I suspect there's no such byte that works for UTF-8/16/32. If we limit JSON text sequences to UTF-8 then any ASCII character that must be escaped in strings and is not valid in the encoding of any values will do. That's a very small set of characters. RS (#5) was objected to. I don't see what other character can be used that won't result similar objections. The more I think about it the more I prefer the options in the I-D, #0, #1, #2, and #3. I don't see better alternatives, and I don't think this is fatal. Nico --
- [Json] Using a non-whitespace separator (Re: Work… Nico Williams
- Re: [Json] Using a non-whitespace separator (Re: … John Cowan
- Re: [Json] Using a non-whitespace separator (Re: … Tim Bray
- Re: [Json] Using a non-whitespace separator (Re: … Nico Williams
- Re: [Json] Using a non-whitespace separator (Re: … Nico Williams
- Re: [Json] Using a non-whitespace separator (Re: … Paul Hoffman
- Re: [Json] Using a non-whitespace separator (Re: … Nico Williams
- Re: [Json] Using a non-whitespace separator (Re: … Nico Williams
- Re: [Json] Using a non-whitespace separator (Re: … Nico Williams
- Re: [Json] Using a non-whitespace separator (Re: … Paul Hoffman
- Re: [Json] Using a non-whitespace separator (Re: … Nico Williams
- Re: [Json] Using a non-whitespace separator (Re: … Martin J. Dürst
- Re: [Json] Using a non-whitespace separator (Re: … Joe Hildebrand (jhildebr)
- Re: [Json] Using a non-whitespace separator (Re: … Phillip Hallam-Baker
- Re: [Json] Using a non-whitespace separator (Re: … Nico Williams
- Re: [Json] Using a non-whitespace separator (Re: … Paul Hoffman
- Re: [Json] Using a non-whitespace separator (Re: … Tim Bray
- Re: [Json] Using a non-whitespace separator (Re: … Tim Bray
- Re: [Json] Using a non-whitespace separator (Re: … Phillip Hallam-Baker
- Re: [Json] Using a non-whitespace separator (Re: … Nico Williams
- Re: [Json] Using a non-whitespace separator (Re: … Tim Bray
- Re: [Json] Using a non-whitespace separator (Re: … Nico Williams
- Re: [Json] Using a non-whitespace separator (Re: … Manger, James
- Re: [Json] Using a non-whitespace separator (Re: … Nico Williams
- Re: [Json] Using a non-whitespace separator (Re: … Jacob Davies
- Re: [Json] Using a non-whitespace separator (Re: … Paul Hoffman
- Re: [Json] Using a non-whitespace separator (Re: … Nico Williams
- Re: [Json] Using a non-whitespace separator (Re: … Paul Hoffman
- Re: [Json] Using a non-whitespace separator (Re: … Tim Bray
- Re: [Json] Using a non-whitespace separator (Re: … Nico Williams
- Re: [Json] Using a non-whitespace separator (Re: … John Cowan
- Re: [Json] Using a non-whitespace separator (Re: … John Cowan
- Re: [Json] Using a non-whitespace separator (Re: … John Cowan
- Re: [Json] Using a non-whitespace separator (Re: … Nico Williams
- Re: [Json] Using a non-whitespace separator (Re: … Manger, James