Re: [Json] Using a non-whitespace separator (Re: Working Group Last Call on draft-ietf-json-text-sequence)

"Martin J. Dürst" <duerst@it.aoyama.ac.jp> Mon, 02 June 2014 01:13 UTC

Return-Path: <duerst@it.aoyama.ac.jp>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C009A1A011D for <json@ietfa.amsl.com>; Sun, 1 Jun 2014 18:13:38 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.158
X-Spam-Level:
X-Spam-Status: No, score=0.158 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HELO_EQ_JP=1.244, HOST_EQ_JP=1.265, J_CHICKENPOX_14=0.6, MIME_8BIT_HEADER=0.3, RCVD_IN_DNSWL_LOW=-0.7, RP_MATCHES_RCVD=-0.651] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id FqX3FEyZ4ML8 for <json@ietfa.amsl.com>; Sun, 1 Jun 2014 18:13:37 -0700 (PDT)
Received: from scintmta01-14.scbb.aoyama.ac.jp (scintmta.scbb.aoyama.ac.jp [133.2.253.64]) by ietfa.amsl.com (Postfix) with ESMTP id 16BFD1A0119 for <json@ietf.org>; Sun, 1 Jun 2014 18:13:36 -0700 (PDT)
Received: from scmeg01-14.scbb.aoyama.ac.jp (scmeg01-14.scbb.aoyama.ac.jp [133.2.253.15]) by scintmta01-14.scbb.aoyama.ac.jp (Postfix) with ESMTP id 78BE632E550; Mon, 2 Jun 2014 10:13:30 +0900 (JST)
Received: from itmail2.it.aoyama.ac.jp (unknown [133.2.206.134]) by scmeg01-14.scbb.aoyama.ac.jp with smtp id 7f11_497b_de914091_87e7_4ccb_a8cf_b8a7f65ec5f7; Mon, 02 Jun 2014 10:13:29 +0900
Received: from [IPv6:::1] (unknown [133.2.210.1]) by itmail2.it.aoyama.ac.jp (Postfix) with ESMTP id 08A87BFB5E; Mon, 2 Jun 2014 10:13:30 +0900 (JST)
Message-ID: <538BCFAB.4010307@it.aoyama.ac.jp>
Date: Mon, 02 Jun 2014 10:13:15 +0900
From: "\"Martin J. Dürst\"" <duerst@it.aoyama.ac.jp>
Organization: Aoyama Gakuin University
User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0
MIME-Version: 1.0
To: Nico Williams <nico@cryptonector.com>, Paul Hoffman <paul.hoffman@vpnc.org>
References: <CAK3OfOidgk13ShPzpF-cxBHeg34s99CHs=bpY1rW-yBwnpPC-g@mail.gmail.com> <CAHBU6itr=ogxP4uoj57goEUSOCpsRx1AXVnW1NQwSTPxbbttkw@mail.gmail.com> <CAK3OfOhft+XJeMrg5rdY9E6fxAkJ2qsT3UHwu7zt=NEz2Q3XOQ@mail.gmail.com> <15F00865-592D-41B2-8E23-6C794C4B77EF@vpnc.org> <CAK3OfOgWtJQtXGQR3GqRpFJBzAPpy0NvTeDeTSYhqa7-FDhqbw@mail.gmail.com>
In-Reply-To: <CAK3OfOgWtJQtXGQR3GqRpFJBzAPpy0NvTeDeTSYhqa7-FDhqbw@mail.gmail.com>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 7bit
Archived-At: http://mailarchive.ietf.org/arch/msg/json/MAAeCEHshnGT2ZdqyXWPjgrWtwQ
Cc: IETF JSON WG <json@ietf.org>
Subject: Re: [Json] Using a non-whitespace separator (Re: Working Group Last Call on draft-ietf-json-text-sequence)
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 02 Jun 2014 01:13:39 -0000

On 2014/06/02 08:16, Nico Williams wrote:
> On Sun, Jun 1, 2014 at 6:10 PM, Paul Hoffman <paul.hoffman@vpnc.org> wrote:
>> On Jun 1, 2014, at 3:35 PM, Nico Williams <nico@cryptonector.com> wrote:
>>> On Sun, Jun 1, 2014 at 12:08 AM, Tim Bray <tbray@textuality.com> wrote:
>>>> No. There should be only one way to do things.

I agree with this sentence, but I very strongly disagree with the 
proposal of U+FFFE.

>>> I'm not terribly fond of this.
>>>
>>> It'd be easier if we picked a Unicode whitespace character that's not
>>> used in the JSON whitespace rule and must be escaped in strings,
>>> preferably one that terminals and such generally handle as a
>>> whitespace.
>>
>> What would a whitespace character be "easier" than a character not allowed in Unicode? It seems to me that a character that could not exist in a string and therefore never needs to be escaped is "easier".
>
> Because I want text-based tools to be able to at least display (and
> grep, and...) JSON text sequence contents much as they can JSON text
> contents.

Yes exactly. The Unicode spec says that U+FFFE must never be interpreted 
as an abstract character nor interchanged. While "interpreted as an 
abstract character" leaves quite a bit of room for interpretation 
(sic!), "not interchanged" is very clear. I hope we won't define a 
format that cannot be interchanged.

The reason why Unicode defines things that way is that there may be 
implementations (think editors,...) that use U+FFFE internally, and 
those would be quite confused when receiving it. There will also be 
applications (editors and other tools, including security checkers) that 
will bark on U+FFFE as part of checking for correct UTF-8.

Regards,   Martin.