Re: [Json] BOMs

"Martin J. Dürst" <duerst@it.aoyama.ac.jp> Wed, 20 November 2013 06:19 UTC

Return-Path: <duerst@it.aoyama.ac.jp>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 463061AE338; Tue, 19 Nov 2013 22:19:29 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.984
X-Spam-Level:
X-Spam-Status: No, score=0.984 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HELO_EQ_JP=1.244, HOST_EQ_JP=1.265, J_CHICKENPOX_45=0.6, MIME_8BIT_HEADER=0.3, RP_MATCHES_RCVD=-0.525] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id QWQTEHBNYf42; Tue, 19 Nov 2013 22:19:27 -0800 (PST)
Received: from scintmta01.scbb.aoyama.ac.jp (scintmta01.scbb.aoyama.ac.jp [133.2.253.33]) by ietfa.amsl.com (Postfix) with ESMTP id F1C4D1AE336; Tue, 19 Nov 2013 22:19:26 -0800 (PST)
Received: from scmse02.scbb.aoyama.ac.jp ([133.2.253.231]) by scintmta01.scbb.aoyama.ac.jp (secret/secret) with SMTP id rAK6J0Vl017383; Wed, 20 Nov 2013 15:19:00 +0900
Received: from (unknown [133.2.206.134]) by scmse02.scbb.aoyama.ac.jp with smtp id 28e9_f0ec_a512753c_51ab_11e3_97e8_001e6722eec2; Wed, 20 Nov 2013 15:18:59 +0900
Received: from [IPv6:::1] (unknown [133.2.210.1]) by itmail2.it.aoyama.ac.jp (Postfix) with ESMTP id 90E81BF4CD; Wed, 20 Nov 2013 15:18:59 +0900 (JST)
Message-ID: <528C5445.3050600@it.aoyama.ac.jp>
Date: Wed, 20 Nov 2013 15:18:45 +0900
From: "\"Martin J. Dürst\"" <duerst@it.aoyama.ac.jp>
Organization: Aoyama Gakuin University
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.9) Gecko/20100722 Eudora/3.0.4
MIME-Version: 1.0
To: "Henry S. Thompson" <ht@inf.ed.ac.uk>
References: <AA45B3C6-1DC5-4B1E-8045-C9FE76022584@vpnc.org> <CEA92854.2CC53%jhildebr@cisco.com> <20131113224737.GI31823@mercury.ccil.org> <f5bob5n71y7.fsf@troutbeck.inf.ed.ac.uk> <5284B095.4070004@it.aoyama.ac.jp> <C37B2FE59C164DBCA982AC81A56A09AA@codalogic> <f5bk3g6ufqy.fsf@troutbeck.inf.ed.ac.uk> <5289F974.9020709@it.aoyama.ac.jp> <020401cee50f$a2cdf5c0$4001a8c0@gateway.2wire.net> <528B46EA.4040503@it.aoyama.ac.jp> <43255615-2FC9-4726-99FD-1B13D6B1F033@wirfs-brock.com> <f5br4ackyqm.fsf@troutbeck.inf.ed.ac.uk>
In-Reply-To: <f5br4ackyqm.fsf@troutbeck.inf.ed.ac.uk>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 7bit
X-Mailman-Approved-At: Wed, 20 Nov 2013 04:39:38 -0800
Cc: Allen Wirfs-Brock <allen@wirfs-brock.com>, John Cowan <cowan@mercury.ccil.org>, IETF Discussion <ietf@ietf.org>, Pete Cordell <petejson@codalogic.com>, JSON WG <json@ietf.org>, www-tag@w3.org, Anne van Kesteren <annevk@annevk.nl>, "t.p." <daedulus@btconnect.com>, es-discuss <es-discuss@mozilla.org>
Subject: Re: [Json] BOMs
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 20 Nov 2013 06:19:29 -0000

Hello Henry, others,

On 2013/11/20 3:55, Henry S. Thompson wrote:
> Allen Wirfs-Brock writes:
>
>> There can be no doubt that the most widely deployed JSON parsers are
>> those that are built intp the browser javascript implementations.
>> The ECMAScript 5 specification for JSON.parse that they implement
>> says BOM is an illegal character.  But what do the browser actually
>> implement?  This:
>
> No, try e.g. jsonviewer.stack.hu [1] (works in Chrome, Safari, Opera,
> not in IE or Firefox)

In Firefox, I got some garbled characters, in particular some question 
marks for each of the two bytes of the BOM and one question mark for the 
e-acute. Because of the type of the errors, I strongly suspect it is 
related to what we are trying to investigate, and so I don't think this 
can be taken as evidence one way or another.

or feed [2] to www.jsoneditoronline.org (Use
> Open/Url) (works in Chrome, IE, Firefox, ran out of time to test more).

The fact that some libraries or Web sites accept a BOM for JSON isn't a 
proof that all (well, let's say the majority) accept a BOM.

> As previously discussed, _no-one_ is arguing that BOMs are in the JSON
> language as such.  JSON parsers shouldn't accept BOMs.
>
> BOMs are, to quote the UNICODE spec, "not part of the text".  It is
> appropriate that specs concerned with JSON-on-the-wire, for example
> the media type registration for 'application/json', _should_ discuss
> the BOM, and it's open to them, _without changing the language at
> all_, to say that BOMs are acceptable but, again, are not part of the
> text which the parser has to accept.

I agree that *from a theoretical viewpoint*, this is correct. But theory 
isn't everything. As I have written before (and you have cited in 
another thread, for another spec):

   What's most important now is to know what receivers actually
   accept. We are not in a design phase, we are just updating the
   definition ... and making sure we fix problems if there are
   problems, but we have to use the installed base for the main
   guidance

For our update from RFC 4627, the null hypothesis is that there are no 
BOMs (neither for UTF-8 nor for UTF-16). The patterns given in 
http://tools.ietf.org/html/rfc4627#section-3 cannot apply to characters, 
they can only apply to bytes. If we want to allow a spec in 
application/json, then we have to have strong evidence that almost all 
parsers can deal with BOMs, not just fragmentary evidence that some 
parsers don't choke on a BOM.

Please note that there's some parallel to XML, in that neither Unicode 
(for the encoding form) nor the IETF (for the 'charset') require a BOM 
for "UTF-16", but XML nevertheless strictly requires it.

Regards,   Martin.

> ht
>
> [1] http://jsonviewer.stack.hu/#http://www.ltg.ed.ac.uk/~ht/ov-test/b16le.json