Re: [Json] Wording on encoding; removing the table

"Martin J. Dürst" <duerst@it.aoyama.ac.jp> Mon, 25 November 2013 09:44 UTC

Return-Path: <duerst@it.aoyama.ac.jp>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9C4051AD2EC for <json@ietfa.amsl.com>; Mon, 25 Nov 2013 01:44:24 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.392
X-Spam-Level:
X-Spam-Status: No, score=-1.392 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HELO_EQ_JP=1.244, HOST_EQ_JP=1.265, MIME_8BIT_HEADER=0.3, RCVD_IN_DNSWL_MED=-2.3, RP_MATCHES_RCVD=-0.001] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ZRLFSD64_9zm for <json@ietfa.amsl.com>; Mon, 25 Nov 2013 01:44:23 -0800 (PST)
Received: from scintmta02.scbb.aoyama.ac.jp (scintmta02.scbb.aoyama.ac.jp [133.2.253.34]) by ietfa.amsl.com (Postfix) with ESMTP id 31FB21ACCDF for <json@ietf.org>; Mon, 25 Nov 2013 01:44:23 -0800 (PST)
Received: from scmse02.scbb.aoyama.ac.jp ([133.2.253.231]) by scintmta02.scbb.aoyama.ac.jp (secret/secret) with SMTP id rAP9iD6M023067; Mon, 25 Nov 2013 18:44:13 +0900
Received: from (unknown [133.2.206.134]) by scmse02.scbb.aoyama.ac.jp with smtp id 74a4_4427_246e8a7e_55b6_11e3_85f5_001e6722eec2; Mon, 25 Nov 2013 18:44:13 +0900
Received: from [IPv6:::1] (unknown [133.2.210.1]) by itmail2.it.aoyama.ac.jp (Postfix) with ESMTP id C00A0BF521; Mon, 25 Nov 2013 18:44:12 +0900 (JST)
Message-ID: <52931BE2.90607@it.aoyama.ac.jp>
Date: Mon, 25 Nov 2013 18:44:02 +0900
From: "\"Martin J. Dürst\"" <duerst@it.aoyama.ac.jp>
Organization: Aoyama Gakuin University
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.9) Gecko/20100722 Eudora/3.0.4
MIME-Version: 1.0
To: Bjoern Hoehrmann <derhoermi@gmx.net>
References: <v8av89128j49csd5bb5ba2rqrgschs4c79@hive.bjoern.hoehrmann.de> <BE35B0E6-6C71-47EB-BA29-08A32935D20E@vpnc.org> <uhnv89pnulebdn9qsjuutr472aku18r0db@hive.bjoern.hoehrmann.de>
In-Reply-To: <uhnv89pnulebdn9qsjuutr472aku18r0db@hive.bjoern.hoehrmann.de>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: quoted-printable
Cc: Paul Hoffman <paul.hoffman@vpnc.org>, JSON WG <json@ietf.org>
Subject: Re: [Json] Wording on encoding; removing the table
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 25 Nov 2013 09:44:24 -0000

I agree with Björn here. Removing UTF-16 and UTF-32 from the spec 
completely opens the door for any other odd encoding, which would be a 
net decrease in interoperability.

Regards,   Martin.

On 2013/11/23 8:02, Bjoern Hoehrmann wrote:
> * Paul Hoffman wrote:
>> Proposed replacement:
>>
>>    The default encoding for JSON transmitted over the Internet is UTF-8.
>>    Transmitting JSON using other encodings may not be interoperable
>>    unless the receiving system definitively knows the encoding.
>>
>> Does anyone have a technical objection to the proposed replacement? If
>> so, please state the error and (hopefully) a correction.
>
> It is necessary for reasons of security and interoperability that all
> application/json processors agree on how to get from the sequence of
> bytes that make up the application/json entity to a sequence of integers
> that are used in the ABNF definition of the JSON syntax. For example,
>
>    data:application/json,%5B%22Bj+APY-rn%22%5D
>
> Under the rules you propose, this can be interpreted as if it were
>
>    ["Björn"]
>
> An implementation that interprets it thus is fully conforming because
> the bytes look like they are UTF-7 encoded text and the specification
> does not unambiguously say that, no, the implementation must not take
> it as UTF-7 encoded document, instead it must use UTF-8 to decode.
>
> Under the rules of RFC 4627 there can be no UTF-7 encoded application/
> json entities at all and processors are required to decode the example
> using UTF-8.
>
> Some web browsers had exactly this problem for a long time with other
> formats, it is entirely unacceptable to go back to "just do whatever"
> specifications for character encoding determination.