Re: [Json] Call for Consensus: Proposed Text for "8.1 Character Encoding"

Pete Cordell <petejson@codalogic.com> Wed, 10 May 2017 20:30 UTC

Return-Path: <petejson@codalogic.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1F95C128799 for <json@ietfa.amsl.com>; Wed, 10 May 2017 13:30:39 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.979
X-Spam-Level:
X-Spam-Status: No, score=0.979 tagged_above=-999 required=5 tests=[BAYES_40=-0.001, RDNS_DYNAMIC=0.982, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id J3RtmJM4WcCO for <json@ietfa.amsl.com>; Wed, 10 May 2017 13:30:37 -0700 (PDT)
Received: from ppsa-online.com (lvps217-199-162-192.vps.webfusion.co.uk [217.199.162.192]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 225BA126B71 for <json@ietf.org>; Wed, 10 May 2017 13:30:36 -0700 (PDT)
Received: (qmail 13724 invoked from network); 10 May 2017 21:22:53 +0100
Received: from host109-156-38-129.range109-156.btcentralplus.com (HELO ?192.168.1.72?) (109.156.38.129) by lvps217-199-162-217.vps.webfusion.co.uk with ESMTPSA (DHE-RSA-AES128-SHA encrypted, authenticated); 10 May 2017 21:22:53 +0100
To: "Matthew A. Miller" <linuxwolf+ietf@outer-planes.net>, Julian Reschke <julian.reschke@gmx.de>, "json@ietf.org" <json@ietf.org>
References: <e69d7c21-85cb-45f4-c0c2-34c624e63049@outer-planes.net> <40e3207f-e047-c898-1f0c-4422de1d597a@it.aoyama.ac.jp> <1b3ec14a-927a-8d46-e3d3-9807a9588437@outer-planes.net> <CAHBU6ivsq8+Z=MMkUH+=Q0uwc5NCtaJLYw5cp0Qg8eX2hQQ6sA@mail.gmail.com> <b74cb31b-8e04-17d0-548a-fc164ce07c05@outer-planes.net> <20170417175627.GK23461@localhost> <10B651F1-7FE0-484D-BD2E-FD146BC5FB04@tzi.org> <eabbccb0-8d15-d595-7cd0-37acc0621c57@it.aoyama.ac.jp> <6eb23f90-6623-7888-bc1c-6640a9dababc@codalogic.com> <61bfad2b-850d-a11f-e80b-d5ed9ccb4dc9@codalogic.com> <08a88696-65ef-da05-0d77-1a07d04ebfc8@outer-planes.net> <bb9fead6-23e7-8c1d-bc80-b60c81c4b89a@codalogic.com> <6f047d01-ad72-59ab-9d34-20a8177ab3af@outer-planes.net> <be4d9f12-a4be-3723-e52a-56a60722a75f@gmx.de> <a3805f67-620b-67f0-9c06-c865b71029e7@codalogic.com> <bb1ef6a8-506c-344b-b903-980ed50ad2d3@gmx.de> <44b4523a-5e4b-ccad-af96-931d8b9ad1c2@codalogic.com> <ac1d1b68-67e7-c19f-a556-280df73f465b@outer-planes.net>
From: Pete Cordell <petejson@codalogic.com>
Message-ID: <db3e4d88-d3bc-2ab5-fd8d-0a9ed90865e9@codalogic.com>
Date: Wed, 10 May 2017 21:30:34 +0100
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0
MIME-Version: 1.0
In-Reply-To: <ac1d1b68-67e7-c19f-a556-280df73f465b@outer-planes.net>
Content-Type: text/plain; charset="windows-1252"; format="flowed"
Content-Transfer-Encoding: 7bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/json/KyWNVmNw2RCHssOlLD0Xo9nLXZc>
Subject: Re: [Json] Call for Consensus: Proposed Text for "8.1 Character Encoding"
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 10 May 2017 20:30:39 -0000

On 10/05/2017 18:13, Matthew A. Miller wrote:
> Assuming the Working Group finds that scope acceptable and finds UTF-8
> only acceptable, here is a starting proposal for text:
>
> """
> 8.1.  Character Encoding
>
> When transmitting over a network protocol, JSON text MUST be
> encoded in UTF-8 (Section 3 of [UNICODE]).
>
> Previous specifications of JSON have not required the use of UTF-8
> when transmitting JSON text. However, the vast majority of
> JSON-based software implementations have chosen to use the UTF-8
> encoding, to the extent that it is the only encoding that achieves
> interoperability.
>
> Implementations MUST NOT add a byte order mark (U+FEFF) to the
> beginning of a JSON text.  In the interests of interoperability,
> implementations that parse JSON texts MAY ignore the presence of a
> byte order mark rather than treating it as an error.
> """
>
> If you find this acceptable, please indicate that.  Otherwise, please
> provide suggested changes.


A strong +1 to the spirit of the proposal.

I realise that the term "network protocol" is necessarily vague, but I 
wonder if it might be possible to avoid some confusion with the likes of 
NetBEUI (if that's still around) with a phrasing something like:

     When transmitted as the syntax of a network protocol, or as a
     payload of a network protocol intended to be interpreted as part of
     a protocol, JSON text MUST be encoded in UTF-8 (Section 3 of
     [UNICODE]).

Thanks,

Pete