Re: [Json] Wording on encoding; removing the table
"Pete Cordell" <petejson@codalogic.com> Sat, 23 November 2013 09:51 UTC
Return-Path: <petejson@codalogic.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 466101AE274 for <json@ietfa.amsl.com>; Sat, 23 Nov 2013 01:51:47 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 2.736
X-Spam-Level: **
X-Spam-Status: No, score=2.736 tagged_above=-999 required=5 tests=[BAYES_20=-0.001, FH_HOST_EQ_D_D_D_D=0.765, HELO_MISMATCH_COM=0.553, RDNS_DYNAMIC=0.982, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, STOX_REPLY_TYPE=0.439] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id tMOB64Nghqt5 for <json@ietfa.amsl.com>; Sat, 23 Nov 2013 01:51:45 -0800 (PST)
Received: from ppsa-online.com (lvps217-199-162-192.vps.webfusion.co.uk [217.199.162.192]) by ietfa.amsl.com (Postfix) with ESMTP id BF6AC1AE25A for <json@ietf.org>; Sat, 23 Nov 2013 01:51:44 -0800 (PST)
Received: (qmail 31725 invoked from network); 23 Nov 2013 09:51:18 +0000
Received: from host86-167-12-24.range86-167.btcentralplus.com (HELO codalogic) (86.167.12.24) by lvps217-199-162-217.vps.webfusion.co.uk with ESMTPSA (RC4-MD5 encrypted, authenticated); 23 Nov 2013 09:51:18 +0000
Message-ID: <7404D1DCC5E84DC3B8F8CD300274962D@codalogic>
From: Pete Cordell <petejson@codalogic.com>
To: Paul Hoffman <paul.hoffman@vpnc.org>, JSON WG <json@ietf.org>
References: <v8av89128j49csd5bb5ba2rqrgschs4c79@hive.bjoern.hoehrmann.de> <BE35B0E6-6C71-47EB-BA29-08A32935D20E@vpnc.org>
Date: Sat, 23 Nov 2013 09:45:02 -0000
X-Unsent: 1
MIME-Version: 1.0
Content-Type: text/plain; format="flowed"; charset="iso-8859-1"; reply-type="original"
x-vipre-scanned: 003896B1005C56003897FE
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2900.5931
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157
Subject: Re: [Json] Wording on encoding; removing the table
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 23 Nov 2013 09:51:47 -0000
I believe we must have consensus that this is a contentious issue, and there is a lot of confusion around it. Therefore, in the interests of interoperability I believe it is inappropriate to decide to be silent on all of these issues. Therefore, I propose text along the following lines: JSON text is a sequence of Unicode codepoints. The transfer encoding used to represent the characters on-the-wire is beyond the scope of this document. It is therefore up to the specifications that reference this document to specify whether JSON messages will be transferred using UTF-8 (recommended), UTF-16 and/or UTF-32 (discouraged), and whether preceding BOMs must be present, must not be present or are optional. If multiple encodings are permitted, implementers may choose to auto-detect a message's encoding by exploiting the fact that the first character of a JSON text must be in the ASCII character range and use the following table to deduce the active encoding: xx xx -- -- UTF-8 xx 00 xx -- UTF-16LE xx 00 00 xx UTF-16LE xx 00 00 00 UTF-32LE 00 xx -- -- UTF-16BE 00 00 -- -- UTF-32BE I don't think that's a lot of text with which to describe the issues here, and I'm sure Tim (or someone else) can make it even snappier and more accurate. Pete Cordell Codalogic Ltd C++ tools for C++ programmers, http://codalogic.com Read & write XML in C++, http://www.xml2cpp.com ----- Original Message ----- From: "Paul Hoffman" <paul.hoffman@vpnc.org> To: "JSON WG" <json@ietf.org> Sent: Friday, November 22, 2013 10:36 PM Subject: [Json] Wording on encoding; removing the table > <hat on> > > Please note that the chairs tried to find some consensus in the BOM > discussion and found none. Given that, and given that the current table is > now wrong, our proposal is to remove it, not try to doctor it. > > Current Section 8.1: > > JSON text SHALL be encoded in Unicode. The default encoding is > UTF-8. > > Since the first two characters of a JSON text will always be ASCII > characters [RFC0020], it is possible to determine whether an octet > stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking > at the pattern of nulls in the first four octets. > > 00 00 00 xx UTF-32BE > 00 xx 00 xx UTF-16BE > xx 00 00 00 UTF-32LE > xx 00 xx 00 UTF-16LE > xx xx xx xx UTF-8 > > Proposed replacement: > > The default encoding for JSON transmitted over the Internet is UTF-8. > Transmitting JSON using other encodings may not be interoperable > unless the receiving system definitively knows the encoding. > > Does anyone have a technical objection to the proposed replacement? If so, > please state the error and (hopefully) a correction. > > --Matt Miller and Paul Hoffman > > _______________________________________________ > json mailing list > json@ietf.org > https://www.ietf.org/mailman/listinfo/json
- Re: [Json] First two characters Markus Lanthaler
- Re: [Json] Wording on encoding; removing the table Bjoern Hoehrmann
- Re: [Json] Wording on encoding; removing the table Pete Cordell
- Re: [Json] First two characters Pete Cordell
- [Json] First two characters Bjoern Hoehrmann
- Re: [Json] First two characters Nico Williams
- [Json] Wording on encoding; removing the table Paul Hoffman
- Re: [Json] Wording on encoding; removing the table Nico Williams
- Re: [Json] Wording on encoding; removing the table Paul Hoffman
- Re: [Json] First two characters Bjoern Hoehrmann
- Re: [Json] Wording on encoding; removing the table Larry Masinter
- Re: [Json] Wording on encoding; removing the table Martin J. Dürst
- Re: [Json] Wording on encoding; removing the table Pete Cordell
- Re: [Json] First two characters Markus Lanthaler
- Re: [Json] First two characters Carsten Bormann