[Json] Complete section 3 proposal

"Joe Hildebrand (jhildebr)" <jhildebr@cisco.com> Tue, 18 June 2013 20:40 UTC

Return-Path: <jhildebr@cisco.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost []) by ietfa.amsl.com (Postfix) with ESMTP id 6F6FD21F8E6E for <json@ietfa.amsl.com>; Tue, 18 Jun 2013 13:40:43 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -10.677
X-Spam-Status: No, score=-10.677 tagged_above=-999 required=5 tests=[AWL=-0.078, BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([]) by localhost (ietfa.amsl.com []) (amavisd-new, port 10024) with ESMTP id oIAdhhhODTpg for <json@ietfa.amsl.com>; Tue, 18 Jun 2013 13:40:37 -0700 (PDT)
Received: from rcdn-iport-7.cisco.com (rcdn-iport-7.cisco.com []) by ietfa.amsl.com (Postfix) with ESMTP id 78EDA11E8100 for <json@ietf.org>; Tue, 18 Jun 2013 13:40:37 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=1463; q=dns/txt; s=iport; t=1371588037; x=1372797637; h=from:to:subject:date:message-id:content-id: content-transfer-encoding:mime-version; bh=Kw3A9xSkiQEBGxh20EpmXgpV0KCD9vcn5BQZ8G8qwyo=; b=ZHi0tFgppfmfiAjTuX0jtonGQna0F16X/wCL/XFAliGKWRjDIRGqN5NJ gj4kb8Vh1Z2noEaXg/qveF1Y/Y+yUxeinNdBzG0KCZpvXO4cQY66v84FD GYmAL9dOUMhMPzHqKH+Zti0Y0/uRd65a6vhjeOvL19eWsk0IcCxqSe7By 4=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AmUGAHnFwFGtJXG8/2dsb2JhbABagwl6vw+BBBZtB4IlAQQ6UQEqFEInBBuIBppMoDiPCoM4YQOpBIMPgig
X-IronPort-AV: E=Sophos;i="4.87,891,1363132800"; d="scan'208";a="224434158"
Received: from rcdn-core2-1.cisco.com ([]) by rcdn-iport-7.cisco.com with ESMTP; 18 Jun 2013 20:40:23 +0000
Received: from xhc-rcd-x14.cisco.com (xhc-rcd-x14.cisco.com []) by rcdn-core2-1.cisco.com (8.14.5/8.14.5) with ESMTP id r5IKeNXs025817 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=FAIL) for <json@ietf.org>; Tue, 18 Jun 2013 20:40:23 GMT
Received: from xmb-rcd-x10.cisco.com ([]) by xhc-rcd-x14.cisco.com ([]) with mapi id 14.02.0318.004; Tue, 18 Jun 2013 15:40:23 -0500
From: "Joe Hildebrand (jhildebr)" <jhildebr@cisco.com>
To: "json@ietf.org" <json@ietf.org>
Thread-Topic: Complete section 3 proposal
Thread-Index: AQHObGQOe+q+3fshkkqz/+DniugiRQ==
Date: Tue, 18 Jun 2013 20:40:22 +0000
Message-ID: <A723FC6ECC552A4D8C8249D9E07425A70FC58C0B@xmb-rcd-x10.cisco.com>
Accept-Language: en-US
Content-Language: en-US
user-agent: Microsoft-MacOutlook/
x-originating-ip: []
Content-Type: text/plain; charset="us-ascii"
Content-ID: <74A55F06E9F4E141957074041777DEA1@emea.cisco.com>
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Subject: [Json] Complete section 3 proposal
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 18 Jun 2013 20:40:43 -0000

Combining the threads

Without an external mechanism that specifies encoding, when serialized to
an octet stream, JSON text SHALL be encoded in one of the following
Unicode encoding schemes: UTF-8, UTF-16BE, UTF-16LE, UTF-32BE, or
UTF-32LE.  The default and RECOMMENDED encoding is UTF-8.

    Note: the MIME type registered in [Section 6] does not specify such an
external mechanism, so when used in a MIME context, one of the above
encoding schemes MUST be used.

    Since the first code point of JSON text will always be an ASCII
character [RFC0020], it is possible to determine whether an octet stream
is UTF-8, UTF-16BE, UTF-16LE, UTF-32BE, or UTF-32LE by looking at the
pattern of nulls in the first four octets of a stream.  In the following
table "00" corresponds to an octet with value zero, "xx" corresponds to an
octet known to be non-zero, and "??" corresponds to an octet that is not

    00 00 00 xx  UTF-32BE
    00 xx ?? xx  UTF-16BE
    xx 00 00 00  UTF-32LE
    xx 00 xx ?? UTF-16LE
    xx xx ?? ?? UTF-8

Note: streams less than four octets long are not UTF-32BE or UTF-32LE, and
streams less than two octets long are UTF-8.

    Other Unicode encoding schemes MAY be used, but such octet streams
cannot have their encoding scheme automatically detected as required by
the MIME type in [Section 6], and SHOULD NOT be assumed to interoperate
with existing software.

Joe Hildebrand