Re: [Json] Call for Consensus: Proposed Text for "8.1 Character Encoding"

"Matthew A. Miller" <> Mon, 27 March 2017 20:04 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 81635126E01 for <>; Mon, 27 Mar 2017 13:04:41 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.935
X-Spam-Status: No, score=-1.935 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_SOFTFAIL=0.665] autolearn=no autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id 1knTkUaLfiw7 for <>; Mon, 27 Mar 2017 13:04:40 -0700 (PDT)
Received: from ( [IPv6:2607:f8b0:4001:c0b::22c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 3A18A127775 for <>; Mon, 27 Mar 2017 13:04:40 -0700 (PDT)
Received: by with SMTP id y18so88837816itc.0 for <>; Mon, 27 Mar 2017 13:04:40 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20150623; h=sender:subject:references:from:to:message-id:date:user-agent :mime-version:in-reply-to; bh=eiJZZNSA/meOggHgh0PYGth0GeCIMZT76CqSnxxJQJQ=; b=OIzmvEeMlL4j2CuUdHn91MVeK4x++PZ9yhA1Suam5jZm+fDLXD1mQK05TCfHWF/AO9 r+qVyW9rj/N/1856GLQfuhtgtI+GpPx8mhHLmvOiuwKMOeVFWca4lSRHiJiAXU5JFuK1 eLA9Izn8BcO4PAyFjJIfwkQIuhb44CLfA6kmFPSIXve+DMeCcWUMptyzD9FLYRFsWbJk ho3duA4nEipXkJroMe6Ljlj0JbtoGwzGjkoRJKnROirGbyK4AEJy1I64do65/Gx8b2az IwswHjYP3XJMlsg6mw/Tkh6ulTxNZXf7CMVuwz70T1omMduuWYm+wv5KskiUzIravHGq xWGw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:sender:subject:references:from:to:message-id :date:user-agent:mime-version:in-reply-to; bh=eiJZZNSA/meOggHgh0PYGth0GeCIMZT76CqSnxxJQJQ=; b=bcZoyBmN7B7NshBJofpWnYZo9vJpMEOkzfyKxw2H2I235ae5HG2LXl6c0fKgj5A68J udWU6pgN9g27Kfsj2M3u+PozkI936C4SoNrlP3hyvlZhFdptO7JJ8SydPN5dqK2gZ+Hh ZImOtBRxCFXOL6tVdEFq01lYSdihzhgoKkykUUEPUINwOOn3q4dgdlxY989wj0xUcLkb csAAVZraZ24BSqNYESXcLXyS4S5J3QOcTDHkw2llGHCeqx8El1/mz0EdWSzabjCWjQYk crjIbqIs2xcrEvWn4WK5MsJgoUxvTqM0nkZDEcNBtdFoxX8bNHZgl+4S475tAgi3FlfD hFTw==
X-Gm-Message-State: AFeK/H2aH+r02kWo7S2UHbux8lFvRC4Z9o52FeAi3usjCkWKPv26laOMo3/FaPlI+nrNgg==
X-Received: by with SMTP id z132mr22473521iof.220.1490645079313; Mon, 27 Mar 2017 13:04:39 -0700 (PDT)
Received: from ?IPv6:2001:67c:370:128:845b:e11:7aad:29da? ( [2001:67c:370:128:845b:e11:7aad:29da]) by with ESMTPSA id l5sm312564ita.13.2017. for <> (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 27 Mar 2017 13:04:38 -0700 (PDT)
Sender: Matthew Miller <>
References: <> <> <> <> <> <> <> <> <> <> <> <> <> <> <>
From: "Matthew A. Miller" <>
To: "" <>
Message-ID: <>
Date: Mon, 27 Mar 2017 15:04:36 -0500
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:52.0) Gecko/20100101 Thunderbird/52.0
MIME-Version: 1.0
In-Reply-To: <>
Content-Type: multipart/signed; micalg="pgp-sha512"; protocol="application/pgp-signature"; boundary="tCLDW8QOuKWhwPLIFLiHoEXAJC02d96wo"
Archived-At: <>
Subject: Re: [Json] Call for Consensus: Proposed Text for "8.1 Character Encoding"
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Mon, 27 Mar 2017 20:04:41 -0000

Hello JSONBis,

I believe I see consensus for:

* MUST encode as UTF-8 where the media type is 'application/json'
* SHOULD encode as UTF-8 for all (other) usages.

In an attempt to expedite things, here is a proposal for text that I
think matches the consensus:

JSON text SHOULD be encoded in UTF-8 (Section 3 of [UNICODE]); JSON
text MAY be encoded in UTF-16 or UTF-32 if the generator is certain
the intended recipients can process it. JSON text MUST NOT be encoded
in any encoding other than UTF-8, UTF-16, or UTF-32. When used with
media type "application/json" the JSON text MUST be encoded as UTF-8.

Implementations MUST NOT add a byte order mark (U+FEFF) to the
beginning of a JSON text.  In the interests of interoperability,
implementations that parse JSON texts MAY ignore the presence of a
byte order mark rather than treating it as an error.

Recipients that wish to support Unicode encodings other than UTF-8
can do this using a detection mechanism that is based on the fact
that the first character will always have a Unicode code point
greater than 0 and less than 128, thus the UTF-16/32 variants can
be detected by inspecting the first octets for nulls.

- m&m

Matthew A. Miller
JSONBis Chair