Re: [Json] Call for Consensus: Proposed Text for "8.1 Character Encoding"

Carsten Bormann <cabo@tzi.org> Mon, 13 March 2017 21:12 UTC

Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 10.2 \(3259\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <1fb5849e-8dbf-835d-65b7-2403686248f9@outer-planes.net>
Date: Mon, 13 Mar 2017 22:12:09 +0100
Content-Transfer-Encoding: quoted-printable
Message-Id: <3B3F2181-6C5D-43C0-BCD9-8D4BA05E6C03@tzi.org>
References: <1fb5849e-8dbf-835d-65b7-2403686248f9@outer-planes.net>
To: Matthew Miller <linuxwolf+ietf@outer-planes.net>
Archived-At: <https://mailarchive.ietf.org/arch/msg/json/8eT6j9ffGvWO7YitYZtdmrwK2ho>
Cc: draft-ietf-jsonbis-rfc7159bis.all@ietf.org, "json@ietf.org" <json@ietf.org>
Subject: Re: [Json] Call for Consensus: Proposed Text for "8.1 Character Encoding"
Precedence: list

For the record…

> On 13 Mar 2017, at 22:06, Matthew Miller <linuxwolf+ietf@outer-planes.net> wrote:
> 
> Hello JSONbis,
> 
> The security directorate review discussion has raised the issue of
> encoding detection.  The original table from RFC 4627 was removed from
> RFC 7159 due to a lack of consensus.  In this latest round, there have
> been a number of comments have been made supporting (and against) adding
> more guidance than is currently present.
> 
> The chair asks for a call on the following from the working group:
> 
> 1) Does the working group think adding any text on how to detect the
> encoding worthwhile?

No, that would be a regression into maintaining the fiction that UTF-16 and UTF-32 versions of JSON are being used in interchange.

> 2a) If such text is worthwhile, is the following proposed text from Nico
> Williams acceptable (to be appended to Section 8.1)?
> 
> """
>   Implementors MAY count the number of ASCII NULs in the first four
>   bytes of any JSON text to detect which of UTF-8, UTF-16, or UTF-32
>   the text is encoded in:
> 
>    - if the count is zero, then the text is encoded in UTF-8
>    - if the count is one or two, then the text is encoded in UTF-16
>    - if the count is three, then the text is encoded in UTF-32
> 
>   This results from a) JSON texts having to start with an ASCII
>   character, b) no unescaped NULs being allowed in JSON strings, and c)
>   any type being allowed at the top-level, thus the first character may
>   be a double-quote and the second may be any permissible, unescaped
>   Unicode codepoint.  An ASCII character requires a NUL-valued byte in
>   UTF-16 encoding, three in UTF-32, and none in UTF-8.
> 
> “""

Not sure if I’m allowed to note that after saying no above, but not all JSON documents have four bytes.

> 2b) If such text is worthwhile but Nico's proposal is not worthwhile,
> what would be acceptable?

Again, not worthwhile, but maybe it wouldn’t hurt to mention that implementations that want to guard against erroneously encoded input can detect ASCII NULs in the input and even use those to predict whether the encoder was using one of the UTF-16s or one of the UTF-32s.

Grüße, Carsten

Re: [Json] Call for Consensus: Proposed Text for … HANSEN, TONY L
Re: [Json] Call for Consensus: Proposed Text for … Nico Williams
Re: [Json] Call for Consensus: Proposed Text for … Matthew A. Miller
Re: [Json] Call for Consensus: Proposed Text for … Carsten Bormann
Re: [Json] Call for Consensus: Proposed Text for … Martin J. Dürst
Re: [Json] Call for Consensus: Proposed Text for … Tim Bray
Re: [Json] Call for Consensus: Proposed Text for … John Cowan
Re: [Json] Call for Consensus: Proposed Text for … HANSEN, TONY L
Re: [Json] Call for Consensus: Proposed Text for … Pete Cordell
Re: [Json] Call for Consensus: Proposed Text for … Carsten Bormann
Re: [Json] Call for Consensus: Proposed Text for … Pete Cordell
Re: [Json] Call for Consensus: Proposed Text for … Martin J. Dürst
Re: [Json] Call for Consensus: Proposed Text for … Pete Cordell
Re: [Json] Call for Consensus: Proposed Text for … Matthew A. Miller
[Json] Call for Consensus: Proposed Text for "8.1… Matthew Miller
Re: [Json] Call for Consensus: Proposed Text for … Carsten Bormann
Re: [Json] Call for Consensus: Proposed Text for … Julian Reschke
Re: [Json] Call for Consensus: Proposed Text for … Nico Williams
Re: [Json] Call for Consensus: Proposed Text for … Nico Williams
Re: [Json] Call for Consensus: Proposed Text for … Carsten Bormann
Re: [Json] Call for Consensus: Proposed Text for … Julian Reschke
Re: [Json] Call for Consensus: Proposed Text for … Carsten Bormann
Re: [Json] Call for Consensus: Proposed Text for … Julian Reschke
Re: [Json] Call for Consensus: Proposed Text for … Peter Cordell
Re: [Json] Call for Consensus: Proposed Text for … Carsten Bormann
Re: [Json] Call for Consensus: Proposed Text for … Julian Reschke
Re: [Json] Call for Consensus: Proposed Text for … Tim Bray
Re: [Json] Call for Consensus: Proposed Text for … Julian Reschke
Re: [Json] Call for Consensus: Proposed Text for … Carsten Bormann
Re: [Json] Call for Consensus: Proposed Text for … Peter Cordell
Re: [Json] Call for Consensus: Proposed Text for … Carsten Bormann
Re: [Json] Call for Consensus: Proposed Text for … Martin J. Dürst
Re: [Json] Call for Consensus: Proposed Text for … Peter Cordell
Re: [Json] Call for Consensus: Proposed Text for … Paul Hoffman
Re: [Json] Call for Consensus: Proposed Text for … Julian Reschke
Re: [Json] Call for Consensus: Proposed Text for … Matthew A. Miller
Re: [Json] Call for Consensus: Proposed Text for … Julian Reschke
Re: [Json] Call for Consensus: Proposed Text for … Julian Reschke
Re: [Json] Call for Consensus: Proposed Text for … Carsten Bormann
Re: [Json] Call for Consensus: Proposed Text for … Julian Reschke
Re: [Json] Call for Consensus: Proposed Text for … Matthew A. Miller
Re: [Json] Call for Consensus: Proposed Text for … John Cowan
Re: [Json] Call for Consensus: Proposed Text for … Julian Reschke
Re: [Json] Call for Consensus: Proposed Text for … Peter Cordell
Re: [Json] Call for Consensus: Proposed Text for … Julian Reschke
Re: [Json] Call for Consensus: Proposed Text for … HANSEN, TONY L
Re: [Json] Call for Consensus: Proposed Text for … Pete Cordell
Re: [Json] Call for Consensus: Proposed Text for … Matthew A. Miller
Re: [Json] Call for Consensus: Proposed Text for … HANSEN, TONY L
Re: [Json] Call for Consensus: Proposed Text for … Julian Reschke
Re: [Json] Call for Consensus: Proposed Text for … HANSEN, TONY L
Re: [Json] Call for Consensus: Proposed Text for … John Cowan
Re: [Json] Call for Consensus: Proposed Text for … Matthew A. Miller
Re: [Json] Call for Consensus: Proposed Text for … Pete Cordell
Re: [Json] Call for Consensus: Proposed Text for … Nico Williams
Re: [Json] Call for Consensus: Proposed Text for … John Cowan
[Json] FW: Call for Consensus: Proposed Text for … Manger, James
Re: [Json] FW: Call for Consensus: Proposed Text … John Cowan
Re: [Json] FW: Call for Consensus: Proposed Text … Manger, James
Re: [Json] Call for Consensus: Proposed Text for … Martin J. Dürst
Re: [Json] Call for Consensus: Proposed Text for … Martin J. Dürst
Re: [Json] Call for Consensus: Proposed Text for … Matthew A. Miller
Re: [Json] Call for Consensus: Proposed Text for … Martin J. Dürst
Re: [Json] Call for Consensus: Proposed Text for … Tim Bray
Re: [Json] Call for Consensus: Proposed Text for … Julian Reschke
Re: [Json] Call for Consensus: Proposed Text for … Pete Cordell
Re: [Json] Call for Consensus: Proposed Text for … Matthew A. Miller
Re: [Json] Call for Consensus: Proposed Text for … Julian Reschke
Re: [Json] Call for Consensus: Proposed Text for … Pete Cordell
Re: [Json] Call for Consensus: Proposed Text for … Julian Reschke
Re: [Json] Call for Consensus: Proposed Text for … Carsten Bormann
Re: [Json] Call for Consensus: Proposed Text for … Pete Cordell
Re: [Json] Call for Consensus: Proposed Text for … Matthew A. Miller
Re: [Json] Call for Consensus: Proposed Text for … Joe Hildebrand
Re: [Json] Call for Consensus: Proposed Text for … John Cowan
Re: [Json] Call for Consensus: Proposed Text for … Pete Cordell
Re: [Json] Call for Consensus: Proposed Text for … Tim Bray
Re: [Json] Call for Consensus: Proposed Text for … Matthew A. Miller
Re: [Json] Call for Consensus: Proposed Text for … Martin J. Dürst
Re: [Json] Call for Consensus: Proposed Text for … Julian Reschke
Re: [Json] Call for Consensus: Proposed Text for … Alexey Melnikov
Re: [Json] Call for Consensus: Proposed Text for … Pete Cordell
Re: [Json] Call for Consensus: Proposed Text for … Allen Wirfs-Brock
Re: [Json] Call for Consensus: Proposed Text for … Tim Bray
Re: [Json] Call for Consensus: Proposed Text for … Julian Reschke
Re: [Json] Call for Consensus: Proposed Text for … Alexey Melnikov
Re: [Json] Call for Consensus: Proposed Text for … Pete Cordell