Re: [Json] Call for Consensus: Proposed Text for "8.1 Character Encoding"

"HANSEN, TONY L" <tony@att.com> Tue, 21 March 2017 04:20 UTC

From: "HANSEN, TONY L" <tony@att.com>
To: "json@ietf.org" <json@ietf.org>
CC: The IESG <iesg@ietf.org>
Thread-Topic: [Json] Call for Consensus: Proposed Text for "8.1 Character Encoding"
Thread-Index: AQHSoB2hkOiDU0h/uUO6PkcNIhnyU6GcKDiAgAASAQCAAfYDAIAAhFmA
Date: Tue, 21 Mar 2017 04:20:02 +0000
Message-ID: <D89BCFAA-B81F-4EEB-8B3A-180BAAB9D16C@att.com>
References: <1fb5849e-8dbf-835d-65b7-2403686248f9@outer-planes.net> <0E32A94D-CE12-4F52-9ED6-8743C49751B4@vpnc.org> <4d2f0fb3-a729-0c17-2394-bc1e005dd612@gmx.de> <d09f9a59-2411-45a0-470c-ea95072fe4fd@outer-planes.net> <dad91b19-e774-e239-36d2-9d086cca8e0d@gmx.de> <ac432615-ee84-3cdf-6b37-480626bd18c1@gmx.de> <804f9930-26a5-a565-0607-452b386cfeb5@outer-planes.net>
In-Reply-To: <804f9930-26a5-a565-0607-452b386cfeb5@outer-planes.net>
Accept-Language: en-US
Content-Language: en-US
Content-Type: text/plain; charset="utf-8"
Content-ID: <02E62CB531054142868A665A9E1BA287@LOCAL>
Content-Transfer-Encoding: base64
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/json/Dmoqnjg7cRJJGm2ZoUKUgkFdmXo>
Subject: Re: [Json] Call for Consensus: Proposed Text for "8.1 Character Encoding"
Precedence: list

I like most of what Matt has below. However, I would prefer that the statement about inspecting the first octets for nulls were more explicit. I think it was Julian who posted a nice little chart the other day on how to determine which utf-16/32 variants was present based on the null pattern. If you don’t want it in the main text, then at least put it into an appendix. Otherwise, the code will be re-created many times, and probably often incorrectly.

	Tony Hansen

On 3/20/17, 12:26 PM, "json on behalf of Matthew A. Miller" <json-bounces@ietf.org on behalf of linuxwolf+ietf@outer-planes.net> wrote:

    Thank you for the suggested changes, Julian.  To consolidate the
    changes, I believe the following is your suggested text for all of
    Section 8.1:
    
    """
    JSON text MUST be encoded in UTF-8, UTF-16, or UTF-32 Section 3 of
    [UNICODE].  The default encoding is UTF-8, and JSON texts that are
    encoded in UTF-8 are interoperable in the sense that they will be
    read successfully by the maximum number of implementations; there are
    many implementations that cannot successfully read texts in other
    encodings (such as UTF-16 and UTF-32).  Text encoded in character
    encodings other than UTF-8, UTF-16, or UTF-32 cannot be used with
    the media tye "application/json".
    
    Implementations MUST NOT add a byte order mark (U+FEFF) to the
    beginning of a JSON text.  In the interests of interoperability,
    implementations that parse JSON texts MAY ignore the presence of a
    byte order mark rather than treating it as an error.
    
    Recipients that wish to support Unicode encodings other than UTF-8
    can do this using a detection mechanism that is based on the fact
    that the first character will always have a Unicode code point less
    or equal than 127, thus the UTF-16/32 variants can be detected by
    inspecting the first octets for nulls.
    """
    
    Does the working group object to this change?
    
    
    - m&m
    
    Matthew A. Miller
    JSONbis Chair
    
    On 17/03/19 04:29, Julian Reschke wrote:
    > ...and here is a concrete proposal:
    > 
    > Original text:
    > 
    >> 8.1.  Character Encoding
    >>
    >>    JSON text SHALL be encoded in UTF-8, UTF-16, or UTF-32 [UNICODE]
    >>    (Section 3).  The default encoding is UTF-8, and JSON texts that are
    > 
    > Change 1:
    > 
    > Say "MUST" instead of "SHALL", as it's the more common form of
    > expressing this requirement.
    > 
    > Change 2:
    > 
    > Replace "[UNICODE] (Section 3)" by "Section 3 of [UNICODE]".
    > 
    > That said, this citation isn't as stable as it should, as [UNICODE]
    > refers to <http://www.unicode.org/versions/latest/> and unless I'm
    > missing something, there's no guarantee that future versions will have
    > the relevant bits in Section 3.
    > 
    >>    encoded in UTF-8 are interoperable in the sense that they will be
    >>    read successfully by the maximum number of implementations; there are
    >>    many implementations that cannot successfully read texts in other
    >>    encodings (such as UTF-16 and UTF-32).
    > 
    > Change 3:
    > 
    > Add "Text encoded in character encodings other than UTF-8, UTF-16, or
    > UTF-32 can not be used with the media type "application/json".
    > 
    > (this explains the implications of the SHALL/MUST)
    > 
    > 
    >>    Implementations MUST NOT add a byte order mark (U+FEFF) to the
    >>    beginning of a JSON text.  In the interests of interoperability,
    >>    implementations that parse JSON texts MAY ignore the presence of a
    >>    byte order mark rather than treating it as an error.
    > 
    > 
    > Finally, change 4:
    > 
    > Add a new paragraph:
    > 
    > "Recipients that wish to support Unicode encodings other than UTF-8 can
    > do this using a detection mechanism that is based on the fact that the
    > first character will always have a Unicode code point less or equal than
    > 127, thus the UTF-16/32 variants can be detected by inspecting the first
    > octets for nulls."
    > 
    > 
    > I believe none of these changes affects anything normative, but that
    > they absolutely clarify the spec. In particular, having them in the spec
    > would have avoided this whole discussion we just had.
    > 
    > Best regards, Julian

Re: [Json] Call for Consensus: Proposed Text for … HANSEN, TONY L
Re: [Json] Call for Consensus: Proposed Text for … Nico Williams
Re: [Json] Call for Consensus: Proposed Text for … Matthew A. Miller
Re: [Json] Call for Consensus: Proposed Text for … Carsten Bormann
Re: [Json] Call for Consensus: Proposed Text for … Martin J. Dürst
Re: [Json] Call for Consensus: Proposed Text for … Tim Bray
Re: [Json] Call for Consensus: Proposed Text for … John Cowan
Re: [Json] Call for Consensus: Proposed Text for … HANSEN, TONY L
Re: [Json] Call for Consensus: Proposed Text for … Pete Cordell
Re: [Json] Call for Consensus: Proposed Text for … Carsten Bormann
Re: [Json] Call for Consensus: Proposed Text for … Pete Cordell
Re: [Json] Call for Consensus: Proposed Text for … Martin J. Dürst
Re: [Json] Call for Consensus: Proposed Text for … Pete Cordell
Re: [Json] Call for Consensus: Proposed Text for … Matthew A. Miller
[Json] Call for Consensus: Proposed Text for "8.1… Matthew Miller
Re: [Json] Call for Consensus: Proposed Text for … Carsten Bormann
Re: [Json] Call for Consensus: Proposed Text for … Julian Reschke
Re: [Json] Call for Consensus: Proposed Text for … Nico Williams
Re: [Json] Call for Consensus: Proposed Text for … Nico Williams
Re: [Json] Call for Consensus: Proposed Text for … Carsten Bormann
Re: [Json] Call for Consensus: Proposed Text for … Julian Reschke
Re: [Json] Call for Consensus: Proposed Text for … Carsten Bormann
Re: [Json] Call for Consensus: Proposed Text for … Julian Reschke
Re: [Json] Call for Consensus: Proposed Text for … Peter Cordell
Re: [Json] Call for Consensus: Proposed Text for … Carsten Bormann
Re: [Json] Call for Consensus: Proposed Text for … Julian Reschke
Re: [Json] Call for Consensus: Proposed Text for … Tim Bray
Re: [Json] Call for Consensus: Proposed Text for … Julian Reschke
Re: [Json] Call for Consensus: Proposed Text for … Carsten Bormann
Re: [Json] Call for Consensus: Proposed Text for … Peter Cordell
Re: [Json] Call for Consensus: Proposed Text for … Carsten Bormann
Re: [Json] Call for Consensus: Proposed Text for … Martin J. Dürst
Re: [Json] Call for Consensus: Proposed Text for … Peter Cordell
Re: [Json] Call for Consensus: Proposed Text for … Paul Hoffman
Re: [Json] Call for Consensus: Proposed Text for … Julian Reschke
Re: [Json] Call for Consensus: Proposed Text for … Matthew A. Miller
Re: [Json] Call for Consensus: Proposed Text for … Julian Reschke
Re: [Json] Call for Consensus: Proposed Text for … Julian Reschke
Re: [Json] Call for Consensus: Proposed Text for … Carsten Bormann
Re: [Json] Call for Consensus: Proposed Text for … Julian Reschke
Re: [Json] Call for Consensus: Proposed Text for … Matthew A. Miller
Re: [Json] Call for Consensus: Proposed Text for … John Cowan
Re: [Json] Call for Consensus: Proposed Text for … Julian Reschke
Re: [Json] Call for Consensus: Proposed Text for … Peter Cordell
Re: [Json] Call for Consensus: Proposed Text for … Julian Reschke
Re: [Json] Call for Consensus: Proposed Text for … HANSEN, TONY L
Re: [Json] Call for Consensus: Proposed Text for … Pete Cordell
Re: [Json] Call for Consensus: Proposed Text for … Matthew A. Miller
Re: [Json] Call for Consensus: Proposed Text for … HANSEN, TONY L
Re: [Json] Call for Consensus: Proposed Text for … Julian Reschke
Re: [Json] Call for Consensus: Proposed Text for … HANSEN, TONY L
Re: [Json] Call for Consensus: Proposed Text for … John Cowan
Re: [Json] Call for Consensus: Proposed Text for … Matthew A. Miller
Re: [Json] Call for Consensus: Proposed Text for … Pete Cordell
Re: [Json] Call for Consensus: Proposed Text for … Nico Williams
Re: [Json] Call for Consensus: Proposed Text for … John Cowan
[Json] FW: Call for Consensus: Proposed Text for … Manger, James
Re: [Json] FW: Call for Consensus: Proposed Text … John Cowan
Re: [Json] FW: Call for Consensus: Proposed Text … Manger, James
Re: [Json] Call for Consensus: Proposed Text for … Martin J. Dürst
Re: [Json] Call for Consensus: Proposed Text for … Martin J. Dürst
Re: [Json] Call for Consensus: Proposed Text for … Matthew A. Miller
Re: [Json] Call for Consensus: Proposed Text for … Martin J. Dürst
Re: [Json] Call for Consensus: Proposed Text for … Tim Bray
Re: [Json] Call for Consensus: Proposed Text for … Julian Reschke
Re: [Json] Call for Consensus: Proposed Text for … Pete Cordell
Re: [Json] Call for Consensus: Proposed Text for … Matthew A. Miller
Re: [Json] Call for Consensus: Proposed Text for … Julian Reschke
Re: [Json] Call for Consensus: Proposed Text for … Pete Cordell
Re: [Json] Call for Consensus: Proposed Text for … Julian Reschke
Re: [Json] Call for Consensus: Proposed Text for … Carsten Bormann
Re: [Json] Call for Consensus: Proposed Text for … Pete Cordell
Re: [Json] Call for Consensus: Proposed Text for … Matthew A. Miller
Re: [Json] Call for Consensus: Proposed Text for … Joe Hildebrand
Re: [Json] Call for Consensus: Proposed Text for … John Cowan
Re: [Json] Call for Consensus: Proposed Text for … Pete Cordell
Re: [Json] Call for Consensus: Proposed Text for … Tim Bray
Re: [Json] Call for Consensus: Proposed Text for … Matthew A. Miller
Re: [Json] Call for Consensus: Proposed Text for … Martin J. Dürst
Re: [Json] Call for Consensus: Proposed Text for … Julian Reschke
Re: [Json] Call for Consensus: Proposed Text for … Alexey Melnikov
Re: [Json] Call for Consensus: Proposed Text for … Pete Cordell
Re: [Json] Call for Consensus: Proposed Text for … Allen Wirfs-Brock
Re: [Json] Call for Consensus: Proposed Text for … Tim Bray
Re: [Json] Call for Consensus: Proposed Text for … Julian Reschke
Re: [Json] Call for Consensus: Proposed Text for … Alexey Melnikov
Re: [Json] Call for Consensus: Proposed Text for … Pete Cordell