Re: [Json] Proposed change: update the Unicode version

Carsten Bormann <cabo@tzi.org> Wed, 05 June 2013 18:43 UTC

Mime-Version: 1.0 (Mac OS X Mail 6.3 \(1503\))
Content-Type: text/plain; charset="windows-1252"
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <CAHBU6itdKgenDnKPP94VWGro+p0GkC-3aDnwqdgztVknu89WJA@mail.gmail.com>
Date: Wed, 05 Jun 2013 20:43:26 +0200
Content-Transfer-Encoding: quoted-printable
Message-Id: <FAF84D51-F683-4B0C-A24D-89F491D0A901@tzi.org>
References: <A723FC6ECC552A4D8C8249D9E07425A70FC286AF@xmb-rcd-x10.cisco.com> <51AE6E95.3050007@stpeter.im> <CAHBU6iu083Q+tFcBt=CshS68DWFZ-8JH3ahquXKGW1t1GgCyjg@mail.gmail.com> <51AE736D.7030209@stpeter.im> <BF7E36B9C495A6468E8EC573603ED9411527BCD5@xmb-aln-x11.cisco.com> <5DC8FE77-10A8-4835-8415-ACC3FC323663@tzi.org> <CAHBU6itdKgenDnKPP94VWGro+p0GkC-3aDnwqdgztVknu89WJA@mail.gmail.com>
To: Tim Bray <tbray@textuality.com>
Cc: "Matt Miller (mamille2)" <mamille2@cisco.com>, "json@ietf.org" <json@ietf.org>
Subject: Re: [Json] Proposed change: update the Unicode version
Precedence: list

On Jun 5, 2013, at 18:56, Tim Bray <tbray@textuality.com> wrote:

>> It would help me if you could briefly explain what the reference to 2.7 specifically adds here.
>> (To me that is a bit confusing, as 2.7 is about internal programming language representation, not about representation for interchange.  But maybe I just don't understand.)
> 
> Hm? The first paragraph says “A Unicode string data type is simply an ordered sequence of code units. Thus a Unicode 8-bit string is an ordered sequence of 8-bit code units, a Unicode 16-bit string is an ordered sequence of 16-bit code units, and a Unicode 32-bit string is an  ordered sequence of 32-bit code units.”

If you were trying to answer my question I must admit that I'm not any further along in understanding why the reference is to 2.7.

4627 Section 3 is about encoding JSON on the wire (correct me if that impression is wrong).  (Actually, it doesn't say that much, the meat is then section 6, which with the title change raises the question whether application/json and the JSON format are the same thing or not.  But I digress.)  Unicode section 2.7 is most emphatically NOT about Unicode on the wire, it is about data types in programming languages.  Much of it is about the problems of processing incomplete UTF-16 strings (unpaired surrogates) in programming languages and their libraries.

The section in the Unicode standard about encoding is 2.6.  This defines seven encoding schemes, some of which mainly differ in their treatment of BOMs.  BOMs are not allowed by the JSON grammar, but it is not entirely obvious that that applies to the sequence of characters obtained after decoding the Unicode encoding scheme.

Now Douglas says:

> I think the section on encoding is not saying anything useful and should be completely removed.

Works for me (and makes this discussion moot).

Grüße, Carsten

[Json] Proposed change: update the Unicode version Paul Hoffman
Re: [Json] Proposed change: update the Unicode ve… Peter Saint-Andre
Re: [Json] Proposed change: update the Unicode ve… Joe Hildebrand (jhildebr)
Re: [Json] Proposed change: update the Unicode ve… Tim Bray
Re: [Json] Proposed change: update the Unicode ve… Joe Hildebrand (jhildebr)
Re: [Json] Proposed change: update the Unicode ve… Peter Saint-Andre
Re: [Json] Proposed change: update the Unicode ve… Tim Bray
Re: [Json] Proposed change: update the Unicode ve… Peter Saint-Andre
Re: [Json] Proposed change: update the Unicode ve… Matt Miller (mamille2)
Re: [Json] Proposed change: update the Unicode ve… Joe Hildebrand (jhildebr)
Re: [Json] Proposed change: update the Unicode ve… Tim Bray
Re: [Json] Proposed change: update the Unicode ve… Carsten Bormann
Re: [Json] Proposed change: update the Unicode ve… John Cowan
Re: [Json] Proposed change: update the Unicode ve… Tim Bray
Re: [Json] Proposed change: update the Unicode ve… John Cowan
Re: [Json] Proposed change: update the Unicode ve… Matt Miller (mamille2)
Re: [Json] Proposed change: update the Unicode ve… Carsten Bormann