Re: [Json] Call for Consensus: Proposed Text for "8.1 Character Encoding"

Julian Reschke <julian.reschke@gmx.de> Tue, 28 March 2017 05:34 UTC

Return-Path: <julian.reschke@gmx.de>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9D315127342 for <json@ietfa.amsl.com>; Mon, 27 Mar 2017 22:34:00 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -5.396
X-Spam-Level:
X-Spam-Status: No, score=-5.396 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-2.796, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 3LNIIBRl0rwV for <json@ietfa.amsl.com>; Mon, 27 Mar 2017 22:33:58 -0700 (PDT)
Received: from mout.gmx.net (mout.gmx.net [212.227.17.21]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4F14E128AB0 for <json@ietf.org>; Mon, 27 Mar 2017 22:33:58 -0700 (PDT)
Received: from [192.168.178.20] ([93.217.102.6]) by mail.gmx.com (mrgmx102 [212.227.17.168]) with ESMTPSA (Nemesis) id 0MfBsk-1cV0sy3kVu-00OpUp; Tue, 28 Mar 2017 07:33:55 +0200
To: Tim Bray <tbray@textuality.com>, "Matthew A. Miller" <linuxwolf+ietf@outer-planes.net>
References: <1fb5849e-8dbf-835d-65b7-2403686248f9@outer-planes.net> <0E32A94D-CE12-4F52-9ED6-8743C49751B4@vpnc.org> <4d2f0fb3-a729-0c17-2394-bc1e005dd612@gmx.de> <d09f9a59-2411-45a0-470c-ea95072fe4fd@outer-planes.net> <dad91b19-e774-e239-36d2-9d086cca8e0d@gmx.de> <ac432615-ee84-3cdf-6b37-480626bd18c1@gmx.de> <804f9930-26a5-a565-0607-452b386cfeb5@outer-planes.net> <D89BCFAA-B81F-4EEB-8B3A-180BAAB9D16C@att.com> <e69d7c21-85cb-45f4-c0c2-34c624e63049@outer-planes.net> <14252631-AD76-4537-89BF-6368F4A8CDF4@att.com> <7e6af21f-16ea-a3bc-9c01-595ae8acebba@gmx.de> <05100401-88D4-4158-A3FF-3EF144D85449@att.com> <CAD2gp_T0bfpnsCA_t4BAMtEhr7p8JkZggjnY4F+m9-M2hWLfmw@mail.gmail.com> <1e94516c-9c82-8b0e-0d2d-7dbaa83b21bd@outer-planes.net> <40e3207f-e047-c898-1f0c-4422de1d597a@it.aoyama.ac.jp> <1b3ec14a-927a-8d46-e3d3-9807a9588437@outer-planes.net> <CAHBU6ivsq8+Z=MMkUH+=Q0uwc5NCtaJLYw5cp0Qg8eX2hQQ6sA@mail.gmail.com>
Cc: "json@ietf.org" <json@ietf.org>
From: Julian Reschke <julian.reschke@gmx.de>
Message-ID: <77cbcbb4-3976-2959-fb2c-3312c633208a@gmx.de>
Date: Tue, 28 Mar 2017 07:33:52 +0200
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0
MIME-Version: 1.0
In-Reply-To: <CAHBU6ivsq8+Z=MMkUH+=Q0uwc5NCtaJLYw5cp0Qg8eX2hQQ6sA@mail.gmail.com>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 8bit
X-Provags-ID: V03:K0:Dyxb9zaS4ayquyXa2grq09repleNjX9xXG5yd1eOkX8uSuwOsUI 5pLkJHwOqECC+XG+KZp0TNbq0mPpNHGa8RyP117W0RFktiTOnYlMvwraILIawSLCjaqHV5I JmExKEcYHtZyo5IZ9PQFYJNcidnjV9HJ81HGz/sb7yh3XVb0D4VEGogjwYbsW6kYq8BJtue VwLf1SXFNKs9bgdaaflsg==
X-UI-Out-Filterresults: notjunk:1;V01:K0:PfqfHrIqudo=:BX5bWHb3d67tT4P8KHtnzH CKrpZ42btgemV54XGVcGI+5A/ZHdyO4l+dA+meKT/nNAIdfptj1shN6Pe8btkoLct1EUAIoUm O/OXqTna8hHeAa9QBQhmjPJ2rpZF4N/NH9TX3Cr4/cuSPhZBvHgNe5eaEKdU6lbPLhCr19sRT VTY7karGZ6F2MDv50X3eJD2XbumuvRre1bKMdbcDRhuPOn7T6YoEn5s9YPJQxqQFDltLOhnqG Pg9TZSwezkSHUKFYa8JuO7njPc0cJ4Z+gxQatD2gjDwTjNGTQ7FhuGlxi6G5tZO1m3X2CONbV B+pqVrevsqJR6yZ0sneO+IS9lzBuyTAvTpgzYy2oXjgkYaPafqu5MurmDbd0JZ3pFFsRzJJQJ pwlAXtbehrmxPnUaIMCLgZYne1/ChuC9QoDZ++q7CbUzUbe/50OTlRiZmqoX6DIIwZyFrJL3W wTFYeMQYHih9pVHJ3MigXCc8QfsV37rc9RKM1/BJzSRTz+BxaLEdVFltBlkE9ZqSeMRFHKHVk crhCI15KNYN1haEHwnc1f+M2Uv+ZgipCkfzVv71erBSkOLsvpG1D3cx46XjJETaYBmj4VQZvX /X8+JS/ypK8gwfMF6keCm/mJTBUz/NPZD6E3Vop5FF1oyVfjfi6NyoRM1vh30QYywT/pvsyBj pk21WA5vojQvOUPxhcCLgvcrggHYVoSyi85oH87xAP5YJHsO5/VgTm9AhyFeNJQRwQbkRfFE6 sui7EbimUxLYt1p2xFpaISFtGT9OuaEntg+eGvCns4hyO6SxSd7x/KnjAjg1fO//gzC3V7Tfh noLePzv
Archived-At: <https://mailarchive.ietf.org/arch/msg/json/Qo3ObxYp3ZE9t4jShhPE1c2Hs3c>
Subject: Re: [Json] Call for Consensus: Proposed Text for "8.1 Character Encoding"
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 28 Mar 2017 05:34:01 -0000

On 2017-03-28 06:48, Tim Bray wrote:
> First of all, let me say that I’m delighted with, and fully support, the
> promotion of the status of UTF-8 in the JSON RFC to MUST.  I suspect
> this steps way outside the JSONbis charter, but that’s a problem for
> chairs and ADs, not yr humble editor.
>
> Comments on Matt's proposed text:
>
> 1. How about a very short historical note, along the lines of: “Previous
> specifications of JSON, including the predecessor RFCs, have not
> required the use of UTF-8 for use with the application/json media type.
> However, implementors of JSON-based software have overwhelmingly chosen
> to use the UTF-8 encoding, to the extent that it is the only realistic
> way to achieve interoperability in software which generates or consumes
> JSON.”
>
> ... moving on...

If we do this, we'll have to add it to the "changes from 7159" section.

> ...
>     Recipients that wish to support Unicode encodings other than UTF-8
>     can do this using a detection mechanism that is based on the fact
>     that the first character will always have a Unicode code point
>     greater than 0 and less than 128, thus the UTF-16/32 variants can
>     be detected by inspecting the first octets for nulls.
>
>
> ​3. Is it just me, or does it feel really dorky to talk mysteriously
> about this detection mechanism without providing details?  On top of
> which, anyone who's writing the kind of software that might lead one to
> consult ​an RFC first shouldn't bloody well use anything but UTF-8.  If
> people really want to have this, I think we owe the world an outline of
> the algorithm, maybe in an appendix. I'll volunteer to make my best
> effort to draft it and try to get consensus that it's correct..  If we
> can't, that's a powerful symbol that we shouldn't have this language.
> But that's my fallback position; my real request to the group is that we
> just take this out.

That was proposed before; it seems some participants are opposed to 
saying "too much" about the detection, leading it to be implemented more 
than before.

Best regards, Julian