Re: [Json] Call for Consensus: Proposed Text for "8.1 Character Encoding"

Carsten Bormann <cabo@tzi.org> Mon, 17 April 2017 20:47 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 027F712946A for <json@ietfa.amsl.com>; Mon, 17 Apr 2017 13:47:31 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.199
X-Spam-Level:
X-Spam-Status: No, score=-4.199 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4ER2YFGpdQl4 for <json@ietfa.amsl.com>; Mon, 17 Apr 2017 13:47:29 -0700 (PDT)
Received: from mailhost.informatik.uni-bremen.de (mailhost.informatik.uni-bremen.de [IPv6:2001:638:708:30c9::12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id F287F127977 for <json@ietf.org>; Mon, 17 Apr 2017 13:47:28 -0700 (PDT)
X-Virus-Scanned: amavisd-new at informatik.uni-bremen.de
Received: from submithost.informatik.uni-bremen.de (submithost.informatik.uni-bremen.de [IPv6:2001:638:708:30c9::b]) by mailhost.informatik.uni-bremen.de (8.14.5/8.14.5) with ESMTP id v3HKlQOk008486; Mon, 17 Apr 2017 22:47:26 +0200 (CEST)
Received: from [192.168.217.113] (p5DCCCDC2.dip0.t-ipconnect.de [93.204.205.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by submithost.informatik.uni-bremen.de (Postfix) with ESMTPSA id 3w6L1B0tznzDHZL; Mon, 17 Apr 2017 22:47:26 +0200 (CEST)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <20170417175627.GK23461@localhost>
Date: Mon, 17 Apr 2017 22:47:25 +0200
Cc: "json@ietf.org" <json@ietf.org>
X-Mao-Original-Outgoing-Id: 514154845.386077-d3cecfa11bde7041eb62cc206db2b592
Content-Transfer-Encoding: quoted-printable
Message-Id: <10B651F1-7FE0-484D-BD2E-FD146BC5FB04@tzi.org>
References: <e69d7c21-85cb-45f4-c0c2-34c624e63049@outer-planes.net> <14252631-AD76-4537-89BF-6368F4A8CDF4@att.com> <7e6af21f-16ea-a3bc-9c01-595ae8acebba@gmx.de> <05100401-88D4-4158-A3FF-3EF144D85449@att.com> <CAD2gp_T0bfpnsCA_t4BAMtEhr7p8JkZggjnY4F+m9-M2hWLfmw@mail.gmail.com> <1e94516c-9c82-8b0e-0d2d-7dbaa83b21bd@outer-planes.net> <40e3207f-e047-c898-1f0c-4422de1d597a@it.aoyama.ac.jp> <1b3ec14a-927a-8d46-e3d3-9807a9588437@outer-planes.net> <CAHBU6ivsq8+Z=MMkUH+=Q0uwc5NCtaJLYw5cp0Qg8eX2hQQ6sA@mail.gmail.com> <b74cb31b-8e04-17d0-548a-fc164ce07c05@outer-planes.net> <20170417175627.GK23461@localhost>
To: "Matthew A. Miller" <linuxwolf+ietf@outer-planes.net>
X-Mailer: Apple Mail (2.3273)
Archived-At: <https://mailarchive.ietf.org/arch/msg/json/0vzh6mShjnS5_DwpQ_qKGxE5wm0>
Subject: Re: [Json] Call for Consensus: Proposed Text for "8.1 Character Encoding"
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 17 Apr 2017 20:47:31 -0000

On Apr 17, 2017, at 19:56, Nico Williams <nico@cryptonector.com> wrote:
> 
>> Thinking about this more, putting an encoding detection algorithm as an
>> appendix seems like a reasonable compromise to me.  To start, how about
>> removing the detection text from Section 8.1 and have an appendix that
>> starts with that text plus the table?
> 
> Or we could even just assert that such an algorithm is possible, and
> that implementors MAY implement one.

Indeed.

Broken record mode:

— writing up the algorithm sounds like encouraging implementation.
  We *don’t* want people to implement this!
  (The whole interminable non-UTF-8 saga probably just was a nod from the RFC 4627 authors to the remnants of UTF-16 land, which mostly have died off since.  Why resurrect?)

- there have been about 15 attempts to define this algorithm on the mailing list.
  All were wrong.
  An Internet Standard should contain tried and true material, not errata fodder.

- an implementer is in a much better position to get this right than the standard, because they can write unit tests.

> 
>> Assuming the above, what does everyone think of the following for
>> Section 8.1?
> 
> +1.

+1

Grüße, Carsten