Re: [Json] Call for Consensus: Proposed Text for "8.1 Character Encoding"

Peter Cordell <petejson@codalogic.com> Tue, 14 March 2017 11:43 UTC

Return-Path: <petejson@codalogic.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2FB0F129543 for <json@ietfa.amsl.com>; Tue, 14 Mar 2017 04:43:10 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.92
X-Spam-Level:
X-Spam-Status: No, score=-0.92 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RDNS_DYNAMIC=0.982, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id R0RNH44fyykj for <json@ietfa.amsl.com>; Tue, 14 Mar 2017 04:43:09 -0700 (PDT)
Received: from ppsa-online.com (lvps217-199-162-192.vps.webfusion.co.uk [217.199.162.192]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 40B7C12954D for <json@ietf.org>; Tue, 14 Mar 2017 04:43:08 -0700 (PDT)
Received: (qmail 10679 invoked from network); 14 Mar 2017 11:35:50 +0000
Received: from host109-158-230-32.range109-158.btcentralplus.com (HELO ?192.168.1.72?) (109.158.230.32) by lvps217-199-162-217.vps.webfusion.co.uk with ESMTPSA (DHE-RSA-AES128-SHA encrypted, authenticated); 14 Mar 2017 11:35:50 +0000
To: Carsten Bormann <cabo@tzi.org>, "json@ietf.org" <json@ietf.org>, draft-ietf-jsonbis-rfc7159bis.all@ietf.org
References: <1fb5849e-8dbf-835d-65b7-2403686248f9@outer-planes.net> <b3cb2651-2d9f-d68d-2191-814e8dd5f5e2@gmx.de> <1cae01bf-721c-1fe0-46c2-8e82b5a043a7@codalogic.com> <76b9f10c-9599-93af-546b-5769b83bdc9b@codalogic.com> <E8C84824-C801-40C3-A2F4-A31AC082EA8B@tzi.org>
From: Peter Cordell <petejson@codalogic.com>
Message-ID: <bd65a9cf-3f30-a2e7-d3ff-ce021aa6ffd5@codalogic.com>
Date: Tue, 14 Mar 2017 11:43:05 +0000
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0
MIME-Version: 1.0
In-Reply-To: <E8C84824-C801-40C3-A2F4-A31AC082EA8B@tzi.org>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/json/BfkXqlCVuiJDLrFXZbbjLKgUHbA>
Subject: Re: [Json] Call for Consensus: Proposed Text for "8.1 Character Encoding"
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 14 Mar 2017 11:43:10 -0000

On 14/03/2017 10:46, Carsten Bormann wrote:
>>    00 00        UTF-32BE
>>    00 xx        UTF-16BE
>>    xx 00 00 00  UTF-32LE
>>    xx 00 00 xx  UTF-16LE
>>    xx 00 xx     UTF-16LE
>>    xx xx        UTF-8
>>    xx EOF       UTF-8 (For Carsten's recent comment)
>
> Wow, people are still posting incorrect match tables :-)

Well, it was posted to show the rat hole that specifying detection is, 
and illustrate why we might not want to go there.

> I think that Julian (with my amendment maybe) nailed it, no need to make ever more wronger proposals here.

I think Julian's table missed the case of a string starting with a 
character like U+0100.  If that did turn out to be the route chosen, it 
seemed prudent to point out the omission earlier rather than later. 
(I'll resist an update with xx 00 EOF UTF-16LE.)

 > (Again, in the age of unit testing, all this is much easier to get
 > right for a coder than for a spec writer.)

Only if you don't miss cases to test!

Pete Cordell
Codalogic Ltd
Rules for Describing JSON Content, http://json-content-rules.org