[Json] Upgraded Test. Re: [jose] EcmaScript V6 - numbers

Anders Rundgren <anders.rundgren.net@gmail.com> Mon, 02 November 2015 10:04 UTC

Return-Path: <anders.rundgren.net@gmail.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DB2F51A1A10; Mon, 2 Nov 2015 02:04:56 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Level:
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id G0VzfY1jcTmp; Mon, 2 Nov 2015 02:04:54 -0800 (PST)
Received: from mail-wm0-x230.google.com (mail-wm0-x230.google.com [IPv6:2a00:1450:400c:c09::230]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id E4FFC1A1A24; Mon, 2 Nov 2015 02:04:53 -0800 (PST)
Received: by wmec75 with SMTP id c75so55483287wme.1; Mon, 02 Nov 2015 02:04:52 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-type:content-transfer-encoding; bh=+f6nR09Uj2svZrUiaShpQ5bCngJBcU0MtA74DuWpYQk=; b=i0HRLe0nWy5th1AGa6XsKhqwLb+G9LR8/PskKTyLfS5XxHERodnJHoEnoiD0NXy9mU hqzrAO5bmfzWdc1tSPZ7UkzjvMMTnUvR+wxqXSJx4Yx5/dx3jeg8pgdgbunHwqcoyLLR h5z2OjJOxSMGjwKbidANG3yvGCf41G/ALwk+3lsHhv18uy3XWe7rqaenEeFWu9NgyY95 rF7OJEoIFKqogEv4RvANlMevi4mfjPXsiA0LHu+n8S4zHnLHx8rrtc0Q+ntrnr2MKdlu pMkTaYzm2zDN9NRORTUjzNQ1zv8wmOAhZyCGEuNytitQzg5vrw4NhL6TJuqCzdNroUME 8khQ==
X-Received: by 10.28.10.142 with SMTP id 136mr13115923wmk.84.1446458692336; Mon, 02 Nov 2015 02:04:52 -0800 (PST)
Received: from [192.168.1.79] (148.198.130.77.rev.sfr.net. [77.130.198.148]) by smtp.googlemail.com with ESMTPSA id jh4sm21384747wjb.33.2015.11.02.02.04.51 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 02 Nov 2015 02:04:51 -0800 (PST)
To: "Manger, James" <James.H.Manger@team.telstra.com>, "jose@ietf.org" <jose@ietf.org>, "json@ietf.org" <json@ietf.org>
References: <255B9BB34FB7D647A506DC292726F6E13BB16C9857@WSMSG3153V.srv.dir.telstra.com>
From: Anders Rundgren <anders.rundgren.net@gmail.com>
Message-ID: <56373538.1000105@gmail.com>
Date: Mon, 02 Nov 2015 11:04:40 +0100
User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0
MIME-Version: 1.0
In-Reply-To: <255B9BB34FB7D647A506DC292726F6E13BB16C9857@WSMSG3153V.srv.dir.telstra.com>
Content-Type: text/plain; charset="windows-1252"; format="flowed"
Content-Transfer-Encoding: 7bit
Archived-At: <http://mailarchive.ietf.org/arch/msg/json/pjwA6khtJaA5eTMb6nrlySlN0QM>
Subject: [Json] Upgraded Test. Re: [jose] EcmaScript V6 - numbers
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 02 Nov 2015 10:04:57 -0000

Hi James,

It seems that Java doesn't bother about the exact representation for edge-cases.

http://webpki.org/ietf/es6numbertest.html

 From an interoperability point-of-view a system that provides between 15 and 17 correct digits wouldn't be technically (too) broken by only using 15 digits externally.

I'm worried that this concern will rather kill an otherwise pretty good idea.
However, it might suffice to use RECOMMENDED although it feels like a cheap trick to fool reviewers.
I wouldn't hesitate using REQUIRED.

When/if TC 39 comes out with a fully deterministic representation of numbers, the spec can be updated,

I incorporated your formatter in the test.  It needs some fixes, right?

Anders


On 2015-11-02 03:32, Manger, James wrote:
> Hi Anders,
>
> For floating point numbers in JSON, I am not certain that removing digits beyond 15 guarantees a canonical form. Most (64-bit) doubles might have 15.95 digits of precision, but what about the range of "denormal" doubles that have less precision?
>
> My concern wasn't so much that the last digit might vary, but whether only some implementations would round to shorter forms. For example, consider three successive 64-bit doubles near 0.3. In hex with a base 2 exponent the values are (java.lang.Double#toHexString(double)):
> 0x1.3333333333332p-2
> 0x1.3333333333333p-2
> 0x1.3333333333334p-2
> The exact decimal values for these are:
> 0.29999999999999993338661852249060757458209991455078125
> 0.299999999999999988897769753748434595763683319091796875
> 0.3000000000000000444089209850062616169452667236328125
> Canonical JSON forms need to be:
> 0.29999999999999993
> 0.3
> 0.30000000000000004
> which have 17, 1, and 17 significant digits; but might some implementations use 17 digits for the middle one as well?
> 0.29999999999999999
>
> I now think that specifying a canonical form as the "shortest correct representation" can work; it does give a unique string for each 64-bit double (it gives 0.3 above). ECMAScript "ToString Applied to the Number Type" is not quite phrased in this way, but it might be equivalent (with the recommended "accurate conversion" version).
>
> I am confident that V8 (ECMAScript in Chrome) produces this "shortest correct representation". See DTOA_SHORTEST in https://chromium.googlesource.com/v8/v8.git/+/master/src/dtoa.h. If a few other implement this form as well, then it looks like the best way to define a canonical form for 64-bit doubles.
>
> Hopefully, interoperability with non-ECMAScript languages can be simpler than your es6JsonNumberSerialization(double) function. Ideally the following should work: use ECMAScript's choice of when to include or omit an exponent, and omit trailing zeros (and use lower-case). [may need to add a precision and locale]
>
> 	static String toJsonString(double d)
> 	{
> 		double ad = Math.abs(d);
> 		if (1e-6d <= ad && ad < 1e21d) return String.format("%f", d).replaceFirst("\\.?0++$", "");
> 		else if (ad == 0) return "0";
> 		else return String.format("%g", d).replaceFirst("\\.?0++e", "e");
> 	}
>
> Anders,
> Do you still think "the textual representation of numbers MUST be preserved during parsing"?
> I would prefer to drop that and require serialization to use the "shortest correct representation" + specify when to omit an exponent.
>
> --
> James Manger
>
>
>
> -----Original Message-----
> From: jose [mailto:jose-bounces@ietf.org] On Behalf Of Anders Rundgren
> Sent: Monday, 26 October 2015 4:15 PM
> To: Manger, James <James.H.Manger@team.telstra.com>; json@ietf.org; jose@ietf.org
> Subject: Re: [jose] EcmaScript V6 - Defined Property Order
>
> On 2015-10-26 00:10, Manger, James wrote:
>> Hi Anders,
>>
>> I agree that the EcmaScript string format for numbers is a better basis for a canonical JSON format than, say, normalized scientific notation - particularly for the dominant case of integers less than 2^64. However, EcmaScript's ToString(number) doesn't quite give a canonical form. 7.1.12.1 step 5 says "the least significant digit of s is not necessarily uniquely determined by these criteria". EcmaScript guarantees that ToNumber(ToString(x)) gives the same number x, but that is not quite what we need for signing. We need ToString(ToNumber(s)) to give the same string. I guess you could sign the 8 bytes of a 64-bit float, instead of the JSON decimal digits.
>
> Hi James,
> Thanx for pointing out this, it is apparently always a very good idea testing concepts with other knowledgeable people before you actually start building something :-)
>
> I guess the ES committee wasn't entirely happy about having to adjust their spec. due to improper reliance on JavaScript property order by parts of the development community.  But they probably did the right thing.
>
> I'm thinking in a similar way.  Why let an edge-case spoil all the fun?  Maybe the ES6 vendors implement the same broken ToString algorithm or the improved version mentioned as a note after the section you referred to?  I won't research this issue now because I consider Ecma the sole "owner" of this problem :-)
>
> So this is my (latest) suggestion for an upgraded in-object JSON clear-text signature specification:
>
>       "Due to limitations in the EcmaScript V6 [ECMA-262] specification regarding
>        the ToString(number) method, it is for interoperability reasons RECOMMENDED
>        to utilize a maximum of 18 digits of precision for non-integer Numbers."
>
> It sure isn't pretty but since "business messaging" can't even use JSON/ES numbers for expressing monetary amounts, it is hardly a show-stopper.
>
> Anders Rundgren
>
>
>>
>> James Manger
>>
>> -----Original Message-----
>> From: jose [mailto:jose-bounces@ietf.org] On Behalf Of Anders Rundgren
>> Sent: Monday, 26 October 2015 2:33 AM
>> To: jose@ietf.org; json@ietf.org
>> Subject: Re: [jose] EcmaScript V6 - Defined Property Order
>>
>> Since the ES6 Number type is 64-bit IEEE, there's no need to worry about number canonicalization either if you base the signature system on ES6 which seems like a pretty safe bet.
>>
>> http://www.ecma-international.org/ecma-262/6.0/index.html#sec-tostring-applied-to-the-number-type
>>
>> That is, AFAICT, clear-text in-object JSON signatures are already compatible with ES6 (and I must drop my "number preservation" stuff...).
>>
>> Folks working with constrained devices will probably settle for CBOR.
>>
>> On 2015-10-25 10:08, Anders Rundgren wrote:
>>> http://www.ecma-international.org/ecma-262/6.0/index.html#sec-ordinary-object-internal-methods-and-internal-slots-ownpropertykeys
>>>
>>> I can't say I'm able "deciphering" the ES6 specification but it seems that the largest base of JSON parsers (the browsers), now are compliant with in-object JSON clear-text signature schemes of the kind I have proposed (pushing maybe...), albeit with some (IMO for practical purposes insignificant) limitations:
>>>
>>> - Integer property names doesn't work.
>>> - Numeric values would have to be normalized.
>>>
>>> Java, Python, and C# already manages this as well.
>>>
>>> Yay!
>>>
>>> Anders
>>>
>> _______________________________________________
>> jose mailing list
>> jose@ietf.org
>> https://www.ietf.org/mailman/listinfo/jose
>
> _______________________________________________
> jose mailing list
> jose@ietf.org
> https://www.ietf.org/mailman/listinfo/jose
>
> _______________________________________________
> jose mailing list
> jose@ietf.org
> https://www.ietf.org/mailman/listinfo/jose
>