Re: [Json] [jose] EcmaScript V6 - numbers

"Manger, James" <James.H.Manger@team.telstra.com> Mon, 02 November 2015 02:32 UTC

Return-Path: <James.H.Manger@team.telstra.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2C4A21AD0C5; Sun, 1 Nov 2015 18:32:17 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.902
X-Spam-Level:
X-Spam-Status: No, score=-0.902 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HELO_EQ_AU=0.377, HOST_EQ_AU=0.327, RCVD_IN_DNSWL_LOW=-0.7, RELAY_IS_203=0.994] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Qx0FDGmtKc_I; Sun, 1 Nov 2015 18:32:14 -0800 (PST)
Received: from ipxano.tcif.telstra.com.au (ipxano.tcif.telstra.com.au [203.35.82.200]) by ietfa.amsl.com (Postfix) with ESMTP id 7B1211AD0BC; Sun, 1 Nov 2015 18:32:13 -0800 (PST)
X-IronPort-AV: E=Sophos;i="5.20,232,1444654800"; d="scan'208";a="41128429"
Received: from unknown (HELO ipcdni.tcif.telstra.com.au) ([10.97.216.212]) by ipoani.tcif.telstra.com.au with ESMTP; 02 Nov 2015 13:32:10 +1100
X-IronPort-AV: E=McAfee;i="5700,7163,7972"; a="289111045"
Received: from wsmsg3751.srv.dir.telstra.com ([172.49.40.172]) by ipcdni.tcif.telstra.com.au with ESMTP; 02 Nov 2015 13:32:10 +1100
Received: from WSMSG3153V.srv.dir.telstra.com ([172.49.40.159]) by WSMSG3751.srv.dir.telstra.com ([172.49.40.172]) with mapi; Mon, 2 Nov 2015 13:32:09 +1100
From: "Manger, James" <James.H.Manger@team.telstra.com>
To: Anders Rundgren <anders.rundgren.net@gmail.com>, "json@ietf.org" <json@ietf.org>, "jose@ietf.org" <jose@ietf.org>
Date: Mon, 02 Nov 2015 13:32:08 +1100
Thread-Topic: [jose] EcmaScript V6 - numbers
Thread-Index: AdEVFKhfIy5zplRrQ6+/nA+BPRIwSQ==
Message-ID: <255B9BB34FB7D647A506DC292726F6E13BB16C9857@WSMSG3153V.srv.dir.telstra.com>
Accept-Language: en-US, en-AU
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: en-US, en-AU
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Archived-At: <http://mailarchive.ietf.org/arch/msg/json/AaxFr0Fg2G4gcMyuEk8bErWVR7w>
Subject: Re: [Json] [jose] EcmaScript V6 - numbers
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 02 Nov 2015 02:32:17 -0000

Hi Anders,

For floating point numbers in JSON, I am not certain that removing digits beyond 15 guarantees a canonical form. Most (64-bit) doubles might have 15.95 digits of precision, but what about the range of "denormal" doubles that have less precision?

My concern wasn't so much that the last digit might vary, but whether only some implementations would round to shorter forms. For example, consider three successive 64-bit doubles near 0.3. In hex with a base 2 exponent the values are (java.lang.Double#toHexString(double)):
0x1.3333333333332p-2
0x1.3333333333333p-2
0x1.3333333333334p-2
The exact decimal values for these are:
0.29999999999999993338661852249060757458209991455078125
0.299999999999999988897769753748434595763683319091796875
0.3000000000000000444089209850062616169452667236328125
Canonical JSON forms need to be:
0.29999999999999993
0.3
0.30000000000000004
which have 17, 1, and 17 significant digits; but might some implementations use 17 digits for the middle one as well?
0.29999999999999999

I now think that specifying a canonical form as the "shortest correct representation" can work; it does give a unique string for each 64-bit double (it gives 0.3 above). ECMAScript "ToString Applied to the Number Type" is not quite phrased in this way, but it might be equivalent (with the recommended "accurate conversion" version).

I am confident that V8 (ECMAScript in Chrome) produces this "shortest correct representation". See DTOA_SHORTEST in https://chromium.googlesource.com/v8/v8.git/+/master/src/dtoa.h. If a few other implement this form as well, then it looks like the best way to define a canonical form for 64-bit doubles.

Hopefully, interoperability with non-ECMAScript languages can be simpler than your es6JsonNumberSerialization(double) function. Ideally the following should work: use ECMAScript's choice of when to include or omit an exponent, and omit trailing zeros (and use lower-case). [may need to add a precision and locale]

	static String toJsonString(double d)
	{
		double ad = Math.abs(d);
		if (1e-6d <= ad && ad < 1e21d) return String.format("%f", d).replaceFirst("\\.?0++$", "");
		else if (ad == 0) return "0";
		else return String.format("%g", d).replaceFirst("\\.?0++e", "e");
	}

Anders,
Do you still think "the textual representation of numbers MUST be preserved during parsing"?
I would prefer to drop that and require serialization to use the "shortest correct representation" + specify when to omit an exponent.

--
James Manger



-----Original Message-----
From: jose [mailto:jose-bounces@ietf.org] On Behalf Of Anders Rundgren
Sent: Monday, 26 October 2015 4:15 PM
To: Manger, James <James.H.Manger@team.telstra.com>; json@ietf.org; jose@ietf.org
Subject: Re: [jose] EcmaScript V6 - Defined Property Order

On 2015-10-26 00:10, Manger, James wrote:
> Hi Anders,
>
> I agree that the EcmaScript string format for numbers is a better basis for a canonical JSON format than, say, normalized scientific notation - particularly for the dominant case of integers less than 2^64. However, EcmaScript's ToString(number) doesn't quite give a canonical form. 7.1.12.1 step 5 says "the least significant digit of s is not necessarily uniquely determined by these criteria". EcmaScript guarantees that ToNumber(ToString(x)) gives the same number x, but that is not quite what we need for signing. We need ToString(ToNumber(s)) to give the same string. I guess you could sign the 8 bytes of a 64-bit float, instead of the JSON decimal digits.

Hi James,
Thanx for pointing out this, it is apparently always a very good idea testing concepts with other knowledgeable people before you actually start building something :-)

I guess the ES committee wasn't entirely happy about having to adjust their spec. due to improper reliance on JavaScript property order by parts of the development community.  But they probably did the right thing.

I'm thinking in a similar way.  Why let an edge-case spoil all the fun?  Maybe the ES6 vendors implement the same broken ToString algorithm or the improved version mentioned as a note after the section you referred to?  I won't research this issue now because I consider Ecma the sole "owner" of this problem :-)

So this is my (latest) suggestion for an upgraded in-object JSON clear-text signature specification:

     "Due to limitations in the EcmaScript V6 [ECMA-262] specification regarding
      the ToString(number) method, it is for interoperability reasons RECOMMENDED
      to utilize a maximum of 18 digits of precision for non-integer Numbers."

It sure isn't pretty but since "business messaging" can't even use JSON/ES numbers for expressing monetary amounts, it is hardly a show-stopper.

Anders Rundgren


>
> James Manger
>
> -----Original Message-----
> From: jose [mailto:jose-bounces@ietf.org] On Behalf Of Anders Rundgren
> Sent: Monday, 26 October 2015 2:33 AM
> To: jose@ietf.org; json@ietf.org
> Subject: Re: [jose] EcmaScript V6 - Defined Property Order
>
> Since the ES6 Number type is 64-bit IEEE, there's no need to worry about number canonicalization either if you base the signature system on ES6 which seems like a pretty safe bet.
>
> http://www.ecma-international.org/ecma-262/6.0/index.html#sec-tostring-applied-to-the-number-type
>
> That is, AFAICT, clear-text in-object JSON signatures are already compatible with ES6 (and I must drop my "number preservation" stuff...).
>
> Folks working with constrained devices will probably settle for CBOR.
>
> On 2015-10-25 10:08, Anders Rundgren wrote:
>> http://www.ecma-international.org/ecma-262/6.0/index.html#sec-ordinary-object-internal-methods-and-internal-slots-ownpropertykeys
>>
>> I can't say I'm able "deciphering" the ES6 specification but it seems that the largest base of JSON parsers (the browsers), now are compliant with in-object JSON clear-text signature schemes of the kind I have proposed (pushing maybe...), albeit with some (IMO for practical purposes insignificant) limitations:
>>
>> - Integer property names doesn't work.
>> - Numeric values would have to be normalized.
>>
>> Java, Python, and C# already manages this as well.
>>
>> Yay!
>>
>> Anders
>>
> _______________________________________________
> jose mailing list
> jose@ietf.org
> https://www.ietf.org/mailman/listinfo/jose

_______________________________________________
jose mailing list
jose@ietf.org
https://www.ietf.org/mailman/listinfo/jose