Re: [Json] [jose] EcmaScript V6 - numbers

Anders Rundgren <anders.rundgren.net@gmail.com> Mon, 02 November 2015 07:11 UTC

Return-Path: <anders.rundgren.net@gmail.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8EB031B3201; Sun, 1 Nov 2015 23:11:55 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Level:
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id maA1o7SMvmge; Sun, 1 Nov 2015 23:11:51 -0800 (PST)
Received: from mail-wm0-x236.google.com (mail-wm0-x236.google.com [IPv6:2a00:1450:400c:c09::236]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id CCBAA1A00B1; Sun, 1 Nov 2015 23:11:49 -0800 (PST)
Received: by wmeg8 with SMTP id g8so52325639wme.1; Sun, 01 Nov 2015 23:11:48 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-type:content-transfer-encoding; bh=F6a+RQqrZH8TMjw92krO8P0DYmGCqA5ob0PG580tl3M=; b=nuAa45odfYHTqRfjcO/78D3mDrWgZ6Tp6R+vhW8CydLiuN1TDyYMvQ/uXMQtt4eOW9 QEG75IaKQHutAlgXBaYKMepIOyBfYe9XiyFFdDdUc8uNlKcgs0f4mVaO4jUZPKDFbWS1 YWn0ykYGBljftFBi2ZDMP+8IZOQFZ5470PP6lDe0mpPG2v57ArtX6XSZYDJ9urVe7HTw QzVl5HSL/v5joHojAjhovKte0BG2YFsIr1Nwxmk+LNEFD3kxUdyk5kWGGGaViXTlRXdX 3UtoYBD6QHStxJVoi/1JIikrCPUoOJyE7XwGuPFC9NCfDil0+AdghuiBNnuI5iqFpm0u A2Zw==
X-Received: by 10.28.138.194 with SMTP id m185mr10775256wmd.92.1446448308302; Sun, 01 Nov 2015 23:11:48 -0800 (PST)
Received: from [192.168.1.79] (148.198.130.77.rev.sfr.net. [77.130.198.148]) by smtp.googlemail.com with ESMTPSA id u126sm16458556wmd.3.2015.11.01.23.11.46 (version=TLSv1/SSLv3 cipher=OTHER); Sun, 01 Nov 2015 23:11:47 -0800 (PST)
To: "Manger, James" <James.H.Manger@team.telstra.com>, "json@ietf.org" <json@ietf.org>, "jose@ietf.org" <jose@ietf.org>
References: <255B9BB34FB7D647A506DC292726F6E13BB16C9857@WSMSG3153V.srv.dir.telstra.com>
From: Anders Rundgren <anders.rundgren.net@gmail.com>
Message-ID: <56370CA8.7080304@gmail.com>
Date: Mon, 02 Nov 2015 08:11:36 +0100
User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0
MIME-Version: 1.0
In-Reply-To: <255B9BB34FB7D647A506DC292726F6E13BB16C9857@WSMSG3153V.srv.dir.telstra.com>
Content-Type: text/plain; charset="windows-1252"; format="flowed"
Content-Transfer-Encoding: 7bit
Archived-At: <http://mailarchive.ietf.org/arch/msg/json/PcgCoALMEHmdowMXHI3p0HzqVbU>
Subject: Re: [Json] [jose] EcmaScript V6 - numbers
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 02 Nov 2015 07:11:55 -0000

On 2015-11-02 03:32, Manger, James wrote:
> Hi Anders,

Hi James,

Disclaimer: I'm not a mathematician, I design and build systems.

Comments in line:


>
> For floating point numbers in JSON, I am not certain that removing digits beyond 15 guarantees a canonical form. Most (64-bit) doubles might have 15.95 digits of precision, but what about the range of "denormal" doubles that have less precision?

The original idea was defining a robust number canonicalization scheme that would be a true subset of ES6 at some expense of precision.
My (maybe somewhat naive) belief is that 15 digits of precision should suffice for some 99.9999999999999% of all use-cases :-)
This is addressed by the parseFloat(value.toPrecision(15)) "workaround".

Anyway, whatever solution we come up with, it must IMO be possible to use today with ES6 as implemented in browsers (modulo apparent bugs...) otherwise absolutely nothing will happen.

I don't see that a few restrictions or constraints would be show-stoppers.


>
> My concern wasn't so much that the last digit might vary, but whether only some implementations would round to shorter forms.

Of course, what we need/want/require is a deterministic scheme.


>   For example, consider three successive 64-bit doubles near 0.3. In hex with a base 2 exponent the values are (java.lang.Double#toHexString(double)):
> 0x1.3333333333332p-2
> 0x1.3333333333333p-2
> 0x1.3333333333334p-2
> The exact decimal values for these are:
> 0.29999999999999993338661852249060757458209991455078125
> 0.299999999999999988897769753748434595763683319091796875
> 0.3000000000000000444089209850062616169452667236328125
> Canonical JSON forms need to be:
> 0.29999999999999993
> 0.3
> 0.30000000000000004
> which have 17, 1, and 17 significant digits; but might some implementations use 17 digits for the middle one as well?
> 0.29999999999999999

This is the same using Java BigDecimal:
  original: 0.29999999999999993338661852249060757458209991455078125
  rounded: 0.300000000000000 to 15 digits of precision
  rounded: 0.2999999999999999 to 16 digits of precision
  rounded: 0.29999999999999993 to 17 digits of precision
  rounded: 0.299999999999999933 to 18 digits of precision

  original: 0.299999999999999988897769753748434595763683319091796875
  rounded: 0.300000000000000 to 15 digits of precision
  rounded: 0.3000000000000000 to 16 digits of precision
  rounded: 0.29999999999999999 to 17 digits of precision
  rounded: 0.299999999999999989 to 18 digits of precision

  original: 0.3000000000000000444089209850062616169452667236328125
  rounded: 0.300000000000000 to 15 digits of precision
  rounded: 0.3000000000000000 to 16 digits of precision
  rounded: 0.30000000000000004 to 17 digits of precision
  rounded: 0.300000000000000044 to 18 digits of precision

So what you are saying is that since some IEEE numbers can be expressed with higher precision than 15.95 digits. this is what should also be done in signed data? I feel a bit leery about that since I don't have full insight in this topic and status of implementations.


>
> I now think that specifying a canonical form as the "shortest correct representation" can work; it does give a unique string for each 64-bit double (it gives 0.3 above). ECMAScript "ToString Applied to the Number Type" is not quite phrased in this way, but it might be equivalent (with the recommended "accurate conversion" version).
>
> I am confident that V8 (ECMAScript in Chrome) produces this "shortest correct representation". See DTOA_SHORTEST in https://chromium.googlesource.com/v8/v8.git/+/master/src/dtoa.h. If a few other implement this form as well, then it looks like the best way to define a canonical form for 64-bit doubles.

I'm sure about that but this appears to be a task for TC 39 rather than the IETF.
This won't happen though unless there is genuine support for a use-case that motivates such a work-item.  Is there?


>
> Hopefully, interoperability with non-ECMAScript languages can be simpler than your es6JsonNumberSerialization(double) function. Ideally the following should work: use ECMAScript's choice of when to include or omit an exponent, and omit trailing zeros (and use lower-case). [may need to add a precision and locale]
>
> 	static String toJsonString(double d)
> 	{
> 		double ad = Math.abs(d);
> 		if (1e-6d <= ad && ad < 1e21d) return String.format("%f", d).replaceFirst("\\.?0++$", "");
> 		else if (ad == 0) return "0";
> 		else return String.format("%g", d).replaceFirst("\\.?0++e", "e");
> 	}

Thanx! Will test ASAP.


>
> Anders,
> Do you still think "the textual representation of numbers MUST be preserved during parsing"?
> I would prefer to drop that and require serialization to use the "shortest correct representation" + specify when to omit an exponent.

Well, this is my "pre-ES6" scheme which indeed isn't strictly be necessary for what we are discussing now.
I guess it at least nicely fits the IETF interoperability mantra "be conservative in what you do, be liberal in what you accept from others", right?

Cheers,
Anders R


>
> --
> James Manger
>
>
>
> -----Original Message-----
> From: jose [mailto:jose-bounces@ietf.org] On Behalf Of Anders Rundgren
> Sent: Monday, 26 October 2015 4:15 PM
> To: Manger, James <James.H.Manger@team.telstra.com>; json@ietf.org; jose@ietf.org
> Subject: Re: [jose] EcmaScript V6 - Defined Property Order
>
> On 2015-10-26 00:10, Manger, James wrote:
>> Hi Anders,
>>
>> I agree that the EcmaScript string format for numbers is a better basis for a canonical JSON format than, say, normalized scientific notation - particularly for the dominant case of integers less than 2^64. However, EcmaScript's ToString(number) doesn't quite give a canonical form. 7.1.12.1 step 5 says "the least significant digit of s is not necessarily uniquely determined by these criteria". EcmaScript guarantees that ToNumber(ToString(x)) gives the same number x, but that is not quite what we need for signing. We need ToString(ToNumber(s)) to give the same string. I guess you could sign the 8 bytes of a 64-bit float, instead of the JSON decimal digits.
> Hi James,
> Thanx for pointing out this, it is apparently always a very good idea testing concepts with other knowledgeable people before you actually start building something :-)
>
> I guess the ES committee wasn't entirely happy about having to adjust their spec. due to improper reliance on JavaScript property order by parts of the development community.  But they probably did the right thing.
>
> I'm thinking in a similar way.  Why let an edge-case spoil all the fun?  Maybe the ES6 vendors implement the same broken ToString algorithm or the improved version mentioned as a note after the section you referred to?  I won't research this issue now because I consider Ecma the sole "owner" of this problem :-)
>
> So this is my (latest) suggestion for an upgraded in-object JSON clear-text signature specification:
>
>       "Due to limitations in the EcmaScript V6 [ECMA-262] specification regarding
>        the ToString(number) method, it is for interoperability reasons RECOMMENDED
>        to utilize a maximum of 18 digits of precision for non-integer Numbers."
>
> It sure isn't pretty but since "business messaging" can't even use JSON/ES numbers for expressing monetary amounts, it is hardly a show-stopper.
>
> Anders Rundgren
>
>
>> James Manger
>>
>> -----Original Message-----
>> From: jose [mailto:jose-bounces@ietf.org] On Behalf Of Anders Rundgren
>> Sent: Monday, 26 October 2015 2:33 AM
>> To: jose@ietf.org; json@ietf.org
>> Subject: Re: [jose] EcmaScript V6 - Defined Property Order
>>
>> Since the ES6 Number type is 64-bit IEEE, there's no need to worry about number canonicalization either if you base the signature system on ES6 which seems like a pretty safe bet.
>>
>> http://www.ecma-international.org/ecma-262/6.0/index.html#sec-tostring-applied-to-the-number-type
>>
>> That is, AFAICT, clear-text in-object JSON signatures are already compatible with ES6 (and I must drop my "number preservation" stuff...).
>>
>> Folks working with constrained devices will probably settle for CBOR.
>>
>> On 2015-10-25 10:08, Anders Rundgren wrote:
>>> http://www.ecma-international.org/ecma-262/6.0/index.html#sec-ordinary-object-internal-methods-and-internal-slots-ownpropertykeys
>>>
>>> I can't say I'm able "deciphering" the ES6 specification but it seems that the largest base of JSON parsers (the browsers), now are compliant with in-object JSON clear-text signature schemes of the kind I have proposed (pushing maybe...), albeit with some (IMO for practical purposes insignificant) limitations:
>>>
>>> - Integer property names doesn't work.
>>> - Numeric values would have to be normalized.
>>>
>>> Java, Python, and C# already manages this as well.
>>>
>>> Yay!
>>>
>>> Anders
>>>
>> _______________________________________________
>> jose mailing list
>> jose@ietf.org
>> https://www.ietf.org/mailman/listinfo/jose
> _______________________________________________
> jose mailing list
> jose@ietf.org
> https://www.ietf.org/mailman/listinfo/jose
>
> _______________________________________________
> jose mailing list
> jose@ietf.org
> https://www.ietf.org/mailman/listinfo/jose