Re: [Json] Canonicalization

"Manger, James H" <James.H.Manger@team.telstra.com> Wed, 20 February 2013 22:58 UTC

Return-Path: <James.H.Manger@team.telstra.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B9A6E21E8037 for <json@ietfa.amsl.com>; Wed, 20 Feb 2013 14:58:23 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.901
X-Spam-Level:
X-Spam-Status: No, score=-0.901 tagged_above=-999 required=5 tests=[AWL=0.000, BAYES_00=-2.599, HELO_EQ_AU=0.377, HOST_EQ_AU=0.327, RELAY_IS_203=0.994]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id EQX10H3Yr0Fd for <json@ietfa.amsl.com>; Wed, 20 Feb 2013 14:58:19 -0800 (PST)
Received: from ipxbvo.tcif.telstra.com.au (ipxbvo.tcif.telstra.com.au [203.35.135.204]) by ietfa.amsl.com (Postfix) with ESMTP id AA27C21F841C for <json@ietf.org>; Wed, 20 Feb 2013 14:58:18 -0800 (PST)
X-IronPort-AV: E=Sophos;i="4.84,705,1355058000"; d="scan'208";a="119257606"
Received: from unknown (HELO ipcbvi.tcif.telstra.com.au) ([10.97.217.204]) by ipobvi.tcif.telstra.com.au with ESMTP; 21 Feb 2013 09:58:11 +1100
X-IronPort-AV: E=McAfee;i="5400,1158,6992"; a="113310703"
Received: from wsmsg3704.srv.dir.telstra.com ([172.49.40.197]) by ipcbvi.tcif.telstra.com.au with ESMTP; 21 Feb 2013 09:58:11 +1100
Received: from WSMSG3153V.srv.dir.telstra.com ([172.49.40.159]) by WSMSG3704.srv.dir.telstra.com ([172.49.40.197]) with mapi; Thu, 21 Feb 2013 09:58:11 +1100
From: "Manger, James H" <James.H.Manger@team.telstra.com>
To: "json@ietf.org" <json@ietf.org>
Date: Thu, 21 Feb 2013 09:58:09 +1100
Thread-Topic: Canonicalization
Thread-Index: AQHODuJJD2QYqWyDNUmeQU+v/KxthpiB3nLAgAF4HLA=
Message-ID: <255B9BB34FB7D647A506DC292726F6E11507634C37@WSMSG3153V.srv.dir.telstra.com>
References: <BF7E36B9C495A6468E8EC573603ED9411513E818@xmb-aln-x11.cisco.com> <A723FC6ECC552A4D8C8249D9E07425A70F897263@xmb-rcd-x10.cisco.com> <255B9BB34FB7D647A506DC292726F6E11507579808@WSMSG3153V.srv.dir.telstra.com>
In-Reply-To: <255B9BB34FB7D647A506DC292726F6E11507579808@WSMSG3153V.srv.dir.telstra.com>
Accept-Language: en-US, en-AU
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: en-US, en-AU
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
Cc: "draft-staykov-hu-json-canonical-form@tools.ietf.org" <draft-staykov-hu-json-canonical-form@tools.ietf.org>
Subject: Re: [Json] Canonicalization
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "Discussion related to JavaScript Object Notation \(JSON\)." <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 20 Feb 2013 22:58:23 -0000

Canonicalization (c14n) may be a dirty word, and barred from the charter, but I hope it isn't too inappropriate to still discuss it on this email list.

There is a better c14n format for numbers than I or draft-staykov-hu-json-canonical-form have suggested: it is the format produced by JSON.stringify, as defined in ECMAScript v5.1 [http://www.ecma-international.org/ecma-262/5.1/#sec-15.12.3]. Examples: 0, 1, -1000, 0.00002334, 1500000, 6.022e+23, 9.1e-28.

In fact, c14n for JSON can (and should) be simply defined as per JSON.stringify, with two extra constraints: object name/value pairs are sorted; and \uxxxx escapes use lowercase hex digits (a-f). Done. Developers even have a readily available reference implementations simply by typing JSON.stringify(...) into the JavaScript console of their browser.

--
James Manger

> -----Original Message-----
> From: json-bounces@ietf.org [mailto:json-bounces@ietf.org] On Behalf Of
> Manger, James H
> Sent: Wednesday, 20 February 2013 11:58 AM
> To: json@ietf.org
> Subject: Re: [Json] Canonicalization
> 
> We should avoid the need to canonicalize JSON whenever possible, but
> there are enough efforts to define a canonical form that it would be
> worth standardizing one.
> 
> Strings:
> Escaping is mandatory for controls chars, double quote, and backslash
> (%x00-1F / %x22 / %x5C).
> The simplest string canonicalization rule would be to escape those 34
> chars and no others. A rule that might be slightly nicer for people
> (and hence worth the few extra lines of code) would be to always use
> the 7 \x escapes for those 7 chars, always use \uxxxx for the rest of
> %x00-1F, and never use it for any other chars. I would be tempted to
> drop \/ from the list of seven (only in the canonicalization rules),
> because / is so common in URIs etc.
> 
> Objects:
> Sorting string names to canonicalize an object needs a few more words.
> Presumably sorting occurs on the logical strings, not the (canonically)
> encoded versions. So {"a\"b":1,"a#b":2} is in canonical form ("(U+22) <
> #(U+23) < \(U+5C)).
> Presumably sorting uses Unicode code points, not UTF-16 words, or UTF-8
> bytes. So {"\uFFE0\uFFE1":3,"\uD834\uDD1E":4} is in the right order
> (0xFFE0 < 0x01D11E), thought the canonical form would use UTF-8 not
> \uxxxx for the 3 characters.
> [The 21-byte canonical form would be (in hex):
> 7B 22 EFBFA0 EFBFA1 22 3A 33 2C 22 F09D849E 22 3A 34 7D]
> 
> Numbers:
> 0, 1, 1e3, 2.334e-5, 1.5e6 are the sort of canonical form numbers
> should have. I don’t like 0.0E0 for zero as per draft-staykov-hu-json-
> canonical-form-00 -- a person would never write that. That draft also
> allows any number of trailing 0’s (eg 1.200000e-2), which is a bug. A
> canonical form should drop the exponent when it is zero, and drop the
> decimal point when there is nothing after it.
> A regex for numbers in canonical form:
> 
> 0|-?[1-9](\.[0-9]*[1-9])?(e-?[1-9][0-9]*)?
> 
> 
> So draft-staykov-hu-json-canonical-form needs a few changes in my mind,
> but is as good a starting point as any.
> 
> --
> James Manger
> _______________________________________________
> json mailing list
> json@ietf.org
> https://www.ietf.org/mailman/listinfo/json