Re: [Json] The names within an object SHOULD be unique.

Allen Wirfs-Brock <allen@wirfs-brock.com> Fri, 07 June 2013 22:33 UTC

Return-Path: <allen@wirfs-brock.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E217021F99A9 for <json@ietfa.amsl.com>; Fri, 7 Jun 2013 15:33:14 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.014
X-Spam-Level:
X-Spam-Status: No, score=-2.014 tagged_above=-999 required=5 tests=[AWL=-0.016, BAYES_00=-2.599, HTML_MESSAGE=0.001, J_CHICKENPOX_45=0.6]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id qsfq4BCVSaEi for <json@ietfa.amsl.com>; Fri, 7 Jun 2013 15:33:09 -0700 (PDT)
Received: from mho-02-ewr.mailhop.org (mho-02-ewr.mailhop.org [204.13.248.72]) by ietfa.amsl.com (Postfix) with ESMTP id ECAF221F99A6 for <json@ietf.org>; Fri, 7 Jun 2013 15:33:07 -0700 (PDT)
Received: from 069-064-236-244.pdx.net ([69.64.236.244] helo=[192.168.0.15]) by mho-02-ewr.mailhop.org with esmtpa (Exim 4.72) (envelope-from <allen@wirfs-brock.com>) id 1Ul5DW-0008Nw-N2; Fri, 07 Jun 2013 22:33:07 +0000
X-Mail-Handler: Dyn Standard SMTP by Dyn
X-Originating-IP: 69.64.236.244
X-Report-Abuse-To: abuse@dyndns.com (see http://www.dyndns.com/services/sendlabs/outbound_abuse.html for abuse reporting information)
X-MHO-User: U2FsdGVkX1+qLYgCunGg3ZSn04XbMTcGM9/2S5wEq5E=
Mime-Version: 1.0 (Apple Message framework v1085)
Content-Type: multipart/alternative; boundary="Apple-Mail-55--652326555"
From: Allen Wirfs-Brock <allen@wirfs-brock.com>
In-Reply-To: <8E679529-8663-4552-A905-529E732AEB7B@yahoo.com>
Date: Fri, 07 Jun 2013 15:33:00 -0700
Message-Id: <89FDB01C-B5B1-43DF-AFE1-845B8E8E96B5@wirfs-brock.com>
References: <51AF8479.5080002@crockford.com> <51AF9ACF.5020507@cisco.com> <D0A99569-0915-4862-A7AE-9DE51C2E90C0@yahoo.com> <51AFB3F8.8060708@crockford.com> <8F32953C-C788-4DC9-888E-920E2BEB7FDD@yahoo.com> <831B8E46-F239-4353-8F95-8DF3F9BD2E78@yahoo.com> <51AFC924.2030805@crockford.com> <DA7A83A2-1C1F-4E74-BF6A-DA943B07AB59@vpnc.org> <C25B2BF6-512F-41CB-A9E8-E329E9C4BDCE@wirfs-brock.com> <8E679529-8663-4552-A905-529E732AEB7B@yahoo.com>
To: Vinny A <jsontest@yahoo.com>
X-Mailer: Apple Mail (2.1085)
Cc: Paul Hoffman <paul.hoffman@vpnc.org>, "json@ietf.org" <json@ietf.org>
Subject: Re: [Json] The names within an object SHOULD be unique.
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 07 Jun 2013 22:33:15 -0000

On Jun 7, 2013, at 10:43 AM, Vinny A wrote:

> 
> 
> On Jun 6, 2013, at 12:45 PM, Allen Wirfs-Brock <allen@wirfs-brock.com> wrote:
>> Allen Wirfs-Brock
>> ECMAScript Language Spec. Project Editor
> 
>> What I would say, as a replacement for the current text  <section 2.2> is:
>> 
>>         The names within an object SHOULD be unique.  If a key is duplicated, a parser MUST take  <<use?? interpret??>> only the last of the duplicated key pairs.
> 
> Just to be clear here, ECMA is OK with using MUST in regards to the last key pair?  

The ECMAScript standard specifies that the JSON parse built-in to a conforming ECMAScript implementation must accept duplicate key pairs and that it provides as its output the value of the last such pair.  Changing that behavior would be a breaking change that  we are very unlikely to make.

In general, the ECMAScript specification is hyper prescriptive about many behaviors, such as this, that traditionally would have been considered reasonable areas to allow implementation variability. The reason is that we really are a community of multiple independently developed implementations whose users (web developers) expect and demand identical behavior from our implementations. Experience has shown that when implementation variation is permitted that a de facto standard rapidly emerges around whatever implementation choices were made by the currently dominant implementation (historically MS Internet Explore on the desktop, more recently Webkit on mobile devices) and that all other implementations are forced to conform in order to remain viable. The preference within TC39 is to pro actively avoid such situations and reach early consensus on a single required behavior.

> 
> Because we already have proposed language that says that, but one of the points under contention on this list is what to do with duplicated key pairs.
> 
> Is the following language acceptable to ECMA (disclaimer: original text by Paul,  modifications by me):
> 
> 
> On Jun 6, 2013, at 10:24 AM, Paul Hoffman <paul.hoffman@vpnc.org> wrote:
>> The new text should use language that is already in RFC 4627; "key" is not such a word. To be fair to implementers, the new document also needs to deal with both emitters and parsers.
>> 
>> Proposal:
>> 
>> In Section 2.2:
>> Current:
>>   The names within an object SHOULD be unique.
>> Proposed:
>>   If the names within an object are not unique, the result of parsing the 
>>   object is unpredictable, and the parse may even fail completely. Thus,
>>   the names within an object SHOULD be unique.

First, a terminology issue. Section 2 is defining a grammar.  There are no syntactic ambiguities associated with multiple key-value pairs.  There is no reason they should cause parsing (take literally) to fail. The issue is what semantic interpretation is applied to such occurrences. JSON actually specifies very little with regard to the semantics of a JSON text but a plausible interpretation of duplicate key-value pairs might be that the entire JSON text is invalid because it is semantically ambiguous.  Personally I wouldn't call that a parsing issue.

Back to your question, Ecma TC39 is a consensus driven organization, and I don't think we would find consensus on the above proposed language.  More below

>> 
>> In Section 4, add a new paragraph:
>>   If a parser encounters an object with duplicate names, the parser MAY
>>   fail to parse the JSON text; if the parser accepts objects with duplicate
>>   names, it SHOULD accept only the last name/value pair that has the
>>   duplicate name. 
> 
> If we have to recommend to both parsers and emitters, I'd like to make a slight change to your wording:
> 
> In Section 2.2:
> Current:
>     The names within an object SHOULD be unique.
> Proposed:
>    The names within an object SHOULD be unique. Non-unique names have unpredictable effects (refer to sections 4 & 5).

This is better and I think we could get consensus (it's a variation of what I suggested above):

In Section 2.2:
Current:
    The names within an object SHOULD be unique.
Proposed:
   The names within an object SHOULD be unique because it is ambiguous as to which value should be associated with a duplicated key.

> 
> In Section 4 (Parsers) add a new paragraph:
>   If a parser encounters an object with duplicate names, the parser MAY fail to parse the JSON text; if the parser accepts objects with duplicate names, it MUST accept only the last name/value pair that has the duplicate name. 

I think this is too restrictive and gives license to reject existing valid datasets that use commenting conventions that introduce duplicate names.

Proposal:
   A parser MUST accept a JSON text that includes objects with duplicate names. A parser SHOULD associate with such a name the value of the last name/value pair that has the    duplicate name. 

TC39 would leave the specification of the ECMAScript JSON.parse function unchanged as it conforms to this proposal.

Note that a streaming parser might reasonably and validly present all of the duplicate name/value pairs to its clients.

> 
> In Section 5 (Generators) add a new paragraph:
>    A JSON generator SHOULD not duplicate names. If duplicate names are generated, the authoritative name/value pair MUST be listed last.

I think this is unnecessary if we specify that JSON text's with duplicated names must be accepted by parsers.

Allen