Re: [Json] Proposed minimal change for duplicate names in objects

Matthew Morley <matt@mpcm.com> Thu, 04 July 2013 19:47 UTC

Return-Path: <mmorley@mpcm.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5F30211E81B3 for <json@ietfa.amsl.com>; Thu, 4 Jul 2013 12:47:50 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.976
X-Spam-Level:
X-Spam-Status: No, score=-2.976 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id NQ8an-Zt5FzH for <json@ietfa.amsl.com>; Thu, 4 Jul 2013 12:47:45 -0700 (PDT)
Received: from mail-lb0-f172.google.com (mail-lb0-f172.google.com [209.85.217.172]) by ietfa.amsl.com (Postfix) with ESMTP id B712A11E819D for <json@ietf.org>; Thu, 4 Jul 2013 12:47:44 -0700 (PDT)
Received: by mail-lb0-f172.google.com with SMTP id v20so1499598lbc.17 for <json@ietf.org>; Thu, 04 Jul 2013 12:47:43 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :x-gm-message-state; bh=qojZA78cjg4p7zsdpLmHIPH/RwpVeCrPWT/0UG9eC5Y=; b=S/zheF6oKNSIQF3DcSSHnJJq+YSllZgbMjnp6A7zqcIa3Z4xvH0b6ZzMUcn28ZhIWK RllN+XrSXijsUt/7d21aJSYuJJWthxalGftFBbwMR5jZgx7xxUfXUHEimeHsjCto/ht0 gHp/9K9jqU7knhaCd7cJ+oRcw5ptXYRms99Yqnb2vNMWuU8nAjwd1TgS3sL4oKVKUKle elsaUUXgYdSWDeTP1lwDIHBgbPuSix3jmNEc8fqhMh9dSA3X2ATwlfEWx/1jaLmOm+LC WqeqTFC4UNJF9g8v/Y/DlNn9x1N3T6e/D5qrCI58HMRLeUEuHGDzUg3Mzkj0bqHiqWse x1fQ==
MIME-Version: 1.0
X-Received: by 10.112.150.231 with SMTP id ul7mr4212598lbb.92.1372967263353; Thu, 04 Jul 2013 12:47:43 -0700 (PDT)
Sender: mmorley@mpcm.com
Received: by 10.114.187.113 with HTTP; Thu, 4 Jul 2013 12:47:43 -0700 (PDT)
In-Reply-To: <B86E1D4B-1DC8-4AD6-B8B3-E989599E0537@vpnc.org>
References: <B86E1D4B-1DC8-4AD6-B8B3-E989599E0537@vpnc.org>
Date: Thu, 4 Jul 2013 15:47:43 -0400
X-Google-Sender-Auth: 0VXcJQcWeLHmuUdA2uzF52XUzpA
Message-ID: <CAOXDeqraiSiU1tjGqmsNgqZ+RhZNsQCviCybogKQE3OQKQdxRw@mail.gmail.com>
From: Matthew Morley <matt@mpcm.com>
To: Paul Hoffman <paul.hoffman@vpnc.org>
Content-Type: multipart/alternative; boundary=047d7b3434c2054d8c04e0b4dac8
X-Gm-Message-State: ALoCoQm7z/7RKcaYesTSLGcWjS5NNgWRfJkRXBtxB64sp5sPbA/RrEEXJLIwb6qikQ0LG/rhGyR4
Cc: "json@ietf.org WG" <json@ietf.org>
Subject: Re: [Json] Proposed minimal change for duplicate names in objects
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 04 Jul 2013 19:47:50 -0000

On Tue, Jul 2, 2013 at 7:25 PM, Paul Hoffman <paul.hoffman@vpnc.org> wrote:

> <chair hats on>
>
> Do either or both of these proposals work for people in the WG?
>
> --Matt Miller and Paul Hoffman
>
> The following are proposed minimal changes to make the JSON spec
> interoperable with respect to
> objects that have duplicate names.
>
> Proposal 1:
>
> In section 2.2 (Grammar -> Objects):
>   No change
> In section 4 (Parsers):
>   Add: "If a parser encounters an object with duplicate names, the parser
> MUST use only the last
>   name/value pair that has the duplicate name.
> In section 5 (Generators):
>   Add: "The names within an object SHOULD be unique."
>
> Proposal 2:
>
> Same as Proposal 1 *except* that a second sentence is added for section 4:
> "A parser that streams
>   its outputs MUST fail to finish parsing if it encounters more than one
> name/value pair with the
>   same name."
>

#1 I guess. Duplicate keys *SHOULD* fail, IMHO.

I feel like I am being disruptive by posting this, but I really think we
are going down a path of splitting hairs and missing the broader point of
this issue. All the same, I'm posting as a reply to the initial thread
post, instead of all along the chain of replies.

Perhaps I'm in the minority, but I don't want to see the validation around
this split depending on the style of the decoding code. "Use the last
value, except if you stream then fail" seems a strange position to hold.
Failure should be a correct behavior in general.

The meaning of *SHOULD* for unique keys within the specification provides
enough of an expectation about the behavior, as well as a warning for those
that "weighed" the choice to not follow it. *MUST* would have been better,
but the complications we have been discussing are reasonable to expect and
predict.

Relying on ordered behavior in addition to duplicate keys, within a
knowingly unordered structure that *SHOULD* have unique keys is clearly
grounds for reasonable interop. errors and issues. "Clever" uses for this
small gap in behavior changes do not warrant protection.

Are there really legitimate uses of duplicate keys that should be honored,
warrant this protection, and the level of effort going into this discussion?


As an aside, it seems to me that none of this removes the need for
streaming parsers to be aware of duplicate keys and to probably fail upon
encountering them within the scope of a not-yet-closed object. If valid
json should unique keys, parsers should fail on non-unique keys, stream or
otherwise.

A similar topic comes up with streaming json-rpc servers writers who are
trying this SAX style approach (usually with batch requests), and the
reality has always been that it must be valid json first. This means you
cannot fully process the data with confidence until you confirm the payload
is actually valid json. Doing otherwise is simply betting
against fulfillment of the JSON format. If you can isolate the impact of
the changes, process away until you can confirm, then merge those changes
into the world. Otherwise, you need to be prepared to strike the actions
from having an impact.


Duplicate keys also prevent the use any meaningful path style strings, that
are to be used as an either an reference point or as an identity/location
token within and across objects (JSON Pointer for example). If your json
object produces more than one value for an identity, that is concerning.

Perhaps I'm just in shock that `json` is pondering new wording to handle
something that clearly comes with such complications and disagreement in
practice. Rather than promoting the behavior that is less complicated
conceptually and clearly intended.

-- 
Matthew P. C. Morley