Re: [Json] Proposed minimal change for duplicate names in objects

Tatu Saloranta <tsaloranta@gmail.com> Wed, 03 July 2013 19:45 UTC

Return-Path: <tsaloranta@gmail.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 18E9B11E8212 for <json@ietfa.amsl.com>; Wed, 3 Jul 2013 12:45:50 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.599
X-Spam-Level:
X-Spam-Status: No, score=-2.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, HTML_MESSAGE=0.001, NO_RELAYS=-0.001]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id I9mPzPXVBzq8 for <json@ietfa.amsl.com>; Wed, 3 Jul 2013 12:45:49 -0700 (PDT)
Received: from mail-wi0-x229.google.com (mail-wi0-x229.google.com [IPv6:2a00:1450:400c:c05::229]) by ietfa.amsl.com (Postfix) with ESMTP id 0CFB111E8210 for <json@ietf.org>; Wed, 3 Jul 2013 12:45:48 -0700 (PDT)
Received: by mail-wi0-f169.google.com with SMTP id c10so6622305wiw.2 for <json@ietf.org>; Wed, 03 Jul 2013 12:45:48 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=Z2Ng8FZd8GKdxhL6+vGkJ3gbXCKuCgnB/lu3Bp0WT5k=; b=nDxtiGxLKD6F0e8KfUEq+jFdOkwtu0Ep0gbMoJWK3hZAw5/r4gpWIQ+5HDwMv+gRjw 62fhetBnfrya5LclncFWRuzQRvlX6jMA215a0n0pPef7oj4SadhoxZnpmTZMbId4YqLl w1pcG/kT8RZJS7xRVc2SOFyENM5twYySV4D+ZQWv4xnAGrAoiAj9mMzcj1e1+htBJ/NX LYjPSWxhilvuFUa1vVOCnWJrGLLWKnKzBl0yLKtuYbgH1jyoHE6vyrshCqKBHBwktpE7 w0HjQCMYf2c97UgDSLA+Aq9hIXXYDPeu7L5Kid9bn9i9WmCf0JqBmm8tT+ymNPmr7XnD wVjA==
MIME-Version: 1.0
X-Received: by 10.194.104.74 with SMTP id gc10mr1533612wjb.48.1372880748193; Wed, 03 Jul 2013 12:45:48 -0700 (PDT)
Received: by 10.227.72.74 with HTTP; Wed, 3 Jul 2013 12:45:48 -0700 (PDT)
In-Reply-To: <51D3CB52.7040902@cisco.com>
References: <B86E1D4B-1DC8-4AD6-B8B3-E989599E0537@vpnc.org> <CAK3OfOj3MNNhjwo2bMa5CgoqynzMRVvviBXC8szxt5D17Z7FDg@mail.gmail.com> <51D3C63C.5030703@cisco.com> <CAK3OfOg5ErNO5zozaCB-qchSaUb-dy4Da5b1KKJNTM0Bnpm+1A@mail.gmail.com> <51D3CB52.7040902@cisco.com>
Date: Wed, 03 Jul 2013 12:45:48 -0700
Message-ID: <CAGrxA26YcCCS6yE6DfEXgJs=nVZCW7fh5H+FSvRCspj65oH9ew@mail.gmail.com>
From: Tatu Saloranta <tsaloranta@gmail.com>
To: Eliot Lear <lear@cisco.com>
Content-Type: multipart/alternative; boundary="047d7bf10b4650ae1f04e0a0b543"
Cc: Nico Williams <nico@cryptonector.com>, Paul Hoffman <paul.hoffman@vpnc.org>, "json@ietf.org WG" <json@ietf.org>
Subject: Re: [Json] Proposed minimal change for duplicate names in objects
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 03 Jul 2013 19:45:50 -0000

On Tue, Jul 2, 2013 at 11:57 PM, Eliot Lear <lear@cisco.com> wrote:

> Hi Nico,
>
> On 7/3/13 8:45 AM, Nico Williams wrote:
> > On Wed, Jul 3, 2013 at 1:35 AM, Eliot Lear <lear@cisco.com> wrote:
> >>> In short, for streaming parsers (and generators) there's nothing we
> can do.
> >>>
> >>> What we can do is RECOMMEND that a) generators not produce duplicates
> >>> (and explain how streaming ones cannot prevent dups), and b) that
> >>> parsers use the last name (and explain how streaming ones will produce
> >>> all dups).
> >> To me this isn't strong enough.  I have some sympathy for both the
> >> existing base and for parsers, but generators as an architectural
> >> component should generate unambiguous output, streaming or otherwise.
> >> And that follow's Postel's Law:
> > How can a streaming generator do that?  The API for one might look like:
> >
> > ObjectStart(context)
> > ObjectAdd(context, name, value)
> > ...
> >
> > There's no way a minimal-state streaming generator (but I repeat
> > myself; the whole point of a streaming interface is to keep minimal
> > state :) can detect duplicates with O(1) state (probabilistic
> > structures with false positives won't be acceptable either).
>
> You've got to retain state.  You don't have to retain the value, but you
> do have to retain the name.  It's O(k), and knowing you I know you're
> smart enough to approach that (because I am and so I'm using inductive
> theory ;-).  And I'll bet you a nickel that this is not going to be
> anyone's high order bit.
> >
>


There is difference between retaining parent path (names of "open" Object
properties), and retaining names of all preceding siblings. Latter is
expensive and/or complicated.
Former is indeed done by parsers that try to give useful error information
with respect to nesting problems and such.

I also think that probabilistic approaches would be problematic from user
POV: one has to allocate lots of memory -- not just for hashes for names
seen (or derivative thereof like Bloom filter(s)), but also locations for
error messages.

If spec did mandate requirement to check for dups, then both generators and
parsers would have to do it; and at least parsers would need additional
metadata to give proper error messages to indicate nature of collision
(what, where).

-+ Tatu +-