Re: [Json] Proposed minimal change for duplicate names in objects

Tatu Saloranta <tsaloranta@gmail.com> Sun, 07 July 2013 18:07 UTC

Return-Path: <tsaloranta@gmail.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C16DA11E80F3 for <json@ietfa.amsl.com>; Sun, 7 Jul 2013 11:07:55 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.599
X-Spam-Level:
X-Spam-Status: No, score=-2.599 tagged_above=-999 required=5 tests=[AWL=0.000, BAYES_00=-2.599, HTML_MESSAGE=0.001, NO_RELAYS=-0.001]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zZS7zkwZz1KZ for <json@ietfa.amsl.com>; Sun, 7 Jul 2013 11:07:54 -0700 (PDT)
Received: from mail-wg0-x231.google.com (mail-wg0-x231.google.com [IPv6:2a00:1450:400c:c00::231]) by ietfa.amsl.com (Postfix) with ESMTP id 676C711E80EE for <json@ietf.org>; Sun, 7 Jul 2013 11:07:54 -0700 (PDT)
Received: by mail-wg0-f49.google.com with SMTP id a12so3184768wgh.16 for <json@ietf.org>; Sun, 07 Jul 2013 11:07:53 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=BwbssAqOogAf4Ky+4jrIXA8ApP4h87lnQcjlw4SzaAw=; b=SfbJktAVTuncPpATYkx90deHFqBh+wxGFbfH0FxQuI3ehYXULppAbAuqK9A5wqcMQQ D4hWjLUmk/x8xKeo44mfNJ2xoAZXDwojm8GPdErSeTHCIDoNCu4UPTlN87Iqz3/bICNw TpLMefZnHxtU86cv59BLkjbStF1GSdB1VgGQeEmM/0nYagPavxi4xB14l6iFK8GoELE9 5ZgJ7XElMMcyNOyOVW0h5vlxeGisFzvu4RxDro8a4qOLJT0rfMwVBa/G/1CGzA2nCX7R NYuVmswhFRippRWZqG/WLFsqHJJzrgI8YXkRKEcdD+cEYDMGi/FOCHxzC5X0WCJQGNSJ NL0A==
MIME-Version: 1.0
X-Received: by 10.180.208.17 with SMTP id ma17mr9879099wic.7.1373220473429; Sun, 07 Jul 2013 11:07:53 -0700 (PDT)
Received: by 10.227.34.199 with HTTP; Sun, 7 Jul 2013 11:07:53 -0700 (PDT)
In-Reply-To: <CAK3OfOic41TWGhVJFwv1o64GarZhM0mqoF1TBruJ9OkCQbqijA@mail.gmail.com>
References: <B86E1D4B-1DC8-4AD6-B8B3-E989599E0537@vpnc.org> <CAK3OfOj3MNNhjwo2bMa5CgoqynzMRVvviBXC8szxt5D17Z7FDg@mail.gmail.com> <51D3C63C.5030703@cisco.com> <51D48023.1020008@qti.qualcomm.com> <20130703201143.GL32044@mercury.ccil.org> <00cd01ce7a9f$19adeaa0$4d09bfe0$@augustcellars.com> <00d701ce7aa6$cc5fe700$651fb500$@augustcellars.com> <CAK3OfOiWrWCvNQneokyycV1Jb98M=UR-U7z0dhxUjzVdf+PwDw@mail.gmail.com> <CAHBU6itdi3B1rWv2TiOYhL1QuOVxrFKt7OTWRoG+6TgV8Bc_uw@mail.gmail.com> <CAK3OfOgOYA5fas0oomF5amjP1bR5F=0+uve7mFD4=FMoEV7sWg@mail.gmail.com> <CAGrxA24y4D62XY-YnbDvKVwNKUickcEFxv1FUhc_yqG4KP-m-w@mail.gmail.com> <CAHBU6iuWLcXv0QKR=Ow8gkzoZLmoZjqYCqXDXR8FLVb7w7M2Tw@mail.gmail.com> <CAK3OfOic41TWGhVJFwv1o64GarZhM0mqoF1TBruJ9OkCQbqijA@mail.gmail.com>
Date: Sun, 7 Jul 2013 11:07:53 -0700
Message-ID: <CAGrxA257rS4Q=HH2GEvU6Skqk_pqD-hxVAekzfGUQ8XKfE2QcQ@mail.gmail.com>
From: Tatu Saloranta <tsaloranta@gmail.com>
To: Nico Williams <nico@cryptonector.com>
Content-Type: multipart/alternative; boundary=001a11c3808e84689204e0efcecc
Cc: Jim Schaad <ietf@augustcellars.com>, Tim Bray <tbray@textuality.com>, "json@ietf.org" <json@ietf.org>
Subject: Re: [Json] Proposed minimal change for duplicate names in objects
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 07 Jul 2013 18:07:55 -0000

On Sat, Jul 6, 2013 at 11:25 PM, Nico Williams <nico@cryptonector.com>wrote;wrote:

> On Sat, Jul 6, 2013 at 11:58 PM, Tim Bray <tbray@textuality.com> wrote:
> > I’ll assume you’re right when you say dupe detection has been measured as
> > expensive at run time, but I’m thinking that if I were writing a reader
> I'd
> > implement a hash table with a test-and-set method, so I admit I'm
> surprised
> > by the finding.  I think I’d need to see a little more research before
> I’d
> > accept that as a given.  -T
>
> Dunno about Tatu, but I think the vast majority of JSON usage must be
> coupled either with generic hash tables (objects/dicts/maps/whatever)
> or structs (with associated schema).  For the latter dup detection may
> well be more expensive than not, but probably not that much more
> expensive.  For the former dup detection is essentially free.
>

Yes (for second case, schema typically coming from Java class definitions,
implicitly).
I did not imply that overhead here would be significant.


> For other use cases, like Stephen Dolan's "launch missiles" example,
> dup detection might well be expensive, but maybe JSON is the wrong
> tool for those use cases.
>
> The CPU cost of dup detection doesn't concern me.  It's the
> architectural implication that does: minimal state streaming parsers
> are out.
>

> I have to ask myself: are there use cases where I would want a minimal
> state streaming parser?  and do I mind losing the ability to have such
> a thing?  My personal, tentative answers: "yes" and "probably not,
> maybe in such a case I should not use JSON".  Others will no doubt
> have different answers, which brings us to...
>
>
I would just note that at this point JSON is very competive,
performance-wise, exceeding performance of other textual formats, and
competing well with binary formats as well. In fact, I rarely see
compelling reason to even consider binary formats, unless payload size is
significant factor.
If this aspect was lost due to clarification for solving a problem that has
more to do with concerns for _possible_ security issues, that would be sad.

End users rarely have real need for minimal-state parsing. It is
framework-builders -- such as, say Solr, Elastic Search, Hadoop, JAX-RS
implementations (these for Java, similar for other platforms) -- that care
as performance implications there have more effect. And they are the ones
that have legitimate use for minimal-state components.

I assume all of above is understood by now, so apologies if I sound like a
broken record here.


> ...more retreading.
>
> I still think the best thing to do is note the divergent
> interpretations and publish Internet JSON.
>
>
I agree.

-+ Tatu +-