Re: [Json] Proposed minimal change for duplicate names in objects

Tatu Saloranta <tsaloranta@gmail.com> Sun, 07 July 2013 04:37 UTC

Return-Path: <tsaloranta@gmail.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6E64021F9CE8 for <json@ietfa.amsl.com>; Sat, 6 Jul 2013 21:37:21 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.599
X-Spam-Level:
X-Spam-Status: No, score=-2.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, HTML_MESSAGE=0.001, NO_RELAYS=-0.001]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ZZ4gMcu8J2jI for <json@ietfa.amsl.com>; Sat, 6 Jul 2013 21:37:20 -0700 (PDT)
Received: from mail-wi0-x22d.google.com (mail-wi0-x22d.google.com [IPv6:2a00:1450:400c:c05::22d]) by ietfa.amsl.com (Postfix) with ESMTP id 3B56821F9D4B for <json@ietf.org>; Sat, 6 Jul 2013 21:37:20 -0700 (PDT)
Received: by mail-wi0-f173.google.com with SMTP id hq4so8155443wib.0 for <json@ietf.org>; Sat, 06 Jul 2013 21:37:19 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=mfqiKCFR+GiMzeu8Plk1WGZI21Qq9huDKX+lUZPvu4Q=; b=ksbN1tq8XLNQpyakn3cxhAzri2yzmzvWgBQp4TMDxxCLGSE1vrCzabiebdUyIRk5eC oEhrRsGqEpZk6NnrmF80q5LzxPgONjPOyymuYv8cpLo9ZlvFX93knKWuBMsu9nIUV01p zBiZGz6pcjxDm9aRtDfwXddcsjhaulpl+wMhEGjEXfssU2to4M/xe6191+bEwXnf7Yeu HBGh+3fRYAslP1VvOtU7fxx9qsHthfdlMbI17l6SFD1Zw7hXjin8DlRr3QWkgRgnSkfO VFIn2RNLJnoLiTEm6H/JzfBqMZ4e1e9kp3sE7RWJ5t96Ly8LHIGN+hgsYO1b0TV463bn tHRw==
MIME-Version: 1.0
X-Received: by 10.194.179.129 with SMTP id dg1mr9619496wjc.38.1373171839392; Sat, 06 Jul 2013 21:37:19 -0700 (PDT)
Received: by 10.227.34.199 with HTTP; Sat, 6 Jul 2013 21:37:19 -0700 (PDT)
In-Reply-To: <CAK3OfOgOYA5fas0oomF5amjP1bR5F=0+uve7mFD4=FMoEV7sWg@mail.gmail.com>
References: <B86E1D4B-1DC8-4AD6-B8B3-E989599E0537@vpnc.org> <CAK3OfOj3MNNhjwo2bMa5CgoqynzMRVvviBXC8szxt5D17Z7FDg@mail.gmail.com> <51D3C63C.5030703@cisco.com> <51D48023.1020008@qti.qualcomm.com> <20130703201143.GL32044@mercury.ccil.org> <00cd01ce7a9f$19adeaa0$4d09bfe0$@augustcellars.com> <00d701ce7aa6$cc5fe700$651fb500$@augustcellars.com> <CAK3OfOiWrWCvNQneokyycV1Jb98M=UR-U7z0dhxUjzVdf+PwDw@mail.gmail.com> <CAHBU6itdi3B1rWv2TiOYhL1QuOVxrFKt7OTWRoG+6TgV8Bc_uw@mail.gmail.com> <CAK3OfOgOYA5fas0oomF5amjP1bR5F=0+uve7mFD4=FMoEV7sWg@mail.gmail.com>
Date: Sat, 6 Jul 2013 21:37:19 -0700
Message-ID: <CAGrxA24y4D62XY-YnbDvKVwNKUickcEFxv1FUhc_yqG4KP-m-w@mail.gmail.com>
From: Tatu Saloranta <tsaloranta@gmail.com>
To: Nico Williams <nico@cryptonector.com>
Content-Type: multipart/alternative; boundary=089e013d194eb3e1a204e0e47bad
Cc: Jim Schaad <ietf@augustcellars.com>, Tim Bray <tbray@textuality.com>, "json@ietf.org" <json@ietf.org>
Subject: Re: [Json] Proposed minimal change for duplicate names in objects
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 07 Jul 2013 04:37:21 -0000

On Sat, Jul 6, 2013 at 7:57 PM, Nico Williams <nico@cryptonector.com> wrote:

> On Sat, Jul 6, 2013 at 8:44 PM, Tim Bray <tbray@textuality.com> wrote:
> > This feels like a no-brainer to me, but that’s probably because (as I’ve
> > said before) I’m an API guy, and the only use for JSON objects in my
> world
> > is to transfer a hash table or database record or whatever from here to
> > there, and there, and in such a situation dupes can never be useful or
> > intended and can only be a symptom of breakage (or, in the JOSE case, a
> > symptom of a malicious attack on my crypto).
>
> I agree.  As a security guy I would prefer if one way or another we
> end up with no dup names, but as an "API guy" myself I think of the
> streaming parsers (they offer an API after all).  Just say the magic
> words: "to hell with minimal state streaming parsers" or perhaps
> something to the effect that *some* component of a layered application
> MUST reject objects with dup names.  It's either or.  Let's choose.
>
> I'm happy with "some component of a layered application MUST reject
> objects with duplicate names" -- I prefer this to the "no minimal
> state streaming parsers" alternative.
>
> I will assume that in general objects rarely have lots of names, so
> that parsers need not keep that much state in order to check for dups.
>  Requiring parsers to reject objects with dup names is my second
> choice.
>

Just to make sure: I also do not have any use for duplicates, and consider
them flaws in processing. I have never used duplicates for anything, nor
find that interesting approach.
The only concern really is that of mandating (or not) detection and/or
prevention at lowest level components of commonly used processing stacks
(low-level push/pull parser, higher-level library or app code that builds
full representations), since this is significant cost, based on extensive
profiling I have done at this level.

Case of application code directly using streaming parser/generators is not
nearly as common as that of frameworks using them to produce higher level
abstractions.
These higher level abstraction (JSON tree representation, binding to native
objects) do either report errors such as duplicates, and at very least can
detect it and use consistent handling. They can do it much more efficiently
than low-level components since they have to build representations that
then serve as data structures for detecting/preventing duplicates.

Specific concern for me is this: if specification mandates detection and/or
prevention for parser, generators, without any mention that 'parser' and
'generator' are logical concepts (thing that reads JSON, thing that writes
JSON), I will have lots of users who promptly demand low-level components
to force checks.
And then I get to spend lots of time discussing why such checks can (and
IMO should) still be pushed to higher level of processing. It is amazing
how much FUD can be generated based on cursory reading of specifications.

This concern extends to suggested "Internet JSON messages" specification.

I would like simple suggestion that some component of the processing
should/must detect and report duplicates; and prevent producing of same;
or, lengthier explanation of what "parser" and "generator" mean (parser is
such a horrible misnomer -- there is very very little parsing involved,
it's just a lexer, and optional object builder -- but I digress).

-+ Tatu +-