Re: [Json] fun with streaming, was The names within an object SHOULD be unique.

"Joe Hildebrand (jhildebr)" <jhildebr@cisco.com> Sat, 03 August 2013 23:09 UTC

Return-Path: <jhildebr@cisco.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BC0C221F9798 for <json@ietfa.amsl.com>; Sat, 3 Aug 2013 16:09:53 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -10.678
X-Spam-Level:
X-Spam-Status: No, score=-10.678 tagged_above=-999 required=5 tests=[AWL=-0.079, BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id bC28KwZcu4-p for <json@ietfa.amsl.com>; Sat, 3 Aug 2013 16:09:48 -0700 (PDT)
Received: from rcdn-iport-7.cisco.com (rcdn-iport-7.cisco.com [173.37.86.78]) by ietfa.amsl.com (Postfix) with ESMTP id 2F97A21F95DD for <json@ietf.org>; Sat, 3 Aug 2013 16:09:48 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=1666; q=dns/txt; s=iport; t=1375571388; x=1376780988; h=from:to:cc:subject:date:message-id:in-reply-to: content-id:content-transfer-encoding:mime-version; bh=DMeWEtMQZ5GAwVuoP6ay/egP7+RQ0ba/58+NqsdFo84=; b=l/7NZwS3/Lj29gwcfbJSRIyKiA94JnArsbB2z8h9wPeFjngS33AXjh/U cp/6GtzivPy+foIpQ4AM8SqGVNtQQRj5wGelBUTmFxUUUjNaZf0fi9KTV il4ecfPb2+wVbO33hKF2iapHsOmdYjkzlsMGNVRFErm5pgzHRDhyQz1tp I=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AgIFAOaM/VGtJXHB/2dsb2JhbABagwaBBb8UgR4WdIImAQEDOj8SAQgOFBQxESUCBAENBQiHdgMPrVINiF6NJ4JBMQeDGXQDlXeOEYUngxeCKg
X-IronPort-AV: E=Sophos;i="4.89,809,1367971200"; d="scan'208";a="243159161"
Received: from rcdn-core2-6.cisco.com ([173.37.113.193]) by rcdn-iport-7.cisco.com with ESMTP; 03 Aug 2013 23:09:46 +0000
Received: from xhc-aln-x07.cisco.com (xhc-aln-x07.cisco.com [173.36.12.81]) by rcdn-core2-6.cisco.com (8.14.5/8.14.5) with ESMTP id r73N9kdB020770 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=FAIL); Sat, 3 Aug 2013 23:09:46 GMT
Received: from xmb-rcd-x10.cisco.com ([169.254.15.159]) by xhc-aln-x07.cisco.com ([173.36.12.81]) with mapi id 14.02.0318.004; Sat, 3 Aug 2013 18:09:46 -0500
From: "Joe Hildebrand (jhildebr)" <jhildebr@cisco.com>
To: Tatu Saloranta <tsaloranta@gmail.com>, Tim Bray <tbray@textuality.com>
Thread-Topic: [Json] fun with streaming, was The names within an object SHOULD be unique.
Thread-Index: AQHOjUPkuhc07XaFCESqCMV81yMt45l91Y4AgAap/wA=
Date: Sat, 03 Aug 2013 23:09:45 +0000
Message-ID: <A723FC6ECC552A4D8C8249D9E07425A71417F804@xmb-rcd-x10.cisco.com>
In-Reply-To: <CAGrxA27ut1MoGLO-kdH1LXjA9Ct7jmvh0G5XDzfaV6AgtaOv5Q@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
user-agent: Microsoft-MacOutlook/14.3.6.130613
x-originating-ip: [10.21.65.244]
Content-Type: text/plain; charset="us-ascii"
Content-ID: <4FB7C3461339304CBB8DA43A063C9B43@emea.cisco.com>
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Cc: John Levine <johnl@taugh.com>, "json@ietf.org" <json@ietf.org>
Subject: Re: [Json] fun with streaming, was The names within an object SHOULD be unique.
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 03 Aug 2013 23:09:53 -0000

On 7/30/13 7:56 PM, "Tatu Saloranta" <tsaloranta@gmail.com> wrote:

>This is a common use case for processing large JSON files; either output
>as JSON arrays, or just sequences of space-separate objects. Typical data
>sets are log output, processing from map/reduce style jobs and batch jobs.

You're saying that you either produce

{"a": 1, "b": 2} {"a": 3, "b": 4}

or:

[{"a": 1, "b": 2},{"a": 3, "b": 4}]

right?  In neither of those cases does the producer have a problem
ensuring name uniqueness.  If you've got a programming language
description the objects in either of those cases, they are known to have
unique names.  If you're parsing either of those as a stream and
generating 2 events each, you'll have to deal with unique names in one of
the four ways we've seen(*), but not when acting as an eventing parser,
only when creating the programming language construct to hold the objects
from the events fired.
 
>Separation between streaming part and higher layers is for good
>separation of concern, as well as practical matter for reusing components.

That is a perfectly valid design, yes.  I even like it.  It doesn't get
away from the fact that when you reduce the events to objects in your
programming language, you'll need to deal with duplicate names in one of
the four ways.  Therefore, you SHOULD NOT send duplicates, since you don't
know how they are going to be treated by the receiver.  If you control
both ends of the conversation, then it doesn't matter what the standard
says, and you can send any data you want, including duplicates.


(*) Reminder: error, first, last, all

-- 
Joe Hildebrand