[Json] JSON merge alternatives

Nico Williams <nico@cryptonector.com> Wed, 19 March 2014 23:46 UTC

Return-Path: <nico@cryptonector.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5E5971A0815 for <json@ietfa.amsl.com>; Wed, 19 Mar 2014 16:46:05 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.566
X-Spam-Level:
X-Spam-Status: No, score=-1.566 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FROM_12LTRDOM=0.1, IP_NOT_FRIENDLY=0.334] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id S_FxKWfMZ_G7 for <json@ietfa.amsl.com>; Wed, 19 Mar 2014 16:46:03 -0700 (PDT)
Received: from homiemail-a27.g.dreamhost.com (agjbgdcfdbfh.dreamhost.com [69.163.253.157]) by ietfa.amsl.com (Postfix) with ESMTP id 99DA51A0813 for <json@ietf.org>; Wed, 19 Mar 2014 16:46:01 -0700 (PDT)
Received: from homiemail-a27.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a27.g.dreamhost.com (Postfix) with ESMTP id 009EF598058 for <json@ietf.org>; Wed, 19 Mar 2014 16:45:52 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=cryptonector.com; h=date :from:to:subject:message-id:mime-version:content-type; s= cryptonector.com; bh=ECOFRs6AehhdLW+PKH/OZtirOsc=; b=Hl6wbwXLkRP 06dc1R+owi4puidJ8pRF8NZmMH8S45ntVN/h9svhcJmCyym1p/loFNoVzdXKXIsV 2Cw3HmkQKMEPEAynsJTnSiR5BO9geD67Ki+11/w7wc9UcFi4zleu6ea0averV/DO 1aiua545SGter1M9bOu3VhDFeK2x3yhc=
Received: from localhost (108-207-244-174.lightspeed.austtx.sbcglobal.net [108.207.244.174]) (Authenticated sender: nico@cryptonector.com) by homiemail-a27.g.dreamhost.com (Postfix) with ESMTPA id B204F598055 for <json@ietf.org>; Wed, 19 Mar 2014 16:45:52 -0700 (PDT)
Date: Wed, 19 Mar 2014 18:45:51 -0500
From: Nico Williams <nico@cryptonector.com>
To: json@ietf.org
Message-ID: <20140319234549.GA3471@localhost>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
User-Agent: Mutt/1.5.21 (2010-09-15)
Archived-At: http://mailarchive.ietf.org/arch/msg/json/oq6viR3uoMVszDUOhR_yoQ2ZAFk
Subject: [Json] JSON merge alternatives
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 19 Mar 2014 23:46:05 -0000

First, IMO the WG should in fact adopt a WG item (or more) for MIME
types for use with PATCH to patch JSON texts (or entities with JSON
representations).  This is commonly known as "JSON merge", IIUC.

I work with one Rails application that has an ad-hoc JSON merge like
schema for representing updates (PUTs) to resources -- this schema is
very specific to the application in question, therefore not reusable.

I can imagine a Ruby gem that handles a JSON merge PATCH in a completely
general and reusable way.  And not just Ruby, of course.

Second, I believe the JSON merge proposals I've seen to date are no
good.

The goal of any Internet JSON merge schema should be to:

   Concisely and generally express "edits" to a JSON text, efficiently
   allowing:

    - replacement of any values
    - prepend/insert/replace/append to arrays
    - addition/removal of names from objects
    - setting null values to object names

   no matter how deeply (subject to a max) or shallowly nested these
   values be in the original, while being easy to produce and apply.
   
Note: it helps to use If-Match to make sure that the resource being
PATCHed has not been modified since it was fetched.  This allows one to
express edits with confidence.

I have two proposals: one based on augmenting a subset of the original
JSON text, the other based a sequence of paths and edit instructions for
the values found at those paths.


Proposal #1: Patch as augmented subset of the original

Follow these steps to produce a patch:

 - Start with the resource's JSON representation.

 - Remove all interior object name/value pairs that are not on the way
   to any value to be added/replaced/removed/otherwise edited.

 - Remove all interior array elements that are not on the way to any
   value to be added/replaced/removed UNLESS the element is in an array
   to be edited (more on this below).

 - Insert before each remaining array element its index.

 - To replace any value with a scalar, just write the new value where
   the old one appeared.

 - To append to any array, just append -1 and the new value to it (after
   the above removals and additions).

 - For any array you want to do more to than append -> wrap the array in
   an object with any of these named values:

    o "delete":  [<list of array indices>]
    o "prepend": [<list of values to append>]
    o "insert":  [<list of [<index>, <value>]>]
    o "set":     [<list of [<index>, <value>]>]
    o "new":     [<list of new elements to replace the old one's with>]
    o "edits":   [<list of edits (see below)>]

   To disambiguate these objects from objects that could legitimately
   appear in the resource the following named value MUST also be
   present:

    o "<URN TBD>": true

   Replacements to be performed first, then deletions, then insertions,
   then prepends, then "edits".  Edits are what would have appeared
   had there been no need to wrap the array in order to do delete,
   prepend, replace, or insert elements.

 - To add name/value pairs to objects, just add them.

 - If you want to delete values from objects -> wrap the object in an
   object with any of these names:

    o "delete": [<list of name strings>]
    o "edits":  <object containint edits>

   Edits are what would have appeared had there been no need to wrap the
   object to delete name/value pairs.

   To disambiguate these names the following named value MUST also be
   present:

    o "<URN TBD>": true

Values to be added/set appear as they should appear in the resource as
patched.

Patch application is straightforward, obvious.

Example resource:

  {
    "a": [ { "b": 1, "c": 2 }, { "d": { "e": [ 2, {} ] } } ],
    "z": [ true, false, null, null, "hello" ]
  }

Example edits:

 - Delete that empty object (path a[0].d.e[1] in jq-speak)

   { "a": [ 1, { "d": { "e": { "delete": [ 1 ],
                               "urn:ietf.org:TBD": true } } } ] }

 - Replace the array [ 2, {} ] (path a[0].d.e in jq-speak) with the
   value 0:

   { "a": [ 1, { "d": { "e": 0 } } ] }

 - Append true to the array [ 2, {} ] (path a[0].d.e in jq-speak):

   { "a": [ 1, { "d": { "e": [ true ] } } ] }

 - Delete the 1, prepend "foo", append "bar" to that same array:

   { "a": [ 1, { "d": { "e": { "delete": [ 0 ],
                               "prepend": [ "foo" ],
                               "edits": [ "bar" ],
                               "urn:ietf.org:TBD": true } } } ] }

 - Add a sibling name to "b" and "c":

   { "a": [ 0, { "f": "hello" } ] }

 - Delete "b":

   { "a": [ 0, { "delete": [ "b" ], "urn:ietf.org:TBD": true } ] }

 - Replace the "z" array with [1, 2, 3]:

   { "z": { "new": [ 1, 2, 3, ], "urn:ietf.org:TBD": true } }

Proposal #2: Use sequence of [<path>, <edit instruction>]

 - Write the path to each value to be edited as an array of path
   elements (strings for object names, numbers for array elements).

 - To replace any value with a scalar value (non-array, non-object),
   just write the path to the value and the new value as the
   instruction.

 - To add a value to an object just write the path to the object, append
   the new value's name to the path, and the instruction will be the new
   value.

 - To add (append) a value to an array, write the path to the array,
   append a -1 to the path, and the instruction will be the new value.

 - Edit instructions for arrays (other than appending to, or replacing
   the whole array with a scalar value) are any objects like:

    o "delete":  [<list of indices>] or true (if path names an array
                                        element)

    o "add":     [<list of <value>s to append]

    o "insert":  [<list of [<index>, <value>] pairs>]
    o "set":     [<list of [<index>, <value>] pairs>]
    o "prepend": [<list of <value>s to prepend]
    o "add":     [<list of <value>s to append]
    o "new":     [<new list of values to replace old ones>]

   No magic URN name is needed in this case.

 - To add an object

 - Edit instructions for objects (other than replacement with a scalar
   value) are objects with name/value pairs like:

    o "delete": [<list of name strings>]
    o "merge":  [<object whose named value pairs will replace the
                 corresponding ones of the object being edited>]

   No magic URN name is needed in this case.

The totality of the patch, then, is an array of
[ <path>, <instruction> ] elements.

Any given path could be referenced multiple times in one patch.

Proposal #2 is pretty self-explanatory.  Examples left as an exercise
for the reader.

Nico
--