Re: [Jsonpath] The draft: ambiguous language

Carsten Bormann <cabo@tzi.org> Mon, 07 December 2020 22:37 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: jsonpath@ietfa.amsl.com
Delivered-To: jsonpath@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2844A3A0B50 for <jsonpath@ietfa.amsl.com>; Mon, 7 Dec 2020 14:37:45 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.908
X-Spam-Level:
X-Spam-Status: No, score=-1.908 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_MSPIKE_H4=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, T_SPF_TEMPERROR=0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4DXDgZiMpN1O for <jsonpath@ietfa.amsl.com>; Mon, 7 Dec 2020 14:37:40 -0800 (PST)
Received: from gabriel-vm-2.zfn.uni-bremen.de (gabriel-vm-2.zfn.uni-bremen.de [134.102.50.17]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B02973A09D6 for <jsonpath@ietf.org>; Mon, 7 Dec 2020 14:37:39 -0800 (PST)
Received: from [192.168.217.118] (p548dca87.dip0.t-ipconnect.de [84.141.202.135]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by gabriel-vm-2.zfn.uni-bremen.de (Postfix) with ESMTPSA id 4CqdVT5clbzyVj; Mon, 7 Dec 2020 23:37:37 +0100 (CET)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.4\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <CA+mwktLNdF+Hw+Dwfe=r8pSL0T+BuebrRXZ3iqd2=ESZxSDi3w@mail.gmail.com>
Date: Mon, 07 Dec 2020 23:37:37 +0100
Cc: jsonpath@ietf.org
X-Mao-Original-Outgoing-Id: 629073456.967362-551e72ff311f427973b33a21cc676751
Content-Transfer-Encoding: quoted-printable
Message-Id: <C7AAE506-75C0-48C7-962B-28985A2D99F8@tzi.org>
References: <mailman.2754.1607359255.8352.jsonpath@ietf.org> <CA+mwktLNdF+Hw+Dwfe=r8pSL0T+BuebrRXZ3iqd2=ESZxSDi3w@mail.gmail.com>
To: Daniel P <danielaparker@gmail.com>
X-Mailer: Apple Mail (2.3608.120.23.2.4)
Archived-At: <https://mailarchive.ietf.org/arch/msg/jsonpath/Po7dFguVswWopexXhFlwRw9SXEM>
Subject: Re: [Jsonpath] The draft: ambiguous language
X-BeenThere: jsonpath@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: A summary description of the list to be included in the table on this page <jsonpath.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/jsonpath>, <mailto:jsonpath-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/jsonpath/>
List-Post: <mailto:jsonpath@ietf.org>
List-Help: <mailto:jsonpath-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/jsonpath>, <mailto:jsonpath-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 07 Dec 2020 22:37:45 -0000

On 2020-12-07, at 21:38, Daniel P <danielaparker@gmail.com> wrote:
> 
> My general comment is that there is a lot of ambiguous language in the
> draft,

As I said, work is needed.

I don’t plan to wait for adoption to start this work, so I created a PR:

https://github.com/jsonpath-standard/internet-draft/pull/46

> and ambiguous language is the bane of implementers. Some
> examples below:
> 
> One. "Data Item: A structure complying to the generic data model of
> JSON, i.e., composed of containers such as arrays and maps (JSON
> objects), and of atomic data such as null, true, false, numbers, and
> text strings."
> 
> Why introduce the term "map" as the preferred (unparenthized) term?
> RFC 8259 doesn't mention "map", it uses JSON object.  

Sure.  That is probably one of the mistakes that were made in defining the JSON terminology.

(Historically, what happened is that JavaScript has objects and atoms, but arrays are almost, but not entirely unlike objects.  When JSON was defined, it only called the maps “objects”, making the term object be ambiguous between its standard meaning and the “JSON object” meaning.  So to be unambiguous, it is best to have a term that unambiguously refers to “JSON objects”.  Maps naturally is that term, as would be longer terms such as dictionary.  If the WG does not like that term, we can of course revert to the more ambiguous JSON terminology.)

> By following
> "arrays and maps" with "(JSON objects)", it could be read that "arrays
> and maps" will collectively be referred to as "JSON objects”.

Yep, that is better the other way around.

> Later in
> the draft we read "the root object" ($) and "the current object" (@),
> which almost suggest that that is the intended meaning.

That should be item.

> My own view is that the terminology should stay consistent with RFC
> 8259, and that the word "object" should not be used for items that are
> not JSON objects in the sense of RFC 8259.

I agree with the latter, but the reason for that is that the RFC 8259 terminology is confusing, so we could try to be more unambiguous.

> What is the purpose of "such as" in the sentence? Aren't the
> itemizations exclusive?

Here, they are, so I replaced “such as” with “namely”.

> Two. "Since a JSON data item is usually anonymous and doesn't
> necessarily have a "root member object", JSONPath used the abstract
> name $ to refer to the top level object of the data item."

Yep, item (and no need to open the “member” discussion, see below).

> I realize this sentence is mostly copied from Goessner, but I didn't
> understand it there either. Regarding "doesn't necessarily" have a
> "root member object", what is that supposed to mean? It seems to me
> that the root is _always_ going to be an anonymous JSON value, which,
> when Goessner was writing, could be a JSON array or object, and since
> RFC 8259, any JSON value.

Many people who actually work with JSON don’t think in terms of data items, but in terms of members (entries of a map, i.e. key/value pairs).  Because JSON values are data items and members cannot exist outside a map, this creates a need to wrap those members into maps, and to unwrap them again what talking about them as members.

See https://ietf-wg-asdf.github.io/SDF/sdf.html#name-sdfchoice for an example how that thinking looks like in practice (yes, I wrote that, fully cognizant of the dissonance).

But that prevalent thinking does not have to be reflected here, and I’m proposing a change in the PR.

> Three.  "Where a JSONPath processor uses JSONPath expressions as
> output paths, these will always be converted to the more general
> bracket-notation."

Fixed in PR.

> For output paths, Goessner uses the term "normalized path
> expressions", which should be unambiguously defined. For uniqueness,
> in addition to avoiding dot-notation, there would need to be other
> restrictions, including avoiding the descendant operator .., and
> filters.
> 
> Four. "The symbol @ is used for the current object."

Fixed in PR.

> Only if you interpret "object" to mean _any_ JSON value. In the
> example above this sentence,
> 
> $.store.book[(@.length-1)].title
> 
> the symbol @ refers to a JSON array, with a JavaScript like property "length".
> 
> That arrays must support a JavaScript like property length is not
> explicitly stated in the draft, but is implied in two examples. This
> should be clarified. Not all implementations do, for example,
> implementations that support functions that can be invoked at the tail
> end of a path typically support a function call length() instead.

Defining .length is part of the outstanding work on the expression language.

> I also think the @ notation could be better explained, this particular
> reader struggled to understand what was meant by "current object" in
> original Goessner, and had to figure it out from examples.

Please send text.

Grüße, Carsten