Re: [Jsonpath] The draft: ambiguous language

Tim Bray <tbray@textuality.com> Mon, 07 December 2020 22:50 UTC

Return-Path: <tbray@textuality.com>
X-Original-To: jsonpath@ietfa.amsl.com
Delivered-To: jsonpath@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 18C7E3A0C00 for <jsonpath@ietfa.amsl.com>; Mon, 7 Dec 2020 14:50:35 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.898
X-Spam-Level:
X-Spam-Status: No, score=-1.898 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=textuality-com.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id eMkF8TpHbnLs for <jsonpath@ietfa.amsl.com>; Mon, 7 Dec 2020 14:50:31 -0800 (PST)
Received: from mail-lj1-x235.google.com (mail-lj1-x235.google.com [IPv6:2a00:1450:4864:20::235]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 30DCC3A0B75 for <jsonpath@ietf.org>; Mon, 7 Dec 2020 14:50:31 -0800 (PST)
Received: by mail-lj1-x235.google.com with SMTP id q8so16831268ljc.12 for <jsonpath@ietf.org>; Mon, 07 Dec 2020 14:50:30 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=textuality-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=o9TclY+E0Q10JLunV3o+TNkEyVaa0GRpD4xt/YkU08g=; b=Dihirb4CV5cqkiHmPexTkc26H2YQNYisPrengBJ0TZYDof2hLBAvTJLauiwrCshEaw EoRQV9mjvLLOr7elxtyOMwfQ5tYNrvjWB2lQ1RVkCgiVPjpIj2pgHcjtRRtyMhN6Asud V4zqLNhUAm+MpX1p9yujpzJcpHNV5fbdRxGXW/6XK0cPbwAa1YpY9YGB7LHMkALp1bVB wwuVQLLPSbEgq7KKXQcgkIesvAt3tD3knM+Nix52AS1W467y82AHdMMjX6HzIYlJOnSh 3G3bQZWnbNnp8GsFdBFlJD+jZRkA369zEvUn9dS9PFUv0c9C+O/sAcnbc2RK0E5kHlMe 0xWg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=o9TclY+E0Q10JLunV3o+TNkEyVaa0GRpD4xt/YkU08g=; b=EvzNTihoenxmkHJtMLe6pndHjKOizWzDuBlowz83ZD76I78Wl/uj9kqSsrzcQGEipR 2+FyiTIhSZtisMt9RcAxlscJD6YWxk0QxlA9E00SgtvAo/dF8EoaIKP1A6nSKAUzdiQc 1/ebGmvK4wXYJfEm30PsxYu2BGV162iddHF6ttEvYlSeLwfKEmeYk1B2ZP19jDYWoaLo SS9UsU6HKriYK6VM1MdSgacuH2hefaoqfLnJKnGmFfXuJdLOTb03crErA6GbxhQDOciW jdbnrqv2zfVuRdXqK3Ox9tI44broN2Y9Sv4NF+4YX+lR5by8f/VPduqshUKbd+2SJbYk fjUw==
X-Gm-Message-State: AOAM532mTI8Sq1ux15dJYt2ddlxHHPg4HrBmzU1uQqkuFVOcFT9Zw3Q1 wMI+pVWR8Zg2YRvi4gkhZ2tnOJ7EqfglzPLj1mub2A==
X-Google-Smtp-Source: ABdhPJxbtXFmLj0HeAgL2/wZ3TEHEZiFuXWMmhktof17K1uiwlsdMvQ7MgDQ/i8NVteKTJrqXsstTOGlciO61gHBaus=
X-Received: by 2002:a2e:8751:: with SMTP id q17mr9818121ljj.179.1607381429125; Mon, 07 Dec 2020 14:50:29 -0800 (PST)
MIME-Version: 1.0
References: <mailman.2754.1607359255.8352.jsonpath@ietf.org> <CA+mwktLNdF+Hw+Dwfe=r8pSL0T+BuebrRXZ3iqd2=ESZxSDi3w@mail.gmail.com> <C7AAE506-75C0-48C7-962B-28985A2D99F8@tzi.org>
In-Reply-To: <C7AAE506-75C0-48C7-962B-28985A2D99F8@tzi.org>
From: Tim Bray <tbray@textuality.com>
Date: Mon, 07 Dec 2020 14:50:18 -0800
Message-ID: <CAHBU6itkNNYTpQVaOoX8r1Lunz_=5_fkKWOnAgw3SqhDKpL+dg@mail.gmail.com>
To: Carsten Bormann <cabo@tzi.org>
Cc: Daniel P <danielaparker@gmail.com>, jsonpath@ietf.org
Content-Type: multipart/alternative; boundary="0000000000001a16d205b5e7a7d3"
Archived-At: <https://mailarchive.ietf.org/arch/msg/jsonpath/JqLEYzed90N2TdwHHtrl5JtDRvw>
Subject: Re: [Jsonpath] The draft: ambiguous language
X-BeenThere: jsonpath@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: A summary description of the list to be included in the table on this page <jsonpath.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/jsonpath>, <mailto:jsonpath-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/jsonpath/>
List-Post: <mailto:jsonpath@ietf.org>
List-Help: <mailto:jsonpath-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/jsonpath>, <mailto:jsonpath-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 07 Dec 2020 22:50:35 -0000

To Carsten's point about what we call things, the number of distinguished
terms per RFC8259 is pretty small: JSON text, value, object, array, number,
string.  Having spent quite a bit of time specifying JSON DSLs, I find that
using just those terms doesn't seem to get in the way or cause problems, so
I'd argue that we should stick to them (and build up to higher-level
constructs as required for JSONPath).

On Mon, Dec 7, 2020 at 2:37 PM Carsten Bormann <cabo@tzi.org> wrote:

> On 2020-12-07, at 21:38, Daniel P <danielaparker@gmail.com> wrote:
> >
> > My general comment is that there is a lot of ambiguous language in the
> > draft,
>
> As I said, work is needed.
>
> I don’t plan to wait for adoption to start this work, so I created a PR:
>
> https://github.com/jsonpath-standard/internet-draft/pull/46
>
> > and ambiguous language is the bane of implementers. Some
> > examples below:
> >
> > One. "Data Item: A structure complying to the generic data model of
> > JSON, i.e., composed of containers such as arrays and maps (JSON
> > objects), and of atomic data such as null, true, false, numbers, and
> > text strings."
> >
> > Why introduce the term "map" as the preferred (unparenthized) term?
> > RFC 8259 doesn't mention "map", it uses JSON object.
>
> Sure.  That is probably one of the mistakes that were made in defining the
> JSON terminology.
>
> (Historically, what happened is that JavaScript has objects and atoms, but
> arrays are almost, but not entirely unlike objects.  When JSON was defined,
> it only called the maps “objects”, making the term object be ambiguous
> between its standard meaning and the “JSON object” meaning.  So to be
> unambiguous, it is best to have a term that unambiguously refers to “JSON
> objects”.  Maps naturally is that term, as would be longer terms such as
> dictionary.  If the WG does not like that term, we can of course revert to
> the more ambiguous JSON terminology.)
>
> > By following
> > "arrays and maps" with "(JSON objects)", it could be read that "arrays
> > and maps" will collectively be referred to as "JSON objects”.
>
> Yep, that is better the other way around.
>
> > Later in
> > the draft we read "the root object" ($) and "the current object" (@),
> > which almost suggest that that is the intended meaning.
>
> That should be item.
>
> > My own view is that the terminology should stay consistent with RFC
> > 8259, and that the word "object" should not be used for items that are
> > not JSON objects in the sense of RFC 8259.
>
> I agree with the latter, but the reason for that is that the RFC 8259
> terminology is confusing, so we could try to be more unambiguous.
>
> > What is the purpose of "such as" in the sentence? Aren't the
> > itemizations exclusive?
>
> Here, they are, so I replaced “such as” with “namely”.
>
> > Two. "Since a JSON data item is usually anonymous and doesn't
> > necessarily have a "root member object", JSONPath used the abstract
> > name $ to refer to the top level object of the data item."
>
> Yep, item (and no need to open the “member” discussion, see below).
>
> > I realize this sentence is mostly copied from Goessner, but I didn't
> > understand it there either. Regarding "doesn't necessarily" have a
> > "root member object", what is that supposed to mean? It seems to me
> > that the root is _always_ going to be an anonymous JSON value, which,
> > when Goessner was writing, could be a JSON array or object, and since
> > RFC 8259, any JSON value.
>
> Many people who actually work with JSON don’t think in terms of data
> items, but in terms of members (entries of a map, i.e. key/value pairs).
> Because JSON values are data items and members cannot exist outside a map,
> this creates a need to wrap those members into maps, and to unwrap them
> again what talking about them as members.
>
> See https://ietf-wg-asdf.github.io/SDF/sdf.html#name-sdfchoice for an
> example how that thinking looks like in practice (yes, I wrote that, fully
> cognizant of the dissonance).
>
> But that prevalent thinking does not have to be reflected here, and I’m
> proposing a change in the PR.
>
> > Three.  "Where a JSONPath processor uses JSONPath expressions as
> > output paths, these will always be converted to the more general
> > bracket-notation."
>
> Fixed in PR.
>
> > For output paths, Goessner uses the term "normalized path
> > expressions", which should be unambiguously defined. For uniqueness,
> > in addition to avoiding dot-notation, there would need to be other
> > restrictions, including avoiding the descendant operator .., and
> > filters.
> >
> > Four. "The symbol @ is used for the current object."
>
> Fixed in PR.
>
> > Only if you interpret "object" to mean _any_ JSON value. In the
> > example above this sentence,
> >
> > $.store.book[(@.length-1)].title
> >
> > the symbol @ refers to a JSON array, with a JavaScript like property
> "length".
> >
> > That arrays must support a JavaScript like property length is not
> > explicitly stated in the draft, but is implied in two examples. This
> > should be clarified. Not all implementations do, for example,
> > implementations that support functions that can be invoked at the tail
> > end of a path typically support a function call length() instead.
>
> Defining .length is part of the outstanding work on the expression
> language.
>
> > I also think the @ notation could be better explained, this particular
> > reader struggled to understand what was meant by "current object" in
> > original Goessner, and had to figure it out from examples.
>
> Please send text.
>
> Grüße, Carsten
>
> --
> Jsonpath mailing list
> Jsonpath@ietf.org
> https://www.ietf.org/mailman/listinfo/jsonpath
>