Re: [Jsonpath] Some Comments ...
Stefan Gössner <stefan@goessner.net> Sat, 27 February 2021 08:44 UTC
Return-Path: <stefan@goessner.net>
X-Original-To: jsonpath@ietfa.amsl.com
Delivered-To: jsonpath@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8ED663A14B6 for <jsonpath@ietfa.amsl.com>; Sat, 27 Feb 2021 00:44:35 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.898
X-Spam-Level:
X-Spam-Status: No, score=-1.898 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, NICE_REPLY_A=-0.001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id b2bCZBT9q624 for <jsonpath@ietfa.amsl.com>; Sat, 27 Feb 2021 00:44:32 -0800 (PST)
Received: from mout.kundenserver.de (mout.kundenserver.de [212.227.126.187]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 8582D3A0B5F for <jsonpath@ietf.org>; Sat, 27 Feb 2021 00:44:31 -0800 (PST)
Received: from [192.168.178.20] ([87.123.195.171]) by mrelayeu.kundenserver.de (mreue009 [212.227.15.167]) with ESMTPSA (Nemesis) id 1N79q8-1lsgJk3Vq0-017SqS for <jsonpath@ietf.org>; Sat, 27 Feb 2021 09:44:28 +0100
From: Stefan Gössner <stefan@goessner.net>
To: jsonpath@ietf.org
References: <98ed1a4f-82fd-3f94-a707-8569f89a5041@goessner.net>
Message-ID: <b028f688-c71a-058b-4948-1a87b4889ffd@goessner.net>
Date: Sat, 27 Feb 2021 09:44:29 +0100
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.7.1
MIME-Version: 1.0
In-Reply-To: <98ed1a4f-82fd-3f94-a707-8569f89a5041@goessner.net>
Content-Type: multipart/alternative; boundary="------------1BB1DBF874F44B3F8AC6B340"
X-Provags-ID: V03:K1:FOCX5VKsYG0iZHu57JWLconX8PANiWWTXwBsz64VEA8S86e7h2F LOK6XG+Pn3pQmMaSWIBwIMZ1Ws/QDzAffcoUsq0vWi2c1AXItT45J3iugMP3RqlB908SHFa BGvHzIvjMEgniFkJ/4TU9k7sMTl7DWWu/0jSi3ccXyY3uFAY+SQ7TrUidWdHdP62XYt8BsC Ho+MYZqBU0NyC2kHyz8DQ==
X-UI-Out-Filterresults: notjunk:1;V03:K0:BLZyWi9HFNI=:60GTyk2VO4Wo4r8fmh/hGX eXEzfJj6jN9xcOQlrr4bH96x2rcqlR/8dMw+mpzGiWpYVP7sSJVbyV46Z6u1172ydlYhV5rIv tUb6AwhERgt1O9NVZamAQ00BNQboe2yWXeWYoHyEnKu3ylHuJtkN5NvGj45x7XWIh9dWtVFRd jDfUnZt7nXQkBdwA3JdGepGXxc8PR+2bkEw5cS0lsyCDCrY1ChBHt7X1g0XWc2JNrrdQfCaaz N2xmi606MegWV8MxFgvYOExp4ikEQwH8Wn3f2w1s1MhD4pTU6PJtzWFW/GiiV6oEbjLdNpnfe wgkanKLiZtUQK9TPXFQ+NHCsvTjqyEhjl98hf7/B1Ex2eCWlnIkXyVdea+Eb0A78Z+i4GMcIl 5/xKZiwQcA036Bj0sYA/v1YPCwgJarhxsgW8YHB1E12as4CkFWWE0JUYFzj3asLxZJPsG9vpx D+kyE41RJUviPi1cdpaN02t+/OxIVR0=
Archived-At: <https://mailarchive.ietf.org/arch/msg/jsonpath/Pq-BULb9aPiCX2MG_azhryqlxw4>
Subject: Re: [Jsonpath] Some Comments ...
X-BeenThere: jsonpath@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: A summary description of the list to be included in the table on this page <jsonpath.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/jsonpath>, <mailto:jsonpath-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/jsonpath/>
List-Post: <mailto:jsonpath@ietf.org>
List-Help: <mailto:jsonpath-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/jsonpath>, <mailto:jsonpath-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 27 Feb 2021 08:44:36 -0000
I copied these comments forward to Github ... for better reading and further commenting. https://github.com/ietf-wg-jsonpath/draft-ietf-jsonpath-jsonpath/issues/54 I would also like to try out that cool Github discussion tab, Daniel wrote about. thanks -- sg Am 26.02.2021 um 19:30 schrieb Stefan Gössner: > Hello List, > > It has been important to go through this list threads carefully. In > fact I should have done that at first. Now I can understand the > current draft and appreciate the work already done much better. > > I collected some citations (important from my point of view) with > comments already in Markdown. > > > ## Title of the specification > > > JSONPath: A query language for JSON data. > *(Carsten Bormann)* > > > I think I’d slightly prefer the term “syntax” to “language” because > “query language” has a smell of various things that end with the > letters “Q” and “L”. But not passionate about that. > *(Tim Bray)* > > > JSONPath: A query syntax for JSON. > > Another wild-card idea: JSONPath: Query expressions for jSON > *(Tim Bray)* > > > The beauty of this is that the plural form “query expressions” > implies a set of expressions, so it implies “language”. It’s indeed > more than the grammar/syntax of those, so why not talk about the > expressions as a whole. This also makes it possible to just use “for > JSON”, without going into detail what these query expressions operate on. > *(Carsten Bormann)* > > There seems to be an agreement for "*JSONPath: Query expressions for > JSON*". I like that also. > > ## Terminology > > > My own view is that the terminology should stay consistent with RFC > 8259, and that the word "object" should not be used for items that are > not JSON objects in the sense of RFC 8259. > *(Daniel P)* > > > To Carsten's point about what we call things, the number of > distinguished > terms per RFC8259 is pretty small: JSON text, value, object, array, > number, > string. Having spent quite a bit of time specifying JSON DSLs, I find > that > using just those terms doesn't seem to get in the way or cause > problems, so > I'd argue that we should stick to them (and build up to higher-level > constructs as required for JSONPath). > > > > … oh, and I forgot the very useful "member". > *(Tim Bray)* > > > … and “element” (the things in arrays). *(Carsten Bormann)* > > > The problem with JSON value is that it also can be quite confusing > due to the usual use of that term. Pointing to a tree and saying “the > values inside that tree” is not going to be felt as equivalent to “the > set of all subtrees of that tree, including the tree itself”. But if > JSON value is the only term we have, it has to be. Hence my > preference to talk about data items when I mean the items themselves > and not their “value”. > *(Carsten Bormann)* > > > I think the key difficulty is whether each (key, value) pair in an > object is "a thing" that can be identified and manipulated and > potentially returned. (If we're talking analogies, then it's analogous > to an attribute node in the XDM model). > *(Michael Kay)* > > > ECMA-404 uses "name/value pair", which is what I understand the term > "member" to mean (Douglas Crockford uses "member"). > *(Daniel P)* > > > I think the term “union” is poor. If we think of it as concatenation > of results, then the result is as expected. > *(Glyn Normington)* > > I understand, that within RFC8259 we have JSON values of different > types. They are structured somehow, which is not so much of interest > here. > > But while querying that structure with JSONPath it is vitally > important to identify that hierarchical structure as a tree. So in > fact we build up a higher-level construct here. We also need to call > "the things" in the tree somehow. I was able to identify > > * "node" or "item" of a tree > * "member" of an object > * "name/value" or "key/value" pair alias "member" > * "element" of an array > > but could not see an agreement here. > > I agree to Glyn calling the term "union" poor (s. below). > > > ## Differentiation from JSON Pointer (JSONPath draft charter) > > > I anticipate being asked "Why is JSON Pointer not sufficient?" > Indeed its abstract says: > > > > JSON Pointer defines a string syntax for identifying a specific > value within a JavaScript Object Notation (JSON) document. > > > >... which sounds awfully similar. If we could include a sentence about > that, or a link to an answer, that might be helpful. > *(Murray S. Kucherawy)* > > > No - it's not similar in concept, they're separate things. If you > really wanted to mention JSON Pointer, you could say something like > "Note that while JSON Pointer (RFC xxxx) is already standardised, it > is designed to provide a reference to a single, specific part of a > JSON document, whereas JSONPath provides the ability to query a > document and potentially return multiple values." > *(Mark Nottingham)* > > >The short answer is that JSON pointer is good if you already know the > structure of the JSON data item you want to point into, and you want > to point to exactly one position in there. If you need to do > something that is closer to a “search” (which might also result in > multiple positions), JSONPath gives you more rope. > *(Carsten Bormann)* > > +1 > > ## References to XPath > > > I wonder if the analogies between XPath and JSONPath are going to be > helpful, or whether they're actually dangerous by implying > equivalences between constructs that are in fact somewhat different? > *(Michael Kay)* > > > I tend to agree. Although JSONPath was inspired by XPath, I wouldn't > want to confuse the JSONPath spec by going into detailed comparisons at > the risk of contradicting the normative text. > *(Glyn Normington)* > > > Someone on StackOverflow today asked a question about JSONPath; they > called it (and tagged it) XPath, we really don't want that kind of > confusion. > > > > In addition, the reference to the XPath specification in 6.2 is out > of date, and the comparison with XPath in Table 2 is very approximate > and the terminology inaccurate: for example there is a mention of > "node sets", which exist in XPath 1.0 but not in XPath 2.0, yet the > citation is to XPath 2.0. For someone who knows the semantics of XPath > the comparison raises all sorts of questions about sorting of results > into document order, elimination of duplicates etc, which are > complications this spec can well do without. (Though some answers are > needed, for example if ..store..price matches the same price in more > than one way, do you get more than one result? And if not, what does > "the same price" actually mean?) > *(Michael Kay)* > > It seemed to be important in 2007, while argumenting to have something > like XPath for JSON. If nowadays the terminology used has changed > significantly with XPath 2.0 and 3.0, we better leave that comparison > table 2 out. I am quite passionless here. > > ## Array Slice Operator > > > Thanks! The ABNF for an array slice in that reference > > ``` > > integer = [%x2D] (%x30 / (%x31-39 *%x30-39)) > > > > array-slice = [ integer ] ws %x3A ws [ integer ] > > [ ws %x3A ws [ integer ] ] > > ; start:end or start:end:step > > ``` > > is consistent with JMESPath, Python, and my understanding of > ECMASCRIPT 4. > > *(Daniel P)* > > > Did anyone else have an opinion on the behaviour of slices such as > [::0]? > The current draft allows this and says it returns an empty array, but > there > is good reason to say it should error so that the slice operation is then > consistent with Python slicing. See below for more context. > *(Glyn Normington)* > > It's good having read this thread and thus understand the current > draft much better. I like the decision to be consistent with Python > and also getting an empty selection set with `step=0`. > > FYI: there is a recent proposal for adding slice notation syntax to > JavaScript, currently at stage 1 of the TC39 process. > > https://github.com/tc39/proposal-slice-notation > > Interestingly it won't have a step argument ... > > https://github.com/tc39/proposal-slice-notation#why-doesnt-this-include-a-step-argument-like-python-does > > > ... because of syntax collision with the new `this-binding` syntax > proposal `::` > > https://github.com/tc39/proposal-bind-operator > > However, we should not let us influence by this. > > ## Unions > > > I don't think any implementation would remove duplicates from a path > such as `"$.store.book"`. I believe this is only somewhat controversial > in the context of unions [,]. The name "union" suggests that distinct > values be returned, compare with SQL unions. But Stefan Goessner's > implementation doesn't do that, it concatenates all results that meet > each criteria. There are a few JSONPath implementations that produce > real unions with no duplicates instead of concatenated results, but I > don't think that's the consensus. > *(Daniel P)* > > > I think the term “union” is poor. If we think of it as concatenation > of results, then the result is as expected. > *(Glyn Normington)* > > > I agree with that comment, but it's partly because I'm used to SQL > UNION, > which is different. I prefer the JMESPath term for an analogous > construct, > MultiSelect List, > https://jmespath.org/specification.html#multiselect-list. > *(Daniel P)* > > Introducing the union operator `[,]` simply was meant an analogon to > XPath's operator `'|'`. I cannot tell, if it was a simple combination > of node sets in Xpath 1.0 or a true union without duplicates. I > obviously was not aware of that subtle (essential ?) union > characteristic. > > So I fully agree to Glyn Normington's '... the term “union” is poor' > statement. Are there some better alternative terms, perhaps > 'multi-index operator', 'index list', 'subscript list', etc.? > > ## Duplicates and Ordering > > > It was my impression that we were talking about duplicated nodes not > duplicated values: > > > > Given th array [10,20,30] > > > > $..[0,1,0] > > > > Would yield only two results [10, 20] > > > > (Not that I'm advocating for removing duplicates, personally I think we > shouldn't) > *(Marko Mikulicic)* > > > You’re framing this as “removing duplicates”. > Another view is that [10, 20, 10] would be “adding duplicates” (copies > of the same node). Related are ordering issues: > > > > `$..[1,0] ➔ [20, 10] Or [10, 20]` > > > > I would expect the spec will leaves implementations some leeway > here, but that should be based on an examination of existing > implementations. > *(Carsten Bormann)* > > > The mental model that leads to omitting duplicate nodes in the > output is > "selection": if you take an input array and select nodes with index > 0,1 or > 0, you get only 2 results (since selecting an index twice has no effect). > > > >OTOH, if you opt for a "collect" model, whenever you encounter a node > that > matches that query you add it to the result stream, thus the same > nodes can > be present multiple times in the result. > > > >I have a slight preference for the "collect" model, because the general > case in jsonpath is to collect things that appear at various points in > the > json tree. For example: > > > >`{"a": {"b": 1, "c": 2}, "d": 3}, $.a.b yields [1] and not > {"a":{"b":1}}` > > > >(i.e. jsonpath is not a filter and view operation but a pick and gather > operation) > *(Marko Mikulicic)* > > > In implementations that support paths (the majority don't), the query > function takes a parameter that indicates values or paths. In both > cases the query returns a JSON array of JSON values, in the latter > case, a JSON array of normalized paths. > *(Daniel P)* > > I must confess to never having thought about duplicates, let alone > wanting to eliminate them. So I do like Marko's comparison of > 'selection-model' vs. 'collection-model' a lot. I would opt for the > latter. In this sense the result of a 'JSONPath query expression' > should be termed a 'collection'. > > Regarding ordering I see something like a 'natural ordering', > according to which > > `$..[0,1] ➔ [10, 20]` > `$..[1,0] ➔ [20, 10]` > > would result with the example above. > > I do understand the use cases for reordering, duplicates removal, > filtering, etc.. But this can always be seen as a postprocessing step > on the resulting collection by handing it over to accompanying tools > (think of pipe operator). > > Of course this cannot work on the result collection of values alone > (s. duplicate nodes vs. duplicate values above), it rather requires a > collection of (normalized) pathes. In this sense, I like this view: > > > In my opinion the right balance between powerfulness and enabling > simple implementations has been so far one of the key factors that > made JSONPath popular over other alternatives, even if it lacks > support for aggregation functions. > *(Davide Bettio)* > > ## Filter Expressions > > > Related to that, it would be helpful to determine if JSONPath filters > apply to both JSON objects and arrays, or only to JSON arrays. > *(Daniel P)* > > > I would support restricting filters to arrays, if others agree. > *(Glyn Normington)* > > I tend to let implementations and their "normative force of the > factual" decide here or in doubt agree to Glyn's restriction to arrays. > > I am very unhappy with confusing `$..book[(@.length-1)]`, where `'@'` > addresses the array itself and implies that array has a `length` > property. In filter expression examples `'@'` more consistently > addresses the current array element. > > The invocation of 'the underlying scripting engine' wasn't meant a > serious normative aspect, but rather a quick and dirty solution for > JavaScript and PHP implementations at that time. > > > ### Corner Case > > > Consider this perfectly legal JSON object > > > > ```{ "ab": 0, "'a.b": 1, "a-b": 2, "a": { "b": 3 } }``` > > > >So `$.ab` is 0, `$.a.b` is 3, `$['a.b']` is 1, `$['a-b']` is 2. You'd > like to say `$.a-b` but lots of libraries will refuse it because > `"a-b"` is not a legal JavaScript "name" construct, that's why you > have to say `$['a-b']`. > > > > But suppose your library would accept `$.a-b`. Then `$.a-b` and > `$['a-b']` would be synonyms, but `$.a.b` and `$['a.b']` wouldn't. > *(Tim Bray)* > > Hmm ... this seems to be a hint to better exclude `'-'` from > dot-child-selector syntax. I think I have read more discussion about > that, currently don't know where. > > ## Respect Implementations > > > As I mentioned in the session, I think there's a non-trivial amount > of risk here that some implementations won't be willing or able to > move away from their current behaviours, even if interoperability > would improve if they did so. However, there are ways to mitigate that > (e.g., a separate 'rfcxxxx compliant' mode). Even so, it will be > important to get good participation from as many current implementers > as possible. > *(Mark Nottingham)* > > > The WG will develop a standards-track JSONPath specification that > is technically sound and complete, based on the common semantics > and other aspects of existing implementations. Where there are > differences, the working group will analyze those differences and > make choices that rough consensus considers technically best, with > an aim toward minimizing disruption among the different JSONPath > implementations. > *(Barry Leiba)* > > > I'm OK with this, but for context: I've been a pretty intense > JSONPath user > in recent years, and AFAIK the spec, and the implementations, are mostly > OK, so the choice between "make JSONPath good" and "don't invalidate > implementations" is unlikely to come up. If it did, my predisposition > would > be to err on the side of not breaking implementations, but I don't think > that's inconsistent with Barry's text. > *(Tim Bray)* > > +1 to all. > > ## Error Handling > > > My mental model at the moment is that a JSONPath expression can be > valid or erroneous; application of a valid expression yields a result > (which may be empty), but does not raise errors. That may not be the > right model for all applications. > *(Carsten Bormann)* > > > The general approach that I've seen several times (including my > Elixir implementation) is that an error is raised when there is a > syntax error, therefore an invalid expression (e.g. $.foo[[5]) raises > an error. Conversely a valid expression applied to a bogus input never > raises an error (e.g. `$.foo.bar on "test" evals as []`). > *(Davide Bettio)* > > > On the whole I think JSONPath is designed to be "forgiving", i.e. > such things aren't errors, e.g. I think I read in the spec that > filtering a non-array isn't an error, it's some kind of no-op. That > approach isn't always best for everyone, but it's important to be > consistent. > *(Michael Kay)* > > > I would expect one component of this policy to be: > > > > Whether a JSONPath query is valid or not does not depend on the > arguments it is applied to. > > > > I.e., you can look at the query and find out independently, without > knowing any data, whether it is valid or not. > *(Carsten Bormann)* > > I like and totally agree with the *forgiving mental model*, so having > only syntax errors, which do not dependent on data. > > Thanks > -- > sg
- Re: [Jsonpath] Some Comments ... Greg Dennis
- [Jsonpath] Some Comments ... Stefan Gössner
- Re: [Jsonpath] Some Comments ... Carsten Bormann
- Re: [Jsonpath] Some Comments ... Greg Dennis
- Re: [Jsonpath] Some Comments ... Carsten Bormann
- Re: [Jsonpath] Some Comments ... Greg Dennis
- Re: [Jsonpath] Some Comments ... Carsten Bormann
- Re: [Jsonpath] Some Comments ... Greg Dennis
- Re: [Jsonpath] Some Comments ... Carsten Bormann
- Re: [Jsonpath] Some Comments ... Greg Dennis
- Re: [Jsonpath] Some Comments ... Carsten Bormann
- Re: [Jsonpath] Some Comments ... Carsten Bormann
- Re: [Jsonpath] Some Comments ... Greg Dennis
- Re: [Jsonpath] Some Comments ... Carsten Bormann
- Re: [Jsonpath] Some Comments ... Greg Dennis
- Re: [Jsonpath] Some Comments ... Carsten Bormann
- Re: [Jsonpath] Some Comments ... Greg Dennis
- Re: [Jsonpath] Some Comments ... Stefan Gössner
- Re: [Jsonpath] Some Comments ... Daniel P
- Re: [Jsonpath] Some Comments ... Tim Bray
- Re: [Jsonpath] Some Comments ... Glyn Normington
- Re: [Jsonpath] Some Comments ... Carsten Bormann
- Re: [Jsonpath] Some Comments ... Greg Dennis
- Re: [Jsonpath] Some Comments ... Carsten Bormann
- Re: [Jsonpath] Some Comments ... Greg Dennis