[Jsonpath] Re: [Technical Errata Reported] RFC9485 (7990)

Carsten Bormann <cabo@tzi.org> Thu, 13 June 2024 20:01 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: jsonpath@ietfa.amsl.com
Delivered-To: jsonpath@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1DF50C1840EF for <jsonpath@ietfa.amsl.com>; Thu, 13 Jun 2024 13:01:03 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.907
X-Spam-Level:
X-Spam-Status: No, score=-6.907 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cz_WKl1Tj7aL for <jsonpath@ietfa.amsl.com>; Thu, 13 Jun 2024 13:00:59 -0700 (PDT)
Received: from smtp.zfn.uni-bremen.de (smtp.zfn.uni-bremen.de [134.102.50.21]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1EC74C151980 for <jsonpath@ietf.org>; Thu, 13 Jun 2024 13:00:57 -0700 (PDT)
Received: from eduroam-0016.wlan.uni-bremen.de (eduroam-0016.wlan.uni-bremen.de [134.102.16.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.zfn.uni-bremen.de (Postfix) with ESMTPSA id 4W0YF260PZzDCcV; Thu, 13 Jun 2024 22:00:54 +0200 (CEST)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.7\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <20240613185743.E2FD1204E21@rfcpa.rfc-editor.org>
Date: Thu, 13 Jun 2024 22:00:54 +0200
X-Mao-Original-Outgoing-Id: 740001654.165229-8a29af7f1e4dc2aefbb56762c86686c9
Content-Transfer-Encoding: quoted-printable
Message-Id: <5DBD3FA0-BD5F-4D56-A660-DAB5F5F889CC@tzi.org>
References: <20240613185743.E2FD1204E21@rfcpa.rfc-editor.org>
To: RFC Errata System <rfc-editor@rfc-editor.org>
X-Mailer: Apple Mail (2.3608.120.23.2.7)
Message-ID-Hash: N5TL3DA3UQM5SB5DPLVEXE3QQEWXUUN6
X-Message-ID-Hash: N5TL3DA3UQM5SB5DPLVEXE3QQEWXUUN6
X-MailFrom: cabo@tzi.org
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
CC: Tim Bray <tbray@textuality.com>, "Murray S. Kucherawy" <superuser@gmail.com>, orie@transmute.industries, james.ietf@gmail.com, bjoern@dataspace.earth, jsonpath@ietf.org
X-Mailman-Version: 3.3.9rc4
Precedence: list
Subject: [Jsonpath] Re: [Technical Errata Reported] RFC9485 (7990)
List-Id: Discussion of JSONPath syntax <jsonpath.ietf.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/jsonpath/cFbR4G6UK-mfkeeATebDExSlAig>
List-Archive: <https://mailarchive.ietf.org/arch/browse/jsonpath>
List-Help: <mailto:jsonpath-request@ietf.org?subject=help>
List-Owner: <mailto:jsonpath-owner@ietf.org>
List-Post: <mailto:jsonpath@ietf.org>
List-Subscribe: <mailto:jsonpath-join@ietf.org>
List-Unsubscribe: <mailto:jsonpath-leave@ietf.org>

Hi Björn,

thank you for your attention and this report.

However, I need to argue for rejecting this specific report.

On 2024-06-13, at 20:57, RFC Errata System <rfc-editor@rfc-editor.org> wrote:
> 
> `\S` excludes the form feed and vertical tabulation characters in Perl, ECMAScript, and other implementations, while the suggested replacement includes them. Given the entire document is about interoperable regular expressions, misrepresentation of the common definition of `\S` runs counter to that. Including form feed and vertical tabulation literally in the replacement expression is not likely to be helpful, so removing the misleading rows seems to be the best option.

Mozilla [1] actually claims that, for JavaScript, \S is equivalent to [^\f\n\r\t\v\u0020\u00a0\u1680\u2000-\u200a\u2028\u2029\u202f\u205f\u3000\ufeff]

The entries in the table are examples, and they clearly need to be modulated by the specific type of regex flavor, library version, Unicode version, platform specification version etc.
The intent is to demonstrate how these replacements can be built, not to supply readymade replacements with the highest fidelity to a specific combination of implementation and Unicode versions.

The entries for the first two examples in the table given are based on W3C XML Schema Definition Language (XSD) 1.1 Part 2: Datatypes, Appendix G.4.2.5 "Multi-character escapes". 
(The value for the third example is *not* taken from that source, which would have suggested \p{Nd}, but mirrors what most regexp writers tend to believe \d means.)

Grüße, Carsten


[1]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_expressions/Character_classes