Re: [Jsonpath] #70 Regexps (was: Re: Draft minutes, consensus points, and actions from IETF 112)
Tim Bray <tbray@textuality.com> Sun, 14 November 2021 21:20 UTC
Return-Path: <tbray@textuality.com>
X-Original-To: jsonpath@ietfa.amsl.com
Delivered-To: jsonpath@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D77933A0817 for <jsonpath@ietfa.amsl.com>; Sun, 14 Nov 2021 13:20:36 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.898
X-Spam-Level:
X-Spam-Status: No, score=-1.898 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=textuality-com.20210112.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id DPhBXD8JLdW8 for <jsonpath@ietfa.amsl.com>; Sun, 14 Nov 2021 13:20:32 -0800 (PST)
Received: from mail-lj1-x232.google.com (mail-lj1-x232.google.com [IPv6:2a00:1450:4864:20::232]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id C970E3A0811 for <jsonpath@ietf.org>; Sun, 14 Nov 2021 13:20:31 -0800 (PST)
Received: by mail-lj1-x232.google.com with SMTP id k2so23577482lji.4 for <jsonpath@ietf.org>; Sun, 14 Nov 2021 13:20:31 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=textuality-com.20210112.gappssmtp.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=025sjQjQbw7K6/kbs+9vbO4hYMdR11PJzkJLCHzAJ8A=; b=FYr8LqKtY8vZkUYWXCp6gysEt34m6bLkRgWGZx04teX7dtPipx9NziWoK3OlxJvVHq MLSf70paKp/KHsaVcAl4aXMAW7n/QbxddZh43lXhoYI40QMdVy8Uj4H3imoDnV237hRP 4OL2lAKuwDUJhngsanjQ4QaYcgWFVJ0rH3GHlbEWcTAUwR+8XoQxWJNBKEXLT7zvInXV mInvGmddpZP9dQwJAMRKFijWB22qjox9RnehsuVcOUPpnhUfzx40dMBb84zhqKXFLYnl IZyx4lo+lRScH53ZIzhyJhfcojYB1PZsQbaOxihR8nDuh5GEJtGAvozs8pWsmi1LUJCo vhDw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=025sjQjQbw7K6/kbs+9vbO4hYMdR11PJzkJLCHzAJ8A=; b=22y5y/Hl9bFIX4UB+8//XAHFTOb7FCzIFvoA+uJo8ipgS1rkXDtglfdbqT1DJwSXXm DUJUBPdVb3jq1XuqNivUhiT7qxQEAcbfTFcOikTuxlYz9Z6XupMnyAYFJg/Ejbgg6u7w FpgjryLrWfaRmGTDkZKaehLfCYPJb6kamgfLJ77FS0Z1pSe/pW5SNMrJGgbhD7s/zh+i KKHZp22Zxb7WVw4N9FG053VF5q2S7nzr616rUDm30hjIh3KSQrVv2hR9N+Bt1dwAIy7a wO+UeyMKPTa1L3jiMBltdW9l5Biz547qWeK4NvIb7P/+k4MSWf5g7CI177NZgSbgeAsE SiZA==
X-Gm-Message-State: AOAM530x9D+zCq6NI3mYwy+uv6JV5uaORjB5WFZM9z5gnwX37d8BpcaR oHrlvBYgvwR94QszoXpkw4f2Fh8Hx967s/8AR4WW/A==
X-Google-Smtp-Source: ABdhPJyy7uqXBtTy3imj8WvVC8uG/5JvdfTFrf4cywWGyCL6Brx/FJ8oESy+aJwIkeJfcu57koK1TSeZCMj8kXBvGu4=
X-Received: by 2002:a05:651c:1256:: with SMTP id h22mr34359333ljh.269.1636924828143; Sun, 14 Nov 2021 13:20:28 -0800 (PST)
MIME-Version: 1.0
References: <03E1325D-268F-4380-A5D0-F45E2BE61360@gmail.com> <ECCA2C65-8534-4F0E-B3AE-A51A737B325C@tzi.org>
In-Reply-To: <ECCA2C65-8534-4F0E-B3AE-A51A737B325C@tzi.org>
From: Tim Bray <tbray@textuality.com>
Date: Sun, 14 Nov 2021 13:20:16 -0800
Message-ID: <CAHBU6itdNT=cR=1b+bBfCZCM_UEGLxSwzSSkLoBN8ebgk2e3YQ@mail.gmail.com>
To: Carsten Bormann <cabo@tzi.org>, Francesca Palombini <francesca.palombini@ericsson.com>
Cc: jsonpath@ietf.org
Content-Type: multipart/alternative; boundary="000000000000e7e1b305d0c642a6"
Archived-At: <https://mailarchive.ietf.org/arch/msg/jsonpath/qUx971wWk4AWf8sj0Vq5Cf1rcNU>
Subject: Re: [Jsonpath] #70 Regexps (was: Re: Draft minutes, consensus points, and actions from IETF 112)
X-BeenThere: jsonpath@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: A summary description of the list to be included in the table on this page <jsonpath.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/jsonpath>, <mailto:jsonpath-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/jsonpath/>
List-Post: <mailto:jsonpath@ietf.org>
List-Help: <mailto:jsonpath-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/jsonpath>, <mailto:jsonpath-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 14 Nov 2021 21:20:37 -0000
I suspect that we are not the only WG, in the present or the future, who need to/will need to address this issue. I suggest that we should take this to ietf@ or ask the IESG if they have an opinion. Francesca, do you have any thoughts? On Sun, Nov 14, 2021 at 1:15 PM Carsten Bormann <cabo@tzi.org> wrote: > On 2021-11-14, at 12:35, James <james.ietf@gmail.com> wrote: > > > > * #70 - Have discussion on list with known options > > As a reminder, my slides had these options (renumbered for easier > reference, and expanded a bit): > > 1. Select (define) one regular expression flavor > 2. Provide a way to plug in regular expressions (of different flavors) > 3. No regexps in base RFC (but keep an extension point) > > 1. further splits into: > > 1a. Select *a version of* ECMAScript (parsing/searching RE) > 1b. Select W3C XSD RE (matching RE) > 1c. Build "modest subset" (e.g., iregexp) > > Since we don’t have a consensus or even a majority among the > implementations, we are free to do the right thing, if we can pull that off. > > As you know, I have been exploring 1c. > > I submitted an updated version -01 of draft-bormann-jsonpath-iregexp. > As in -00, I’m using W3C XSD RE as a base, as these are actual regular > expressions, amenable to implementation techniques that are less > susceptible to DoS problems than the Perl/PCRE/ECMAscript dialect. > > Apart from character class subtraction and the exact semantics of > Multi-Character escapes (\s \d \w etc., outside and inside of character > class expressions), W3C XSD RE are pretty much a consensus subset of the > various regular expression dialects, except that they are in the form of > matching expressions (no anchors needed) instead of parsing expressions. > > I believe the spec should have conversion instructions for implementers > that just want to use the regexp engine they happen to have handy. Because > of the consensus subset nature of W3C XSD RE, these instructions are > relatively straightforward (copy, and surround by the anchors the target > flavor happens to use). > > I tried to add conversion instructions for Multi-Character escapes (\s \d > \w and \S \D \W, leaving out the \c and \i that nobody except W3C has). > As you can see when looking at the diff, the result is not pretty when it > comes to double negation in character classes. Maybe a pathological case, > but a bit of a trap, and maybe not that useful anyway as the W3C semantics > is quite different from the PCRE ones. > So I’m leaning towards a -02 that does not have Multi-Character escapes. > (None of the regexps I found in RFCs uses them, and not having them also > happens to be pretty much what the json-schema.org drafts seem to > converge at.) > > Grüße, Carsten > > Status: https://datatracker.ietf.org/doc/draft-bormann-jsonpath-iregexp/ > Html: > https://www.ietf.org/archive/id/draft-bormann-jsonpath-iregexp-01.html > Diff: > https://www.ietf.org/rfcdiff?url2=draft-bormann-jsonpath-iregexp-01 > > -- > JSONpath mailing list > JSONpath@ietf.org > https://www.ietf.org/mailman/listinfo/jsonpath >
- [Jsonpath] Draft minutes, consensus points, and a… James
- [Jsonpath] #70 Regexps (was: Re: Draft minutes, c… Carsten Bormann
- Re: [Jsonpath] #70 Regexps (was: Re: Draft minute… Tim Bray
- Re: [Jsonpath] #70 Regexps (was: Re: Draft minute… Carsten Bormann
- Re: [Jsonpath] Draft minutes, consensus points, a… Carsten Bormann
- Re: [Jsonpath] Draft minutes, consensus points, a… Glyn Normington
- Re: [Jsonpath] Draft minutes, consensus points, a… Glyn Normington