Re: [secdir] [Jsonpath] Secdir last call review of draft-ietf-jsonpath-iregexp-06
"Martin J. Dürst" <duerst@it.aoyama.ac.jp> Tue, 16 May 2023 08:20 UTC
Return-Path: <duerst@it.aoyama.ac.jp>
X-Original-To: secdir@ietfa.amsl.com
Delivered-To: secdir@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E6B18C151998; Tue, 16 May 2023 01:20:25 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, NICE_REPLY_A=-0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_PASS=-0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=itaoyama.onmicrosoft.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id NoZQZahZlOYE; Tue, 16 May 2023 01:20:21 -0700 (PDT)
Received: from JPN01-OS0-obe.outbound.protection.outlook.com (mail-os0jpn01on20722.outbound.protection.outlook.com [IPv6:2a01:111:f403:700c::722]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4C0D8C151989; Tue, 16 May 2023 01:20:19 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=DCpDk08aBT5KWJ1yv/+s8BfAqIm6kLT4ieoRISTEuNWlkvrxmf7qTaEWhIAmt+kd6SLUTT9w4kqEj0cUNiAG6rK8iq0dXCVojUpRgjGfPshHeZQeQwTlJdiJNCN4M0GPhX1SZz5dqOegOmvkexF+fXP3ekRW5rPe6h3lKm3ul6uixpSvAtGXCnUPbsfm4/qbX4+ssobQgbg+vvcnxk8d6kcI0mcBIqt/kbN1IoUwp9e79mCbIIKVtCIu2gN0XXQvhNlE9YlXRP24uiOs2BJl7HEcJ+1nvg1HS0rZkEFWZQFijcKUyMc8zSXPU/SwbojwFGgk6GspQcZvy3eo20heyQ==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=amX030Y8OXt6ajpw+J85MbKYZFqqakHV2S1rxTS5Prk=; b=U4AXyZQSLFmnVLhVVULoegK6+xZgLfiERUt5CFRCwvMMuz5H9EGAFDcHP0iNN5ZALtgicBqJVdAiuOXSAqev9UkJr+li8zyUZq4MkXPvd3OZnxiCEPzEas2h873hPUO9MAeh+5ATUz2iXXJy6gGUiH5bVpq0cd+uJxJMLSb/mhECoAg+4CH42PwCqiYUL8O25Mqaiq/NTF3t0iDfyUH5jXL36wVrNxLsbD8bpaZwRO8zHM79MZPDjgOLpScuoQ60ou4yZuH+2Tkq8re8rAbYaiwZ+NAq62HdSIIKA51mWJR5htGmO5hiA+SEO19KvF+uWZK7IvMzptZ9ZELw6+EVmw==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=it.aoyama.ac.jp; dmarc=pass action=none header.from=it.aoyama.ac.jp; dkim=pass header.d=it.aoyama.ac.jp; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=itaoyama.onmicrosoft.com; s=selector2-itaoyama-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=amX030Y8OXt6ajpw+J85MbKYZFqqakHV2S1rxTS5Prk=; b=rLpA7dmfaAkp1t7eVAa4DYrnOb/y8RB0kOsydTlnappeOTShdmjjJ7hPIiscdS0YOBpFDJcPOEGx1kntCrGvjKB+xOPVsPiMf96sC7xcYUXMUu4+lkryvRliWcQqYC9MgFFTOEAh7MciIVZId5l7Sm5Spmrd975LyrYQMGntu+I=
Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=it.aoyama.ac.jp;
Received: from TYAPR01MB5689.jpnprd01.prod.outlook.com (2603:1096:404:8053::7) by TYWPR01MB9293.jpnprd01.prod.outlook.com (2603:1096:400:1a4::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6387.32; Tue, 16 May 2023 08:20:14 +0000
Received: from TYAPR01MB5689.jpnprd01.prod.outlook.com ([fe80::29a4:16ca:2bec:36d1]) by TYAPR01MB5689.jpnprd01.prod.outlook.com ([fe80::29a4:16ca:2bec:36d1%6]) with mapi id 15.20.6387.031; Tue, 16 May 2023 08:20:14 +0000
Message-ID: <f69ab710-34eb-1697-be29-cc0298c88da8@it.aoyama.ac.jp>
Date: Tue, 16 May 2023 17:20:12 +0900
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.10.1
Content-Language: en-US
To: Tim Bray <tbray@textuality.com>, Mike Ounsworth <mike.ounsworth@entrust.com>
Cc: secdir@ietf.org, draft-ietf-jsonpath-iregexp.all@ietf.org, jsonpath@ietf.org, last-call@ietf.org
References: <168416383998.50512.953102690552943438@ietfa.amsl.com> <CAHBU6iuKKp3g_HbhgaZT8CcStQBKoaHOcdf9ogku=bftYt5wgA@mail.gmail.com>
From: "Martin J. Dürst" <duerst@it.aoyama.ac.jp>
Organization: Aoyama Gakuin University
In-Reply-To: <CAHBU6iuKKp3g_HbhgaZT8CcStQBKoaHOcdf9ogku=bftYt5wgA@mail.gmail.com>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 8bit
X-ClientProxiedBy: TYCP286CA0116.JPNP286.PROD.OUTLOOK.COM (2603:1096:400:29c::8) To TYAPR01MB5689.jpnprd01.prod.outlook.com (2603:1096:404:8053::7)
MIME-Version: 1.0
X-MS-PublicTrafficType: Email
X-MS-TrafficTypeDiagnostic: TYAPR01MB5689:EE_|TYWPR01MB9293:EE_
X-MS-Office365-Filtering-Correlation-Id: 20165fae-5179-4e55-00b5-08db55e65f5e
X-MS-Exchange-SenderADCheck: 1
X-MS-Exchange-AntiSpam-Relay: 0
X-Microsoft-Antispam: BCL:0;
X-Microsoft-Antispam-Message-Info: R5CHQxa6dSk/4Ppn6Xz92S5csMQjgvM7EB94r0RoEgtGXDEjLFJ87t5LFY14l03bKVHn62NglhZlEXtTheEWI6B3Zxo1uLeUZfrC9txKv4b6r+/shvuyJG+iuZq4CyvmOF3j5YuDnOI487Zuw+GKSFwLSvYKj1JKGIDsPDpP62gX+WtPpLazuPAJHkufnBHxD++jY/9rOQaGoLFzQzPcjp6XpgikfCVtpJZoHWVKjFM/M5wNYw6E1JvZghvrrWAJKYU8L2sFIFL0sESrvEqirfIk8KfIiIdoeu8j8+cESRkwJdRf2PGbbzdogitNwz1tvyDu8Fy4BJIJXC/Rxs0i83ui1nRU+LZBXHLGMWgI82IN/PlSfwdk848MjrYWSnsaRGbuAvjdjAK/Cyz8wdYW4dp2z3EMDVGkrFZ0IuMAlpL+rQVClbGH0WjSPb5nuZlK/G3I/QaS/kaFStHHZF4cvzg+jhfRTjyy8e5jbwJdy0rE6kjdfGF0gMO9xv7SgUE8MnCdZRGmanEY8sBF1PzqqU2ikPSELPRjzdC7ogXWksRelE0AquwsNF+ei6pZ99fpmaB4Bn3mZ8tOZboedff8kdzm5/1GpkhJryrv35h8ybO89V4WIwgvjU+QWoqGzHrvbbSXWOJlJw/BUKqzySdAnOgl3wj16CLAFVEpTDDqowJv9NSvbNNT1X9ArhjOlmea
X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:TYAPR01MB5689.jpnprd01.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230028)(346002)(136003)(366004)(376002)(39850400004)(396003)(451199021)(66556008)(66476007)(66946007)(38350700002)(966005)(38100700002)(53546011)(26005)(6486002)(36916002)(41320700001)(6512007)(2616005)(52116002)(83380400001)(86362001)(31696002)(6506007)(186003)(110136005)(31686004)(8936002)(8676002)(5660300002)(41300700001)(478600001)(2906002)(66899021)(316002)(786003)(4326008)(43740500002)(45980500001)(569394003); DIR:OUT; SFP:1102;
X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1
X-MS-Exchange-AntiSpam-MessageData-0: d3M7swCbOf/dcl08jU/AqSfgSa4KZQwCAR2tHMMEKirAeVrvz/eyDfLW8qBQWqYPKtdHW6U8a7dpntgI04mUxz1Jyvu9M4a9ETPsoIs6jIHMkwy1cbfzKwLE9PeRpInTV3rP7DN88JXQyPOIr/UDCIaAkxQ/X+BlCABPKaZkHtTCYDRKCDKsOPbH4trK53q6BDu6y7zglTtjpTXzK4rmrfCmXSLTYSrRcO9rp4OJKGZIFfIbwRt8HuwHFb4YUKsevZB58haqjCRQaLSXxRslbLID7cwsvKbXhBsg69c5mCnR0PS7wducb/xtluniBtQqwawUQAohn4fVajpRtgnZLw9rwEJXkOGmsZbAyW8x6NQX+DOAa0Nxv0g3v7U8uApxnqR3i7yGoetIhA3kOxaSldE1SQ99crsFu4x9FvRM1q+R09fflCKL4f+QFD/vwWygOaqzqL6EX9oOtVM0Y64Mb+48AKnOE005yQRVFchahImuKnAoEIKrplVbELEGqPku7i/2AZSFe6uRg523zlsQgOmltm7CU27c60ZguvcRfaLpeZfXDhrspVHpl/SrlmJw0+CO3hf1/MG5vei6dxpcZkQxrmMke5BlE4Pwgw/0uY17Qtds8vunXSpykkH260pmXYglr9j8zg9Xvnbtr+UAksLrcR2IqhhxZIQyAL5LrCcj0t4Jn9CanI3U5YjviaoFVvOi85/tBlnXmxeCZwCBUbBO7PKqDdzHyhJ56B6LBpLnMCzRl3agXdQU4ehj9R6vrzT67eH8m4wPp4gMbjAbkMAIOdSQADaYL4I4EnSmFwf1tLUqKjL9MKIy/Pj951G7G4NpERSehgzFgWMOseVNRbSeGX37Fsn9AFYk+2mQw06bPYH4MmOC16sSd/BY5nlHnbFMeikkmMYWtL+fK8cUXYOu4WLKOV1M1jp8q8kp4LgysT1HZSaOa+LxMSAd+moxZPFTSVfkxdg0FPpVYDrE3a9rfzokjvMIxdbrBQXKyiumT4uCaTMizFvIT/bhU8q8Y7ujh2zwH4H1g5uwMMCbID7w8L6p5UTW9aGP31sU698EqmdXZLFMBx+GEKBK5YJwY1NqbwzVwz/Lcs3/B/iYT7AUcFDAHcjQD+f9oXRSzQrli8hHDzd3hBcT7TwTfHAF61P8IsXqm3IrzOelnZNCAp0Mhj+yx+4MkLrJ1mJFW8stJucErIhhQL3uNISTBu8BLeMLLSrPgx1oNidO1lWJp3H8iVdmYBB5Y7uWQ9k67hmwdYCDpn13dk7r0nZq6OAoQAQhgjewOBkbuWyAHHUqFhk/rfNE83p7kKLHDhGgVieDqAut+tcVzjHPttXa7g4mqea5EIlHlvY8Bp/sHKu3vStzlfSLsIUQjDeTV2BYhDGLIQM2mPzZkjgDkIHfA0Dl7XioRHJZ/7uNwiNCfJROyRIqQrwRIYswhToTDKZk0fHnFntP/eZ8wJZi/CZXIPPaamjPBalNjL6CQ/u4lwuvJsbuQ6aA/ftKe4EAHe1QIjVu1+x6H72LTQHkfhw7RWZgAT0pw1K60cXbyXz+aZCFNfHTcWwxO+qXi1r+daxxlVSQW/SORLpWlk4e49rrtzAR
X-OriginatorOrg: it.aoyama.ac.jp
X-MS-Exchange-CrossTenant-Network-Message-Id: 20165fae-5179-4e55-00b5-08db55e65f5e
X-MS-Exchange-CrossTenant-AuthSource: TYAPR01MB5689.jpnprd01.prod.outlook.com
X-MS-Exchange-CrossTenant-AuthAs: Internal
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 16 May 2023 08:20:14.0893 (UTC)
X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted
X-MS-Exchange-CrossTenant-Id: e02030e7-4d45-463e-a968-0290e738c18e
X-MS-Exchange-CrossTenant-MailboxType: HOSTED
X-MS-Exchange-CrossTenant-UserPrincipalName: pCVoIw09wxy1xSuDrThDwduh0NzzVryChzHzK7xry84qy4NjQEDf5VUpHmmGtoeKmJRxqQue47Mr0Xaky+W8AA==
X-MS-Exchange-Transport-CrossTenantHeadersStamped: TYWPR01MB9293
Archived-At: <https://mailarchive.ietf.org/arch/msg/secdir/Jkx0TTptY70keHAzYQl90HCnHCY>
Subject: Re: [secdir] [Jsonpath] Secdir last call review of draft-ietf-jsonpath-iregexp-06
X-BeenThere: secdir@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Security Area Directorate <secdir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/secdir>, <mailto:secdir-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/secdir/>
List-Post: <mailto:secdir@ietf.org>
List-Help: <mailto:secdir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/secdir>, <mailto:secdir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 16 May 2023 08:20:26 -0000
Hello Tim, Mike, others, Many thanks to Mike for bringing up this issue. I have the following comments: 1) The syntax (and what's more important, but completely missing in the text, the semantics) is equivalent to theoretical regular expressions, i.e. regular expressions that define regular languages (Chomsky Type 3). They can therefore be implemented to run in time linear to the length of the string input, and if so implemented, they don't offer any possibilities for ReDoS (but see (4) below). 2) I-Regexps will be executed on all kinds of different regexp implementations, as explicitly envisioned by the draft. Some of these implementations use backtracking because they need it for more complex features, and may (or let's say will) exhibit quadratic, cubic, or exponential behavior on some I-Regexp inputs. The examples of nested quantifiers given by Mike are a typical case. 3) Even for the same programming language/regular expression engine, different versions may perform differently. As an example, Ruby just recently implemented some changes to its regexp engine, implementing a cache that makes a lot of cases linear in time and uses space proportional to the product of the length of the regexp and the input. (see https://bugs.ruby-lang.org/issues/19104). In addition, we introduced a predicate to check whether a regexp can be run in linear time (https://bugs.ruby-lang.org/issues/19194) and a feature to set a timeout (https://bugs.ruby-lang.org/issues/17837). So a regexp that might blow up on Ruby versions of 3.1 and lower may behave nicely on Ruby 3.2 and higher. 4) Repetitions with lower and upper bounds are a bit of a special case. In theory, α{m,n} (where α is any suitable subregexp) can always be expanded to m times α followed by (n-m) times α?. As an example, a{3,7} would expand to aaaa?a?a?a?, which can be implemented efficiently (e.g. using conversion from a nondeterministic finite automaton (NFA) to a deterministic finite automaton (DFA)). However, it's easy to spoof this with e.g. a regular expression of the form a{10,1000000}. And if 1,000,000 isn't enough, it's easy for the attacker to add a few zeros, they may get a 10-fold memory increase for each zero they add. It is possible to avoid this memory increase by using some kind of lazy construction, which would only be triggered on lengthy inputs, and be linear in the length of the input. Nested quantifier ranges will need special care. [It could be that "Programming Techniques: Regular expression search algorithm" (Ken Thomson, https://dl.acm.org/doi/10.1145/363347.363387) includes such a technique; unfortunately, I have not yet been able to understand that paper because I don't understand the assembly code it uses :-(.] Ruby produces a "too big number for repeat range" error for ranges greater than 100000. I have no idea how the number 100000 was found. Ruby 3.2 also returns false for Regexp.linear_time?(/(a?){20,200}{20,200}/) (and runs a veeery long time on some specific input), which I think is a limitation in the implementation, not in principle.) Many engines simply treat stacked range quantifiers as errors (see https://regex101.com). On 2023-05-16 00:52, Tim Bray wrote: > I was reading your note and agreeing that yes, it remains possible to > devise regexps that are going to cause combinatorial nasties in almost any > conceivable implementation, but unconvinced about the conclusion that it is > "still not advisable to run arbitrary user-provided regular expressions on > your hardware", because it seems to me that the only way to find out if the > regexp is evil is to run it. Not exactly. For I-Regexps, it's only the implementation that is relevant; all I-Regexps *can* be implemented safely. Even for much wider classes of regexps, most of them can be very quickly classified into safe or potentially dangerous, with rather simple heuristics. So in conclusion, please make sure that the security considerations mention all of the following: - I-Regexps may be implemented to run in linear time in the input, but this requires really great care. - Existing regexp engines may be used to run I-Regexps, but may run in quadratic, cubic, or exponential time of the input length for some I-Regexps. - Existing regexp engines should be able to easily handle most I-Regexps (after the adjustments discussed in Section 5), but may reject some types of syntactically legal I-Regexps because they cannot guarantee execution in reasonable time (a typical example might be a{2,4}{2,4}) - Range quantifiers (example: {2,4}) provide specific challenges for implementation and may therefore be limited in composability (see previous example) or range (e.g. disallowing very large ranges such as {20,200000}). Such large ranges may also allow specific attacks on some implementations. - Different versions of the same regexp library may be more or less vulnerable to some attacks. In particular, newer libraries may behave better or may offer additional features to check regular expressions or guard against high resource consumption. Regards, Martin. P.S.: The syntax currently has >>>> quantifier = ( %x2A-2B ; '*'-'+' / "?" ) / ( "{" quantity "}" ) >>>> The doubly nested alternatives and the extremely short range provide more confusion than clarity to the reader. Also, there's absolutely no need to use hex for * and + (see the char-val production in RFC 5234). So please rewrite this as: >>>> quantifier = "*" / "+" / "?" / ( "{" quantity "}" ) >>>> Because we want to refer to it in the security discussion, it may also be a good idea to introduce a rule name for "{" quantity "}": >>>> quantifier = "*" / "+" / "?" / range-quantifier range-quantifier = "{" quantity "}" >>>> P.P.S.: For additional reading, I suggest starting at https://levelup.gitconnected.com/the-regular-expression-denial-of-service-redos-cheat-sheet-a78d0ed7d865. All of James Davis' work on ReDoS is well worth reading. > But I think your closing paragraph provides a solution. > > On Mon, May 15, 2023 at 8:17 AM Mike Ounsworth via Datatracker < > noreply@ietf.org> wrote: > … > >> I wonder if this >> document could recommend that implementations include some sort of >> configurable >> limit on nesting level or on recursion / backtracking depth. > > > That sounds like a good direction, but pretty complex. A simpler option > would be that implementations impose a limit on time and/or memory costs > and error out when those are breached. Do you think that a recommendation > along those lines would address your concerns? > >
- [secdir] Secdir last call review of draft-ietf-js… Mike Ounsworth via Datatracker
- Re: [secdir] Secdir last call review of draft-iet… Tim Bray
- Re: [secdir] [EXTERNAL] Re: Secdir last call revi… Mike Ounsworth
- Re: [secdir] [Jsonpath] Secdir last call review o… Greg Dennis
- Re: [secdir] [Jsonpath] Secdir last call review o… Martin J. Dürst
- Re: [secdir] [Last-Call] [Jsonpath] Secdir last c… Rob Sayre
- Re: [secdir] [Last-Call] [Jsonpath] Secdir last c… Rob Sayre
- Re: [secdir] [EXTERNAL] Secdir last call review o… Carsten Bormann
- Re: [secdir] [EXTERNAL] Secdir last call review o… Mike Ounsworth