Re: Complexity of :matches ?

Michael Haardt <michael@freenet-ag.de> Tue, 12 July 2005 11:26 UTC

Received: from above.proper.com (localhost.vpnc.org [127.0.0.1]) by above.proper.com (8.12.11/8.12.9) with ESMTP id j6CBQZ7G080054; Tue, 12 Jul 2005 04:26:35 -0700 (PDT) (envelope-from owner-ietf-mta-filters@mail.imc.org)
Received: (from majordom@localhost) by above.proper.com (8.12.11/8.12.9/Submit) id j6CBQZnb080053; Tue, 12 Jul 2005 04:26:35 -0700 (PDT)
X-Authentication-Warning: above.proper.com: majordom set sender to owner-ietf-mta-filters@mail.imc.org using -f
Received: from mout1.freenet.de (mout1.freenet.de [194.97.50.132]) by above.proper.com (8.12.11/8.12.9) with ESMTP id j6CBQY5a080039 for <ietf-mta-filters@imc.org>; Tue, 12 Jul 2005 04:26:34 -0700 (PDT) (envelope-from michael@freenet-ag.de)
Received: from [194.97.50.135] (helo=mx2.freenet.de) by mout1.freenet.de with esmtpa (Exim 4.52) id 1DsIuH-0003Oh-Co for ietf-mta-filters@imc.org; Tue, 12 Jul 2005 13:26:33 +0200
Received: from nostromo.freenet-ag.de ([194.97.7.6]) by mx2.freenet.de with esmtps (TLSv1:AES256-SHA:256) (Exim 4.52 #3) id 1DsIuH-0005LB-A6 for ietf-mta-filters@imc.org; Tue, 12 Jul 2005 13:26:33 +0200
Received: from michael by nostromo.freenet-ag.de with local (ID michael) (Exim 4.52 #5) id 1DsIuG-0003lF-08 for ietf-mta-filters@imc.org; Tue, 12 Jul 2005 13:26:32 +0200
Date: Tue, 12 Jul 2005 13:26:31 +0200
From: Michael Haardt <michael@freenet-ag.de>
To: ietf-mta-filters@imc.org
Subject: Re: Complexity of :matches ?
Message-ID: <20050712112631.GC14392@nostromo.freenet-ag.de>
References: <E1DsGVb-0003gO-Fg@nostromo.freenet-ag.de> <200507121052.j6CAq1rH038682@lab.smi.sendmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <200507121052.j6CAq1rH038682@lab.smi.sendmail.com>
User-Agent: Mutt/1.5.6i
Sender: owner-ietf-mta-filters@mail.imc.org
Precedence: bulk
List-Archive: <http://www.imc.org/ietf-mta-filters/mail-archive/>
List-ID: <ietf-mta-filters.imc.org>
List-Unsubscribe: <mailto:ietf-mta-filters-request@imc.org?body=unsubscribe>

On Tue, Jul 12, 2005 at 03:52:01AM -0700, Philip Guenther wrote:
> Given that you can obviously locate a match for a string of 'n' non-star
> characters in O(n*M), isn't the algorithm that Tim described a
> worst-case of O(N*M) where N is the number of characters _other_ than
> star in the pattern?  (Stars are the cheap part, as they can never force
> a backtrack!)

I thought about things by looking at a*b and the string abb.  A non-greedy
match could match a with a, * with nothing, b with b, and then find out it
has to backtrack.  A greedy match with a*b and abc would match a with a,
* with bc, backtrack, match * with b, fail on b and c.  Things got worse
with more stars.  As I said, a naive approach, but I was too focused on
regular expressions and failed to see how I can make use of wildcards
being a subset.

Now that Tim answered it all, it is obvious where exactly the difference
to regular expressions is, why backtracking can be replaced by string
search, that it is not related to greedyness and that in wildcard
expressions, stars are cheap if done right.

Michael