Re: Complexity of :matches ?

Michael Haardt <michael@freenet-ag.de> Tue, 12 July 2005 12:13 UTC

Received: from above.proper.com (localhost.vpnc.org [127.0.0.1]) by above.proper.com (8.12.11/8.12.9) with ESMTP id j6CCDo3s097533; Tue, 12 Jul 2005 05:13:50 -0700 (PDT) (envelope-from owner-ietf-mta-filters@mail.imc.org)
Received: (from majordom@localhost) by above.proper.com (8.12.11/8.12.9/Submit) id j6CCDoWi097532; Tue, 12 Jul 2005 05:13:50 -0700 (PDT)
X-Authentication-Warning: above.proper.com: majordom set sender to owner-ietf-mta-filters@mail.imc.org using -f
Received: from mout1.freenet.de (mout1.freenet.de [194.97.50.132]) by above.proper.com (8.12.11/8.12.9) with ESMTP id j6CCDnrQ097517 for <ietf-mta-filters@imc.org>; Tue, 12 Jul 2005 05:13:50 -0700 (PDT) (envelope-from michael@freenet-ag.de)
Received: from [194.97.50.136] (helo=mx3.freenet.de) by mout1.freenet.de with esmtpa (Exim 4.52) id 1DsJe0-0005b0-9s for ietf-mta-filters@imc.org; Tue, 12 Jul 2005 14:13:48 +0200
Received: from nostromo.freenet-ag.de ([194.97.7.6]) by mx3.freenet.de with esmtps (TLSv1:AES256-SHA:256) (Exim 4.52 #3) id 1DsJdz-00082R-Uh for ietf-mta-filters@imc.org; Tue, 12 Jul 2005 14:13:47 +0200
Received: from michael by nostromo.freenet-ag.de with local (ID michael) (Exim 4.52 #5) id 1DsJdx-0003nA-Fo for ietf-mta-filters@imc.org; Tue, 12 Jul 2005 14:13:45 +0200
To: ietf-mta-filters@imc.org
Subject: Re: Complexity of :matches ?
Message-Id: <E1DsJdx-0003nA-Fo@nostromo.freenet-ag.de>
From: Michael Haardt <michael@freenet-ag.de>
Date: Tue, 12 Jul 2005 14:13:45 +0200
Sender: owner-ietf-mta-filters@mail.imc.org
Precedence: bulk
List-Archive: <http://www.imc.org/ietf-mta-filters/mail-archive/>
List-ID: <ietf-mta-filters.imc.org>
List-Unsubscribe: <mailto:ietf-mta-filters-request@imc.org?body=unsubscribe>

> In particular, counted repetitions ({n,m}) have to be expanded when
> figuring the 'N' of the size of the regexp, while back-references
> (\digit) are, IIRC, worst-case exponential to match.

That's because back-references leave the class of regular languages,
but I am surprised right now that the manual does not say so.

This egrep expression shows that is an entertaining way:

^(aa+)\1+$

It does not match if the length of the string of a characters is prime,
because the group is a prime factor.  A DFA can not do that, because it
contains only a finite number of states, thus being unable to compute
with natural numbers.  I guess I should think less about regular
expressions. :)

Michael