lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


So is there a simple way of knowing when a pattern will (or might) backtrack?

Vaughan

2009/12/26 David Manura <dm.lua@math2.org>:
> On Fri, Dec 25, 2009 at 6:17 AM, Ico wrote:
>> Recently I've been bitten a few times by very bad performing patterns
>> using string.match()...
>> Still, I was wondering if anybody has ever written a sort of 'best
>> practices' document for writing lua patterns, describing the impact of
>> various constructs on performance...
>
> Resources on regular expression performance also apply here, but Lua
> patterns are a significant simplification of that.
>
> What hurts performance is backtracking, particularly if there are a
> lot of backtracking possibilities and no match exists, in which case
> all backtracking possibilities must be checked before the match can
> return with failure.  Consider this, which can take a couple seconds
> to run:
>
>  ('a'):rep(1000):match'.*.*b'
>
> The first ".*" matches all 1000 letter a's.  The second ".*" matches
> the remaining empty string.  The final "b" fails in matching, which
> thereby triggers backtracking.  The first ".*" can backtrack to all
> N=1000 character locations in the string.  When the first backtracks
> to character position i, the second can backtrack to more or less
> 1000-i different locations.  The overall complexity of this is on the
> order of O(N^2).  If you have three ".*", the complexity would be
> O(N^3).  I would be careful using more than one ".*" or similar
> construct in a pattern.  You might, for example, replace "." with a
> smaller character class.
>