[Date Prev][Date Next][Thread Prev][Thread Next]
- Subject: Re: Pattern matching: good practices ?
- From: Vaughan McAlley <ockegheim@...>
- Date: Sat, 26 Dec 2009 20:39:06 +1100
So is there a simple way of knowing when a pattern will (or might) backtrack?
2009/12/26 David Manura <firstname.lastname@example.org>:
> On Fri, Dec 25, 2009 at 6:17 AM, Ico wrote:
>> Recently I've been bitten a few times by very bad performing patterns
>> using string.match()...
>> Still, I was wondering if anybody has ever written a sort of 'best
>> practices' document for writing lua patterns, describing the impact of
>> various constructs on performance...
> Resources on regular expression performance also apply here, but Lua
> patterns are a significant simplification of that.
> What hurts performance is backtracking, particularly if there are a
> lot of backtracking possibilities and no match exists, in which case
> all backtracking possibilities must be checked before the match can
> return with failure. Consider this, which can take a couple seconds
> to run:
> The first ".*" matches all 1000 letter a's. The second ".*" matches
> the remaining empty string. The final "b" fails in matching, which
> thereby triggers backtracking. The first ".*" can backtrack to all
> N=1000 character locations in the string. When the first backtracks
> to character position i, the second can backtrack to more or less
> 1000-i different locations. The overall complexity of this is on the
> order of O(N^2). If you have three ".*", the complexity would be
> O(N^3). I would be careful using more than one ".*" or similar
> construct in a pattern. You might, for example, replace "." with a
> smaller character class.