[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Pattern matching: good practices ?
- From: David Manura <dm.lua@...>
- Date: Sat, 26 Dec 2009 02:09:51 -0500
On Fri, Dec 25, 2009 at 6:17 AM, Ico wrote:
> Recently I've been bitten a few times by very bad performing patterns
> using string.match()...
> Still, I was wondering if anybody has ever written a sort of 'best
> practices' document for writing lua patterns, describing the impact of
> various constructs on performance...
Resources on regular expression performance also apply here, but Lua
patterns are a significant simplification of that.
What hurts performance is backtracking, particularly if there are a
lot of backtracking possibilities and no match exists, in which case
all backtracking possibilities must be checked before the match can
return with failure. Consider this, which can take a couple seconds
to run:
('a'):rep(1000):match'.*.*b'
The first ".*" matches all 1000 letter a's. The second ".*" matches
the remaining empty string. The final "b" fails in matching, which
thereby triggers backtracking. The first ".*" can backtrack to all
N=1000 character locations in the string. When the first backtracks
to character position i, the second can backtrack to more or less
1000-i different locations. The overall complexity of this is on the
order of O(N^2). If you have three ".*", the complexity would be
O(N^3). I would be careful using more than one ".*" or similar
construct in a pattern. You might, for example, replace "." with a
smaller character class.