lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Fri, Dec 25, 2009 at 6:17 AM, Ico wrote:
> Recently I've been bitten a few times by very bad performing patterns
> using string.match()...
> Still, I was wondering if anybody has ever written a sort of 'best
> practices' document for writing lua patterns, describing the impact of
> various constructs on performance...

Resources on regular expression performance also apply here, but Lua
patterns are a significant simplification of that.

What hurts performance is backtracking, particularly if there are a
lot of backtracking possibilities and no match exists, in which case
all backtracking possibilities must be checked before the match can
return with failure.  Consider this, which can take a couple seconds
to run:

  ('a'):rep(1000):match'.*.*b'

The first ".*" matches all 1000 letter a's.  The second ".*" matches
the remaining empty string.  The final "b" fails in matching, which
thereby triggers backtracking.  The first ".*" can backtrack to all
N=1000 character locations in the string.  When the first backtracks
to character position i, the second can backtrack to more or less
1000-i different locations.  The overall complexity of this is on the
order of O(N^2).  If you have three ".*", the complexity would be
O(N^3).  I would be careful using more than one ".*" or similar
construct in a pattern.  You might, for example, replace "." with a
smaller character class.