lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


David Given wrote:

I'm generating some rather baroque patterns.  For example,
doing a search for "Fnord" from the menu actually starts
applying the pattern "[Ff]%c*[Nn]%c*[Oo]%c*[Rr]%c*[Dd]" to
the string data.  (Style information is stored in the
string as control codes, hence the %cs.)

This not that bad.  Lua does basically the following:

- Check for a match for the first element of the pattern at
 the beginning of the subject string.
- If one is found, check if that match is followed by a
 match for the next element of the pattern.
- If at any point a match can't be found, give up and try
 again starting with the first element of the pattern and
 the next character of the subject.

What's important is not how complex the pattern is (there is
no separate step where the whole pattern is parsed), but how
many partial matches are in the subject and how soon Lua can
figure out that they're only partial and give up on them.
(Patterns anchored at the beginning are nice if the subject
string is likely to contain no match, because they can allow
Lua to call off the search very early.)

In your above example, the first "%c*" element is only even
looked at if a match for "[Ff]" is found; "[Nn]" is only
looked at if a match for "[Ff]%c*" is found; the second
"%c*" is only looked at if a match for "[Ff]%c*[Nn]" is
found; and so on.

Another good thing about your example is that there's no
overlap between the "%c*" elements and the elements that
follow them.  This means you never have a situation where a
star element grabs too much and has to have its potential
match shortened one character at a time, like this:

 string.match("abcdZefghijklmnopqrstwxy", "%a*Z")

The "%a*" grabs the whole subject, then Lua asks itself:

- Is there anything left in the subject?  (No, so back up
 one character.)
- Does 'y' match 'Z'?  (No, so back up one character.)
- Does 'x' match 'Z'?  (No, so back up one character.)
- Does 'w' match 'Z'?  (No, so back up one character.)
- Does 't' match 'Z'?  (No, so back up one character.)

and so on until it finally reaches the "Z".  (How to
optimize this depends on what its purpose is.)

--
Aaron
http://arundelo.com/