lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]



This may be tricky on regular(-ish) expressions like "^%(*%b()$", given input like "(((((((((((((((((((((()))())".
This can definitely be done using a sub-exponential algorithm (e.g. generate an ambiguous context free grammar from the _expression_, and use Early's parser to get O(n^3) performance).
I don't yet see how to do this in linear time by bolting a recognizer for the "%b()" pattern onto a finite state automaton.

"^%(*%b()%b)($" should be a fun regex too :-) 



On Fri, Mar 31, 2017 at 7:58 AM, Roberto Ierusalimschy <roberto@inf.puc-rio.br> wrote:
> Many regular _expression_ implementations have added features that can't be
> implemented by a finite state automaton, and they end up requiring
> backtracking. Lua's "%bxy" feature is one of them, as are 'back references'
> in the search string: ^(.*)-\1$  (two equal strings separated by a hyphen)

Small correction: "%bxy" certainly cannot be implemented by a finite
state automaton, but it does not require backtracking. It is
easy to do an efficient (linear) implementation of "%bxy" and integrate
it with a finite state automaton. ('back references' as in Perl, on the
other hand, are NP complete.)

-- Roberto




--
--