[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Emulating advanced regex features using Lua patterns and pure Lua code
- From: Duncan Cross <duncan.cross@...>
- Date: Mon, 20 Jun 2011 14:07:27 +0100
On Mon, Jun 20, 2011 at 11:12 AM, Lorenzo Donati
<lorenzodonatibz@interfree.it> wrote:
> Hi all!
>
> A recent thread about Lua patterns not being regexes induced me to start
> this thread.
>
> I wonder if there are ways to EASILY emulate some advanced regex features,
> usually found in PCRE-like packages, without resorting to LPEG or C bindings
> to external libs.
>
> For example, some uses of regex alternation can be emulated using multiple
> calls to string.match and using logical operators, that is:
>
> -- sort of pseudocode
> local pcre = require 'pcre'
>
> if pcre.match( haystack, "^foo|bar$" ) then ...
>
> may be rewritten in pure Lua as:
>
> if haystack:match "^foo$" or haystack:match "^bar$" then ...
>
>
> Are there common Lua idioms to emulate advanced regex features as:
>
> - Alternation (I don't know if the example above is universally applicable)
> - Positive/Negative look-ahead/-behind
> - Quantification/capture of complex patterns (as in pcre's "((?:foo)+)").
>
>
> If there were a sort of one-to-one (almost boilerplate) translation between
> those PCRE features and Lua idioms, it would very useful.
>
> In particular, for my typical use case, I could reuse my knowledge of PCRE
> without resorting to heavy external bindings or LPEG (I'm lazy, I've got
> little time lately and it is still on my TODO list: "learn LPEG" :-)
>
>
> Thanks in advance for any suggestion/pointer or contribute to the discussion
> (if the list find it worthwhile :-)
>
> Cheers.
> -- Lorenzo
>
>
I think you'd basically have to get used to using position captures
and custom loops. Unlike in common regex flavours, ^ will match on a
given position, and this can be useful here. For example, instead of:
for m in text:gmatch('foo|bar') do
-- [LOOP BODY]
end
...you could do (untested):
local pos = 1
while pos <= #text do
-- find the next f or b
pos = string.match(text, '()[fb]', pos)
if not pos then
break
end
-- try to match foo or bar at this position
local m, next_pos = string.match(str, '^(foo)()', pos)
if not m then
m, next_pos = string.match(str, '^(bar)()', pos)
end
if m then
pos = next_pos
-- [LOOP BODY]
else
pos = pos + 1
end
end
(In other words... if you value brevity, bite the LPEG bullet. :)
-Duncan