lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Mon, Jun 20, 2011 at 11:12 AM, Lorenzo Donati
<lorenzodonatibz@interfree.it> wrote:
> Hi all!
>
> A recent thread about Lua patterns not being regexes induced me to start
> this thread.
>
> I wonder if there are ways to EASILY emulate some advanced regex features,
> usually found in PCRE-like packages, without resorting to LPEG or C bindings
> to external libs.
>
> For example, some uses of regex alternation can be emulated using multiple
> calls to string.match and using logical operators, that is:
>
> -- sort of pseudocode
> local pcre = require 'pcre'
>
> if pcre.match( haystack, "^foo|bar$" ) then ...
>
> may be rewritten in pure Lua as:
>
> if haystack:match "^foo$" or haystack:match "^bar$" then ...
>
>
> Are there common Lua idioms to emulate advanced regex features as:
>
> - Alternation (I don't know if the example above is universally applicable)
> - Positive/Negative look-ahead/-behind
> - Quantification/capture of complex patterns (as in pcre's "((?:foo)+)").
>
>
> If there were a sort of one-to-one (almost boilerplate) translation between
> those PCRE features and Lua idioms, it would very useful.
>
> In particular, for my typical use case, I could reuse my knowledge of PCRE
> without resorting to heavy external bindings or LPEG (I'm lazy, I've got
> little time lately and it is still on my TODO list: "learn LPEG" :-)
>
>
> Thanks in advance for any suggestion/pointer or contribute to the discussion
> (if the list find it worthwhile :-)
>
> Cheers.
> -- Lorenzo
>
>


I think you'd basically have to get used to using position captures
and custom loops. Unlike in common regex flavours, ^ will match on a
given position, and this can be useful here. For example, instead of:

  for m in text:gmatch('foo|bar') do
    -- [LOOP BODY]
  end

...you could do (untested):

  local pos = 1
  while pos <= #text do
    -- find the next f or b
    pos = string.match(text, '()[fb]', pos)
    if not pos then
      break
    end
    -- try to match foo or bar at this position
    local m, next_pos = string.match(str, '^(foo)()', pos)
    if not m then
      m, next_pos = string.match(str, '^(bar)()', pos)
    end
    if m then
      pos = next_pos
      -- [LOOP BODY]
    else
      pos = pos + 1
    end
  end

(In other words... if you value brevity, bite the LPEG bullet. :)

-Duncan