lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Hi List,

I recently had a case where I wanted to find the maximum and minimum
number of characters that can be matched by a given Lua string
pattern, as used by string.match() etc.

I couldn't find any existing implementation, so here's mine. It's
pretty simple, but I'd be interested to know if anyone can see a case
that it would give the wrong answer for, or any other critique.

  -- patternsize(p)
  -- returns two values:
  -- * the minimum length of the substring matched by pattern p
  -- * either the maximum length, or nil if unbounded
  function patternsize(p)
    -- collapse escapes, character sets, capture groups
    p = p:gsub('%%.', '!'):gsub('%[.-%]', '!'):gsub('[%(%)]','')
    -- collapse & count optional characters
    local optional
    p, optional = p:gsub('.%?', '')
    -- collapse & count repeating characters
    local repeaters
    p, repeaters = p:gsub('.([%-%*%+])', function(repchar)
      if (repchar == '+') then
        return '!'
      else
        return ''
      end
    end)
    -- remove anchors, if any
    -- (important to do this after escapes have been collapsed,
    --  in case of a string ending '%$')
    p = p:match('^%^?(.-)%$?$')
    -- return (minimum characters), (maximum characters)
    if (repeaters == 0) then
      return #p, #p + optional
    else
      return #p, nil
    end
  end

-Duncan