[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: patternsize(p), finding the min & max length of substring matched by pattern p
- From: Duncan Cross <duncan.cross@...>
- Date: Sat, 14 Jan 2012 15:08:11 +0000
Hi List,
I recently had a case where I wanted to find the maximum and minimum
number of characters that can be matched by a given Lua string
pattern, as used by string.match() etc.
I couldn't find any existing implementation, so here's mine. It's
pretty simple, but I'd be interested to know if anyone can see a case
that it would give the wrong answer for, or any other critique.
-- patternsize(p)
-- returns two values:
-- * the minimum length of the substring matched by pattern p
-- * either the maximum length, or nil if unbounded
function patternsize(p)
-- collapse escapes, character sets, capture groups
p = p:gsub('%%.', '!'):gsub('%[.-%]', '!'):gsub('[%(%)]','')
-- collapse & count optional characters
local optional
p, optional = p:gsub('.%?', '')
-- collapse & count repeating characters
local repeaters
p, repeaters = p:gsub('.([%-%*%+])', function(repchar)
if (repchar == '+') then
return '!'
else
return ''
end
end)
-- remove anchors, if any
-- (important to do this after escapes have been collapsed,
-- in case of a string ending '%$')
p = p:match('^%^?(.-)%$?$')
-- return (minimum characters), (maximum characters)
if (repeaters == 0) then
return #p, #p + optional
else
return #p, nil
end
end
-Duncan