• Subject: Re: Is this correct?
• From: Jay Carlson <nop@...>
• Date: Thu, 3 May 2012 15:52:27 -0400

```On May 3, 2012, at 3:02 PM, Emeka wrote:

> I stumbled on the above while going through Lua manual... I need your help to here.

The Lua manual explains "what", but it does not explain "why". _Programming in Lua_ often does explain "why" and I recommend it.

>      x = string.gsub("4+5 = \$return 4+5\$", "%\$(.-)%\$", function (s)
>          end)

Similar to "*", the "-" quantifier means "zero or more, but as few as possible." It is common when iterating over delimited patterns. Let's look at the difference between ".*" and ".-" here:

> sub = "4 + 5, 3 + 4 = \$return 4+5\$, \$return 3+4\$"

> s_star ="%\$(.*)%\$"
> s_minus="%\$(.-)%\$"

> =string.match(sub, s_star)
return 4+5\$, \$return 3+4

> =string.match(sub, s_minus)
return 4+5

In the case of single-character delimiters, this can be rewritten in terms of greedy matching everything but the terminating character:

> s_invclass ="%\$([^\$]*)%\$"

> =string.match(sub, s_invclass)
return 4+5

But this would not work if you wished to match something like

> sub2 = "4 + 5 = \${return 4+5}\$"

since you can't invert a sequence--only a character set. Ending a capture with a sequence works fine with .- though:

> s_minus2="%\${(.-)}%\$"

> =string.match(sub2, s_minus2)
return 4+5

Because UTF-8 codepoints are more than one Lua character long, non-greedy matching is necessary for field matches terminating at a non-ASCII codepoint.

Jay

```