lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]



On 16/11/13 21:22, Liam Devine wrote:


On 16/11/13 20:59, Michael Savage wrote:

Hi lua-l,

( "[a] [b] c" ):match( "%[(.-)%] c" )
-> "a] [b"

From http://www.lua.org/manual/5.1/manual.html#5.4.1 or
http://www.lua.org/manual/5.2/manual.html#6.4.1:

a single character class followed by '-', which also matches 0 or more
repetitions of characters in the class. Unlike '*', these repetition
items will always match the shortest possible sequence;

The shortest possible sequence for the above is "b". Is this a bug?

Mike


It is not, as it returns the first match [1]. If you wanted to capture
the last sequence in square brackets you could prefix it with a greedy
match:
( "[a] [b] c" ):match( ".*%[(.-)%] c" )

[1] http://www.lua.org/manual/5.2/manual.html#pdf-string.match

I say could because it all depends on if you know your data format, the more you know the more specific you can be.
- Single lower case character
- Multiple lower case characters
- Numbers and lower case characters
- Upper and lower case characters
...

> print( ( "[a] [b] c" ):match( "(%l)%] c$" ) )
b
> print( ( "[a] [bddd] c" ):match( "(%l-)%] c$" ) )
bddd
> print( ( "[a] [b1d2c] c" ):match( "([%l%d]-)%] c$" ) )
b1d2c
> print( ( "[a] [KJhhgGGj] c" ):match( "(%a-)%] c$" ) )
KJhhgGGj

As you can see I also like to anchor the match when possible.

--
Liam