lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Leo Razoumov wrote:
> I would presume you really meant something like this
> 
> function strsplit(s, delim)
>     local t= {}
>     for x in string.gmatch(s, string.format("[^%s]+", delim)) do
>          t[#t+1]= x
>     end
>     return t
> end
> 
> In any case it is NOT equivalent to a split operation (Ruby, 
> Pearl) that splits on a matching pattern. To see the 
> difference, choose a delimiter to be a string of several 
> characters like delim="SEP". The
> strsplit() implementation above will skip all occurrences of 
> letters S,E,P anywhere in the string which is very different 
> from splitting on a single string "SEP".
> 
> It would be great if someone on this list could show a short 
> and convenient idiom for  "split" operation on strings.

Here is a version using a light wrapper aroung string.gmatch. Because of
the way gmatch is implemented (I think that was the reason exposed on
this list when the issue has been raised), it works better if you
concatenate a delimiter at the end of the string to split. But if the
delimiter contains magic characters, you have to provide a simpler
version that match the magic characters (term in the example below, for
terminator). Tested with stock Lua 5.1:

local str1 = "foo bar baf"
local str2 = "foo<space type='a'>bar<space type='b'>baf"

function split(str, delim, term)
    if not term then term = delim end
    -- :TODO: Detect if delim needs a different term, and build it
depending on delim content.
    return string.gmatch(str..term, "(.-)"..delim)
end

for token in split(str1, " ") do
    print(token)
end

for token in split(str2, "<space[^>]*>", "<space>") do
    print(token)
end