lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


LuaTeX adds this function to the string library:
...
This function as it stands should not be added to the standard library.
However, an iterator function string.split(s,[,m]) such that
string.explode is equivalent to

function(str,delim)
     local t={}
     for k in str:split(delim) do t[#t+1]=k end
     return t
     end

maybe with delim not as restricted as in LuaTeX (i.e. normal string
patterns allowed) would be handy in many other applications too.
You can sort of mock it up using existing string functions but it
is surprisingly tricky to get it right.

I've attached two scripts which implement the split() function you describe, both support Lua patterns. One returns a table, the other an iterator. I can't guarantee they're bug free, but they are definitely a good start.

 - Peter
-- This function splits a string on a Lua pattern delimiter
-- and returns the results using an iterator function
-- instead of a table. This can be useful for two reasons:
-- 1) You want to iterate over the results but don't need
-- them in a table and 2) you want to split very large input
-- strings.

local yield = coroutine.yield
function string.gsplit(string, pattern, capture)
 string = string and tostring(string) or ''
 pattern = pattern and tostring(pattern) or '%s+'
 if (''):find(pattern) then
  error('pattern matches empty string!', 2)
 end
 return coroutine.wrap(function()
  local index = 1
  repeat
   local first, last = string:find(pattern, index)
   if first and last then
    if index < first then yield(string:sub(index, first - 1)) end
    if capture then yield(string:sub(first, last)) end
    index = last + 1
   else
    if index <= #string then yield(string:sub(index)) end
    break
   end
  until index > #string
 end)
end
-- This function splits a string on a Lua pattern delimiter
-- and returns the results in a table. When the optional
-- third argument is true the matched delimiters are also
-- included in the returned table.

function string.split(string, pattern, capture)
 string = string and tostring(string) or ''
 pattern = pattern and tostring(pattern) or '%s+'
 if (''):find(pattern) then
  error('pattern matches empty string!', 2)
 end
 local index = 1
 local matches = {}
 repeat
  local first, last = string:find(pattern, index)
  if first and last then
   if index < first then
    matches[#matches+1] = string:sub(index, first - 1)
   end
   if capture then
    matches[#matches+1] = string:sub(first, last)
   end
   index = last + 1
  else
   if index <= #string then
    matches[#matches+1] = string:sub(index)
   end
   break
  end
 until index > #string
 return matches
end