lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


I had previously asked if there were a streamlined way of 
performing input/split fields/process loops as in awk or perl.
Since no one answered this I gather that there are no specific
facilities for this.  

I would like to suggest that if there is a standard library in the 
future that it includes a function that provides this since its a 
pretty common requirement.

Here is my current stab at something to do this.  

It defines a function called input that returns the fields in global 
f with f[0] being the entire line and f[1], f[2], ... being the 
fields.   

It makes use of a function, strsplit which 
is a modified version of the one in Peters Standard Library found 
at http://lua-users.org/wiki/PetersStdLib .  (I had previously been
using a split routine in another person's library but found a problem 
with that one when encountering empty fields.)  The only key 
difference between this strsplit and that one is: if the delimiter 
is "", it does awk style field splitting.   This is the same as using 
a delimiter of "%s+" except that if there is leading whitespace on a 
line then the first field is the nonwhitespace that follows it 
(rather than the empty field before it).  There is a small test 
program at the end.  See the last two examples in the comments of 
strsplit.

This is close to my first lua program so I suspect that it could be 
improved by someone with more expertise.

Not sure how the decision process works on things like this but can 
we have this or some analogous facility in a future standard library?

-- Split text into a list consisting of the strings in text,
-- separated by strings matching delim (which may be a pattern). 
-- if delim is "" then action is the same as %s+ except that 
-- field 1 may be preceeded by leading whitespace
-- example: strsplit(",%s*", "Anna, Bob, Charlie,Dolores")
-- example: strsplit(""," x y") gives {"x","y"}
-- example: strsplit("%s+"," x y") gives {"", "x","y"}
function strsplit(delim, text)
  local list = {}
  delim = delim or ""
  local pos = 1
  -- if delim matches empty string then it would give an endless loop
  if strfind("", delim, 1) and delim ~= "" then 
    error("delim matches empty string!")
  end
  local first, last
  while 1 do
    if delim ~= "" then 
	    first, last = strfind(text, delim, pos)
    else
	    first, last = strfind(text, "%s+", pos)
	    if first == 1 then
		    pos = last+1
		    first, last = strfind(text, "%s+", pos)
	    end
    end
    if first then -- found?
      tinsert(list, strsub(text, pos, first-1))
      pos = last+1
    else
      tinsert(list, strsub(text, pos))
      break
    end
  end
  return list
end

function input(fs)
  fs = fs or ""
  local l = {}
  l = read("*l")
  if not l then return nil end
  f = strsplit(fs,l)
  f[0] = l
  return 1
end

-- test this out
while(input()) do
	print("<"..f[0]..">")
	print("<"..f[1]..">")
end