lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Thu, Dec 24, 2009 at 8:48 PM, David Manura <dm.lua@math2.org> wrote:
> to misimplement.  I don't generally trust the code in [1] as is.
> Behavior of corner cases (e.g. empty patterns and treatment of empty
> leading and trailing matches) needs to be fully given in the
> specifications and confirmed in test cases.

Here's a first draft of what that split function looks like with a few
sanity checks:

function split(str, pat, limit)
   assert(type(str)=="string","split() first argument must be string")
   if pat == nil then -- default
      pat = '%s+'
   elseif type(pat) ~= "string" then
      error("split() delimiter must be string")
   elseif pat == "" then -- empty delim not an error
      return {str}
   end
   if limit then -- number of delimiters to split
      assert(type(limit)=="number","split() limit argument must be number")
   end
   local t = {}
   local fpat = "(.-)" .. pat
   local last_end = 1
   local find,append = string.find,table.insert
   local s, e, cap = find(str, fpat, 1)
   while s do
      if s ~= 1 or cap ~= "" then
         append(t,cap)
      end
      last_end = e+1
      s, e, cap = find(str, fpat, last_end)
      if limit and #t == limit then break end
   end
   if last_end <= #str then
      cap = str:sub(last_end)
      append(t, cap)
   end
   return t
end

The default separator is spaces, and an empty separator means do no
splitting (Python regards this as an error condition). The original
behaves very badly with an empty separator.

split("one=two=three","=",1) => {"one","two=three"}. Again Python
convention for limit arg.

It's hard to write a split function which meets all our expectations.
This function ignores delimiters at the ends of the string, which is
often what we want

split(" one two"," ") => {"one","two"}   -- cool
split(",one,two",",") => {"one","two"}   -- not what expected?

Feels that it needs yet another optional parameter, dont_ignore_ends

steve d.