Lua Sorting

lua-users home
wiki

The table library provides an in-place sort function, based on the quicksort algorithm [wikipedia]. However, it is possible to write sort in pure Lua without much penalty, as given here.

The algorithm used is Shell sort (named after its inventor, Donald Shell) [wikipedia], and the gap sequence comes from Robert Sedgewick (see [A036569] in [the on-line encylopedia of integer sequences] for a reference to Sedgewick's paper). I used Shell sort rather than quicksort because it's usually about as fast, and the code is much simpler. The gap sequence should be adequate up to at least 10 million elements.

Also see LazySort for another perspective on sorting in Lua.

For efficiency, there are three versions of the sorter; the top-level function selects one of them as appropriate. There are specialized versions for the < operator and the > operator, and a general implementation which takes any comparison function as with table.sort. Note that, as with table.sort the comparison function should return false if its parameters are equal, although in the case of Shell sort it is not as critical.

Sample timings are found below.

-- shellsort.lua
-- Written by Rici Lake. The author disclaims all copyright and offers no warranty.
--
-- This module returns a single function (not a table) whose interface is upwards-
-- compatible with the interface to table.sort:
--
-- array = shellsort(array, before, n)
-- array is an array of comparable elements to be sorted in place
-- before is a function of two arguments which returns true if its first argument
--    should be before the second argument in the second result. It must define
--    a total order on the elements of array.
--      Alternatively, before can be one of the strings "<" or ">", in which case
--    the comparison will be done with the indicated operator.
--    If before is omitted, the default value is "<"
-- n is the number of elements in the array. If it is omitted, #array will be used.
-- For convenience, shellsort returns its first argument.

do
  local incs = { 1391376,
                 463792, 198768, 86961, 33936,
                 13776, 4592, 1968, 861, 336, 
                 112, 48, 21, 7, 3, 1 }

  local function ssup(t, n)
    for _, h in ipairs(incs) do
      for i = h + 1, n do
        local v = t[i]
        for j = i - h, 1, -h do
          local testval = t[j]
          if not (v < testval) then break end
          t[i] = testval; i = j
        end
        t[i] = v
      end 
    end
    return t
  end

  local function ssdown(t, n)
    for _, h in ipairs(incs) do
      for i = h + 1, n do
        local v = t[i]
        for j = i - h, 1, -h do
          local testval = t[j]
          if not (v > testval) then break end
          t[i] = testval; i = j
        end
        t[i] = v
      end 
    end
    return t
  end

  local function ssgeneral(t, n, before)
    for _, h in ipairs(incs) do
      for i = h + 1, n do
        local v = t[i]
        for j = i - h, 1, -h do
          local testval = t[j]
          if not before(v, testval) then break end
          t[i] = testval; i = j
        end
        t[i] = v
      end 
    end
    return t
  end

  function shellsort(t, before, n)
    n = n or #t
    if not before or before == "<" then return ssup(t, n)
    elseif before == ">" then return ssdown(t, n)
    else return ssgeneral(t, n, before)
    end
  end
  return shellsort
end

A simple test (and not very good benchmark) harness:

local function gt(a, b) return a > b end
local function lt(a, b) return a < b end

local function builtinsort(t, before)
  if not before or before == "<" then table.sort(t)
  elseif before == ">" then table.sort(t, gt)
  else table.sort(t, before)
  end
  return t
end

local algo, sort = "Shell sort", shellsort
if not tonumber(arg[1]) then
  if arg[1] == "builtin" then
    algo, sort = "table.sort", builtinsort
  elseif arg[1] == "shell" then
    algo, sort = "Shell sort", require"shellsort"
  else error"Only shell and builtin are implemented"
  end
  table.remove(arg, 1)
end

local a = {}
local range = 100
for i = 1, tonumber(arg[1]) or 10000 do a[i] = math.random(1, range) end
local before = arg[2] and
        ( arg[2]:match"^[<>]$"
          or arg[2] and assert(loadstring("return function(a, b) return "..arg[2].." end"))()
        )
      or "<"

local now = os.clock()
sort(a, before)
local took = os.clock() - now
io.write(("%-12s with %i values: %7.3f seconds, comparison: %s"):format(algo, #a, took, arg[2] or "<"))

-- Validate
before = ({["<"] = lt, [">"] = gt})[before] or before
for i = 1, #a-1 do
  if before(a[i+1], a[i]) then print(("\t****Failed at %i\n"):format(i)); return end
end
print"\tOrder checked"

The results show that shellsort is competitive with table.sort despite being pure Lua; in case where table.sort has an optimized implementation (less than comparison), shellsort is about 75% slower, but it is faster in the case where it has an optimized implementation (greater than comparison) and roughly the same on a sort where the comparison function dominates the timing:

# (luafast is compiled with aligned doubles):

rlake@freeb:~/lualib$ for count in 1e5 2e5 5e5 1e6; do
>  for comp in "<" ">" "(a-50)^2<(b-50)^2"; do echo
>    for algo in shell builtin; do
>      luafast testsort.lua $algo $count $comp
>    done           
>  done     
>done

Shell sort   with 100000 values:   0.352 seconds, comparison: < Order checked
table.sort   with 100000 values:   0.195 seconds, comparison: < Order checked

Shell sort   with 100000 values:   0.352 seconds, comparison: > Order checked
table.sort   with 100000 values:   0.430 seconds, comparison: > Order checked

Shell sort   with 100000 values:   0.914 seconds, comparison: (a-50)^2<(b-50)^2 Order checked
table.sort   with 100000 values:   0.805 seconds, comparison: (a-50)^2<(b-50)^2 Order checked

Shell sort   with 200000 values:   0.734 seconds, comparison: < Order checked
table.sort   with 200000 values:   0.414 seconds, comparison: < Order checked

Shell sort   with 200000 values:   0.758 seconds, comparison: > Order checked
table.sort   with 200000 values:   0.906 seconds, comparison: > Order checked

Shell sort   with 200000 values:   1.891 seconds, comparison: (a-50)^2<(b-50)^2 Order checked
table.sort   with 200000 values:   1.758 seconds, comparison: (a-50)^2<(b-50)^2 Order checked

Shell sort   with 500000 values:   1.961 seconds, comparison: < Order checked
table.sort   with 500000 values:   1.062 seconds, comparison: < Order checked

Shell sort   with 500000 values:   1.938 seconds, comparison: > Order checked
table.sort   with 500000 values:   2.352 seconds, comparison: > Order checked

Shell sort   with 500000 values:   5.031 seconds, comparison: (a-50)^2<(b-50)^2 Order checked
table.sort   with 500000 values:   5.000 seconds, comparison: (a-50)^2<(b-50)^2 Order checked

Shell sort   with 1000000 values:   4.320 seconds, comparison: <        Order checked
table.sort   with 1000000 values:   2.391 seconds, comparison: <        Order checked

Shell sort   with 1000000 values:   4.500 seconds, comparison: >        Order checked
table.sort   with 1000000 values:   6.062 seconds, comparison: >        Order checked

Shell sort   with 1000000 values:  12.305 seconds, comparison: (a-50)^2<(b-50)^2        Order checked
table.sort   with 1000000 values:  12.359 seconds, comparison: (a-50)^2<(b-50)^2        Order checked

Comments

Here's a metaprogramming implementation: --DavidManura
do
  local incs = { 1391376,
                 463792, 198768, 86961, 33936,
                 13776, 4592, 1968, 861, 336, 
                 112, 48, 21, 7, 3, 1 }

  local function make_sorter(compare)
    local src = [[
      local incs = ...
        return function(t, n, before)
        for _, h in ipairs(incs) do
          for i = h + 1, n do
            local a = t[i]
            for j = i - h, 1, -h do
              local b = t[j]
              if not (]] .. compare .. [[) then break end
              t[i] = b; i = j
            end
            t[i] = a
          end 
        end
        return t
      end
    ]]
    return assert(loadstring(src))(incs)
  end

  local sorters = {}
  local aliases = {["<"] = "a<b", [">"] = "a>b"}

  function shellsort(t, before, n)
    before = aliases[before] or before or "a<b"
    n = n or #t
    local beforesrc = type(before) == "function" and "before(a,b)" or before
    local sorter = sorters[beforesrc]
    if not sorter then
      sorter = make_sorter(beforesrc)
      sorters[beforesrc] = sorter
    end
    return sorter(t, n, before)
  end

  return shellsort
end

RecentChanges · preferences
edit · history
Last edited April 19, 2012 2:26 am GMT (diff)