List Comprehensions

lua-users home
wiki

List comprehensions [1] provide concise syntax for building lists in mathematical set-builder notation. A number of programming languages (e.g. Haskell and Python) provide built-in support for list comprehensions. Lua does not; however, there are are ways to implement it in Lua.

List Comprehensions in Pure Lua with Code Generation

Unlike some other approaches, the following approach implements list comprehensions in pure Lua as a library (without patching or token filters). It uses a trick with dynamic code generation (loadstring) and caching somewhat analogous to ShortAnonymousFunctions.

This library has been incorporated into [Penlight].

Examples

local comp = require 'comprehension' . new()
comp 'x^2 for x' {2,3} --> {2^2,3^2}
comp 'x^2 for _,x in ipairs(_1)' {2,3} --> {2^2,3^2}
comp 'x^2 for x=_1,_2' (2,3) --> {2^2,3^2}

comp 'sum(x^2 for x)' {2,3} --> 2^2+3^2
comp 'max(x*y for x for y if x<4 if y<6)' ({2,3,4}, {4,5,6}) --> 3*5
comp 'table(v,k for k,v in pairs(_1))' {[3]=5, [5]=7} --> {[5]=3, [7]=5}

Examples/Tests

-- comprehension_test.lua
-- test of comprehension.lua

-- Utility function for the test suite.
-- Returns true/false whether arrays a and b
-- are equal.  This does a deep by-value comparison (supports nested arrays).
local function eqarray(a, b)
  if #a ~= #b then return false end
  for i=1,#a do
    if type(a[i]) == 'table' and type(b[i]) == 'table' then
      if not eqarray(a[i], b[i]) then return false end
    else
      if a[i] ~= b[i] then return false end
    end
  end
  return true
end

local comp = require 'comprehension' . new()

-- test of list building
assert(eqarray(comp 'x for x' {}, {}))
assert(eqarray(comp 'x for x' {2,3}, {2,3}))
assert(eqarray(comp 'x^2 for x' {2,3}, {2^2,3^2}))
assert(eqarray(comp 'x for x if x % 2 == 0' {4,5,6,7}, {4,6}))
assert(eqarray(comp '{x,y} for x for y if x>2 if y>4' ({2,3},{4,5}), {{3,5}}))

-- test of table building
local t = comp 'table(x,x+1 for x)' {3,4}
assert(t[3] == 3+1 and t[4] == 4+1)
local t = comp 'table(x,x+y for x for y)' ({3,4}, {2})
assert(t[3] == 3+2 and t[4] == 4+2)
local t = comp 'table(v,k for k,v in pairs(_1))' {[3]=5, [5]=7}
assert(t[5] == 3 and t[7] == 5)

-- test of sum
assert(comp 'sum(x for x)' {} == 0)
assert(comp 'sum(x for x)' {2,3} == 2+3)
assert(comp 'sum(x^2 for x)' {2,3} == 2^2+3^2)
assert(comp 'sum(x*y for x for y)' ({2,3}, {4,5}) == 2*4+2*5+3*4+3*5)
assert(comp 'sum(x^2 for x if x % 2 == 0)' {4,5,6,7} == 4^2+6^2)
assert(comp 'sum(x*y for x for y if x>2 if y>4)' ({2,3}, {4,5}) == 3*5)

-- test of min/max
assert(comp 'min(x for x)' {3,5,2,4} == 2)
assert(comp 'max(x for x)' {3,5,2,4} == 5)

-- test of placeholder parameters --
assert(comp 'sum(x^_1 + _3 for x if x >= _4)' (2, nil, 3, 4, {3,4,5})
       == 4^2+3 + 5^2+3)

-- test of for =
assert(comp 'sum(x^2 for x=2,3)' () == 2^2+3^2)
assert(comp 'sum(x^2 for x=2,6,1+1)' () == 2^2+4^2+6^2)
assert(comp 'sum(x*y*z for x=1,2 for y=3,3 for z)' {5,6} ==
  1*3*5 + 2*3*5 + 1*3*6 + 2*3*6)
assert(comp 'sum(x*y*z for z for x=1,2 for y=3,3)' {5,6} ==
  1*3*5 + 2*3*5 + 1*3*6 + 2*3*6)

-- test of for in
assert(comp 'sum(i*v for i,v in ipairs(_1))' {2,3} == 1*2+2*3)
assert(comp 'sum(i*v for i,v in _1,_2,_3)' (ipairs{2,3}) == 1*2+2*3)

-- test of difficult syntax
assert(eqarray(comp '" x for x " for x' {2}, {' x for x '}))
assert(eqarray(comp 'x --[=[for x\n\n]=] for x' {2}, {2}))
assert(eqarray(comp '(function() for i = 1,1 do return x*2 end end)() for x'
               {2}, {4}))
assert(comp 'sum(("_5" and x)^_1 --[[_6]] for x)' (2, {4,5}) == 4^2 + 5^2)

-- error checking
assert(({pcall(function() comp 'x for __result' end)})[2]
       :find'not contain __ prefix')

-- environment.
-- Note: generated functions are set to the environment of the 'new' call.
assert(5 == (function()
  local env = {d = 5}
  setfenv(1, env)
  local comp = comp.new()
  return comp 'sum(d for x)' {1}
end)());

print 'DONE'

Run-time behavior

To illustrate the run-time characteristics, consider the following code:

comp 'sum(x^2 for x if x % 2 == 0)'

That gets code generated to this Lua function:

local __in1 = ...
local __result = (  0  )
for __idx1 = 1, #__in1 do
  local x = __in1[__idx1]
  if x % 2 == 0 then
    __result = __result + ( __x^2 )
  end
end
return __result

Note that no intermediate lists are built. The code efficiently avoids memory allocations (apart from the allocation of the function itself, but that is done only on the first invocation due to caching/memoization). Also, no global variables are referenced.

Source

Download from [github].

Note: the implementation uses the module LuaBalanced.

--DavidManura

Status

This module is new and likely still has some bugs.

Possible extensions

A simple extension would be to provide a more mathematical (or more Haskell-like) syntax:

assert(comp 'sum { x^2 | x <- ?, x % 2 == 0 }' {2,3,4} == 2^2+4^2)

A compelling extension, as recommended by Greg Fitzgerald, is to implement the generalized list comprehensions proposed by SPJ and Wadler [2]. This provides some clear directions to take this to the next level, and the related work in Microsoft LINQ [3] shows what this could look like in practice.

The "zip" extension to list comprehensions, using the Haskell-like notation in the paper

[ (x,y,z,w) | (x <- xs | y <- ys), (z <- zs | w <- ws) ] ,

would require only small changes. The corresponding Lua function to generate that would be like this:

local __xs, __ys, __zs, __ws = ...
local __ret = {}   -- i.e. $list_init
for __i=1,__math_min(#__xs, #__ys) do
  local x, y = __xs[__i], __ys[__i]
  for __j=1,__math_min(#__zs, #__ws) do
    local z, w = __zs[__j], __ws[__j]
    __ret[#__ret+1] = {x,y,z,w}   -- i.e. $list_accum(__ret, x, y, z, w)
  end
end
return ret

(The "$" notation here is a short-hand for compile-time macros that were used to expand the source.)

Supporting sort or grouped by, e.g. again using notation in the paper

[ the x.name, sum x.deposit | x <- transactions, group by x.name ] ,

could be achieved by generating functions like this:

local __transactions = ...
local __groups1 = {}
local __groups2 = {}
for __i = 1, #__transactions do
  local x = __transactions[__i]
  local __key = ( x.name )  -- i.e. $group_by_key
  __groups1[__key] = ( x.name )
         -- i.e. $accum_the(__groups1[__key], $val1)
  __groups2[__key] = (__groups2[__key] or 0) + ( x.deposit )
         -- i.e. $accum_sum(__groups2[__key], $val2)
end
local __result = {}   -- i.e. $list_init
for __key in __pairs(__groups1) do
  __result[__result+1] = {__groups1[__key], __groups2[__key]}
         -- i.e. $accum_list(__result, __v)
end
return __result

Taking this to its full fruition seems quite achievable (after some research), though it would be a significant extension to this module (any takers?).

Changes

2009-12-01 - fix __call convenience operator not caching (noted by Joshua Jensen on mail list)

MetaLua

List comprehensions have also been implemented in MetaLua (clist):

LuaMacros

List comprehensions have also been implemented in LuaMacros:

MoonScript

Moonscript has list comprehensions -- http://moonscript.org/reference/#comprehensions . Moonscript gets compiled into Lua code.

See Also


RecentChanges · preferences
edit · history
Last edited April 7, 2012 9:05 pm GMT (diff)