Lua Macros

lua-users home
wiki

Lua Macro is a macro facility for Lua using token filters.

Source Code

http://luaforge.net/frs/download.php/4329/luamacro-1.5.zip

Dependencies

Lua 5.1.4 with [tokenf patch].

There is a Lua 5.1.4 patched Windows build for the curious using Mingw (i.e not Lua for Windows compatible) [here].

Token Filters

lhf's tokenf patch (see also [this writeup] provides a simple but powerful hook into the stream of tokens that the Lua compiler sees. (In Lua, for a given module, compilation into bytecode and execution are distinct phases.) Basically you have to provide a global function called FILTER, which will be called in two very different ways. First, it will be called with two arguments; a function which you can use to get the next token (a 'getter') and the source file. Thereafter, it will be called with no arguments, but will be expected to return three values. (This is confusing at first, and these two functions should probably be given different names.)

The get function returns three values: line, token and value. Token has a few special values like '<name>', '<string>', '<number>', and '<eof>' but otherwise is the actual keyword or operator like 'function', '+', '~=', '...', etc. If the token is one of the special cases, then the value of the token is returned as the third value. (There is an instructive example with the tokenf distribution, called fdebug, which simply prints out these values.)

Token filters read and write tokens one at a time. Coroutines make it possible to maintain complex state, without having to manage a state machine.

LuaMacros

The macro facility described here is similar to the C preprocessor, although it works on an already predigested token stream and is not a separate program through which Lua code is passed. This has several advantages - it is faster (no separate translation phase) and macros can be tested interactively. The disadvantage is that LuaMacro is dependent on a patched version of Lua, and debugging macros can be sometimes a little awkward, since you do not see the result as transformed text.

As always, macros need to be used carefully. They do not share Lua's concept of scoping (so should be named distinctly), and overusing them could result in code which could only be read by the original writer, which is the 'private language' problem. (See http://research.swtch.com/2008/02/bourne-shell-macros.html for a classic example.)

Even if not part of your production/released code, macros can be useful in debugging and constructing tests. If you are using Lua as a DSL (Domain Specific Language) then macros allow for easy customization of syntax.

This version (1.5) allows a simplified notation suggested by Thomas Lauer, in which simple macros look very much like their C equivalents:

A macro that takes two parameters:

macro.define('PLUS(L,C) ((L)+(C))')

The following is an equivalent to a C-style assert, where the actual expression is converted into a string to form the optional second argument of assert() using the 'stringizing' function _STR():

macro.define('ASSERT('x') assert(x,_STR(x))')

An advantage of this is that assertions can be removed globally by a simple change in a header.

Using Macros

Macro definitions need to be in a separate file from the code to be preprocessed, but do not need to be loaded before your program. Instead, there is a standard macro __include. Assuming that the PLUS and ASSERT macros have been defined in plus.lua, then:

--test-macro.lua
__include 'plus'
print(PLUS(10,20))
ASSERT(2 > 4)

$ lua -lmacro  test-macro.lua
30
lua: test-macro.lua:3: 2 > 4
stack traceback:
        [C]: in function 'assert'
        test-macro.lua:3: in main chunk

It is important that the module macro is loaded before the program is parsed, since macros operate on the compile phase.

They can be tested interactively like this:

D:\stuff\lua\tokenf>lua -lmacro -i
Lua 5.1.2  Copyright (C) 1994-2007 Lua.org, PUC-Rio
> __include 'plus'
> = PLUS(10,20)
30
> = PLUS(10)
=stdin:1: PLUS expects 2 parameters, received 1
> ASSERT(2 > 4)
stdin:1: 2 > 4
stack traceback:
        [C]: in function 'assert'
        stdin:1: in main chunk
        [C]: ?

The substitution may be a function - this is where things get interesting:

macro.define('__FILE__',nil,function(ls) return macro.string(ls.source) end)

The nil second argument indicates that we have no parameters, and the third substitution argument is a function which always receives a table containing the lexical state: source,line and get (the getter function currently being used). This function is expected to return a token list: in this case, {'<string>',ls.source} . Three convenience functions, macro.string(),macro.number() and macro.name(), are available. In LuaMacro 1.5, the function may also return a string.

In general, the substitution function receives all parameters passed to the macro:

local value_of = macro.value_of

macro.define('_CAT',{'x','y'},function(ls,x,y)
   return macro.name(value_of(x)..value_of(y))
end)

This is also the only way to handle variable length parameter lists, since otherwise the number of formal and actual parameters must match. Bear in mind that the parameters always come in the form of token lists, which have a particular abbreviated format. For example, {'<name>','A','+',false,'<name>','B','*',false,'<number>',2.3} . (Here false is a placeholder for nil.)

Please note that macro definitions are Lua modules and so you are free to define local variables and functions.

Macro definitions can be Inline

The new simplified macro definition-as-string allows simple macros to be defined in the source file where they are actually used. This even works with the interactive interpreter:

$ lua -lmacro
Lua 5.1.4  Copyright (C) 1994-2008 Lua.org, PUC-Rio
> __def 'dump(x) print(_STR(x).." = "..tostring(x))'
> x = 10
> dump(x)
x = 10
> dump(10*4.2)
10 * 4.2 = 42

(This could be a useful tool in a debugging toolchest.)

Consider this shorthand for evaluating a statement for all values in an array:

__def 'for_(t,expr) for _idx,_ in ipairs(t) do expr end'

for_({10,20,30},print(_))

See functional.lua for more examples of this style.

Using End-scanners

A common pattern when using anonymous functions is:

set_handler(function()
  ...
end)

It would be cool if this could simply be expressed like this (LeafStorm's BeginEndProposal):

set_handler begin
  ...
end

With LuaMacro 1.5, macros may set lexical scanners, which watch the token stream for specified tokens. A particularly useful one is an 'end-scanner'. In this case, the scanner detects the last end of a block, and emits end).

def ('begin',nil,function()
    macro.set_end_scanner 'end)'
    return '(function(...)'
end)

Macros can be Arbitrary Tokens

Another LuaMacro 1.5 feature, is that any token may be used as a macro name. Consider the problem of introducing a short anonymous function form, (see http://lua-users.org/lists/lua-l/2009-12/msg00140.html). Instead of function(x) return x+1 end we can say \x(x+1). Many readers (although not all ;)) find this notation less noisy when specifying short functions.

'\' is a good choice for a token macro, since it appears nowhere else in the language. You can define a handler which provides parameters if a macro is intended to be called without a parameter list. This is the fourth argument to define().

-- lhf-style lambda notation
def ('\\', {'args','body';handle_parms = true},
    'function(args) return body end',
    function(ls) -- grab the lambda
        -- these guys return _arrays_ of token-lists. We use '' as the delim
        -- so commas don't split the results
        local args = macro.grab_parameters('(','')[1]
        local body = macro.grab_parameters(')','')[1]
        return args,body
    end
)

Functions of more than one argument (like \x,y(x+y) and functions defining functions (like \x(\y(x+y))) work as expected.

Implementing a Try/Except Statement

As an actual useful example, here is how try and except can be defined as syntactical sugar around pcall():

-- try.lua
function pack (...)
	return {n=select('#',...),...}
end

macro.define ('try',nil,
 'do local res = pack(pcall(function()'
  -- try block goes here
)

macro.define ('except',{'e',handle_parms=macro.grab_token},
	function()
		-- make sure that the 'end' after 'except' becomes 'end end' to close
		-- the extra 'do' in 'try'.
		-- we start at level 1 (before 'end))') and must ignore the first level zero.
		macro.set_end_scanner ('end end',1,true)
		return [[
	end))
	if res[1] then
		if res.n > 1 then return unpack(res,2,res.n) end
	else local e = res[2]
	]]
	-- except block goes here
    end
)


return 'local pack,pcall,unpack = pack,pcall,unpack'

So, given code like this:

a = nil
try
  print(a.x)
except e
  print('exception:',e)
end

The compiler would see the following code:

a = nil
do local res = pack(pcall(function()
  print(a.x)
end)) 
 if res[1] then
	if res.n > 1 then return unpack(res,2,res.n) end 
 else local e = res[2]
   print('exception',e)
end end

The smartness of these macros (note we can handle closing the extra do statement) means that it is easier try out new syntax proposals with a little work, without having to patch Lua itself. And writing macros in Lua is an order of magnitude easier than writing syntax extensions in C!

(Please note that this is not a full solution to the problem. In particular, we cannot cope with blocks which return explicitly, but return no value.)

The last return statement requires some explanation. This macro asummes that the environment in which it is expanded cam access the functions pack,pcall and unpack. In general, this is not true, since a module created with module(...) does not by default have access to the global environment.

This macro should be brought into a module using __include try before the module call. __include uses require internally, and if that returns a string, then this is the actual substituted value of the __include macro expansion. In that way, the necessary hidden dependencies of the macro are properly made available in the module.

As an example of more elaborate code generation, here is a using macro which works rather like the C++ statement. There is no true module scope in Lua, so a common trick is to 'unroll' a table:

local sin = math.sin
local cos = math.cos
...

Not only do we get nice unqualified names, but accessing local function references is faster than looking up functions in a table. Here is a macro that can generate the above code automatically:

macro.define('using',{'tbl'},
    function(ls,n)
        local tbl = _G[n[2]]
        local subst,put = macro.subst_putter()
        for k,v in pairs(tbl) do
            put(macro.replace({'f','T'},{macro.name(k),n},
                ' local f = T.f; '))
        end
        return subst
    end)

Here the substitution is a function, which is passed a name token (like {'<name>','math'}), assumes it refers to a globally available table, and then iterates over that table dynamically generating the required local assignments. subst_putter() gives you a token list and a put function; you can use the put function to fill the token list, which is then returned and actually substituted into the token stream. replace generates a new token list by replacing all occurrences of the formal parameters (first argument) with actual parameter values (second argument) in a token list. To use this, put the macro call at the start of your module:

using (math)

This brings in the whole contents of the table into scope, and assumes that the table does actually exist at compile-time. A better idiom is import(math,sin cos) which expands to local sin = math.sin; local cos = math.cos:

macro.define ('import',{'tbl','names'},
	function (ls,tbl,names)
		local subst,put = macro.subst_putter()
		for i = 1,macro.length_of(names) do
			local name = macro.get_token(names,i)
			put 'local'; put (name); put '='; put (tbl); put '.'; put (name); put ';'
		end
		return subst
	end
)

Implementing List Comprehensions

In PythonLists, FabienFleutot discusses a list comprehension syntax modelled on the Python one.

x = {i for i = 1,5}

{1,2,3,4,5}

Such a statement does not actually require much transformation to be valid Lua. We use anonymous functions:

x = (function() local ls={}; for i = 1,5 do ls[#ls+1] = i end; return ls end)()

However, to make it work as a macro, we need to choose a name (here 'L') since we cannot look ahead to see the `for` token.

macro.define('L',{'expr','loop_part',handle_parms=true},
    ' ((function() local t = {}; for loop_part do t[#t+1] = expr end; return t end)()) ',
    function(ls)
        local get = ls.getter
        local line,t = get()
        if t ~= '{' then macro.error("syntax: L{<expr> for <loop-part>}") end
        local expr = macro.grab_parameters('for')
        local loop_part = macro.grab_parameters('}','')
        return expr,loop_part
    end)

The substitution is pretty straightforward, but we have grab the parameters with a custom function. The first call to macro.grab_parameters grabs upto 'for', and the second grabs upto '}'. Here we have to be careful that commas are not treated as delimiters for this grab by setting the second argument to be the empty string.

Any valid for-loop part can be used:

 L{{k,v} for k,v in pairs{one=1,two=2}}

 { "one", 1 }, { "two", 2 } }

Nested comprehensions work as expected:

x = L{L{i+j for j=1,3} for i=1,3}
  
{ { 2, 3, 4 }, { 3, 4, 5 }, { 4, 5, 6 } }

A particularly cool idiom is to grab the whole of standard input as a list, in one line:

lines = L{line for line in io.lines()}

Debugging LuaMacro code

There is a variable macro.verbose which you can set to see the tokens read and written by LuaMacro. If it is 0, a debug hook is set, but no debug output appears; if it is 1, then it shows the transformed token stream which the compiler sees, and if it is 2 it will also show the input token stream.

Setting the verbosity level to zero (say by lua -lmacro -e "macro.verbose=0" myfile.lua) is useful because the __dbg builtin macro can then change the verbosity dynamically:

__dbg 1
mynewmacro(hello)
__dbg 0

This helps you to zero in on particular problem areas without having to wade through pages of output.

Compiling LuaMacro code

Although LuaMacro depends on a token-filter patched Lua compiler, the resulting byte code can run on stock Lua 5.1.4. A very simple compiler is provided, based on luac.lua from the Lua distribution.

$ lua macro/luac.lua myfile.lua myfile.luac
$ lua51 myfile.luac 
<runs fine> 

-- SteveDonovan


RecentChanges · preferences
edit · history
Last edited March 8, 2011 12:20 pm GMT (diff)