# Optimisation Coding Tips

## Lua 5.1 Notes

• Memory allocation from the heap--e.g. repeatedly creating tables or closures--can slow things down.

• Short inline expressions can be faster than function calls. `t[#t+1] = 0` is faster than `table.insert(t, 0)`.

• Constant folding: `x * (1/3)` is just as fast as `x * 0.33333333333333333` and is generally faster than `x/3` on most CPUs (see multiplication note below). `1 + 2 + x`, which is the same as `(1+2) + x` should be just as fast as `3 + x` or `x + (1 + 2)` but faster than `x + 1 + 2`, which is the same as `(x + 1) + 2` as is not necessary equivalent to the former. Note that addition of numbers on computers is generally not associative when overflow occurs, and the compiler doesn't even know whether `x` is a number or some other type with a non-associative `__add` metamethod. - LuaList:2006-03/msg00363.html . It's been reported that Roberto is seriously thinking about removing constant folding from Lua 5.2 since constant folding has been a source of bugs in Lua (though some of us really like constant folding -- DavidManura).

• Multiplication `x*0.5` is faster than division `x/2`.

• `x*x` is faster than `x^2`

• Factoring expressions: `x*y+x*z+y*z` --> `x*(y+z) + y*z`. Lua will not do this for you, particularly since it can't assume distributive and other common algebraic properties hold during numerical overflow.

Note that Roberto Ierusalimschy's article Lua Performance Tips from the excellent [Lua Programming Gems] book is [available online].

## Lua 4 Notes

The following information concerns optimization of Lua 4 and is kept here for historical reference.

### General tips on coding

(Joshua Jensen) These are some optimization strategies I use (off the top of my head):

• Local variables are very quick, since they are accessed by index. If possible, make global variables local (weird, eh?). Seriously, it works great and indexed access is always going to be faster than a hash lookup. If a variable, say `GameState`, needs global scope for access from C, make a secondary variable that looks like '`local GSLocal = GameState`' and use `GSLocal` within the module. This technique can also be used for functions that are called repetitively, too. (see OptimisingUsingLocalVariables)
• for loops are quite a bit faster than while loops, since they have specialized virtual machine instructions.
• In your C callback functions, use `lua_rawcall()` to call other functions. The overhead of a `setjmp()` call for exceptions (and a few other things) is avoided. I would not recommend using `lua_rawcall()` outside of a callback in case something goes wrong during execution. Without the setjmp() call, the error handler that exits the application is called.
• If possible, in your C functions, try and use `lua_rawget()` and `lua_rawgeti()` for table access, since it avoids the tag method checks. Be sure to use `lua_rawgeti()` for indexed access. It's still a hash lookup, but it's probably the fastest way to get there by index.
• In C, use `lua_ref()` wherever possible. `lua_ref()` behaves similarly to a local variable in terms of speed.
• Know that C strings passed into a Lua function (such as `lua_getglobal()`) from C are translated to a Lua string on entry. If a string is to be reused across multiple frames of the game, do a `lua_ref()` operation on it, too.

This information was written for Lua, pre v4.0 -- Nick Trout

### Assertions

Using the standard assert function with a non-trivial message expression will negatively impact script performance. The reason is that the message expression is evaluated even when the assertion is true. For example in
```assert(x <= x_max, "exceeded maximum ("..x_max..")")
```
regardless of the condition (which usually will be true), a float to string conversion and two concatenations will be performed. The following replacement uses printf-style message formatting and does not generate the message unless it is used:
```function fast_assert(condition, ...)
if not condition then
if getn(arg) > 0 then
assert(condition, call(format, arg))
else
assert(condition)
end
end
end
```
Now the example becomes:
```fast_assert(x <= x_max, "exceeded maximum (%d)", x_max)
```

This is the VM code generated:

```assert(x <= x_max, "exceeded maximum ("..x_max..")")
GETGLOBAL  	0	; assert
GETGLOBAL  	1	; x
GETGLOBAL  	2	; x_max
JMPLE      	1	; to 6
PUSHNILJMP
PUSHINT    	1
PUSHSTRING 	3	; "exceeded maximum ("
GETGLOBAL  	2	; x_max
PUSHSTRING 	4	; ")"
CONCAT     	3
CALL       	0 0
fast_assert(x <= x_max, "exceeded maximum (%d)", x_max)
GETGLOBAL  	5	; fast_assert
GETGLOBAL  	1	; x
GETGLOBAL  	2	; x_max
JMPLE      	1	; to 17
PUSHNILJMP
PUSHINT    	1
PUSHSTRING 	6	; "exceeded maximum (%d)"
GETGLOBAL  	2	; x_max
CALL       	0 0
```

```Edit: April 23, 2012 By Sirmabus
The code above will not actually work with 5.1
Also added some enhancements like pointing back to the actual assert line number,
and a fall through in case the assertion msg arguments are wrong (using a "pcall()").```

```function fast_assert(condition, ...)
if not condition then
if next({...}) then
local s,r = pcall(function (...) return(string.format(...)) end, ...)
if s then
error("assertion failed!: " .. r, 2)
end
end
error("assertion failed!", 2)
end
end
```

### Fast Unordered List Iteration

Frequently in Lua we build a table of elements such as:
`table = { "harold", "victoria", "margaret", "guthrie" } `

The "proper" way to iterate over this table is as follows:

```for i=1, getn(table) do
-- do something with table[i]
end
```

However if we aren't concerned about element order, the above iteration is slow. The first problem is that it calls getn(), which has order O(n) assuming as above that the "n" field has not been set. The second problem is that bytecode must be executed and a table lookup performed to access each element (that is, "table[i]").

A solution is to use a table iterator instead:

```for x, element in pairs(table) do
-- do something with element
end
```

The getn() call is eliminated as is the table lookup. The "x" is a dummy variable as the element index is normally not used in this case.

There is a caveat with this solution. If library functions tinsert() or tremove() are used on the table they will set the "n" field which would show up in our iteration.

An alternative is to employ the list iteration patch listed in LuaPowerPatches.

### Table Access

Question: It's not the performance of creating the tables that I'm worried about, but rather all the accesses to the table contents.

(lhf) Tables are the central data structure in Lua. You shouldn't have to worry about table performance. A lot of effort is spent trying to make tables fast. For instance, there is a special opcode for `a.x`. See the difference between `a.x` and `a[x]` ... but, like you said, the difference here is essentially an extra `GETGLOBAL`.

```a,c = {},"x"
CREATETABLE	0
PUSHSTRING 	2	; "x"
SETGLOBAL  	1	; c
SETGLOBAL  	0	; a
b=a.x
GETGLOBAL  	0	; a
GETDOTTED  	2	; x
SETGLOBAL  	3	; b
b=a["x"]
GETGLOBAL  	0	; a
GETDOTTED  	2	; x
SETGLOBAL  	3	; b
b=a[c]
GETGLOBAL  	0	; a
GETGLOBAL  	1	; c
GETTABLE
SETGLOBAL  	3	; b
END
```

See also: VmMerge (used to format the merged Lua source and VM code), OptimisationTips , OptimisingUsingLocalVariables

RecentChanges · preferences
edit · history
Last edited April 24, 2012 6:42 am GMT (diff)