[Date Prev][Date Next][Thread Prev][Thread Next]
- Subject: Re: boolean operators
- From: Rici Lake <lua@...>
- Date: Thu, 28 Sep 2006 17:17:47 -0500
On 28-Sep-06, at 4:55 PM, D Burgess wrote:
Mike Pall wrote:
Note that the main speed disadvantage of functions vs. operators
(around 5x) in interpreted Lua is due to the call frame setup and
teardown overhead. One way to speed this up would be to add
"frameless" C functions which operate in the frame of the caller.
Especially trivial functions (math.*) would benefit a lot.
It would also settle the operator vs. function discussion (as far
as performance is concerned).
A very valuable suggestion. Please dont let this idea get lost
in the more detailed discussion.
Frameless C operators avoid the frame setup and teardown, but that is
not all of the overhead of a function call. In order to call a
function, the compiled code needs to:
1) put the function object onto the stack
2) put each argument onto the stack
3) call the function
4) (possibly) move the result to the right place.
For a function which takes two operands, that could be five VM
instructions, although the last one is frequently avoidable (see
By contrast, a built-in operator only requires one three-address VM
instruction, even if it subsequently calls a metamethod.
I'm not saying that VM instruction execution dominates the overhead of
function calls, but it is significant, depending on how the function is
One way to minimize this overhead for common cases would be to
implement variants of OP_CALL for specific cases, say 1 return value
and 1 or 2 arguments. (The latter would require four operands, so it
would need to be a double-word instruction.)
That's not an argument for more operators: in my opinion the reason to
introduce a new operator would be clarity of expression and readability
of code, not efficiency. (And that might well be the case for
---- Sample luac output:
rlake@freeb:~$ ./lua-5.1.1/src/luac -l -
local func = sometable.somefunc
a = func(b, c)
main <stdin:0,0> (9 instructions, 36 bytes at 0x8072000)
0+ params, 7 slots, 0 upvalues, 4 locals, 2 constants, 0 functions
1  GETGLOBAL 0 -1 ; sometable
2  GETTABLE 0 0 -2 ; "somefunc"
3  LOADNIL 1 3
-- the function call starts here
4  MOVE 4 0
5  MOVE 5 2
6  MOVE 6 3
7  CALL 4 3 2
8  MOVE 1 4
-- and ends here
9  RETURN 0 1