lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

On 28-Sep-06, at 4:55 PM, D Burgess wrote:

Mike Pall  wrote:

Note that the main speed disadvantage of functions vs. operators
(around 5x) in interpreted Lua is due to the call frame setup and
teardown overhead. One way to speed this up would be to add
"frameless" C functions which operate in the frame of the caller.
Especially trivial functions (math.*) would benefit a lot.

It would also settle the operator vs. function discussion (as far
as performance is concerned).

A very valuable suggestion. Please dont let this idea get lost
in the more detailed discussion.

Frameless C operators avoid the frame setup and teardown, but that is not all of the overhead of a function call. In order to call a function, the compiled code needs to:
1) put the function object onto the stack
2) put each argument onto the stack
3) call the function
4) (possibly) move the result to the right place.

For a function which takes two operands, that could be five VM instructions, although the last one is frequently avoidable (see example below).

By contrast, a built-in operator only requires one three-address VM instruction, even if it subsequently calls a metamethod.

I'm not saying that VM instruction execution dominates the overhead of function calls, but it is significant, depending on how the function is called.

One way to minimize this overhead for common cases would be to implement variants of OP_CALL for specific cases, say 1 return value and 1 or 2 arguments. (The latter would require four operands, so it would need to be a double-word instruction.)

That's not an argument for more operators: in my opinion the reason to introduce a new operator would be clarity of expression and readability of code, not efficiency. (And that might well be the case for particular operators.)

---- Sample luac output:

rlake@freeb:~$ ./lua-5.1.1/src/luac -l -
local func = sometable.somefunc
local a,b,c
a = func(b, c)
main <stdin:0,0> (9 instructions, 36 bytes at 0x8072000)
0+ params, 7 slots, 0 upvalues, 4 locals, 2 constants, 0 functions
        1       [1]     GETGLOBAL       0 -1    ; sometable
        2       [1]     GETTABLE        0 0 -2  ; "somefunc"
        3       [2]     LOADNIL         1 3
-- the function call starts here
        4       [3]     MOVE            4 0
        5       [3]     MOVE            5 2
        6       [3]     MOVE            6 3
        7       [3]     CALL            4 3 2
        8       [3]     MOVE            1 4
-- and ends here
        9       [3]     RETURN          0 1