[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Function calls [was: Lua and Neko comparison]
- From: Mike Pall <mikelu-0609@...>
- Date: Thu, 28 Sep 2006 19:37:00 +0200
Hi,
Wim Couwenberg wrote:
> > Nope. The fibonacci micro-benchmark measures call frame setup and
> > teardown overhead and nothing else.
>
> So is there room for considerable improvement? (This is relevant in
> operator vs. library call, iterators and other places.)
[Well, Lua already compares favourably with other interpreters in
this regard ...]
I don't think there is a way to speed up Lua->Lua function calls
in the interpreter any further without compromising Lua language
semantics (like adjusting # of args and results).
But there is a way to speed up Lua->C function calls:
1. Add a kind of "light" C functions. This avoids building up a
call frame for every function call. These functions are a bit
more restricted (like no callbacks into Lua), but have a faster
calling convention. The cl->isC byte may be reused to
differentiate them from regular functions. Three options:
1a. Just bump up L->base temporarily and call the C function
within the existing Lua call frame. The C function must store the
results in the proper place and generally has to be _very_
careful with the API functions for stack manipulation. I'm not
sure how feasible this is, because one needs to export some
knowledge about the internals of the Lua VM.
1b. Like 1a, but add a special set of API calls which exclusively
work with light functions. E.g. non-stack based fast type checks.
1c. Specialize the types early on. E.g. here are the two most
common candidates (more can be added):
lua_Number lightfunc_n_n(lua_State *L, lua_Number a);
lua_Number lightfunc_n_nn(lua_State *L, lua_Number a, lua_Number b);
Type checking can be inlined in luaD_precall and is much faster,
too. Very few (or no) Lua internals need to be opened up.
A Lua compiler has two more options:
2. Fully inline the called code. Would be useful for Lua->Lua
function calls, but is tricky to get right. Also helpful for a
selected subset of C functions (drawback: not easily user
extensible for new C functions in external libraries). In fact
LuaJIT is doing the latter already (with great gains).
3. Optimize, i.e. reduce the call frame overhead. LuaJIT is
already doing this to some extent. But compatibility with the
(mostly unchanged) Lua VM requires some compromises.
A better way would be to use the C stack for everything (like
most compiled languages do). This avoids setting up and tearing
down three structures in parallel (C stack, Lua stack and Lua
call frames). Alas, this would imply a significant departure from
the standard Lua VM structures.
Bye,
Mike