lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Hi,

Paul Chiusano wrote:
> I did notice a few cases where code runs slower with jit
> compilation, and in each case it was code which creates a lot of
> functions dynamically (to use as iterators).

Creation of closures is as cheap as it is in plain Lua. Only the
underlying function _prototype_ is compiled and not each closure.
A closure is just a short block of memory holding pointers to the
upvalues (which may differ) and a pointer to the prototype (which
stays the same).

Rule of thumb: any Lua source code line you are seeing is
compiled at most once (unless wrapped with loadstring()).

The use of coroutines doesn't make a difference (but see below).

> Here's the smallest
> example I could come up with, it runs about four times slower with jit
> vs stock Lua:
> 
> local function count(n)
>    return coroutine.wrap(function()
>     for i=1,n do coroutine.yield(i) end
>   end)
> end
> local v = {}
> for i=1,1e5 do
>   for n in count(math.random(1,5)) do v[#v+1]=n end
> end

But this example has only three function prototypes (main, count()
and the wrapped anonymous function). And only these three are ever
compiled. You can easily find out with -j trace:

$ luajit -j trace -O test.lua 
[LuaJIT: OK   21   1206  test.lua:0]
[LuaJIT: OK    7    296  test.lua:1]
[LuaJIT: OK   10    390  test.lua:2]

What you are really seeing in this example is the effect of
repeated creation and garbage collection of coroutines. Since
every coroutine in LuaJIT needs an associated C stack and you are
not doing much work in these coroutines, you are effectivly
benchmarking the higher overhead of the memory allocator.

You can test this is the case by adding
  coroutine.cstacksize(1) -- Select minimum default C stack size.
as the first line. Much faster ... (but has side-effects).

It's not recommended to create/destroy coroutines at such a high
rate. Neither for Lua and especially not for LuaJIT. Recycling
coroutines is easy:

local yield, random = coroutine.yield, math.random
local co = coroutine.wrap(function(n)
    while n do for i=1,n do yield(i) end n = yield() end
  end)
local function count(n)
  return co, n
end
local v = {}
for i=1,1e5 do
  for n in count(random(1,5)) do v[#v+1]=n end
end

This program runs 2.2 times faster with plain Lua. LuaJIT adds
another 30% boost (well, there is not much to optimize here).

> Are ALL functions jit-compiled the very first time they are called?

Yes. Except for those you marked as not-to-be-compiled with
jit.off(f). In particular the main function of all Lua modules
loaded via require() is not compiled because it's guaranteed to
be executed only once.

[But this is only useful for functions that are _truly_ compiled
from scratch (e.g. with loadstring()) and not to be confused with
closure creation or coroutine creation. See above.]

> Would it make any sense to have some higher threshold, as in, after a
> function has been called K times, it is jit compiled? Or some other
> more clever workaround so that the compiler doesn't spend too much
> time compiling and optimizing what are basically throw-away functions?

The total compilation time for the above three functions is
around 200 microseconds on my old PIII. This is very little
compared to traditional compilers. And the overhead is needed
only once (I hope I cleared up the misconception above).

Compilation thresholds need runtime instrumentation in the
interpreter, an extensive set of heuristics, migration of live
function state and other ugly things. I guess this only makes
sense for slow compilers.

I'd rather work on making the compiler even faster. :-)

Bye,
     Mike