lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Miles Bader wrote:
> Mike Pall <> writes:
> > Since LuaJIT makes use of all CPU features on both x86 and x64,
> > there's little performance difference.
> Does luajit not benefit from the amd64 architectural/ABI differences in
> the same way other code does?

It's the other way round: LuaJIT doesn't suffer as much from the
deficiencies of x86 mode and its suboptimal ABI.

You can compare the performance of Lua and LuaJIT _across_ both
architectures here:

First click on plain Lua for both x86 and x64 and you'll see a
moderate speedup for x64. This is mainly because the C compiler
has trouble managing the fewer registers on x86 in the main
interpreter loop. The better calling conventions on x64 and the
use of SSE for FP arithmetics make up for the rest.

Then click on the LuaJIT interpreter for both x86 and x64 and
you'll barely see a difference! This is because the LuaJIT
interpreter is written in hand-tuned assembler using manual
register allocation. It makes near optimal use of x86 registers,
so the extra registers on x64 don't help. The tiny improvements
are due to some faster libc functions (sum-file) or the use of SSE
in the interpreter (euler14-bit, scimark).

Now click on LuaJIT git HEAD (JIT compiler enabled) for both x86
and x64. Again, there's little difference. Apart from the faster
libc (sum-file), some specialized hash lookups make use of 64 bit
compares on x64 (nbody, revcomp). The register allocator is pretty
good, so only a few benchmarks benefit from the extra registers on
x64 (ray, scimark, fasta). Also, a trace compiler is able to lower
the register pressure on hot code paths compared to a traditional
method-at-a-time compiler.

Summary: LuaJIT doesn't need to care about most of the things that
slow down C code on x86. Conversely the speedup on x64 is much
less pronounced, because it already makes good use of the CPU
features in x86 mode.