lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Hi Duncan,

On 23 February 2017 at 13:27, Duncan Cross <> wrote:
> On Thu, Feb 23, 2017 at 12:59 PM, Sergey Rozhenko <> wrote:
>> On Ср 22.02.17 22:13, Dibyendu Majumdar wrote:
>>> A natural follow on question is can the interpreter be made faster
>>> without resorting to hand-written assembly code or other esoteric
>>> optimisations?
>> Big Yes. When I switched from plain Lua to LuaJIT I was very happy with
>> the speedup. I didn't run the numbers, but it felt like 10x increase in
>> speed. Then I found out that mobdebug.lua was disabling JIT, so all this
>> speedup was from faster interpreter. Speedup from enabling JIT wasn't nearly
>> as big.
>> This speedup probably comes from faster table access, but I don't know how
>> LuaJIT achieves it.
> It comes from exactly what Dibyendu wanted to avoid: hand-written assembly
> code. The LuaJIT interpreter is completely rewritten this way.

I think in addition to being written in assembly and inlining of
various table operations, LuaJIT also employs other optimisations, in
particular it optimises the way the Lua values are used (using 8 bytes
rather than 16 bytes) and the stack frame information is merged into
the Lua stack of values. I am not sure if there is a single most
important optimisation - one that increases performance most - if
anyone has a view on this then please let me know. We know that the
combination of optimisations leads to almost 2x improvement in
interpreted mode which is very impressive.

I think I can inline table operations even in C code without having to
resort to assembly. Additionally by doing other changes I may be able
to convince the C optimizer to do better optimizations. But supporting
64-bit integers prevents the optimization of value types, and to
change the way the stack frames are coded would mean a large amount of
rework (it is doable as I don't think it requires rewriting Lua in
assembly code).

The other area where performance is lost is in function call sequence
where I think LuaJIT is able to improve performance significantly.

My knowledge of LuaJIT's implementation is patchy however as the code
is hard to read and understand. If anyone can provide more insight
then please do so.