lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

> I'm also just generically curious about this change. What led to it?
> What are the tradeoffs?

In Lua 4.0 we already decided to use a whole 32-bit integer for each
instruction, because of alignment problems with arguments inside
bytecodes. But with 32-bit instructions and a stack machine, most
istructions do not use most bits. Also, we know that copying values
is a "slow" operation in Lua (because each value is a structure). Then,
we decided to try this register-based architecture.

One big advantage is that all locals live in registers, so most
opcodes can manipulate them directly (saving lots of copies from
GETLOCAL/SETLOCAL). The total number of instructions is quite smaller
than in a stack-oriented machine. For each saved instruction you save
a lot of overhead (hook checking, decoding, etc.). Of course, each
instruction is a little more expensive, as you have to decode more
arguments, but overall there is a good improvement. An extreme example
is the code for "a=a+1" where `a' is local: In Lua 4.0 we have

     2  [2]     GETLOCAL    0       ; a
     3  [2]     ADDI        1
     4  [2]     SETLOCAL    0       ; a

In Lua 5.0 it becomes

     2  [2]     ADD         0 0 250 ; 1

(meaning "add the contents of register 0 with the first constant, and
store the result at register 0".)  Something similar happens for "a.x = b",
with `a' and `b' locals:

     2  [2]     GETLOCAL        0       ; a
     3  [2]     PUSHSTRING      0       ; "x"
     4  [2]     GETLOCAL        1       ; b
     5  [2]     SETTABLE        3 3


     2  [2]     SETTABLE        1 0 250 ; "x"

The main disadvantage is that the code generator is much more complex.
For a stack machine the code generation is straightforward, there is not
much optimization you can do, even if you want. For a register machine
to pay off, you must do some optimizations (such as direct access to
local variables).

Now it is difficult to estimate the real gains with that change, because
there are many other changes too. But I would say that for "opcode
intensive" code (that is, code with few function calls and lots of
locals) the gain in performance is around 20%.

-- Roberto