[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Lua 5 VM questions
- From: Mike Pall <mikelu-0501@...>
- Date: Tue, 25 Jan 2005 13:50:02 +0100
Hi,
Christian Müller wrote:
> 1. I read that it is a register vm. How do you solve argument passing?
The Lua 5 VM employs a sliding register window on top of a stack. Frames
(named CallInfo aka 'ci' in the source) occupy different (overlapping)
ranges on the stack. Successive frames are positioned exactly over the
passed arguments (luaD_precall). The compiler ensures that there are no
live variables after the arguments for a call. Return values need to be
copied down (with truncate/extend) to the slot holding the function object
(luaD_poscall). This is because the compiler has no idea how many values
another function may return -- only how many need to be stored.
Example:
function f2(arg1, arg2, ..., argN)
local local1, local2, ...
...
return ret1, ret2, ..., retO
end
function f1(arg1, arg2, ..., argM)
local local1, local2, ...
...
local ret1, ret2, ..., retP = f2(arg1, arg2, ..., argN)
...
end
Simplified stack diagram:
stack
|
| time: >>>> call >>>>>>>>>>>>>>>>>> call ~~~~~~~~~~~~~~~~~~ return >>>>>
|
| ciX.func-> f1 f1 f1 f1
| ciX.base-> arg1 arg1 arg1 arg1
| arg2 arg2 arg2 arg2
| ... ... ... ...
| argM argM argM argM
| ciX.topC-> local1 local1 local11
| local2 local2 local2
| local3 local3 local3
| ... ... ...
| f2 ciY.func-> f2 f2 ret1
| arg1 ciY.base-> arg1 arg1 ret2
| arg2 arg2 arg2 ...
| ... ... ... retP
| argN argN argN
| ciX.topL-> ------ ------ ------ ciY.topC-> local1 local1
| local2 local2
| ... ...
| ret1
| ret2
| ...
| retO
| ciY.topL-> ------ ------
V
Note that there is only a single 'top' for each frame:
For Lua functions the top (tagged topL in the diagram) is set to the base
plus the maximum number of slots used. The compiler knows this and stores
it in the function prototype. The top pointer is used only temporarily
for handling variable length argument and return value lists.
For C functions the top (tagged topC in the diagram) is initially set to
the base plus the number of passed arguments. C functions can access their
part of the stack via Lua API calls which in turn change the stack top.
C functions return an integer that indicates the number of return values
relative to the stack top.
In reality things are a bit more complex due to overlapped locals, block
scopes, varargs, coroutines and a few other things. But this should get
you the basic idea.
> Do you deviate from the typical register based architecture in
> that case to save memory traffic?
I think the architecture is pretty unique as far as VMs go. Some CPUs
have sliding register windows, but this gets quite complicated since they
need to spill/fill registers to/from the stack. A VM can of course use an
unbounded (reallocated) stack on the heap.
> 2. As far as I learned you do instruction encoding close to hardware
> architectures. Therefore you always have to decode the opcode in contrast to
> the JVM where opcode and arguments are stored in several independent bytes.
> Is opcode decoding cheap (one might forgive my poor knowledge of C operator
> performance;-)?
All instructions are 32 bit. The current layout as of Lua 5.1work4 is:
BBBBBBBB BCCCCCCC CCAAAAAA AAOOOOOO ABC format
BBBBBBBB BBBBBBBB BBAAAAAA AAOOOOOO ABx format
sBBBBBBB BBBBBBBB BBAAAAAA AAOOOOOO AsBx format
Fetching a 32 bit value once from memory and then extracting the bits to
other registers is cheaper than doing single-byte fetches for variable
length operands. Byte alignment does not matter at all (word alignment does).
Memory bandwith is usually not an issue for VM instructions since there
is so much else going on for each instruction. It's much more important
to keep the execution units busy by avoiding interlocks caused by memory
fetches. Tuning the code to make it easy for the compiler to generate
good code is another issue (the Lua authors have done quite a bit of
tuning in some important spots).
Bye,
Mike