[Date Prev][Date Next][Thread Prev][Thread Next]
- Subject: Re: LuaJIT IR Design
- From: Mike Pall <mikelu-1105@...>
- Date: Fri, 13 May 2011 17:44:43 +0200
Maxime Chevalier-Boisvert wrote:
> 1. The post mentions using bidirectionally extensible arrays to
> store instructions. Is one such array used per basic block? If not,
> how are basic blocks efficiently allocated/moved within the array?
It's a trace compiler -- it doesn't work at the basic block level.
One might consider a single trace to be similar to an extended
basic block (one entry, multiple exits). But a root trace may also
hold a loop, so that's where the analogy breaks.
> What's the advantage of being able to extend the array
Constants grow downwards and don't interfere with regular
instructions. This avoids some difficulties esp. for loop
optimizations, which may need to emit new constants, but want
their loads to be hoisted.
Having the lowest numbered IR references also implicitly gives
them a lower weight in the register allocation heuristics. Thus
rematerialization is preferred over spilling.
And the killer feature is to be able to test for a constant simply
by looking at the reference. IR references are biased relative to
the cut off point between the two parts of the IR. Anything below
the bias value (< 32768) must be a constant. This test is heavily
used and it's very cheap.
The IR array is lazily grown and reused for subsequent traces. In
practice very few moves or array extensions are needed. Once a
trace has been assembled, it's copied/compacted to a newly
allocated memory area.
> My optimizer does alot of IR transforms which
> merge/eliminate basic blocks, remove/replace/insert new
> instructions. How is that done efficiently in LuaJIT?
The former does not apply to a trace compiler. The latter can be
avoided by doing forward transforms on-the-fly during trace
recording and backward transforms on-the-fly during assembly.
> 2. In my current layout I also keep a list of uses for each IR
> instruction (all instructions reading the output value of a given
> instruction). This is very useful when I want to replace the value
> of an instruction by that of another instruction. What kind of trick
> does LuaJIT use to do this efficiently?
Folding happens before an instruction is actually emitted. But it
may cause new instructions to be emitted and old instructions to
fall unused, too. Unused instructions are simply ignored by the
backward pass of the assembler. This gives you DCE for free.
Backward transforms already start from a use and don't need a list
of other uses.
> 3. How are constants encoded in the LuaJIT IR? I saw it uses 16 bit
> references to refer (I assume) to indices of other IR instructions
> elsewhere in the instruction array. Are constants also stored in the
> array, as instructions?
Constants are stored like any other IR instructions. They just
grow downwards in the IR.
Note: the IR dump (-jdump=i) hides all of these details and merges
constants into the output of the regular IR instructions.
> 4. I saw the IR instructions have up to 2 operands. What about call
A chain of call arguments (CARG) is used for extension. The IR
dump hides this detail. It's much more important to have an
orthogonal and linearly indexable IR (always exactly 64 bits per
instruction) than to optimize for special cases.