lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Hi,

Joseph Stewart wrote:
> I would faint in extacy if you said that most of the JIT was written in Lua 
> itself...

Almost. :-)

After experimenting with several approaches for machine
code generation (e.g. giant nested macros), I settled
on a preprocessing assembler (PASM).

This translates mixed C/Assembler code into pure C code.
Hidden in there is another bytecode engine that generates
machine code from action lists, intermixed with variables
and constants passed from C. It resolves labels, tries
to generate the most compact machine code encoding, etc...

And PASM is of course written in Lua. No, we don't have
a chicken-and-egg problem here, because it's only needed
once, while you compile the core. I plan on distributing
a pre-assembled C file, too (so you don't need to run
PASM unless you want to modify the JIT engine).

Have a look at this fragment (lines starting with '|'
are for PASM):

  if (ptr != NULL) {
    |  mov eax, foo+17
    |  mov edx, [eax+esi*2+0x20]
    |  add ebx, [ecx+bar(ptr, 3)]
  }

After pre-processing you get:

  if (ptr != NULL) {
    pasm_put(J, 123, foo+17, bar(ptr, 9));
  }

[Yes, you usually get the assembler code as comments and
proper CPP directives to match them up with the source.]

Here 123 is an offset into the action list buffer that
holds the partially specified machine code:

 00 B8   F4     04 8B 54 70 20 03  00 99   F0    FF
(COPY 1  IMM_D  COPY 5             COPY 1  DISP  END)

[The extra COPY action is needed to mark the opcode,
because it may be modified by the DISP action.]


So all the grunt work of encoding the very CISCy x86
machine code is done by PASM. This is 500 lines generic
and 1300 lines x86-specific code in Lua plus 400 lines
for the tiny action list bytecode engine in C.

The JIT engine itself then consists mostly of some logic
separating the various cases, lots of calls to pasm_put and
a single static array holding all action lists. It doesn't
have a clue about x86 assembler, nor does it need to.

Oh, and PASM itself is not specific to the JIT engine.
It can be reused for any other project that wants to
emit machine code sequences (MIT license, like LuaJIT).


So here's a short real-life example from the JIT compiler:

static void jit_enc_loadnil(jit_State *J, int ra, int rb)
{
  int idx, num = rb - ra + 1;
  |  xor eax, eax
  if (num <= 8) {
    for (idx = ra; idx <= rb; idx++) {
      |  mov BASE[idx].tt, eax
    }
  } else {
    |  lea ecx, BASE[ra].tt
    |  lea edx, BASE[rb].tt
    |1:
    |  mov [ecx], eax
    |  cmp ecx, edx
    |  lea ecx, [ecx+sizeof(TValue)]
    |  jbe <1
  }
}

And this is the resulting C code:

static void jit_enc_loadnil(jit_State *J, int ra, int rb)
{
  int idx, num = rb - ra + 1;
  pasm_put(J, 1032);
  if (num <= 8) {
    for (idx = ra; idx <= rb; idx++) {
      pasm_put(J, 1121, (int)&(((StkId)0)[idx].tt));
    }
  } else {
    pasm_put(J, 1490, (int)&(((StkId)0)[ra].tt),
             (int)&(((StkId)0)[rb].tt), sizeof(TValue));
  }
}

As you can guess, the compiler runs pretty fast and needs
only very little code space for itself.

A really life-saving feature is that you can freely mix
C constants, variables, arrays and structures with the
assembler code. Something like 'mov CI, L->ci' (which
translates to 'mov ecx, [esi+0x18]') comes very handy.
It has macros and defines, too.


Umm and yes, I'm thinking of adding high-level bytecode
transform and hinting passes as optional Lua modules
(much later ...). There's no point in writing any of this
in C except for the core engine.

[Sorry for being so verbose, but I thought I might as
well write it for reuse in the docs ... :-) ]

Bye,
     Mike