lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


It was thus said that the Great Roberto Ierusalimschy once stated:
> > I have been thinking of ways of making the Ravi interpreter faster
> > without having to resort to assembly code etc. So far one of the areas
> > I have not really looked at is how to improve the bytecode decoding.
> > The main issue is that the operands B and C require an extra bit
> > compared to operand A, so we cannot make them all 8 bits...
> 
> Given a 32- or 64-bit machine, why decoding 8-bit operands would be
> faster/better than 9-bit ones?

  I would think less shifting, but not being 100% sure, I decided to test my
assumptions.  I wrote:

void split8(unsigned int *dest,unsigned int op)
{
  dest[0] = (op >> 24) & 0xFF;
  dest[1] = (op >> 16) & 0xFF;
  dest[2] = (op >>  8) & 0xFF;
  dest[3] = (op      ) & 0xFF;
}

void split9(unsigned int *dest,unsigned int op)
{
  dest[0] = (op >> 27) & 0x01F;
  dest[1] = (op >> 18) & 0x1FF;
  dest[2] = (op >>  9) & 0x1FF;
  dest[3] = (op      ) & 0x1FF;
}

and the output of "gcc -O3 -fomit-frame-pointer" (on 64-x86) to eliminate
any excess instructions to make the comparison easier) wasn't all that much
different between the two---split9() was four bytes longer (38 bytes vs. 42
bytes).  I then did:

union op8
{
  struct
  {
    unsigned int op : 8;
    unsigned int a  : 8;
    unsigned int b  : 8;
    unsigned int c  : 8;
  } fields;
  unsigned int full;
};

void split8(unsigned int *dest,union op8 x)
{
  dest[0] = x.fields.op;
  dest[1] = x.fields.a;
  dest[2] = x.fields.b;
  dest[3] = x.fields.c;
}

union op9
{
  struct
  {
    unsigned int op : 5;
    unsigned int a  : 9;
    unsigned int b  : 9;
    unsigned int c  : 9;
  } fields;
  unsigned int full;
};

void split8(unsigned int *dest,union op9 x)
{
  dest[0] = x.fields.op;
  dest[1] = x.fields.a;
  dest[2] = x.fields.b;
  dest[3] = x.fields.c;
}

  There's not much of a difference between the four so I would still run
tests to see which one would be "best."

  -spc