lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Alex Davies wrote:
> Besides, any jiter should inline the call. (and luajit 2.x may even do a 
> good job of avoiding the :str lookup).

Yep. There is no effective performance difference between a
builtin operator and a call to a library function. The compiler
needs to "know" what the library function does, of course.

Case in point: I've reimplemented MD5 in pure Lua using the bit
library functions provided in LJ2. This is a twisted maze of
function applications:

  local function tr_g(a, b, c, d, x, s)
    return rol(bxor(c, band(d, bxor(b, c))) + a + x, s) + b
  end
  [...]
  c = tr_g(c, d, a, b, x[12] + 0x265e5a51, 14)
  b = tr_g(b, c, d, a, x[ 1] + 0xe9b6c7aa, 20)
  [...]

But the resulting code is almost on par with the code a C compiler
generates. Excerpt from round 19/20:

  [...]
  mov esi, [esp+0x34]
  add esi, 0x265e5a51
  add edx, eax
  add esi, edx
  rol esi, 0x0e
  lea edx, [ebx+esi]
  mov eax, edx
  xor eax, ebx
  mov esi, ecx
  and esi, eax
  mov eax, ebx
  xor eax, esi
  mov esi, [esp+0x60]
  add esi, 0xe9b6c7aa
  [...]

(There's still one missed opportunity for a lea.)

> (Unrelated though, but something I'd like to see in jit 2.x would be things 
> such as "if str:sub(1, 2) == "__" then" remove the string creation and just 
> compare the first two bytes - just because it's a fairly common operation 
> in my code at least).

Here's how the generated IR looks like when displayed in tree-form
(the IR itself is linear). I've left out the str:sub dispatch and
the guard that ensures str has at least 2 characters:

  SPTR str 0  KINT 2
      \      /
        SNEW     KSTR "__"
	   \    /
	     EQ  -> exit

  SPTR str ofs  Return a pointer to the string data + ofs of a string object.
  SNEW ptr len  Creates a new string object from the pointer and the length.

Assuming the SNEW result does not escape, the EQ could be replaced
by a 2-byte comparison against the original string contents. This
leaves the SNEW dead and avoids the creation of the temp. string.
But in case the temp. string must be created anyway, it's usually
cheaper to compare the string pointers.

Alas, general escape analysis is not trivial. There are a few
shortcuts though, like checking whether the SNEW is unused when
the EQ is encoded (machine code generation is backwards) and
whether there are no instructions inbetween that use it. But quite
often the temp. string _does_ escape through one of the guard
exits. This requires sinking of the SNEW, which opens up another
can of worms. Oh well ... later. :-)

BTW: Python has s.startswith/s.endswith to avoid the temp. string.

--Mike