lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


As LuaBitOp v.s. Lua-5.2.0-work4 bit libraries has been a point of
contention, I thought I'd post a little anecdotal feedback tonight.
LuaBitOp and Lua-5.2.0-work4 bitwise lshift behave differently here:

 bit.lshift(0xffffffff, 32) --> -1 (LuaJIT2) or 0 (lua-5.2.0-work4)

Indeed, the LuaBitOp docs say that only the lower 5 bits in the shift
count are used (i.e. 0-31), and that follows from C and x86 [4].

I came across that when converting some pure Lua code to these native
bit libraries.  The code involves ModuleCompressDeflateLua [1], where
it wraps a bytestream in a bitstream [2].  A simplified version is
given below.  Basically, you push individual bytes onto a queue (four
byte buffer) and pop 0-32 bits at a time.

-----
local buffer = 0
local buffer_nbits = 0

local function push_byte(byte)
 buffer  = buffer + bit.lshift(byte, buffer_nbits)
 buffer_nbits = buffer_nbits + 8
 return byte, buffer, buffer_nbits
end

local function pop_bits(nbits)
 local bits = bit.band(buffer, bit.lshift(1, nbits)-1)
 buffer = bit.rshift(buffer, nbits)
 ----[[pure Lua-5.1 alternative:
 local m = 2^nbits  -- or pow2[nbits]
 local bits = buffer % m
 buffer = (buffer - bits) / m
 --]]
 buffer_nbits = buffer_nbits - nbits
 return bits, buffer, buffer_nbits
end

print(push_byte(0xff))
print(push_byte(0xff))
print(push_byte(0xff))
print(push_byte(0xff))
print(pop_bits(32))

$ lua52-work4 t.lua
255     255     8
255     65535   16
255     16777215        24
255     4294967295      32
4294967295      0       0

$ luajit2-20101011 t.lua
255     255     8
255     65535   16
255     16777215        24
255     -1      32
0       -1      0
-----

The two bit libraries (and the pure Lua alternative) give the
identical final result for nbits < 32, but for nbits == 32 the
LuaBitOp breaks unless handled as a special case.  That's not to say
that it is bad.  You may want Lua to have the same bitwise semantics
as C to simplify conversion of C/native code to Lua (and vice
versa--e.g. LuaToCee).

~~~

Another area of difficulty involves a function that reverses the order
of bits [5].  There is not an easy way to optimize the pure Lua form
with existing native bitwise operations:

  local function msb(bits, nbits)
    local res = 0
    for i=1,nbits do
      local b = bits % 2
      bits = (bits - b) / 2
      res = res * 2 + b
    end
    return res
  end

LuaBitOp does implement a function "bswap" that reverses the order of
bytes, but that only would help partly (e.g. swap bytes with bswap and
then swap bits in each byte via an 8-bit lookup table).

~~~

I also took a brief look at performance.  The largest bottleneck
(~50%) in the pure Lua 5.1 implementation is in the CRC calculation
performing the XOR operation, which is implemented as a variant of
Roberto's XOR [3].  That bottleneck goes away with the native XOR
operation.  Interestingly, it also largely goes away in LuaJIT2 even
when using [3].  However, most of the other bit operations in
ModuleCompressDeflateLua are shifts and modulus, which, although
cleaner with their native operations, tend to be slightly slower when
implemented as function calls (except maybe LuaJIT2).

 local x = 0
 local rshift = bit.rshift
 for i=1,1000000000 do
   x = (x - x % 256) /  256
   --x = rshift(x, 8)
 end

[1] http://lua-users.org/wiki/ModuleCompressDeflateLua
[2] http://github.com/davidm/lua-compress-deflatelua/blob/ce66aea7b76df9fb74a9606dd04dbbb66c483f37/module/lmod/compress/deflatelua.lua#L192
[3] http://lua-users.org/lists/lua-l/2002-09/msg00134.html
[4] http://us.generation-nt.com/answer/left-shift-32-help-10039022.html
[5] http://github.com/davidm/lua-compress-deflatelua/blob/ce66aea7b76df9fb74a9606dd04dbbb66c483f37/module/lmod/compress/deflatelua.lua#L314