Roberto Ierusalimschy wrote:
About why we do not use BitOp: I see no reason why (a<< 33) should not
Having a defined behavior for out-of-range shift counts is
desirable. Making it wrap around is the most useful behavior, e.g.
see this line from nsieve, accessing an array of bits:
if band(rshift(p[rshift(i, 5)], i), 1) ~= 0 then ...
This saves an extra band(i, 31) for the user of the library. Similar
examples appear in code that needs to treat 32 like 0 etc. However
there is rarely a need to get zero for out-of-range shift counts.
A simple count&31 inside the implementation is much cheaper (or
free) compared to adding extra branches to return 0 in the
out-of-range case. Also, returning 0 just does not make sense with
arithmetic right shifts.
or why (a<< -1) should not be equal to (a>> 1).
I do not think this is a useful identity to have.
I have been
using shift operators for many years, and I too often miss this behavior
in C. I can undestand why C does not give it, but I cannot undestand why
Lua should not give it.
In all the years I've been writing C and assembler I've never had a
need to mix left and right shifts controlled by the same variable
holding a signed shift count.
Also, considering the readability aspect, what does this do:
bit.bshift(x, -8) ??
Shift right by 8? Shift left by 8? I'm sure I'm not the only one
who had to look it up in the manual.
There is no cost to have two explicitly named functions and it
makes the intent immediately obvious:
bit.lshift(x, 8) vs. bit.rshift(x, 8)
Readibility is important. And having two functions avoids an
unpredictable branch, too. Ditto for rotates.
Advising users to "use arithmetic operators" for an arithmetic shift
is completely missing the point. The point of adding a bit library
is exactly so that users can stop (ab)using math.floor(x / 2^8) and
use bit.arshift(x, 8) instead. Umm, and not wait for the slow
division to complete ...
About the system-dependent semantics, I may be missing something,
but I do not see the problem. This library operates on bits; as long
as the result type is able to represent the 32 bits, is should make
no difference whether they are signed, unsigned, or packed inside a
If it doesn't matter, then it should definitely be signed. Because
Lua requires a signed number type and thankfully does not have the
concept of an unsigned type. This means operations have to behave
in a signed or unsigned way (e.g. string.byte), which is much
easier to deal with.
Also, results of bit operations need to be used as inputs of other
operations and passed on to C APIs (esp. to luaL_checkint()).
Embedders will not be delighted to hear that their Lua code is not
cross-platform safe if they use Lua 5.2's bit operations.
Artificially inflicting the concept of a "sometimes unsigned"
number into Lua seems a very odd choice. The goal should be to
have platform-independent and reliable semantics.
On quite a few platforms only signed conversions between int/FP
numbers have hardware support, whereas unsigned conversions need to
use library routines or workarounds. Adding an unpredictable branch
to the output path doesn't exactly help, either. None of this is
needed when outputs are signed.
Of course, if you do something like "if bit.band(x,y)< 0
...", then you will have system-dependent semantics. But, for me, this
operation already has an arbitrary semantics. What does it mean to ask
whether a bunch of bits is smaller than zero?
I suggest to try to actually write some code using bit operations
in Lua first. The above example is unrealistic.
I think it's mandatory to write some standard usage examples
before designing and implementing an API. Then one would quickly
discover that it's not easily possible to reliably implement
something like MD5 or AES with Lua 5.2's bit operations.
Example expression from MD5:
rol(bxor(d, band(b, bxor(c, d))) + a + x, s) + b
The input to rol() will end up as>= 2^32. This will obviously fail
with Lua 5.2 on some platforms. Modular arithmetic is used a lot in
combination with bit operations, so one really needs to support it.
Ditto for inputs in the range of signed _and_ unsigned 32 bit ints.
That's why Lua BitOp does the input conversion in a specific and
consistent way on all platforms and explicitly defines the semantics
for out-of-range inputs.
Another useful side effect of BitOp's semantics is to allow
compiling bit operations to fast machine code with a Lua compiler
(any existing or future Lua compiler, not just LJ2).
- Inputs to bit operations need to be converted in a consistent and
platform-independent way, respecting modular arithmetic requirements.
- Outputs of bit operations need to be signed.
- Shift and rotate instructions need to be explicitly named.
- Shift counts should be masked by the bit width.
- Arithmetic right shift is not optional.
- Ignoring the naming conventions defined by the most popular
existing libraries is unwise, because it forces all users to
rewrite their code (FYI: BitOp inherited the names from lbitlib).