lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Hi, Dylan.

I'm glad you like Lua, by the way.

The issue with vector arithmetic is indeed an interesting one; I have been 
thinking about it off and on for a while.

One possibility, if you are using a machine with vector data types, would 
be to make all Lua numbers into 128-bit quantities, which could be three 
32-bit floats or one 64-bit float; one of the 32-bit slots could be used 
as a tag. This wouldn't work if you required four-element vectors, of 
course, and maybe you do; there would have to be some other indication, I 
guess.

Basically, I don't think it is worth worrying about the integer/floating 
point distinction as long as you have some mechanism to store an integer 
of sufficient size. A 32-bit float is probably not adequate though. Doing 
all arithmetic in SSE registers, for example, is probably not going to 
have as much of a performance hit; interpretive overhead occupies a lot 
more cycles than the difference between integer and floating point 
arithmetic, and indeed on a Pentium it is often actually faster to do 
floating point than integer because of the parallelism possible.

The change to Lua would not be huge; the main issue is that there are a 
number of places in the Lua code where it assumes that you can cast Lua 
numbers back and forth to integers without invoking a library routine; in 
particular, the API assumes that C will "deal with" conversions between 
Lua numbers and whatever you put in the argument, so it is quite common to 
find statements like "lua_pushnumber(L, 1);" or "int i = lua_tonumber(L, 
3);" in C extensions, including the standard Lua libraries. Routing out 
all these instances would be a bit of a pain, but if you only needed to do 
it in the standard library I think I know all the places that need 
changes.

The other thing is the Pentium ABI itself, which does not use SSE 
registers for argument passing, and which moreover does not allow internal 
alignment of argument lists (so that a 128-bit quantity would be pushed 
onto a randomly-aligned stack location, and then have to be realigned for 
use inside the function.) GCC (and other C compilers) generally have an 
option to optimise stack frame alignment but it can actually pessimise in 
the case of functions with mixed arguments; in particular, functions whose 
first argument is a state pointer. This has an unfortunate effect on 
lua_pushnumber(), for example, in a standard 32-bit pointer/64-bit number 
configuration.

However, I think that *could* be gotten around by some creative use of 
macros.

The question is, is the effort worthwhile in order to produce 3-element 
vectors? It does not seem to me like a very general solution.....

Rici