[Date Prev][Date Next][Thread Prev][Thread Next]
- Subject: Re: Light userdata
- From: RLake@...
- Date: Thu, 26 Jun 2003 08:57:47 -0500
I'm glad you like Lua, by the way.
The issue with vector arithmetic is indeed an interesting one; I have been
thinking about it off and on for a while.
One possibility, if you are using a machine with vector data types, would
be to make all Lua numbers into 128-bit quantities, which could be three
32-bit floats or one 64-bit float; one of the 32-bit slots could be used
as a tag. This wouldn't work if you required four-element vectors, of
course, and maybe you do; there would have to be some other indication, I
Basically, I don't think it is worth worrying about the integer/floating
point distinction as long as you have some mechanism to store an integer
of sufficient size. A 32-bit float is probably not adequate though. Doing
all arithmetic in SSE registers, for example, is probably not going to
have as much of a performance hit; interpretive overhead occupies a lot
more cycles than the difference between integer and floating point
arithmetic, and indeed on a Pentium it is often actually faster to do
floating point than integer because of the parallelism possible.
The change to Lua would not be huge; the main issue is that there are a
number of places in the Lua code where it assumes that you can cast Lua
numbers back and forth to integers without invoking a library routine; in
particular, the API assumes that C will "deal with" conversions between
Lua numbers and whatever you put in the argument, so it is quite common to
find statements like "lua_pushnumber(L, 1);" or "int i = lua_tonumber(L,
3);" in C extensions, including the standard Lua libraries. Routing out
all these instances would be a bit of a pain, but if you only needed to do
it in the standard library I think I know all the places that need
The other thing is the Pentium ABI itself, which does not use SSE
registers for argument passing, and which moreover does not allow internal
alignment of argument lists (so that a 128-bit quantity would be pushed
onto a randomly-aligned stack location, and then have to be realigned for
use inside the function.) GCC (and other C compilers) generally have an
option to optimise stack frame alignment but it can actually pessimise in
the case of functions with mixed arguments; in particular, functions whose
first argument is a state pointer. This has an unfortunate effect on
lua_pushnumber(), for example, in a standard 32-bit pointer/64-bit number
However, I think that *could* be gotten around by some creative use of
The question is, is the effort worthwhile in order to produce 3-element
vectors? It does not seem to me like a very general solution.....