[Date Prev][Date Next][Thread Prev][Thread Next]
- Subject: Re: LuaJIT and SSE2
- From: KHMan <keinhong@...>
- Date: Fri, 18 Jun 2010 16:34:57 +0800
Petri Häkkinen wrote:
On 18.6.2010, at 4.06, Peter Harris wrote:
I disabled the check for SSE2 on my SSE machine (Pentium-3), and
LuaJIT seemed to run fine in my (extremely limited) test. Depending on
just how old Petri's "older AMD setup" is, it may have SSE.
Yes, I've got SSE. That's very interesting. I'll give it a try!
Does LuaJIT use much SSE2 that isn't in SSE? I seem to recall SSE2
added mostly integer operations, and a few cache prefetch type
instructions, but not so much in the way of floating point
If SSE2 is used only minimally maybe it would still be worthwhile to add support for non-SSE2 but SSE capable platforms?
Even if this would increase the number of potential users just a tiny bit. I could try to do the patch myself but only if the required changes are small. I don't know the source code at all so any pointers on how to start would be appreciated.
I think Intel's IA32/IA64 Developer's Manual Vol 1 summarizes the
differences best. The chapters on SSE, SSE2 etc are pretty clearly
written and well worth reading.
SSE allowed 'at best' a vector of single precision floats. SSE2
extended it to 'at best' a vector of double precision floats. Both
instruction sets generally have vector and scalar instruction
equivalents, so actually, I believe it is the double precision
capability of SSE2 that we want. SSE can't do anything about the
double precision needs of Lua.
Hard to keep all of it in one's head; there must be about a
thousand instructions in them x86 processors now...
Kein-Hong Man (esq.)
Kuala Lumpur, Malaysia