lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Thanks for the tip! I'll definitely have a look at BLAS. I'm using D3DX for most math operations now, which is supposed to be optimised for MMX, SSE etcetera.

Our applications work on Windows'2000 with DirectX9 (the latest DX9 SDK doesn't run on Windows 2000, but DX9 itself does). So theoretically someone with a PC from 2000 and an AMD 900MHz (not sure if I'm naming a CPU without SSE here) would be able to run our applications.

The thing I don't understand is why it would be a bad thing to let DX switch the (almost deprecated anyhow ;) FPU to single precision, which is exactly what it 'wants'. It only means compiling Lua to use float-s, why would that be a bad thing? :)

   Thanks again!
                   Hugo

----- Original Message ----- From: "SevenThunders" <mattcbro@earthlink.net>
To: <lua@bazar2.conectiva.com.br>
Sent: Thursday, March 16, 2006 10:54 PM
Subject: Re: ANN: LuaJIT 1.1.0 / Re: Lua x DirectX



Well we've had SSE2 instructions since the pentium 4 was released in Nov. of
2000.  All of AMDs recent releases support it (though perhaps the FPU is
faster on AMDs offerings I don't know). Will your graphics application even
run on these older processors?

I have to admit that I said the FPU was deprecated with tongue in cheek.
That is Intel's intention in their documentation. Compiler support for SSE
seems somewhat spotty, especially when optimization comes into play.  I've
noticed that enabling the SSE estensions using microsofts supposedly
optimizing compiler rarely improves performance for pure double floating
point operations. Hand coded libraries such as say numerical linear algebra
(e.g. BLAS) have made good use of SSE however.  Also once you start using
single precision, SSE may actually be worth it. Again in SSE you can do
twice as many FLOPS per clock cycle, not just load twice as many floating
point words. So bottom line is, if you are concerned with performance, and
you are going to use single precision, try to use SSE if possible.  This
will avoid the need to do the FPU switch to single precision.
--
View this message in context: http://www.nabble.com/ANN%3A-LuaJIT-1.1.0-t1273815.html#a3445696
Sent from the Lua - General forum at Nabble.com.