[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
**Subject**: **Re: LuaJIT FFI math 2x faster than built-in Lua math**
**From**: Adam Strzelecki <ono@...>
**Date**: Wed, 21 Dec 2011 00:54:06 +0100

> math.sin() calls the x87 fsin instruction, which is known to be
> slow on some CPUs, especially when reducing ranges. ffi.C.sin(),
> as implemented in most x64 math libraries, uses SSE and a faster
> range-reduction algorithm. The call overhead itself is negligible,
> since sin() is an expensive operation.
Thank you for precise explanation. Nice to learn something new that single FPU instructions can be sometimes slower than "software" implementation via SSE, which one however can have lower precision in some situations. I tried that "benchmark" on SSE-less machine (well actually via Parallels), math.sin was a bit faster than ffi.C.sin (similar to your benchmark).
> Try the same thing without a division and with sqrt() and you'll
> see that math.sqrt() is always faster.
Scored exactly the same here on my machine. Seems both are calling FPU's fsqrt.
> IMHO using sin() in benchmarks is the floating-point equivalent of
> Fibonacci benchmarks: totally worthless.
Yeah, I know these are worthless. But I wouldn't dare to ask about this if the difference was not so noticeable. Now I know it is tradeoff between slower fsin and lower precision SSE x64 implementation. Anyway this seems to be just for trigonometric functions, as rest math GLIBC functions seem to be calling FPU.
Thanks again for valuable answer,
--
Adam