lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On 2/16/2011 5:38 PM, Miles Bader wrote:
KHMan writes:
rk4-unroll.lua with LuaJIT2: 0m4.391
C code with GSL lib (opt -O2): 0m20.094

Sorry to barge in, it is a worrying difference. One thing is bugging me:
Is the C code running SSE2? IIRC gcc -O2 does not normally enable SSE2.

It will enable it by default when it thinks it's possible.

But this is not standard behaviour. I haven't heard of such a change in gcc default behaviour from the past few release notices. Would be much obliged if you can correct me on this.

Checked the gcc 4.5.2 docs, AFAICT there is still a generic behaviour and it's not "-march=native". Still need to explicitly enable SSE2 if the compiler was not built with --with-fpmath=sse or something like that.

It's always a good idea to use "-march=native" if you're looking for the
best performance...

And those "software automatic tuning" guys from the high-performance computing side will say that there is a lot more to do in order to wring the last bit of speed from a C compiler with a ton of options... :-)

--
Cheers,
Kein-Hong Man (esq.)
Kuala Lumpur, Malaysia