Re: LuaJIT2 performance for number crunching

lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Subject: Re: LuaJIT2 performance for number crunching
From: Mike Pall <mikelu-1102@...>
Date: Wed, 23 Feb 2011 00:31:37 +0100

Francesco Abbate wrote:
> I guess that this problem
> can be easily solved by loading the cblas library but I can give more
> help if needed.

Umm, is this the ancient NETLIB cblas code? You realize this is
not tuned at all for modern CPUs? And it's not vectorized, so if
you're using it and think you get a speedup, you're mistaken.
Also, the DLL you provide, uses x87 code and not SSE ...

Loops over vectors are certainly faster if written in plain Lua
and compiled with LuaJIT (provided the vectors are not too short).

[I can understand the desire to avoid rewriting all of cblas in
Lua, but a daxpy loop seems easy. And BTW: do NOT unroll it by
hand, this is counter-productive on modern CPUs.]

> I've given a look at the trace and it seems that the root of the
> problem is the cblas function that LuaJIT2 doesn't like:
> 
> [TRACE --- rkf45vec-out.lua:78 -- NYI: unsupported C function type at
> rkf45vec-out.lua:83]
> 
> the function incriminated is cblas_daxpy. But I don't really know.

My fault. Just released a fix for this to git HEAD. Much faster
now.

BTW: Consider checking all of your code for bad uses of global
variables. E.g.:
  cblas = ffi.load('libgslcblas-0')
should be:
  local cblas = ffi.load('libgslcblas-0')

--Mike

Follow-Ups:
- Re: LuaJIT2 performance for number crunching, Luis Carvalho
- Re: LuaJIT2 performance for number crunching, Francesco Abbate

References:
- LuaJIT2 performance for number crunching, Francesco Abbate
- Re: LuaJIT2 performance for number crunching, Leo Razoumov
- Re: LuaJIT2 performance for number crunching, Francesco Abbate
- Re: LuaJIT2 performance for number crunching, T T
- Re: LuaJIT2 performance for number crunching, Francesco Abbate
- Re: LuaJIT2 performance for number crunching, T T
- Re: LuaJIT2 performance for number crunching, Francesco Abbate
- Re: LuaJIT2 performance for number crunching, Francesco Abbate

Prev by Date: A* pathfinding with the LuaJIT FFI
Next by Date: Re: continuations, was Re: Google Summer of Code 2011
Previous by thread: Re: LuaJIT2 performance for number crunching
Next by thread: Re: LuaJIT2 performance for number crunching
Index(es):
- Date
- Thread