[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: LuaJIT2 performance for number crunching
- From: Mike Pall <mikelu-1102@...>
- Date: Wed, 16 Feb 2011 17:16:51 +0100
Daurnimator wrote:
> I got bored... See attached for a lua Dormand–Prince integrator...
> Pretty much a verbatim copy of http://adorio-research.org/wordpress/?p=6565
Looks ok, except for this:
[...]
local a21 = 1/5
local a31 = 3/40
local a32 = 9/40
[...]
local function ode_dormandprince ( fxy , x0 , x1 , y0 , tol , hmax , hmin , maxiter )
[...]
while true do
local K1 = fxy ( x , y )
local K2 = fxy ( x + c2*h, y + h*( a21*K1 ) )
local K3 = fxy ( x + c3*h, y + h*( a31*K1 + a32*K2 ) )
[...]
Rule #437: Never hoist out constants (if possible)!
Always write constants inline, otherwise they are treated as
variables, which generates worse code. And, yes, even plain Lua is
able to fold 1/5 into a constant. It's important to write these
unambiguously -- FP arithmetic is not associative!
So use either of these expressions:
y = (1/5)*x -- fast
y = x*(1/5) -- fast
But avoid these expressions:
y = 1*x/5 -- slow!
y = x*1/5 -- slow!
y = x/5 -- slow!
Actually these calculate something different for corner cases,
which may be important, depending on the algorithm. Given the
nature of this algorithm, the fast variant is preferred.
Another thing ... replace:
if delta <= 0.1 then h = h * 0.1
elseif delta >= 4 then h = h * 4.0
else h = delta * h end
if h > hmax then h = hmax end
with:
h = math.min(4.0, math.max(0.1, delta)) * h
hmax = math.max(hmax, h)
But keep the other conditionals, since these are strongly biased
and the uncommon paths are never compiled.
--Mike
- References:
- Re: LuaJIT2 performance for number crunching, Florian Weimer
- Re: LuaJIT2 performance for number crunching, Francesco Abbate
- Re: LuaJIT2 performance for number crunching, Leo Razoumov
- Re: LuaJIT2 performance for number crunching, Francesco Abbate
- Re: LuaJIT2 performance for number crunching, Leo Razoumov
- Re: LuaJIT2 performance for number crunching, Francesco Abbate
- Re: LuaJIT2 performance for number crunching, Leo Razoumov
- Re: LuaJIT2 performance for number crunching, Francesco Abbate
- Re: LuaJIT2 performance for number crunching, T T
- Re: LuaJIT2 performance for number crunching, Daurnimator