[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: LuaJIT2 performance for number crunching
- From: Mike Pall <mikelu-1102@...>
- Date: Wed, 23 Feb 2011 11:25:27 +0100
Luis Carvalho wrote:
> function set1 (u, v)
> for i = 0, u.size - 1 do
> u.data[i] = v.data[i]
> end
> return u
> end
>
> function set (u, v)
> for i = 0, u.size - 1 do
> u.data[i * u.stride] = v.data[i * v.stride]
> end
> return u
> end
>
> Are set1 and set comparable? (I'm not sure "set" is the best way of
> implementing a strided version...)
Currently the multiply is neither strength-reduced nor narrowed,
so it's kind of costly. If this code is performance-sensitive,
you'll probably want to dispatch to a dynamically specialized
version of the loop, based on the two stride values. Usually these
are powers of two, which enables further optimizations.
--Mike
- References:
- LuaJIT2 performance for number crunching, Francesco Abbate
- Re: LuaJIT2 performance for number crunching, Leo Razoumov
- Re: LuaJIT2 performance for number crunching, Francesco Abbate
- Re: LuaJIT2 performance for number crunching, T T
- Re: LuaJIT2 performance for number crunching, Francesco Abbate
- Re: LuaJIT2 performance for number crunching, T T
- Re: LuaJIT2 performance for number crunching, Francesco Abbate
- Re: LuaJIT2 performance for number crunching, Francesco Abbate
- Re: LuaJIT2 performance for number crunching, Mike Pall
- Re: LuaJIT2 performance for number crunching, Luis Carvalho