[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
**Subject**: **LuaJIT nested loops performance difference**
**From**: Krunal Rao <krunal.rao78@...>
**Date**: Mon, 2 May 2011 15:30:03 +0100

I was benchmarking the following code:
-- n: number of total samples, here 1e8
-- vn: number of samples drawn at the same time in vsample_rng
-- note that vn*batches == n
-- Draw one sample at time
local function sample_rng(rng)
z = 0.0
for i=1,n do
z = z + rng:sample()
end
print("mean:", z/n)
end
-- Draw multiple samples at a time
local function vsample_rng(rng)
z = 0.0
local batches = n/vn
assert(batches*vn == n)
local v = alg.Vec(vn) -- A vector of size vn
for i=1,batches do
rng:vsample(v)
for j=1,vn do
z = z + v[j]
end
end
print("mean:", z/n)
end
The reason for having both rng:sample() and rng:vsample(x) where x is
a vector is that some random variate allows for more efficient
algorithm when multiple samples are required. Moreover for
multidimensional random variates (which may be correlated) drawing
multiple values at once is required.
For this specific rng however, vsample is defined as:
function rng_idx:vsample(x)
for i=1,#x do
x[i] = sample(self)
end
end
As the same total number of samples is drawn in the two benchmark
functions one would expect the two benchmarks above to have similar
execution speed.
However when vn is small: for vn=10 vsample_rng takes twice the time
of sample_rng to draw the same number of samples, for vn=3 it takes it
three times the time and for vn=2 it takes it 200 times the time.
Is there any reason for this behavior or any way to fix it?
Thank you!
KR