lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Hi Krunal,

Seems that __index metamethod returning just m.ptr + col * m.rows adds absolutely no overhead to performance which is superb! Running your benchmarks only possible overhead (but little) comes from checking type(col) == "number" to dispatch other methods.

Also what is more interesting we get almost C (GCC) performance for these matrix assignments, see my rewrite of your benchmark into C.

Taking into account that for more time consuming operations like determinant, multiplication, inverse one will use BLAS rather than hand-crafted Lua versions, this makes LuaJIT perfect candidate scientific language.

C-style vec(n) run
secs: 	2.796316

Vec(n) run
secs: 	2.916069
secs: 	3.567791 (bounds check)

C-style mat(nrows, ncols) run
secs: 	2.792748

Mat(nrows, ncols) run
secs: 	2.78517
secs: 	2.89588 (bounds check)

Pure C implementation mat(nrows, ncols) run [see below]
secs:	1.859779 (gcc -O3)
secs:	2.084383 (gcc -O2)
secs:	6.461822 (gcc -O0)

--- rewrite of bench_matplain into C ---

#include <sys/time.h>
#include <stdio.h>

static const double incr = 0.000156;
static const size_t rep = 500;

void bench_matplain(double *x, size_t nrows, size_t ncols)
{
	size_t irep, i, j;
	double z;
	for(irep=1; irep<=rep; irep++) {
		// write
		for(i=1; i<=nrows; i++)
			for(j=1; j<=ncols; j++)
				x[(i-1)*nrows + j-1] = incr;
		// read
		for(i=1; i<=nrows; i++)
			for(j=1; j<=ncols; j++)
				z = z + x[(i-1)*nrows + j-1];
	}
}

int main (int argc, char const *argv[])
{
	size_t nrows = 1000, ncols = 1000;
	double x[nrows * ncols];
	struct timeval start, end;
	double perf;

	gettimeofday(&start, NULL);
	bench_matplain(x, nrows, ncols);
	gettimeofday(&end, NULL);

	printf("\nC mat(nrows, ncols) run\n");
	printf("secs:\t%.12g\n", (double)(end.tv_sec - start.tv_sec) + ((double)end.tv_usec - (double)start.tv_usec) / 1e6);

	return 0;
}

Cheers,
-- 
Adam Strzelecki