lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


> My plan for implementing SIMD operations is this: (…)

Mike, that sounds really great. Thanks for sharing your ideas.

Can I ask you if there would be a limit for vector size?
To provide support for 4x4 matric with GCC or Clang I can already define 16 x float vectors using:

typedef float vec4 __attribute__((vector_size(16)));
typedef float mat4 __attribute__((vector_size(64)));
vec4 vadd(vec4 a, vec4 b) { return a + b; }
mat4 madd(mat4 a, mat4 b) { return a + b; }

And for madd produces 4 x addps here, where vadd does single addps.

So if it was possible to use 16 x float (512 bit vectors) too and define intrinsics for then I can imagine it would be possible to define full set of 4x4 float matrix operations (multiplication, transposition, inverse) or even quaternions that would satisfy D3D/OpenGL interoperability using pure LuaJIT.

Cheers,
-- 
Adam Strzelecki