[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
**Subject**: **Re: Avoiding FFI- allocations + using SSE-vectors**
**From**: Adam Strzelecki <ono@...>
**Date**: Mon, 6 Feb 2012 13:37:51 +0100

> My plan for implementing SIMD operations is this: (…)
Mike, that sounds really great. Thanks for sharing your ideas.
Can I ask you if there would be a limit for vector size?
To provide support for 4x4 matric with GCC or Clang I can already define 16 x float vectors using:
typedef float vec4 __attribute__((vector_size(16)));
typedef float mat4 __attribute__((vector_size(64)));
vec4 vadd(vec4 a, vec4 b) { return a + b; }
mat4 madd(mat4 a, mat4 b) { return a + b; }
And for madd produces 4 x addps here, where vadd does single addps.
So if it was possible to use 16 x float (512 bit vectors) too and define intrinsics for then I can imagine it would be possible to define full set of 4x4 float matrix operations (multiplication, transposition, inverse) or even quaternions that would satisfy D3D/OpenGL interoperability using pure LuaJIT.
Cheers,
--
Adam Strzelecki