lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


So I'm working on some vector & matrix library code (simple stuff - for games probably), and was amazed at the code luajit generates (as long as one avoids allocating stuff here and there.

Granted C/C++ hand-optimizely tuned, inline blah blah maybe can beat it, but with lots of hurdle both on implementor and people later to use it (I've been once in hell with a heavily templatized C++ code, that required custom gcc compiler that our studio supported code for Playstation2 and shipped games with it... duh)

To make it even more beautiful, I took care of not allocating vec3() around, by making my v3 api take in account a register that can be reused (I did the same with Common Lisp back and work pretty good, it requires some discipline, but eventually good code gets written).

And somehow I prefer function calls over operators, dunno why. It makes it clearer for me when dealing with matrices/vectors.

Now I'm think I'm ready to port a lot of quake code to straight lua, just for fun and pleasure

Anyway, here is the code

# distanceSquaredToSegment.lua

local v3 = require( "lib/v3math" )
local min, max = math.min, math.max

function v3.madd( r, v1, k, v2 )
   r[0], r[1], r[2] = v1[0]*k + v2[0], v1[1]*k + v2[1], v1[2]*k + v2[2]
   return r
end

local function distance_squared(r, a, b)
   return v3.mag(v3.sub(r, a, b))
end

local function distance_squared_to_segment( r, point, segment_start, segment_dir, segment_length )
   local dir = v3.sub( r, point, segment_start )
   local dot = max( segment_length, min( 0, v3.dot( dir, segment_dir ) ) )
   local prj = v3.madd( r, segment_dir, dot, segment_start )
   local dsq = v3.mag( v3.sub( r, point, prj ))
   return dsq
end

local function test()
   local segment_start = v3.new()
   local segment_dir = v3.new()
   local segment_length = 10
   local point = v3.new()
   local sum = 0
   local r = v3.new()
   for i=0,10000 do
sum = sum + distance_squared_to_segment( r, point, segment_start, segment_dir, segment_length )
   end
end

test()

And the assembly:

# ./luajit -jdump samples/v3math/distanceSquaredToSegment.lua | tail -n 50

->LOOP:
b910fe6a  movsd xmm14, [rcx+0x8]
b910fe70  subsd xmm14, [rdx+0x8]
b910fe76  movsd xmm12, [rcx+0x10]
b910fe7c  subsd xmm12, [rdx+0x10]
b910fe82  movsd xmm13, [rcx+0x18]
b910fe88  subsd xmm13, [rdx+0x18]
b910fe8e  movsd [rax+0x18], xmm13
b910fe94  movsd [rax+0x10], xmm12
b910fe9a  movsd [rax+0x8], xmm14
b910fea0  movsd xmm15, [rbx+0x8]
b910fea6  mulsd xmm14, xmm15
b910feab  movsd xmm6, [rbx+0x10]
b910feb0  mulsd xmm12, xmm6
b910feb5  addsd xmm12, xmm14
b910feba  movsd xmm14, [rbx+0x18]
b910fec0  mulsd xmm13, xmm14
b910fec5  addsd xmm13, xmm12
b910feca  minsd xmm13, xmm1
b910fecf  maxsd xmm13, xmm0
b910fed4  mulsd xmm15, xmm13
b910fed9  addsd xmm15, [rdx+0x8]
b910fedf  mulsd xmm6, xmm13
b910fee4  addsd xmm6, [rdx+0x10]
b910fee9  mulsd xmm13, xmm14
b910feee  addsd xmm13, [rdx+0x18]
b910fef4  movsd [rax+0x18], xmm13
b910fefa  movsd [rax+0x10], xmm6
b910feff  movsd [rax+0x8], xmm15
b910ff05  movsd xmm14, [rcx+0x8]
b910ff0b  subsd xmm14, xmm15
b910ff10  movsd xmm15, [rcx+0x10]
b910ff16  subsd xmm15, xmm6
b910ff1b  movsd xmm6, [rcx+0x18]
b910ff20  subsd xmm6, xmm13
b910ff25  movsd [rax+0x18], xmm6
b910ff2a  movsd [rax+0x10], xmm15
b910ff30  movsd [rax+0x8], xmm14
b910ff36  mulsd xmm14, xmm14
b910ff3b  mulsd xmm15, xmm15
b910ff40  addsd xmm15, xmm14
b910ff45  mulsd xmm6, xmm6
b910ff49  addsd xmm6, xmm15
b910ff4e  addsd xmm7, xmm6
b910ff52  add edi, +0x01
b910ff55  cmp edi, 0x2710
b910ff5b  jle 0x1b910fe6a       ->LOOP
b910ff61  jmp 0x1b9100028       ->6
---- TRACE 2 stop -> loop

Simply beautiful!

Thank you Mike!

the v3math is here:
https://raw.github.com/malkia/ufo/master/lib/v3math.lua