lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]



On 24-Sep-06, at 11:29 PM, Glenn Maynard wrote:

On Sun, Sep 24, 2006 at 11:02:47PM -0500, Rici Lake wrote:
Yeah, it's a pain. The best workaround is to use AMD chips instead of
Intel.
(Next time you buy a computer :) ) -- I don't work for AMD or anything,
but it was enough to convince me to change my purchasing preference.

That would be fine, if I only wrote programs to use myself.  :)

Yeah, I know. I was just making a point, I guess.
I'm also not sure if VC has an equivalent to attribute(aligned).


Me neither. I realized I'd forgotten a lot of the details of this problem.

Anyway, here's a test.

This just does the same loop, with different stack offsets. The slowdown mostly occurs in the for loop itself, so adding to the body of the for was probably unnecessary.

testpent.lua:

/usr/bin/time src/lua -e "for i = 1,1e7 do local j = i + 1 end"
/usr/bin/time src/lua -e "local a1; for i = 1,1e7 do local j = i + 1 end" /usr/bin/time src/lua -e "local a1,a2; for i = 1,1e7 do local j = i + 1 end" /usr/bin/time src/lua -e "local a1,a2,a3; for i = 1,1e7 do local j = i + 1 end" /usr/bin/time src/lua -e "local a1,a2,a3,a4; for i = 1,1e7 do local j = i + 1 end" /usr/bin/time src/lua -e "local a1,a2,a3,a4,a5; for i = 1,1e7 do local j = i + 1 end" /usr/bin/time src/lua -e "local a1,a2,a3,a4,a5,a6; for i = 1,1e7 do local j = i + 1 end" /usr/bin/time src/lua -e "local a1,a2,a3,a4,a5,a6,a7; for i = 1,1e7 do local j = i + 1 end" /usr/bin/time src/lua -e "local a1,a2,a3,a4,a5,a6,a7,a8; for i = 1,1e7 do local j = i + 1 end" /usr/bin/time src/lua -e "local a1,a2,a3,a4,a5,a6,a7,a8,a9; for i = 1,1e7 do local j = i + 1 end" /usr/bin/time src/lua -e "local a1,a2,a3,a4,a5,a6,a7,a8,a9,aa; for i = 1,1e7 do local j = i + 1 end" /usr/bin/time src/lua -e "local a1,a2,a3,a4,a5,a6,a7,a8,a9,aa,ab; for i = 1,1e7 do local j = i + 1 end" /usr/bin/time src/lua -e "local a1,a2,a3,a4,a5,a6,a7,a8,a9,aa,ab,ac; for i = 1,1e7 do local j = i + 1 end" /usr/bin/time src/lua -e "local a1,a2,a3,a4,a5,a6,a7,a8,a9,aa,ab,ac,ad; for i = 1,1e7 do local j = i + 1 end" /usr/bin/time src/lua -e "local a1,a2,a3,a4,a5,a6,a7,a8,a9,aa,ab,ac,ad,ae; for i = 1,1e7 do local j = i + 1 end" /usr/bin/time src/lua -e "local a1,a2,a3,a4,a5,a6,a7,a8,a9,aa,ab,ac,ad,ae,af; for i = 1,1e7 do local j = i + 1 end" /usr/bin/time src/lua -e "local a1,a2,a3,a4,a5,a6,a7,a8,a9,aa,ab,ac,ad,ae,af,b0; for i = 1,1e7 do local j = i + 1 end"


Unmodified lua 5.1.1:

rlake@freeb:~/src/lua-5.1.1$ . testpent.lua
        0.53 real         0.52 user         0.00 sys
        0.26 real         0.25 user         0.00 sys
        0.25 real         0.25 user         0.00 sys
        0.25 real         0.25 user         0.00 sys
        0.25 real         0.25 user         0.00 sys
        0.26 real         0.25 user         0.00 sys
        0.25 real         0.25 user         0.00 sys
        0.25 real         0.24 user         0.01 sys
        0.25 real         0.25 user         0.00 sys
        0.26 real         0.25 user         0.00 sys
        0.25 real         0.25 user         0.00 sys
        0.25 real         0.25 user         0.00 sys
        0.40 real         0.40 user         0.00 sys
        0.53 real         0.53 user         0.00 sys
        0.32 real         0.31 user         0.00 sys
        0.35 real         0.34 user         0.00 sys
        0.52 real         0.51 user         0.00 sys

Edit: add at line 64, in the Value union:
  char dummy[12];

Retest:
rlake@freeb:~/src/lua-5.1.1$ . testpent.lua
        0.21 real         0.21 user         0.00 sys
        0.22 real         0.21 user         0.00 sys
        0.21 real         0.21 user         0.00 sys
        0.21 real         0.21 user         0.00 sys
        0.21 real         0.21 user         0.00 sys
        0.21 real         0.21 user         0.00 sys
        0.22 real         0.21 user         0.00 sys
        0.21 real         0.19 user         0.02 sys
        0.21 real         0.21 user         0.00 sys
        0.21 real         0.21 user         0.00 sys
        0.22 real         0.21 user         0.00 sys
        0.21 real         0.21 user         0.00 sys
        0.21 real         0.21 user         0.00 sys
        0.21 real         0.21 user         0.00 sys
        0.21 real         0.21 user         0.00 sys
        0.22 real         0.21 user         0.00 sys
        0.21 real         0.21 user         0.00 sys

The alignment configuration is for alignment of strings and (theoretically) userdata, except that on 32-bit architectures, the Udata type ends up with the payload being misaligned always. So if you're putting doubles in a userdata on 32-bit architecture, watch out. Unfortunately, there doesn't seem to be any way of forcing TValue's to be aligned aside from declaring lua_Number to be a doubleword aligned type, and I don't know how to do that on VC.