lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


>________________________________
> Von: Steven Johnson <steve@xibalbastudios.com>
>An: Leo Romanoff <romixlev@yahoo.com>; Lua mailing list <lua-l@lists.lua.org> 
>Gesendet: 4:42 Sonntag, 28.Juli 2013
>Betreff: Re: Question about accessing Lua tables and arrays faster
> 
>
>
>
>
>More results in the meantime:
>>
>>I have created a version of iterations, where I use 
>>
>>  rawset(t, i, rawget(t, i) + 1) 
>>instead of 
>>
>>  t[i] = t[i] + 1 
>>to skip any __index() and __newindex() calls completely.
>>
>>
>>And according to my tests rawget/rawset slow down array operations significantely compared to a simple Lua's t[i] access without metatables. Basically, rawset and rawget a C functions exposed via a standrad Lua C library. Therefore each invocation of these functions introduces the almost the same overhead as in case of my own C-based vector implementation (and obtained timings confirm it). Using simple t[i] = t[i] + 1 seems to be way more efficient, because it happens completely on Lua side and does not require and Lua->C function invocations.
>>
>>
>>-Leo
>>
>>
>
>Since you mention "well-known tricks" [1], did you do something like
>
>  local rawget, rawset = rawget, rawset
>
>before the loop? If not, what you perceived about table lookup before will show up in a different way (looking those functions up in the globals or environment, each and every time). 
> For little one-off stuff you might never notice, but over the course of a million loops...


Yes. Of course I did that. This trick is very well known.
But it is so slow due the reasons explained above. rawget and rawset are not implemented directly by bytecodes. They are C functions provided by the built-in library. Therefore each call of these functions results in a C library function invocation including a corresponding overhead for preparing arguments, pushing them on stack, invoking, getting results from a function, etc. And this overhead is actually quite significant compared to a the simple operation performed by those functions. Therefore this overhead completely overshadows everything else.

BTW, I also measured LuaJIT in the meantime. I used the same code as a pure Lua version. No special tricks were applied to the code.
Results are:
1) LuaJIT with -O3 is only 1.8-2 times slower than a pure C version compiled with "gcc -O3". This optimized C version was the fastest.
2) LuaJIT with "-joff -O0" (i.e. no JIT, no optimizations, only interpreter) is only 19 times slower than the optimized pure C version. 

Which means that LuaJIT jitted and interpreted versions take a second and third place overall. Way ahead of the next competitor.

-Leo