[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Question about accessing Lua tables and arrays faster
- From: Leo Romanoff <romixlev@...>
- Date: Fri, 26 Jul 2013 11:42:35 -0700 (PDT)
----- Ursprüngliche Message -----
> Von: Steve Litt <slitt@troubleshooters.com>
> An: "lua-l@lists.lua.org" <lua-l@lists.lua.org>
> CC:
> Gesendet: 18:58 Freitag, 26.Juli 2013
> Betreff: Re: Question about accessing Lua tables and arrays faster
>
> On Fri, 26 Jul 2013 09:29:42 -0700 (PDT)
> Leo Romanoff <romixlev@yahoo.com> wrote:
>
>> ----- Ursprüngliche Message -----
>>
>> > Von: Steve Litt <slitt@troubleshooters.com>
>> > An: lua-l@lists.lua.org
>> > CC: Leo Romanoff <romixlev@yahoo.com>
>> > Gesendet: 17:20 Donnerstag, 25.Juli 2013
>> > Betreff: Re: Question about accessing Lua tables and arrays faster
>> >
>> > On Thu, 25 Jul 2013 08:00:00 -0700 (PDT)
>> > Leo Romanoff <romixlev@yahoo.com> wrote:
>> >> - Many high-performance applications work intensively with
> arrays.
>> >> Currently, Lua uses the same datatype, namely tables, for both
>> >> arrays and maps. Accessing arrays from Lua is not extremely fast
>> >> currently. (Lua interpreter does not use even those
>> >> lua_rawseti/lua_rawgeti functions, IIRC. Only C libs for Lua can
>> >> do it). It would be nice to have a faster way of operating on
>> >> arrays in Lua.
>> >>
>> >> BTW, I did some experiments where I create an array of 1000000
>> >> elements and then perform a 1000 iterations over it, touching
> each
>> >> element and assigning a new value to it or incrementing its
>> >> current value. I implemented the same algorithm in pure Lua, as a
>> >> C library for Lua which uses lua_rawseti/lua_rawgeti, and as a C
>> >> program. The outcome is that pure Lua solution is the slowest, the
>> >> C library for Lua is roughly 1.5 times faster and C version is
>> >> about 14.5 times faster than pure Lua solution.
>> >
>> > Hi Leo,
>> >
>> > Comparing any interpreter to C is certain to yield disappointing
>> > results. A better test would have been to compare pure Lua to Perl
>> > arrays, Python lists, and whatever Ruby uses for arrays (I long
>> > since forgot). If *those* are faster than Lua in the algorithm you
>> > mention, then we have something to talk about.
>>
>> Just out of curiosity I created versions of the same algorithm for
>> Perl and Python as you suggested.
>>
>> The outcome is: They are slightly faster by may be 5%-10% than Lua
>> for this benchmark.
>
> Perl doesn't surprise me a bit. Perl has always been a very fast runtime
> interpreter. Python's a surprise I didn't expect.
> Just to make sure I understand, were your only keys 1 through number of
> elements? No 0, no text keys, just 1-whatever?
Yes. The keys are 1 through the number of elements, i.e. 1 .. 1000000
> Also, did you take steps to minimize your metatables? An apples with
> apples comparison would be minimal or no metatables.
Well I was simply creating a table in a usual way, without explicitly touching metatables.
I.e. local t = {}. Then I initialize it with 0 before my loops, and then I perform iterations.
Should I do something in addition to that to minimize the overhead of metatables?
>I don't know, is
> it possible to set a table's metatable to nil, and if you do so, does
> it affect the speed of your 1000x1000000 incrementer algorithm?
I don't know if there is a need to do it, IMHO I use simple tables and they don't have any metadata.
> Two more things, and the first one is this: I have no doubt that the
> others are faster for going over and incrementing each element. But I'm
> wondering if, for more ambitious usages of elements, you could put the
> element processing in a metatable, and if that would speed things up.
Can you elaborate a bit? What do you mean by putting element processing in a metatable?
I'm not sure I understand what you mean. Do you mean that __index method should do the actual processing, e.g. increment a value?
> Obviously there's no way to write that algorithm in Perl or Python, so
> you can't really A/B test them. But you could A/B test the metatable
> processing method against a brute force, process in a subroutine method.
>
> The second thing is this: Thank goodness the difference is only 5 to
> 10%. There are very few cases where one cares whether it takes 0.11
> seconds instead of 0.10, 1.1 seconds instead of 1.0 seconds, or an hour
> and six minutes rather than an hour. Add to this the fact that there
> are few computer tasks these days where the software is the bottleneck,
> at least for more than 5 seconds at a time, and the decision of
> interpreters is probably one of which is easiest, not which one had the
> runtime speed. After all, if runtime speed were the biggest priority,
> Lua, Python and Perl wouldn't even exist.
Absolutely. If we speak about purely interpreted code, we don't care about performance that much.
Performance becomes important when you start applying your language to CPU intensive tasks (image processing, number crunching, emulating another system, etc).
But in any case, having more efficient arrays would not hurt.
Currently, when I look at the results coming out of profiler, I can see that most of the time is spent by Lua VM accessing tables. My guess is that each statement like "t[i] = t[i] + 1" requires at least two table lookups. In general case it makes sense, but some sort of peephole optimization could help a lot in such cases. This is probably also true for "i=i+1" statements, which are very common in loops. Overall, some sort of a simple CSE (common subexpression elimination) could be a big win, I'd say. But then again, I'm not sure if Lua's VM with its current set of opcodes allows for it at all, because it would probably require internally the ability to reference memory allocated for variables and tables or something like that (e.g. VM could compute the address p of t[i] and then do something like *p = *p +1 or even *p += 1). Well, this was just an idea...
> If you try the "process the element using the metatable" approach,
> please let us know how it works.
I'd be happy to try out and provide feedback once you explain me what "process the element using the metatable" exactly means ;-)
-Leo