[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Does pulling out locals from loops improve performance?
- From: Patrick Donnelly <batrick@...>
- Date: Sun, 18 Sep 2011 19:07:50 -0400
On Sun, Sep 18, 2011 at 10:21 AM, Gavin Wraith <gavin@wra1th.plus.com> wrote:
> In message <4E75EDC0.10400@interfree.it> you wrote:
>
>> I still don't understand what's wrong in my reasoning, since the
>> implementation clearly proves me wrong as I showed in my first post.
>>
>> -- Lorenzo
>
> Consider these three snippets:
>
> A is
>
> do
> local e
> for i = 1,10 do
> e = i*i
> print(e)
> end -- for
> end -- do
>
> This compiles (5.2beta) to:
>
> 1 [3] LOADNIL 0 0
> 2 [4] LOADK 1 -1 ; 1
> 3 [4] LOADK 2 -2 ; 10
> 4 [4] LOADK 3 -1 ; 1
> 5 [4] FORPREP 1 4 ; to 10
> 6 [5] MUL 0 4 4
> 7 [6] GETTABUP 5 0 -3 ; _ENV "print"
> 8 [6] MOVE 6 0
> 9 [6] CALL 5 2 1
> 10 [4] FORLOOP 1 -5 ; to 6
> 11 [8] RETURN 0 1
>
> B is
>
> do
> for i = 1,10 do
> local e = i*i
> print(e)
> end -- for
> end -- do
>
> This compiles (5.2beta) to:
>
> 1 [3] LOADK 0 -1 ; 1
> 2 [3] LOADK 1 -2 ; 10
> 3 [3] LOADK 2 -1 ; 1
> 4 [3] FORPREP 0 4 ; to 9
> 5 [4] MUL 4 3 3
> 6 [5] GETTABUP 5 0 -3 ; _ENV "print"
> 7 [5] MOVE 6 4
> 8 [5] CALL 5 2 1
> 9 [3] FORLOOP 0 -5 ; to 5
> 10 [7] RETURN 0 1
>
> C is
>
> do
> for i = 1,10 do
> local e
> e = i*i
> print(e)
> end -- for
> end -- do
>
> This compiles (5.2beta) to:
>
> 1 [3] LOADK 0 -1 ; 1
> 2 [3] LOADK 1 -2 ; 10
> 3 [3] LOADK 2 -1 ; 1
> 4 [3] FORPREP 0 5 ; to 10
> 5 [4] LOADNIL 4 0
> 6 [5] MUL 4 3 3
> 7 [6] GETTABUP 5 0 -3 ; _ENV "print"
> 8 [6] MOVE 6 4
> 9 [6] CALL 5 2 1
> 10 [3] FORLOOP 0 -6 ; to 5
> 11 [8] RETURN 0 1
>
> Evidently B uses one less instruction. In A the extra
> instruction (LOADNIL) occurs outside the loop, in C within it.
> So the ranking ( < means "better") appears to be B < A < C.
> The key point, I think, is whether the local declaration is part
> of an assignment.
This is a good summary. Another point neglected is that Lua recycles
slots in a function when a local goes out of scope. So if you have:
do
local a = 1
-- do something with a
end
do
local a = 2
-- do something with a
end
-- repeat above do <block> end hundreds of times
In the above, you'd be ok with only 1 stack slots used (try it!). If
you try that without the do <block> end, you have 200+ slots taken up
by the repeated 'a' locals.
So, always put local declarations in the innermost block. It saves
stack space and is often faster. It just so happens it also helps with
readability :).
--
- Patrick Donnelly