[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: LuaJIT performance query
- From: "Vyacheslav Egorov" <mister.aleph@...>
- Date: Sun, 15 Jun 2008 23:08:33 +0700
Mike, your response is insightfull as always. It seems that I need:
(a) to think before saying anything (b) to educate myself a little
before talking about compilers.
Yes, I see now that there is no problem with such caching. Corner
cases can be detected either by static analysis and/or by runtime
checks and deoptimization. One can even handle coroutine.yield by
flushing cached globals.
By the way, is LuaJIT 2.0 capable of outputing correct error messages
for errors occured in optimized code (e.g. "attempt to index global
'x' a nil value")? At first glance it seems that information required
for this is the same as information required for deoptimization or
"cached globals flushing".
Also, Mike, could you please point me to some info about: typed SSA
and store sinking/load hoisting?
I heard about Michael Franz's publications on typed SSA, but now
cannot find any of them online. The same for store sinking/load
hoisting, I at first thought that these optimizations are classical
ones, but a quick look through indexes of Muchnick's and Allen's books
revealed nothing.
--
e.v.e
On Sun, Jun 15, 2008 at 9:38 PM, Mike Pall <mikelu-0806@mike.de> wrote:
> It's possible if the complex bytecode instructions are broken up
> into smaller operations. The combination of store-to-load
> forwarding (in linear control flow and across loop iterations) and
> store sinking can remove almost all redundant aggregate access.
>
> This is the same effect that traditional static compilers can
> achieve with scalar replacement of aggregates. And since alias
> analysis is much easier in Lua (no pointers) even a simplistic
> disambiguation algorithm can be far more effective.
>
> Metatables and other dynamisms pose no problem either. A direct or
> indirect metatable access is just a special load or store to the
> underlying object. If the alias analysis doesn't find a conflicting
> store inside the loop (setmetatable to the same object), it can
> safely hoist all metatable loads out of the loop. Even if it finds
> a store, it can often just forward it to the load and sink the
> store. Ditto for function environments.
>
> Dynamic code loading can introduce new closures, but the identity
> of all closures is checked anyway. If a check later fails, the
> trace will be extended with a side branch to handle the new
> closures.
>
> E.g. here's the SSA IR generated by LuaJIT 2.x for the uncached
> math.sin loop:
>
> 0004 I Knum KNUM 1
> 0005 I Knum KNUM 1000
> 0006 >P num SLOAD #2 i
> 0007 I Kstr KSTR "math"
> 0008 I Kfun KFUNC 0x8072c88
> 0009 >I fun SLOAD #0 (current closure)
> 0010 >I fun EQ 0009 0008
> 0011 I tab FLOAD 0008 cl_env getfenv()
> 0012 >I Kstr HREF 0011 0007
> 0013 >I tab HLOAD 0012 math
> 0014 I Kstr KSTR "sin"
> 0015 >I Kstr HREF 0013 0014
> 0016 >I fun HLOAD 0015 math.sin
> 0017 I Kfun KFUNC 0x80714a0
> 0018 >I fun EQ 0016 0017 check for math.sin closure identity
> 0019 num SIN 0006
> 0020 >P num SLOAD #1 s
> 0021 num ADD 0019 0020
> 0022 num ADD 0006 0004
> 0023 > num LE 0022 0005
> 0024 num PHI 0006 0022 i
> 0025 num PHI 0020 0021 s
>
> 1st column: IR instruction number (implicit SSA ref)
> 2nd column: I: invariant, P: left PHI operand, >: guard.
> 3rd column: K: constant, num/str/fun/tab/...: typed IR
> 4th column: IR operation
> 5th/6th column: IR operands (SSA refs or literals)
>
> As you can see almost all operations are invariant and will be
> hoisted out of the loop. Scalar stores are implicit and not
> represented in the IR because all of them can be sunk. The only
> remaining operations inside the loop are SIN, two ADDs and a LE
> comparison. The only guard left inside the loop is the loop exit
> condition. This is the minimum number of operations -- a C compiler
> couldn't generate any better code.
>
> [ And for the specialists: no, there's no metatable lookup for the
> function environment or for the math table because the type checks
> implied by the loads suffice. If a table value is not nil, neither
> an __index nor a __newindex metamethod can ever be called. ]
>
> --Mike
>