lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

----- Original Message -----
From: Roberto Ierusalimschy
Date: 8/4/2011 12:00 PM
First, on my Core i7 720 laptop, turning off #define MS_ASMTRICK
saves a bit of time.
Does the code use the "regular" IEEE trick in that case
(LUA_IEEE754TRICK)? That used to cause problems with DirectX. Does
anyone know how is this problem currently?
No LUA_IEEE754TRICK, although that is very useful.
Okay, with those out of the way, I hit lvm.c.  For my particular
benchmarks, the VM executes luaV_lessthan() and luaV_lessequal() a
lot.  Adopting the Lua 5.1 type equality check helps out

Do you have any explanation to that? (Do the benchmarks execute
a lot of luaV_lessthan/luaV_lessequal for numbers or strings?)
I don't have any explanation. I have the assembly code in front of me for both. Lua 5.2's implementation looks more efficient, but in one of my benchmarks, I gained roughly 0.4 seconds with the Lua 5.1 implementation.
I then used an Instruction *pc and tried to update ci->u.l.savedpc
at key intervals.  I did this quickly.  It might be wrong, but the
performance improvement was huge.
How much is "huge"?

Ah, I forgot to post numbers.

Here are some timings I quickly through together for one of the benchmarks (attached below). It does a prime number calculation. Unlike the original mail, I left the LUA_NANTRICKLE #define on, since I wasn't doing a head to head comparison against Lua 5.1.

Lua 5.2 beta:              28.14 seconds
LUA_IEEE754TRICK: 27.64 seconds (was curious... and I leave it on for the rest of these numbers) PC #1: 27.23 seconds (this is the Instruction** pc version) PC #2: 25.89 seconds (this is the Instruction* pc version which may not be correct for everything but is for this benchmark)

"Huge" is subjective, but the Instruction* version bought nearly 2 seconds.

Picking through the assembly shows the reason why. Not only is there one less dereference for access to pc, but the runtime keeps 'pc' loaded in a register. In Lua 5.2 beta, there are a number of additional instructions for storing something back into a memory space and then loading, say, 'L' into a register. While I can't say for certain, it appears as if 'L' remains loaded in a register in the Instruction* version.


local primes = {}
primes[1] = 2
primes[2] = 3
local nprimes = 2

local function try( n )
    local i = 1
    while true do
        local prime_i = primes[i]
        if prime_i * prime_i >= n then break end
        if ( n % prime_i ) == 0 then
        i = i + 1
    primes[ nprimes ] = n
    nprimes = nprimes + 1

function main()
    for iter=1,100 do

        primes[1] = 2
        primes[2] = 3
        nprimes = 2

        local i = 1
        while nprimes < 25000 do
            local i6 = i * 6
            try( i6 - 1 )
            try( i6 + 1 )
            i = i + 1

        print('--------->', collectgarbage('count'))

        print('--------->', collectgarbage('count'))