lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Thu, May 7, 2009 at 4:15 AM, David Manura <dm.lua@math2.org> wrote:
On Wed, May 6, 2009 at 5:14 PM, Alexander Gladysh wrote:
> Today I'm benchmarking iterating a table with ipairs vs. numeric for:...
<...>
>         7       [35]    GETGLOBAL       5 -2    ; v

What's v? <...>

Ugh. Sorry, seems like I've had a bad day yesterday. I've fixed the benchmark (see below).

To answer a couple of questions from other replies:

1. Why am I compensating for ipairs() function call? I wanted to measure the relative speed of the loop construct itself (numeric vs. generic vs. while). This is not quite correct though. I've removed the compensating call in the fixed test.

2. Am I comparing two solutions to the same problem? I think so. This benchmark came from practical task -- to serialize numeric part of the table before hash part (in loose sense). For example, for arguable aesthetic purposes, this

   { [1] = 1, [2] = 2, [3] = nil, [4] = 4 }

should be serialized as

   { 1, 2, [4] = 4 }

and not as

   { 1, 2, nil, 4 }

Fixed (hopefully) benchmark is attached. Results:

lua
-------------------------------------------------------------------
                name |     rel | abs s / iter = us (1e-6 s) / iter
-------------------------------------------------------------------
       loop_while_50 |  1.0000 |  98.10 /   20000000 = 4.905000 us
       loop_while_25 |  1.0145 |  99.52 /   20000000 = 4.976000 us
      loop_numfor_50 |  1.0499 | 103.00 /   20000000 = 5.150000 us
      loop_numfor_25 |  1.1172 | 109.60 /   20000000 = 5.480000 us
        loop_while_5 |  1.1367 | 111.51 /   20000000 = 5.575500 us
       loop_numfor_5 |  1.3889 | 136.25 /   20000000 = 6.812500 us
      loop_ipairs_50 |  2.2424 | 219.98 /   20000000 = 10.999000 us
      loop_ipairs_25 |  2.4672 | 242.03 /   20000000 = 12.101500 us
       loop_ipairs_5 |  2.9218 | 286.63 /   20000000 = 14.331500 us

The actual difference in speed is not that huge (x2.25 vs x7.5 on broken benchmark).

Here are the bytecode dumps (the difference is also a bit less here, 7 vs. 10 instructions):

-- do_loop_ipairs()
function <nloopbench_simple.lua:27,30> (7 instructions, 28 bytes at 0x101190)
1 param, 7 slots, 1 upvalue, 6 locals, 0 constants, 0 functions
        1       [28]    GETUPVAL        1 0     ; ipairs
        2       [28]    MOVE            2 0
        3       [28]    CALL            1 2 4
        4       [28]    JMP             0       ; to 5
        5       [28]    TFORLOOP        1 2
        6       [28]    JMP             -2      ; to 5
        7       [30]    RETURN          0 1

-- do_loop_ipairs()
function <nloopbench_simple.lua:32,38> (10 instructions, 40 bytes at 0x100ec0)
1 param, 6 slots, 0 upvalues, 5 locals, 2 constants, 0 functions
        1       [33]    LOADK           1 -1    ; 1
        2       [33]    LEN             2 0
        3       [33]    LOADK           3 -1    ; 1
        4       [33]    FORPREP         1 4     ; to 9
        5       [34]    GETTABLE        5 0 4
        6       [34]    EQ              0 5 -2  ; - nil
        7       [34]    JMP             1       ; to 9
        8       [35]    JMP             1       ; to 10
        9       [33]    FORLOOP         1 -5    ; to 5
        10      [38]    RETURN          0 1

Alexander.

Attachment: nloopbench_simple.lua
Description: Binary data