lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Hi, list!

Continuing my micro-benchmarking series.

Today I'm benchmarking iterating a table with ipairs vs. numeric for:

local do_loop_ipairs = function(t)
  for i, v in ipairs(t) do
  end
end

local do_loop_numfor = function(t)
  do_nothing() -- Compensating for ipairs() call
  for i = 1, #t do
    if v == nil then
      break
    end
  end
end

As ipairs stops on the first nil, numeric for loop body has to check if value is nil to be functionally equivalent.

I've measured iteration over a table with N = 5, 25 and 50 elements, nil at N + 1 place and five non-nil elements on the end. To get more comparable results, table of five elements is iterated five times, 25 -- twice, 50 -- once. Missing function calls are padded with calls to the empty do_nothing() function.

As usual, benchmarked code is attached. If anyone is interested in reproducing my experiment, I'll post the full suite here (it not much changed from the last time though).

Here are the results (with while loop benchmarked as an added bonus):

lua
-------------------------------------------------------------------
                name |     rel | abs s / iter = us (1e-6 s) / iter
-------------------------------------------------------------------
       loop_while_50 |  1.0000 |  28.16 /   20000000 = 1.408000 us
      loop_numfor_50 |  1.0543 |  29.69 /   20000000 = 1.484500 us
       loop_while_25 |  1.1942 |  33.63 /   20000000 = 1.681500 us
      loop_numfor_25 |  1.2972 |  36.53 /   20000000 = 1.826500 us
        loop_while_5 |  2.5490 |  71.78 /   20000000 = 3.589000 us
       loop_numfor_5 |  3.4599 |  97.43 /   20000000 = 4.871500 us
      loop_ipairs_50 |  7.5465 | 212.51 /   20000000 = 10.625500 us
      loop_ipairs_25 |  7.8991 | 222.44 /   20000000 = 11.122000 us
       loop_ipairs_5 |  9.9187 | 279.31 /   20000000 = 13.965500 us
luajit -O
-------------------------------------------------------------------
                name |     rel | abs s / iter = us (1e-6 s) / iter
-------------------------------------------------------------------
       loop_while_50 |  1.0000 |   4.47 /   20000000 = 0.223500 us
      loop_numfor_50 |  1.0313 |   4.61 /   20000000 = 0.230500 us
       loop_while_25 |  1.1611 |   5.19 /   20000000 = 0.259500 us
      loop_numfor_25 |  1.2528 |   5.60 /   20000000 = 0.280000 us
        loop_while_5 |  3.0112 |  13.46 /   20000000 = 0.673000 us
       loop_numfor_5 |  3.1857 |  14.24 /   20000000 = 0.712000 us
      loop_ipairs_25 |  3.3378 |  14.92 /   20000000 = 0.746000 us
      loop_ipairs_50 |  3.3535 |  14.99 /   20000000 = 0.749500 us
       loop_ipairs_5 |  3.8680 |  17.29 /   20000000 = 0.864500 us

We can see that though do_loop_ipairs() is almost twice shorter in bytecode than do_loop_numfor(), it is  7.5 times slower. I think this is due to the hidden iterator function call in the generic for loop.

While if one thinks about this issue, it seems quite logical, I've somehow always considered using generic for with ipairs() to be magically faster than numeric for loop.

Yet another case to review tight spots in my code.

Bytecode dumps (as per luac -l):

-- do_loop_ipairs():
function <nloopbench_simple.lua:27,30> (7 instructions, 28 bytes at 0x101190)
1 param, 7 slots, 1 upvalue, 6 locals, 0 constants, 0 functions
        1       [28]    GETUPVAL        1 0     ; ipairs
        2       [28]    MOVE            2 0
        3       [28]    CALL            1 2 4
        4       [28]    JMP             0       ; to 5
        5       [28]    TFORLOOP        1 2
        6       [28]    JMP             -2      ; to 5
        7       [30]    RETURN          0 1

-- do_loop_numfor():
function <nloopbench_simple.lua:32,39> (12 instructions, 48 bytes at 0x100ec0)
1 param, 6 slots, 1 upvalue, 5 locals, 3 constants, 0 functions
        1       [33]    GETUPVAL        1 0     ; do_nothing
        2       [33]    CALL            1 1 1
        3       [34]    LOADK           1 -1    ; 1
        4       [34]    LEN             2 0
        5       [34]    LOADK           3 -1    ; 1
        6       [34]    FORPREP         1 4     ; to 11
        7       [35]    GETGLOBAL       5 -2    ; v
        8       [35]    EQ              0 5 -3  ; - nil
        9       [35]    JMP             1       ; to 11
        10      [36]    JMP             1       ; to 12
        11      [34]    FORLOOP         1 -5    ; to 7
        12      [39]    RETURN          0 1

Alexander.

Attachment: nloopbench_simple.lua
Description: Binary data