lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Sorry for the HTML letter, there are tables inside. :-)

On Wed, Nov 25, 2009 at 04:02, Mike Pall <mikelu-0911@mike.de> wrote:
> Alexander Gladysh wrote:
>> I've got an impression that LJ2 b2 does not optimize functions with varargs.

> Yes, that's mentioned on the status page.

Sorry, I know I should read docs better.

>> I'm running this silly benchmark:
>> http://github.com/agladysh/luamarca/blob/master/bench/arguments.lua

>> luajit2
>> -------------------------------------------------------------------
>>                 name |     rel | abs s / iter = us (1e-6 s) / iter
>> -------------------------------------------------------------------
>>      assert_is_alloc |     nan |   0.00 /   10000000 = 0.000000 us
>>         plain_assert |     nan |   0.00 /   10000000 = 0.000000 us
>>            assert_is |     inf |   0.01 /   10000000 = 0.001000 us

> The assertions in these cases can be hoisted, so they take up no
> time in an inner loop. That's exactly what you want.

Indeed.

>>   args_select_simple |     inf |   3.23 /   10000000 = 0.323000 us
>>  args_recursive_simp |     inf |   3.26 /   10000000 = 0.326000 us
>>    args_recursive_ln |     inf |   3.63 /   10000000 = 0.363000 us

> None of these are compiled right now (see above). And even when
> this is added, I'm pretty sure performance would still be rather
> low. I'm not sure that all of these checks will be hoistable and
> I cannot recommend this approach in general.

Perhaps I can rewrite them somehow? See below.

>> It seems that in LT 1 recursion is the fastest way to walk through varargs.
>> In LJ2 it is the slowest way. Would it always be like this or is it just
>> some beta version artifact?

> All of them will be slower than direct checks and recursion would
> probably be the slowest.

I see. Would recursion without varargs be slow as well? (I'm asking this in general, not related to the arguments().)

>> I want to write fastest possible arguments() function for LJ2.

> Maybe you need to change your goal instead. Either use source-code
> transformation and inline the checks, e.g. turn function(a:number)
> into function(a) + assert_is_number(a, 1).

No, sorry. We're writing in plain Lua (until Metalua is released :-) ). It is hard enough to find experienced Lua programmer. To find experienced programmer for custom home-grown language is, of course, impossible.

> Or just drop the explicit
> type checks -- that's very un-Lua-ish, anyway. You'll get an error
> on the first use instead. A test suite will catch it either way.

I'm not sure why do you call type checks *very* un-Lua-ish...

We're using these checks in "system-level" code, and they do save us some pain, even with test suites. It is much easier to see the problem when type error is triggered immediately, than when it is arise from the depths of implementation in totally unrelated context. There are "Lua-ish" cases when argument type checks are not needed, but then we just don't call arguments(). :-)

Perhaps we design our architecture in a wrong way then?

>> But, perhaps, something is known at this point. I use vararg a lot when I
>> write generic code, so I want, from the beginning, to write it in the way
>> LJ2 would like it.

> You may want to reconsider this decision. Vararg processing *is*
> expensive and I can't do much about it. Now that everything else
> has gotten so fast with LJ2, this will be even more noticeable.

That is a pity. I have written some useful generic functions (aside of arguments() family), and it would be a shame to get rid of them. Ah, well, when they'd show up in the profiler, I'll deal with specific cases.

But I need fast arguments(). I've updated my benchmark with two new use-cases, hardcoded check for six arguments (args_hard_simple) and the unroller (args_unroll_simple , see below).

http://github.com/agladysh/luamarca/blob/master/bench/arguments.lua#L168-254

Here are the results:

lua
-------------------------------------------------------------------
                name |     rel | abs s / iter = us (1e-6 s) / iter
-------------------------------------------------------------------
    args_hard_simple |  1.0000 |  11.38 /   10000000 = 1.138000 us
        plain_assert |  1.0255 |  11.67 /   10000000 = 1.167000 us
           assert_is |  1.1090 |  12.62 /   10000000 = 1.262000 us
  args_unroll_simple |  1.4042 |  15.98 /   10000000 = 1.598000 us
  args_select_simple |  1.6380 |  18.64 /   10000000 = 1.864000 us
 args_recursive_simp |  1.7469 |  19.88 /   10000000 = 1.988000 us
   args_recursive_ln |  1.8225 |  20.74 /   10000000 = 2.074000 us
     assert_is_alloc |  2.0905 |  23.79 /   10000000 = 2.379000 us

luajit2
-------------------------------------------------------------------
                name |     rel | abs s / iter = us (1e-6 s) / iter
-------------------------------------------------------------------
        plain_assert |     nan |   0.00 /   10000000 = 0.000000 us
     assert_is_alloc |     nan |   0.00 /   10000000 = 0.000000 us
           assert_is |     nan |   0.00 /   10000000 = 0.000000 us
    args_hard_simple |     nan |   0.00 /   10000000 = 0.000000 us
  args_unroll_simple |     inf |   2.22 /   10000000 = 0.222000 us
  args_select_simple |     inf |   3.25 /   10000000 = 0.325000 us
 args_recursive_simp |     inf |   3.48 /   10000000 = 0.348000 us
   args_recursive_ln |     inf |   3.56 /   10000000 = 0.356000 us

The question is: can LJ2 be made to optimize the function below away? (That's what args_unroll_simple is.)

  local arguments = function(...)
      local n = select("#", ...)
      -- Assuming cache is pre-populated for all possible use-cases
      return assert(arguments_cache[n])(...)
  end

Thanks,
Alexander.