lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]



On 05/01/2015 20:14, Lorenzo Donati wrote:

On 05/01/2015 17:31, Roberto Ierusalimschy wrote:
The numbering is related to different test suites, one for each
different module in my "patchwork" library.

So, each number is a completely independent run of Lua?

No. There is just one interpreter process running for the whole time. It
is just a master script calling other scripts (each one containing a
test suite for the corresponding module) using "dofile" and collecting
the info reported by the individual scripts.


Then, there is an something interesting: The "distribution of slowness"
is far from smooth.  Several tests are extremely slower in 5.3 than in
5.1/5.2 (by orders of magnitude), and most tests run in around the same
time, but there is not much in between (say, 3~7 times slower).  What do
these tests have in common, different from the others?
[snip]
Moreover, memory is not always related to time. Some of these tests use
much more memory in 5.3 than in 5.1/5.2, but for many, the difference in
time is much higher than the difference in memory. So, maybe the problem
is not directly related to memory.

I think I isolated the problem, after delving into that 20kB
single-module unit testing framework of mine. It's most probably my
fault, but an interesting interaction of causes.

The ultimate culprit seems to be an inefficient pattern used in a
"match" call deep inside an error handler for xpcall.

That pattern is used to parse the stack trace of errors and extracting
some info from it.

The problem is that my unit tests were designed for Lua 5.1/5.2. Since
some features of Lua 5.1 were removed in 5.3, running the tests in 5.3
caused hard errors inside the deepest routines of the modules and they
were caught in that "xpcall". Since the framework wasn't very well
tested against those errors, they simply were gobbled up and not always
reported to the user. They only added wasted time because the error
handler tried to parse their stack traces. Of course under 5.1. there
were not such errors (e.g. table.unpack being removed from the basic
library) and so no slowdown happened. And since 5.2 was compiled out of
the box with compatibility with 5.1 on, even in 5.2 that slowdown didn't
happen (!!!).

I'm not 100% sure my analysis is completely accurate (too messy code in
that framework - I just cut it down with a machete until I reproduced
the problem also in 5.1 and 5.2), but it seems very likely.

I hacked away that framework ~2 years ago (I badly needed a more
consistent way to unit-test my modules, which were about to grow wild)
and I almost never touched its code since last year (it's a 20kB
monolithic, badly documented "monster"). I planned to clean it a bit,
but never found enough time. I knew it would haunt me eventually :-(

I need some more time to test why that pattern was so slow (I still have
to remember exactly what I used it for! Reverse engineering one's code -
that will be fun. A reminder to myself on why one should never leave his
code undocumented for too long! :-)

Sorry for the noise!

Cheers!


I can confirm that the problem is on my side. It wasn't an inefficient pattern, but an infinite recursion in the error handler, as explained below.

A minimal working example is the following:

--------------------------------------------------------------------

function Test( test_suite )
   print "------------- start --------------"

   local ok, err = xpcall(
      function()
         -- this closure is needed so that in the stack trace the
         -- function will be named as in the call below, namely
         -- 'test_suite', which will be searched for by the pattern.
         test_suite()
      end,
      function( e )      -- error handler
         local trace = debug.traceback()
         print( e )
         print( trace )
         local a =  trace:match '.-\n([^\n]-)in function \'test_suite\'\n'
         local b = e:match '.-:%d+:'
         local r = a .. b   --- error if a is nil --> infinite recursion
      end
   )
   print "------------- end --------------"
end


Test( function( )
   assert( false )
end)

--------------------------------------------------------------------

This works ok in 5.1 and 5.2. The pattern used to parse `trace` assumes a specific error string format, which is that of 5.1 and 5.2. In 5.3 the error format changed (you'll see "in metamethod..." instead of "in function..."), so the match fails and `a` gets nil.

Unfortunately the subsequent concatenation to obtain `r` assumes `a` is a string, which is not! The consequent error in the error handler triggers an infinite recursion.

I assume the recursion is stopped by some mechanism inside Lua, since the code recovers after having produced a huge stack trace and does not abort the interpreter, but it wastes lots of time in the recursion and the consequent matching attempts.

Oh well! Interesting debugging session, to say the least! :-)

-- Lorenzo