lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Fri, Apr 6, 2012 at 11:10, Alexander Gladysh <agladysh@gmail.com> wrote:
> On Fri, Apr 6, 2012 at 08:37, Alexander Gladysh <agladysh@gmail.com> wrote:
>> On Thu, Apr 5, 2012 at 01:54, Alexander Gladysh <agladysh@gmail.com> wrote:
>
>>> While trying to optimize my Lua serialization library, luatexts[1],
>>> I've stumbled upon this strange crash in LJ2:
>>>
>>> ./luajit: /usr/local/share/lua/5.1/lua-nucleo/tdeepequals.lua:0:
>>> attempt to index a boolean value
>>> stack traceback:
>>>        /usr/local/share/lua/5.1/lua-nucleo/tdeepequals.lua: in function 'tdeepequals'
>>>        /usr/local/share/lua/5.1/lua-nucleo/ensure.lua:318: in function
>>> 'ensure_returns'
>>>        test/test.lua:2238: in main chunk
>>>        [C]: ?
>>>
>>> Note strangely missing line info. This happens during generative test
>>> suite when I'm trying to load mutated data — so all kinds of bad
>>> things may happen.
>>
>> Here is another kind of crash:
>>
>> luajit2: /usr/local/share/lua/5.1//lua-nucleo/tdeepequals.lua:192:
>> attempt to get length of local 'keys1' (a function value)
>> stack traceback:
>>        /usr/local/share/lua/5.1//lua-nucleo/tdeepequals.lua:192: in function 'tmore'
>>        /usr/local/share/lua/5.1//lua-nucleo/tdeepequals.lua:186: in function 'tmore'
>>        /usr/local/share/lua/5.1//lua-nucleo/tdeepequals.lua:186: in function 'tmore'
>>        /usr/local/share/lua/5.1//lua-nucleo/tdeepequals.lua:186: in function 'tmore'
>>        /usr/local/share/lua/5.1//lua-nucleo/tdeepequals.lua:207: in function 'tmore'
>>        /usr/local/share/lua/5.1//lua-nucleo/tdeepequals.lua:203: in function 'tmore'
>>        /usr/local/share/lua/5.1//lua-nucleo/tdeepequals.lua:107: in function
>> </usr/local/share/lua/5.1//lua-nucleo/tdeepequals.lua:104>
>>        [C]: in function 'table_sort'
>>        /usr/local/share/lua/5.1//lua-nucleo/tdeepequals.lua:199: in function 'tmore'
>>        /usr/local/share/lua/5.1//lua-nucleo/tdeepequals.lua:203: in function 'tmore'
>>        /usr/local/share/lua/5.1//lua-nucleo/tdeepequals.lua:203: in function 'tmore'
>>        /usr/local/share/lua/5.1//lua-nucleo/tdeepequals.lua:227: in function
>> 'tdeepequals'
>>        /usr/local/share/lua/5.1//lua-nucleo/ensure.lua:318: in function
>> 'ensure_returns'
>>        test/test.lua:2238: in main chunk
>>        [C]: ?
>
> Ugh, that's a nasty one...
>
> I was able to reproduce this crash outside of my mutation data set, so
> it is less likely that it is some kind of memory corruption on my
> side. Since the crash is intermittent, it is also less likely that my
> code breaks Lua state somehow. (But, of course, all that is still
> possible.)
>
> Wrapping code in xpcall or adding some additional output seems to
> prevent this bug from appearing...
>
> Mike, any advice?

I prepared a dataset to reproduce the crash:

https://github.com/agladysh/luatexts/tree/ag/intermittent-crashes/test/crash

Example output:

$ ./reproduce.sh
Fri Apr  6 13:01:18 MSK 2012 REPRODUCE BEGIN
Fri Apr  6 13:02:12 MSK 2012 ERROR 1 BEGIN (iteration 6)
Fri Apr  6 13:02:12 MSK 2012 strace:
open("data/00004514.luatexts", O_RDONLY|O_LARGEFILE) = 3
Fri Apr  6 13:02:12 MSK 2012 stderr:
./luajit: /usr/local/share/lua/5.1//lua-nucleo/tdeepequals.lua:0:
attempt to index a boolean value
stack traceback:
	/usr/local/share/lua/5.1//lua-nucleo/tdeepequals.lua: in function 'tdeepequals'
	/usr/local/share/lua/5.1//lua-nucleo/ensure.lua:316: in function
'ensure_returns'
	../../etc/replay.lua:79: in main chunk
	[C]: ?
Fri Apr  6 13:02:12 MSK 2012 ERROR 1 END
Fri Apr  6 13:04:44 MSK 2012 ERROR 2 BEGIN (iteration 23)
Fri Apr  6 13:04:44 MSK 2012 strace:
open("data/00000463.luatexts", O_RDONLY|O_LARGEFILE) = 3
Fri Apr  6 13:04:44 MSK 2012 stderr:
./luajit: /usr/local/share/lua/5.1//lua-nucleo/tdeepequals.lua:0:
attempt to index a boolean value
stack traceback:
	/usr/local/share/lua/5.1//lua-nucleo/tdeepequals.lua: in function 'tdeepequals'
	/usr/local/share/lua/5.1//lua-nucleo/ensure.lua:316: in function
'ensure_returns'
	../../etc/replay.lua:79: in main chunk
	[C]: ?
Fri Apr  6 13:04:44 MSK 2012 ERROR 2 END

(and so on)

This:

open("data/00000463.luatexts", O_RDONLY|O_LARGEFILE) = 3

is a number of file which was processed when crash is reproduced.

Each iteration starts from the first file. Crashes happen randomly and
not with any invocation (more often it does not crash), but usually
around files with range from 300 to 400.

To reproduce (on i386):

# install strace
# install luarocks
sudo luarocks install luafilesystem
git clone git://github.com/agladysh/luatexts.git
git checkout ag/intermittent-crashes
cd test/crash
./reproduce

I do not know what to make of this... Next I will look at
uninitialized variables in my C code, but I doubt that this is the
reason.

Any help welcome.

Thanks,
Alexander.