[Date Prev][Date Next][Thread Prev][Thread Next]
- Subject: Apparent bug with lua io library -- may be related to the known stdin/out/err bugs
- From: Luiz Henrique de Figueiredo <lhf@...>
- Date: Thu, 23 Aug 2007 20:16:22 -0300
Zack Weinberg, of the Monotone team, has asked me to post this here.
Please reply with Cc to him at <email@example.com>.
From: "Zack Weinberg" <firstname.lastname@example.org>
Hi, I'm one of the developers of the Monotone version control system
(http://monotone.ca/) We use Lua both in the application itself and in
its test harness. Recently I changed the test harness to make it
parallelizable; this works great except for bizarre intermittent
problems with I/O on some, but not all, Unix-family operating systems.
The symptoms are suspiciously similar to the ones discussed in the
thread starting at
http://lua-users.org/lists/lua-l/2007-04/msg00386.html ("loadfile gets
stdin confused") but I don't think it's exactly the same bug.
The test driver program is written in a mixture of C++ and Lua. The
C++ main() creates a Lua interpreter structure and loads a bunch of
C++ extensions and Lua definitions -- the latter have been embedded
into the C++ executable, and are read in with luaL_loadbuffer(). It
then uses luaL_loadfile() to evaluate a "testsuite definition file",
specified on the command line. This file can define more Lua
functions for the test suite's use; it also tells the driver where to
find a directory containing test cases. Test cases are subdirectories
of that directory containing a Lua script with a particular name.
The driver creates a directory to run the test cases in, and creates
(with io.open()) a "master logfile" in that directory. For each test
case, it creates a subdirectory, and fork()s a child process. The
child process chdir()s into the subdirectory and opens a "per-test
logfile", again with io.open(). It then runs the testcase script,
with loadfile() and xpcall() at the Lua level. When the test case
script completes, the calling function calls f:close() on the per-test
logfile, then io.open()s a "status file" into which it writes one of
several short strings that describe the overall result of the test.
[We can't use the process exit code for this, unfortunately; it
doesn't give us enough bits.] The child process then terminates. The
parent process reads the overall result out of the status file and
writes it to the master log file and to the original stdout.
The above is how it's *supposed* to work. The bug is that
intermittently (and not on all supported platforms, and of course
*never* under the debugger) chunks of text which were supposed to go
to the per-test logfile either fail to show up anywhere, or show up in
the status file instead.
The child processes never touch stdin/out/err; in fact, I deny any
access to stdin/out/err to all code written in Lua (by removing almost
everything from the io table). The child processes never write to the
master logfile, either. I do not replace stdin/out/err at any point
in the code, nor do I mess with file descriptors 0, 1, or 2.
(Previous incarnations of the code did mess with the file descriptors,
but taking that out did not make the bug go away.) Iostreams are not
used anywhere. The only remaining "dirty trick", and I confess I
don't see how it could be causing the problem here, is that the Lua
interpreter is created and initialized once, in the parent process. I
rely on fork() to clone its state into the children, and I do not
lua_close() the interpreter in the children. The files that the
children write are explicitly closed instead of relying on final GC to
[Monotone does work on Windows, but of necessity the test suite must
be parallelized rather differently there, and the problem has not been
Any help would be greatly appreciated. If anyone wants to look at the
code, the relevant files are tester.cc, testlib.lua, and
unix/tester-plaf.cc in the current monotone development repository
(alas, I cannot point you at a tarball). I regret not being able to
provide a small self-contained testcase.