lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


I have recently embedded Lua in my MUD (Multi-User Dungeon) for use in writing new game commands, and scripts for NPCs. Everything has worked excellent until I moved from a single global state to coroutines using lua_newthread(), lua_resume(), and lua_status() on the C API, and implemented my own sleep function. The sleep function is implemented in Lua, and hence yields from Lua, but uses a Lua binding to the Linux gettimeofday() function to allow for sleep times in the milliseconds.

I've made a very simple test script which I can execute as a command from within the MUD. The script appears to execute perfectly, the sleep function works as desired, and there are no apparent problems. I am able to execute the script a couple of times per second without incident. However, if I spam this command several times in a single second, some strange things happen. For one thing, the value returned by lua_status() after the script has executed, and is yielding in my sleep function, is sometimes very strange. It seems to return a number between 1 and 128 randomly. On scripts that do this, they seem to become invalid and will not resume any further, even though they haven't finished executing their chunks. And in the worst case, when I spam my test command, I get a crash as follows:

Here is core file of the MUD in gdb:
#0  0x000000001d2bf8b0 in ?? ()
#1  0x000000000062b861 in resume (L=0x1d2c54f0, ud=0x1d2c9990) at ldo.c:510
#2  0x000000000062a5d8 in luaD_rawrunprotected (L=0x1d2c54f0, f=0x62b710 <resume>, ud=0x1d2c9990) at ldo.c:131
#3  0x000000000062b90e in lua_resume (L=0x1d2c54f0, from=0x0, nargs=0) at ldo.c:530
#4  0x0000000000614747 in lua_coroutine::resume (this=0x1d2c7930) at lua_coroutine.c:84
#5  0x000000000061f0ab in lua_updater::update (this=0x1d260d20) at lua_update.c:38
#6  0x000000000061f385 in lua_update () at lua_update.c:87
#7  0x000000000060f868 in update_handler () at update.c:2775
#8  0x00000000004d4e4b in game_loop () at comm.c:580
#9  0x00000000004d3ea2 in main (argc=2, argv=0x7fff3ac42538) at comm.c:252
(gdb) frame 1
#1  0x000000000062b861 in resume (L=0x1d2c54f0, ud=0x1d2c9990) at ldo.c:510
510             n = (*ci->u.c.k)(L);  /* call continuation */
(gdb) print *ci->u.c.k
$1 = {int (lua_State *)} 0x1d2bf8b0
(gdb) print (*ci->u.c.k)(L)
You can't do that without a process to debug.
(gdb) print (L)
$2 = (lua_State *) 0x1d2c54f0
(gdb) print n
$3 = 0
(gdb) list
505           if (ci->u.c.k != NULL) {  /* does it have a continuation? */
506             int n;
507             ci->u.c.status = LUA_YIELD;  /* 'default' status */
508             ci->callstatus |= CIST_YIELDED;
509             lua_unlock(L);
510             n = (*ci->u.c.k)(L);  /* call continuation */
511             lua_lock(L);
512             api_checknelems(L, n);
513             firstArg = L->top - n;  /* yield results come from continuation */
514           }
(gdb) print ci
$4 = (CallInfo *) 0x1d2c5570
(gdb) print ci.u
$5 = {l = {base = 0x4014000000000001, savedpc = 0x1d2bf8b0}, c = {ctx = 1, k = 0x1d2bf8b0, old_errfunc = 68, old_allowhook = 0 '\000', status = 1 '\001'}}
(gdb) print ci.u.c
$6 = {ctx = 1, k = 0x1d2bf8b0, old_errfunc = 68, old_allowhook = 0 '\000', status = 1 '\001'}
(gdb) print ci.u.c.k
$7 = (lua_CFunction) 0x1d2bf8b0

The lua_updater class ensures that the status of the coroutine is LUA_YIELD before calling resume. Also, lua_updater::update() is called approximately every 250ms. So, it's possible for me to spawn several coroutines before the first call the the update function. Scripts are executed immediately upon spawn, and yielded coroutines are then updated every so often.

This bug occurs for me on both 5.1.4 and 5.2.1, I am on a 64-bit system using the POSIX build option. Does anyone have any clue as to what is happening here?