lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

I've been using Lua for quite some time - and it's a hugely impressive language especially in embedded firmware :-)

I have a problem that appears to be some kind of garbage collection (?) issue. The situation is as follows:
  • There is one "Lua Universe" from which multiple running Lua states are produced by calling lua_newthread.
  • I have applied LuaUsers extensions, according to
    • The LuaLock and LuaUnlock functions make use of a global mutex so that only one OS thread can be running Lua at any one time
  • There are about 12 different Lua states, all running in their own OS thread.
  • Each thread running C++ invokes Lua through lua_pcall
  • There is quite a large amount of C/C++ extensions added into Lua that are called from within each Lua context
    • This is probably just a side detail and not relevant.
  • Every now and then the asserts fail in lvm.c's luaV_execute
    • Typically, base no longer is equal to L->base (but L->ci->base is the same as L->base)
    • It's does not appear to be down to buffer overruns (I wrapped base in a protected set of variables to look for changes). If anything, it is L->base that is changing (not base itself).
    • Sometimes, base points to stale memory (I have full MISRA C runtime checking enabled in my environment) - as if it was valid, but is not anymore.
    • Eventually Lua goes stale -
      • Sometimes an OS thread crashes (though I can't prove it's the same issue at the moment)
      • Sometimes a while..forever inside a Lua will go AWOL - and variables will be pointing to the wrong place (as if the stack has changed unexpectedly)
After a couple of days of furious debugging, I find the following:
  • I have put numerous checks within lvm.c's luaV_execute to narrow down the point where base is modified
  • It appears that the modification occurs either side of dojump's luai_threadyield
    • I modified the dojump macro to check the validity of base and throw an exception if wrong
    • What I can see is that before the threadyield base was fine, and after it has changed.
  • As far as luaV_execute is concerned, base must not change (or when L->base is expected to change, then the call is wrapped in Protect)
    • I have occasions where L->base has changed in an OP_TEST opcode (where L->base was definitely not expected to change!)
    • The value of L->base has changed in other opcodes too (not just OP_TEST)
I'm trying to pin down where this "corruption" is occurring, and wondered if it might be down to garbage collection or some other 'shuffling' process. It would appear that other parts of Lua may be considering the lua_State to be 'unlocked' and therefore safe to modify - when in fact they are not (because it is running within the context of luaV_execute).

I am hoping to craft some debug code that can pin down where the change is occurring, but I'd be grateful for any pointers or suggestions otherwise!!!

Thanks :-)

Matt "Matic"