lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


2009/12/2 Matt 'Matic' (Lua) <lua@photon.me.uk>:
> Jerome Vuarand wrote:
>
> 2009/12/1 Javier Guerra <javier@guerrag.com>:
>
>
> On Tue, Dec 1, 2009 at 10:48 AM, Matt 'Matic' (Lua) <lua@photon.me.uk>
> wrote:
>
>
> There is quite a large amount of C/C++ extensions added into Lua that are
> called from within each Lua context
>
> This is probably just a side detail and not relevant.
>
>
> does any of these extensions touch a Lua_State other than the one that
> called it?  remember that the Lua core calls LuaUnlock just before
> executing a C extension, so some other state will be running
> concurrently with your C code; if you modify it, it will be corrupted.
>
>
> I had similar issues. If several threads use a given coroutine, Lua C
> API calls from different C extensions may be intermixed, and while the
> interpreter state itself is safe, the content of the stack may become
> wrong. Many luaL_ functions become unsafe in that regard since they
> may unlock the state in the middle of processing.
>
> To solve the problem I decided to expose lua_lock and lua_unlock from
> the Lua API (I have a patch available somewhere), so that I can call
> it around all C code supposed to access shared Lua states. This
> implies that lua_lock implementation works recursively, but that's
> quite easy to do with most threading APIs.
>
> Another solution not involving patching Lua is to make sure that no
> two native threads use the same Lua thread (coroutine). This is why
> most threading libraries for Lua create at least one Lua coroutine per
> native thread created (eg. LuaThread, LuaProc).
>
>
>
> As you both have said, it would on the surface appear as though I had two
> threads accessing one lua_State, or some other C/C++ corruption of Lua.
> However, I've put a huge amount of code to prove that wasn't the case (just
> doubting myself!!). Everything appears correct and coherent.
>
> My view is that the Lua VM is making assumptions about the L->base value.
> Generally, L->base doesn't change, so it holds a local copy to avoid
> indirection and speed up the LVM. In some opcodes, the VM knows that L->base
> could or is definitely going to change, so the call is wrapped up in the
> "Protect" macro which reassigned the local copy of base after completion.
>
> However, it would appear that there are 6 (IIRC) op codes that call dojump
> outside of the context of "Protect" and assume that L->base is not going to
> change. If you have one Lua "Universe", one lua_State per OS thread then
> that will be ok.
> Once you add OS-level threading and have a single Lua "Universe" - even if
> you use lua_newthread and strictly keep each lua_State in an OS thread -
> that assumption is no longer valid.
>
> Consequently, I have changed the "dojump" macro in lvm.c to now be:
>
>     #define dojump(L,pc,i) { (pc) += (i); luai_threadyield(); base =
> L->base; }
>
> Now, I know that some "dojump" calls are also wrapped inside "Protect" and
> therefore the "base = L->base" is going to be duplicated in those cases. Of
> course, my optimising compiler removes the redundancy.
>
> Guess what - the problem has gone away and Lua is not failing its assertions
> anymore (and my Lua code isn't running off the rails)!!
>
>
> Javier - I reckon you adding the extra lock/unlock inside your C routines is
> probably greatly minimising the issue because you are reducing its
> probability, but you may well find it's the same one that I have and can be
> resolved completely with the dojump patch.
>
>
> Any thoughts or comments??

Do you have a patch of the modifications available ?