lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Daurnimator: The segfault (read-acccess violation) happens with more regularity in Visual C++ builds because in debug mode the compiler forcefully sets freed and deallocated memory to a known "nothing is here" pattern like "\xCC" or "\xDD", causing it to crash more reliably. I'm sure if the code was run in a more complicated or long-running environment as I had been, it would likely exhibit undefined behavior or segfault sooner!

Francisco: Thank you for the suggestion. Unfortunately, I don't have that choice. For C code I can make that change, but my example was a stripping down of some C++ code. The thing that calls "co_test_destroy" is really a C++ destructor, and that destructor can't receive extra parameters. So while I know I can fix it if it was just plain C code, it seems like I am more or less stuck in the C++ world.

彭 书呆:

I will most likely use these suggestions to ensure the thread really doesn't die. It's probably the most surefire way to ensure the garbage collector doesn't snatch the thread out from under me. I could also go about forcefully making "L" of my "co_test" point to the main lua_State (using LUA_RIDX_MAINTHREAD and friends), in which case the registry reference will still be plenty valid. I'll have to try both and explore each route.

Thank you all for the suggestions and advice; I was really stuck on trying to figure this out for a good long while.

Sincerely,

ThePhD


On Wed, Sep 13, 2017 at 10:50 AM, 彭 书呆 <nerditation@outlook.com> wrote:
On 09/13/2017 10:02 PM, ThePhD wrote:
> Observe the following complete, compile-able (and mostly C) code, which stores a simple "luaL_ref" wrapper in a struct named "co_test" and links up a simple creation method ("new") and a simple deletion method ("__gc"):
>
> #include <lua.h>
> #include <lualib.h>
> #include <lauxlib.h>
>
> #include <assert.h>
>
> typedef struct co_test {
>     lua_State* L;
>     int ref;
> } co_test;
>
> void co_test_make(co_test* p, lua_State* L, int stack_target) {
>     lua_pushvalue(L, stack_target); // copy value   
>     p->L = L;
>     p->ref = luaL_ref(L, LUA_REGISTRYINDEX); // save in registry
> }
>
> void co_test_destroy(co_test* p) {
>     luaL_unref(p->L, LUA_REGISTRYINDEX, p->ref);
>     p->ref = 0;
>     p->L = NULL;
> }
>
> int co_test_new(lua_State* L) {
>     co_test* p = (co_test*)lua_newuserdata(L, sizeof(co_test));
>     luaL_setmetatable(L, "co_test"); // bless userdata with metatable
>
>     co_test_make(p, L, 1);
>
>     return 1;
> }
>
> int co_test___gc(lua_State* L) {
>     co_test* p = (co_test*)lua_touserdata(L, 1);
>
>     co_test_destroy(p);
>
>     return 0;
> }
>
> int main(int, char*[]) {
>
>     const char code[] = R"(
> local co = coroutine.wrap(
>     function()
>         local t = co_test.new({ "SOME_TABLE" })
>         -- STUFF
>     end
> )
>
> co() -- call
> co = nil -- destroy coroutine/underlying thread
> collectgarbage() -- collect garbage
> )";
>
>     lua_State* L = luaL_newstate();
>     luaL_Reg registration[3] = { 0 };
>     int pre_stack_check = 0;
>     int post_stack_check = 0;
>
>     luaL_requiref(L, "base", luaopen_base, 1);
>     lua_pop(L, 1);
>    
>     luaL_requiref(L, "coroutine", luaopen_coroutine, 1);
>     lua_pop(L, 1);
>
>     pre_stack_check = lua_gettop(L);
>     if (luaL_newmetatable(L, "co_test") == 1) {
>         registration[0] = luaL_Reg{ "new", &co_test_new };
>         registration[1] = luaL_Reg{ "__gc", &co_test___gc };
>         luaL_setfuncs(L, registration, 0);
>         lua_pushvalue(L, -1); // duplicate metatable
>         lua_setfield(L, -1, "__index"); // loop-back lookup
>     }
>     lua_setglobal(L, "co_test");
>     post_stack_check = lua_gettop(L);
>     // sanity check
>     assert(pre_stack_check == post_stack_check);
>
>     if (luaL_dostring(L, code)) {
>         assert(false);
>     }
>
>     return 0;
> }
>
>
> In "co_test_destroy", the code segfaults (g++, clang++) / "Read Access Violation" (VC++) when doing "luaL_unref" when this code executes. I found that the reason is that the "co" variable, which holds the coroutine and thread represented by the "L" that the "co_test" was created with, dies. If you do not explicitly "nil" the "co" variable in the Lua code, the thread/state that "co_test" is created with stays alive.
>
> This behavior is observable in Lua 5.3, 5.2 and 5.1 (but not with LuaJIT on VC++ and Windows, for some reason).
>
> I am unsure of how to get around this issue properly. I know in 5.2 and 5.3 I can forcefully obtain the main thread's state and thusly avoid this problem altogether by just adjusting the "L" pointer in the "co_test_new" function. But I seem to be on an island for Lua 5.1.
>
> Does anyone have any thoughts about this? Is this behavior desirable? Should I just ask my users to collect garbage before setting any coroutine/thread wrappers to "nil"?
>
> Sincerely,
> ThePhD

this is a tricky situation. basically you store a reference to a
gc managed object (the thread) outside Lua (well, by ouside Lua
I really mean beyond the gc's knowledge), and you need to deref
the pointer in the finalizer, but the collector knows nothing about
this reference relation, so it can't re-mark the referenced object
when calling the finalizer. so when your finializer got called, the
referenced thread object could already got swept.

one obvious solution is you store the referenced thread as the uservalue
or in a field of the metatable of your userdata, this way the gc will
notice the correct reference relation and make sure it not be collected
when calling the finalizer.

if for some reason you can't use the uservalue or metatable, another
possible solution I could think of is to also store the Lua thread
object in the registry (think another lua_ref), which effectively
maintains the reference relation, though in an weird indirect way.


--
the nerdy Peng / 书呆彭 /