lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


I continue to try to reduce the repro, but the lua vm fights me
It really does help to have a large application, and the objects and stacks have to be large - every time I try to reduce the problem - the repro chance drops nearly to zero

However, I have found that using a coroutine significantly increases the chance

for _=1,10000 do
 local co = coroutine.create(function()
   local resource = new_proxy()
   resource:close()
 end)
 coroutine.resume(co)
end

but this alone nearly never repros. more metatables and increasing the stack size seems to greatly affect repro chance. Some times to much on the stack drops the chance (probably because the stack resized quite a lot)




From: Payo Nel <payonel@hotmail.com>
Sent: Thursday, April 23, 2020 10:15 PM
To: lua-l@lists.lua.org <lua-l@lists.lua.org>
Subject: lua 5.4 calls __gc twice, sometimes
 
I am testing with the latest source from lua on github, which I've confirmed matches the tarball source for lua 5.4 rc1.

I believe commit 1d8920dd7f508af5f2fd743678be1327f30c079b introduced a problem: a metatable __gc may be called twice on a table value

So far I have not been able to create a small reproducible test to show the issue. I have tried, and will continue to try.

Irregularly my program linking with lua-5.4-rc1 will crash in my userdata dispose callback - the userdata is being disposed twice from two separate gc calls
In the years I have used this custom program of mine (about 4 years) - I have never had a __gc callback issue like this. We have seen success with lua 5.2, 5.3, 5.3.4, and 5.3.5

My project itself has a very large number of tests it runs within a single lua_State - and I utilize a modest number of userdata objects. The lua code uses a __gc handler to call a "userdata_dispose" C function back in the main program.

There is a cache, weak table, holding a "proxy" key and userdata value:

```lua
local cache = setmetatable( {}, { __mode = 'k' })

local function new_proxy()
  local userdata = userdata_new()
  local proxy = setmetatable( {}, { __gc = function(t) local h = cache[t] cache[t] = nil userdata_dispose(h) end } )
  cache[proxy] = userdata
  return proxy
end
```

Various places in my lua code call new_proxy. and may store for later use, or quickly nil these variables. Again, this crash is not consistent. Sometimes I can userdata_dispose >6000 times running the full program without a crash. Other times it'll crash after ~600 dispose callbacks. This bug manifests in various GC related checks. It can be while growing the stack for normal function behavior, it can be during buffer allocation for string methods.

Curiously, the first free always appears to happen when my lua code sets the proxy variable to nil and soon after calls another c function. During the cfunction return, there is a stack push for the ret value which causes a gc check, which free's the now unref'd proxy (calling __gc, calling dispose). The 2nd free is arbitrary and "random"

After bisecting and MUCH testing, I have found that this bug does not exist in 911f1e3e7ff01dafe3e9f77bc9c8e3a25237d7de
and is introduced in the very next commit: 1d8920dd7f508af5f2fd743678be1327f30c079b

_______________________________________________
lua-l mailing list -- lua-l@lists.lua.org
To unsubscribe send an email to lua-l-leave@lists.lua.org