[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: w5 GC bug regression? [Re: lua51w4-rvm segmentation fault]
- From: "Adam D. Moss" <adam@...>
- Date: Fri, 18 Mar 2005 16:00:21 +0000
Adam D. Moss wrote:
I'm having a problem where my objects seem to be getting GC'd in an
unexpected order at shutdown, in 5.1work5. I'd previously seen this
problem in 5.1work4 and thought that it was fixed when I applied
Roberto's (or Ed Ferguson's) patch, but only one instance of it was
fixed.
I think I understand the problem a bit better now, and it's probably
a genuine problem but not precisely what I thought it was.
This trace and explanation might help someone to understand very slightly better:
[..]
DEBUG:zlua:[string "physpick.lua"]:514:Created physpick's button-listeners: userdata: 0x846293c userdata: 0x84629b4
[..]
[lua_close() called]
[..]
DEBUG:zscript:api_util.c:510: {{GCing ZLUA listener 0x84629b4...
DEBUG:zscript:api_util.c:524:...done}}
DEBUG:zscript:api_util.c:510: {{GCing ZLUA listener 0x846293c...
DEBUG:zscript:api_util.c:524:...done}}
DEBUG:zlua:[string "physpick.lua"]:521:Removing physpick's button-listener.
DEBUG:zscript:api_util.c:544:block: getting listener data from 0x846293c, zlla is (nil)
WARNING:zscript:api_util.c:545:!zlla... blocking probably-dead listener; premature GC? Ignoring.
DEBUG:zscript:api_util.c:544:block: getting listener data from 0x84629b4, zlla is (nil)
WARNING:zscript:api_util.c:545:!zlla... blocking probably-dead listener; premature GC? Ignoring.
[..]
What the program does (amongst many other things) is this:
1) create __gc-able userdata A, whose __gc method is a lua function which refers to B,C
2) create __gc-able userdata B,C whose __gc methods are cfuncs which refer only to that userdata
3) make B,C depend on the lifespan of A (through weak refs)
(Note that although this isn't its purpose, step 3 should make B and C
outlive A even if that wasn't an obvious dependancy [which I think it is!]
by tracing liveness through A's __gc method.)
During the normal lifetime of the program this is all a useful and sensible
thing to do that works just great. Now, what happens upon lua_close() is that
(as documented!) the __gc methods are called in this order:
1) C (__gc method refers to C, okay)
2) B (__gc method refers to B, okay)
3) A (__gc method refers to B,C which are dead userdata with live
script-side references -- havoc ensues)
Although this is the documented behaviour in terms of GC order, it seems
to me that this is the wrong thing to do here, as B and C are still (I believe)
directly traceable as live through A at the time they're collected.
The current behaviour presents a subtle, interesting and unanticipated
problem, which is that reachable objects will be live as expected during
normal GC __gc runs, but __gc runs triggered by lua_close() will be able to
reach dead objects (and not necessarily know it) -- ouch.
I grant that any given ordering will be wrong in the case of cycles, but
I would expect that the correct thing to do with non-cycles such as this
would be to collect from leaf to root during a lua_close(), only
then falling back to the 'reverse order of creation' rule within cycles.
If this isn't seen as something worth fixing (or is too hard to fix),
then would it be possible to make heavy userdata appear to have the
'value' of nil once they have been __gc'd? That would at least let __gc
methods be bulletproofed against using dead heavyuserdata during a
lua_close() finalization.
Thanks,
--Adam
--
Adam D. Moss - adam@gimp.org