lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


I've got a slight change to my earlier code that causes an assertion
to fail in lfunc.c, line 69:

--------------------------------------
function thread_fn(thread)
    local x = {}
    local function inner_fn ()
        x = nil
    end
    coroutine.yield()
end

while true do
    local thread = coroutine.create(thread_fn)
    coroutine.resume(thread, thread)
end
--------------------------------------

When garbage is collected, the table that was assigned to "x" is freed
before the thread is freed.  When the thread is freed, it closes its
open upvalue, copying the old value, which is the previously freed
table object, into the upvalue object's "value" field.  The assert
fails on my machine before this copy occurs; the freed memory contains
garbage, causing a type tag mismatch.

Without the assert, the Lua VM will continue on and read the value
pointer and type tag information from the freed area of memory, and
store it in the upvalue object.  Although freed memory is read to find
this pointer value, the pointer value that is copied will never be
dereferenced: the dead status of the table object implies that there
are no more live references to the upvalue.  I have not seen this code
cause a crash on my machine, but of course reading from freed memory
is never safe.

It seems like the easiest fix would be to traverse the GC list twice
when performing a sweep; on the first traversal, dead threads could be
freed, and on the second traversal, everything else would be freed.  I
don't like that solution because it seems like way too much overhead
for such a narrow corner case like this.  Perhaps the best way to fix
it, without adding additional overhead, would be to maintain a
separate GC list for thread objects, and sweep that list before
sweeping the main GC list.

Finding these two separate bugs makes me wonder whether there are any
other bugs lurking in this area.  Roberto, you mentioned earlier that
you have a full suite of tests for the Lua VM; how much testing is
there for the case where yielded coroutines are garbage collected?  I
don't mean to be critical, it just seems like a new corner case that
could use further investigation.  Lua is a great language, and I've
enjoyed reading the Lua VM code while investigating these bugs.


On Tue, 4 Jan 2005 11:26:52 -0700, Spencer Schumann
<gauchopuro@gmail.com> wrote:
> I mentioned in my first post in this thread that the code I presented
> was a stripped down version of code that was causing crashes.  I ran
> that code with my patch applied, and unfortunately, I'm still getting
> failed assertions.  I'll continue to investigate this issue, and post
> my findings.