lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Hi,

I am having an "interesting" GC issue related to weak tables and userdata when
considering garbage collection *across* process boundaries...

First, some background....

I am developing a sandboxing mechanism for use in my Git service Gitano.  The
sandbox is called 'Supple' and I've been trying to make it as secure and
transparent as possible.  (instructions on getting this later)

All communication between the host and the sandbox is sent over a single FD
channel which carries marshalled records consisting of one of three things.  A
method call, a response or an error propagation.

Whenever a non-integral type is passed across (either as a method argument or
as a result) it is instead remembered and assigned a tag.  The remote end
receives that tag and builds a proxy userdata object to represent the remote
object.  The relationship between tag and proxy is maintained in a pair of weak
tables so that when nothing is legitimately holding a reference to the proxy,
it can be GC'd.

Now the problem...

I am encountering an interesting issue where the following sequence of events
occurs:

Side 1 makes a call which returns a *new* proxied table (t)
Side 1 then does not store that anywhere strongly.

Side 1 makes a call which returns the same table (t)
Side 1 attempts to look up the proxy for t in the weak tables and does not
       find it, resulting in an error

Where it gets interesting is that between the two calls, if the userdata proxy
were being 'forgotten' I'd expect to see the __gc for it.  However, due to the
two-stage nature of the GC'ing of userdata, the userdata disappears from the
weak tables but has not yet had its __gc called.

Thus the error.

The above is a vastly simplified explanation of the problem which I hope
carries the salient points.  Essentially I'm after suggestions how to either
avoid, or mitigate against the situation that the proxy is no longer "present"
in the weak table, but had yet to be __gc'd.

How to get and run the code:

If you feel you want to run the real code in order to play with it, you'll need
to fetch Luxio first and install it:

    bzr branch http://bzr.rjek.com/public/luxio/
    cd luxio
    make
    sudo make install LOCAL=1

Then you can fetch Supple and install it

    git clone git://git.gitano.org.uk/supple.git
    cd supple
    git checkout diagnosis
    make
    sudo make install

Then you can run the example, which, for me, tends to trip the problem:

    lua example/simple-example.lua

Due to the nature of Supple (setuid helper, lots of hard Linuxy sandboxing,
etc) I doubt it'll be easy to diagnose on any non-Linux platform currently.

In summary,

Odd GC behaviour because userdata are no longer 'present' in the weak table but
have yet to have their __gc metamethods called.

Any ideas gratefully received, otherwise I'm just going to have to undo all the
clever "remembering" of proxied objects and hope noone minds that '==' will
work even less than it does now. :-(

Thanks,

D.

-- 
Daniel Silverstone                         http://www.digital-scurf.org/
PGP mail accepted and encouraged.            Key Id: 3CCE BABE 206C 3B69