lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


mark gossage wrote:
Hi folks,
We all seem to be looking at the same kind of thing. Basically extending a C++ object within Lua, and being able to call it within either C++ or Lua.  I think with a bit of effort, putting our heads together we should be able to crack this one.

Here is a quick summary of the current state of SWIG-LUA:
When SWIG wrappers a class as a full userdata. The userdata holds a pointer to the object, a pointer to the SWIG_TYPE (an internal type structure) and a flag to point out if this object should be GC'ed.
It then adds a metadata to the object, which holds all the methods, as well as a bunch of functions to read/write the attributes (which is why you can access the attributes naturally). This metatable is held in the registry and actually shared between all instances of the object.
Thanks for describing how SWIG-Lua works. It's a great piece of work, and I'm glad to be using it.

The various SWIG back-ends have different approaches and abilities, depending on the target language. Lua and Python share a lot in things in common. I've used the SWIG-Python, and I'm just learning Lua and SWIG-Lua. The Python back-end is pretty complicated (python.cxx is 127067 bytes) compawred to the current Lua back-end (37,054). But with a language like Lua, small and simple is a good thing, and I hope we can keep it that way!

One of the abilities that the SWIG-Python back-end supports is to generate Python wrapper classes ("shadow classes") around the lower level procedural entry points, in such a way that you can subclass them from Python.

Python only has one standard object system (unless you count Zope/CMF/Plone, which have about six and a half between them ;-), but Lua lets you roll your own object system, so SWIG can't make as many assumptions about the kind of Lua object wrappers to generate, and it has to be more flexible.

One cool thing about SWIG is that a lot of it is written with typemap libraries. You can hook into typemaps and and extend them, and it's flexible enough that the back-end can define new kinds of typemaps that let the user hook into the wrapper generation in language specific ways. It would be great if the SWIG-Lua back-end would generate Lua wrapper files, so you could write typemaps to tailor the wrappers for whatever kind of Lua object system you're using. Of course it should included a library of typemaps to support the best practices (whatever those are -- is there any consensus?).
I like Ariel's idea of having a function to setup the inheritence properly, I think this is a simple, but usable solution.
Because of the was the current SWIG wrappers, it has to remain a userdata with a metatable.
But I reckon, that the metatable should not be shared, and it could be used to hold the new attributes and new functions.
I am not so clear on what we do with the C++ (one problem at a time), but keeping a ref to the metatable seems like a good idea. Don, I seem to remember that you suggested having a ref from the C++ back to the userdata, that might work. But I think it might mess up the GC as the C++ has a ref to the userdata, so the userdata cannot be GC'ed.
I've taken a look at how tolua++ works, and it's quite nice and well integrated with Lua, and solves lots of these problems. It has a "setpeer" function that in Lua 5.1 uses the "setfenv" call to attach an environment table (i.e. the Lua class) to the userdata object, without stomping on its metatable. But that means the metatable has to be in on the conspiracy, and know about delegating to the environment (peer object) as well as cutting the C++ object in on its part of the action. We could modify the SWIG back-end to have a hook in the wrapper function SWIG_Lua_class_get, so you can write a delegation typemap that tries the environment table. Like tolua_event.c's class_index_event does:

       lua_getfenv(L,1);
       if (!lua_rawequal(L, -1, TOLUA_NOPEER)) {
           lua_pushvalue(L, 2); /* key */
lua_gettable(L, -2); /* on lua 5.1, we trade the "tolua_peers" lookup for a gettable call */
           if (!lua_isnil(L, -1))
               return 1;
       };

I'm still trying to figure this out, and writing it down helps work it out, so please tell me if I've got anything wrong or missed anything.

tolua++ uses a weak value table to map from lua wrapper (peer) objects to the corresponding userdata's (which can be gc'ed). In Lua 5.1, the env slot on the userdata directly and efficiently links the other way.

I haven't been able to figure out by reading the code if or how lua++ makes sure there are never two userdata's referring to the same C++ object. I think it's worth the extra effort it takes to ensure that there's a 1:1 mapping between C++ objects and wrappers (interning userdata), because that enables you to hang extra properties off of the wrapper objects, and they won't go away or get confused with multiple wrappers around the same object.
I have commited in the the SWIG CVS a new file which helps for writing callbacks:
http://swig.cvs.sourceforge.net/swig/SWIG/Lib/lua/lua_fnptr.i?revision=1.1&view=markup
(there is also an example of using it in CVS)

This might be of you for some of you.
Can some of you help me with some more idea's on how to address the rest of the issues?

Thanks,
Mark
Thanks for posting the typemaps -- they're so cool! Your SWIGLUA_REF typemap is similar to the one I wrote to handle references. But I just assumed there was one global interpreter named "L", and stored plain integer Lua object reference ids in my C++ members (making a "typedef int LuaRef" to let SWIG know what I meant). And I had to call the equivalent to your swiglua_ref_clear function in my C++ object's destructor to make sure the corresponding ref's get cleaned up.

The primary LuaRef I was using for each C++ object was a reference to itself in Lua-land (the peer object), so it was easy for the C++ object to pass the peer object back to the handlers as the "self" argument, and to access Lua properties attached to the peer object (like instance variables and callback methods), and stuff like that.

An alternative more gc-friendly approach would be to use a weak table like tolua++ is doing. My stupid objects are owned by the application, created by a factory and destroyed through helper functions, and I don't currently intend for the Lua programmer to create and destroy them directly with new and delete (at least at this stage) or for the Lua GC to collect them. Of course that would be nice once we figure out the best way to do that, but it would be best to support both approaches to object ownership (the application owns, creates and destroys the objects, versus the Lua programmer can call the class's new and object's delete methods, and the Lua GC controls their lifetimes.

I'm still learning Lua and trying to get my head around it, especially how the meta-object programming stuff works, and how metatables, userdata and native code extensions interact.

One question I have about Lua OOP in general is: why (in the examples on the Wiki) is there both a metatable, and also a separate method table (which the metatable points to with its __index attribute)? This seems kind of wastefully JavaScripty, and not as minimally Selfish as I'd expect. Is there any reason not to put methods and class variables directly into the metatable, and dispense with the extra method table in the metatable's __index attribute?

Lua is a lot like Self. But one difference is that Self lets you mark any slot as inheritable, so you can implement multiple inheritance by having multiple parent slots (with different names of course). Like the userdata has both a metatable and an env slot (but the env slot is general purpose and not automatically inherited from, so the metatable has to know how to delegate to the env for "dual inheritence" to work). What we're trying to do by binding together Lua "peer" object tables (the Lua object) with Lua userdata objects (the C++ object), C++ metatables (the C++ behavior) and Lua scripted classes (the Lua behavior), seems a lot like multiple inheritance, where we first check the "peer" table for user defined attributes/methods, then check the Lua class for scripted attributes/methods, then the C++ metatable for native code attributes/methods.

userdata:
object => pointer to C++ object, represents object binary data and native code metatable => C++ metatable behavior, dispatches attributes and methods into binary data and native code
   env => Lua peer table, represents scripted side of object
peer's properties: scripter defined instance variables, callbacks, etc.
      peer's metatable => Lua class behavior
         class's properties: scripter defined class variables and methods
         class's metatable => Lua superclass behavior, etc...

global weak value dictionary:
 lua peer table (strong) => userdata (weak)

Then the C++ object needs a way to get to the corresponding userdata. That could be:

global weak value dictionary:
 integer address of C++ object => userdata (weak)

Or it we could use the (more efficient?) approach of putting an integer reference to the userdata in the C++ object, like I was doing with LuaRef's. But that may leak memory (and it imposes on the design of the C++ object, making it harder to wrap uncooperative code, i.e. libraries written by other people), so it might be better to use a weak dictionary to map from C++ object addresses to userdata's. (Or does Lua intern userdata's automatically? That'd be nice!)

   -Don