lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Josh Haberman wrote:
> I know that base Lua does not provide a C representation for "pointer
> to Lua value":
>   http://article.gmane.org/gmane.comp.lang.lua.general/56515
>
> I was wondering if the same is true for LuaJIT and for the FFI.  I suspect
> that it is since it simplifies GC, but I just wanted to ask in case anyone
> has interesting thoughts on the matter.

Cdata objects do not hold interior pointers to GC objects. This
means they don't need to be traversed by the GC. If you view the
entire state of all Lua objects as a graph, they are always leafs.

[Currently this speeds up the GC only a bit. But with the new GC
in LuaJIT 2.1, this has a significant impact on performance: cdata
objects, strings and a couple of internal object types will be
allocated from a distinct area of memory. This area has to be
marked, but the marks live in a separate and compact part of the
area. Only this small part needs to be touched by the GC. The area
holding the payload of cdata is never touched, it doesn't need to
be traversed and it doesn't need to be brought into the cache.
This has big advantages for workloads with many small/medium-sized
cdata objects.]

> I'm trying to decide how to model a protocol buffer as a Lua object.
> Protocol Buffers contain both primitive values (integers, bools, etc.
> which I can pack into a userdata more efficiently than Lua can pack
> lua_Number) and references to submessages, which would be other
> Lua objects.

I was under the impression that protocol buffers are mostly
transient objects. Then I'd worry more about the speed of access
than their memory usage.

> One idea I had was to have a userdata but use its metatable to store
> the references to other Lua values.  That requires a metatable per
> instance which is a little extra overhead and it means that the
> luaL_checkudata() scheme won't work for type checking, but overall
> it's the best idea I have so far.

Avoid writing to metatables after their initial creation. This
kills the internal negative cache for metamethods on every write
(this is true for Lua, too).

Also, one metatable per instance would defeat any kind of
metatable specialization (if I were to implement that). Use the
environment table of userdata to store per-instance data.

> I forgot to mention one related idea I had: the protobuf data itself
> could live in a separate data structure that the userdata only has a pointer
> to.  In that case the Lua object is just a wrapper, and I can cache these
> Lua objects in a weak table indexed by the address of the actual data
> object (ie. keyed by lightuserdata).  When I want to go from one Lua
> object to another, I look up the sub-object in the weak table by address.

Umm, isn't this overdesigning a bit? Yeah, ok, so is the whole
protobuf thing. Sorry, I couldn't resist. :-)

But wouldn't it make sense to first study the access patterns and
then adapt the data structures? Assuming that every element is only
accessed once, it might be easier to decode each one on-the-fly
from the binary representation. Maybe remember the last accessed
element and use an incremental algorithm behind the scenes. But I'd
try to avoid costly preparation steps.

--Mike