lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


It was thus said that the Great Coroutines once stated:
> On Fri, Apr 4, 2014 at 5:19 PM, Sean Conner <sean@conman.org> wrote:
> 
> > It was thus said that the Great Coroutines once stated:
> > >
> > > I do agree that separate global_States in separate threads is the
> > > safer/saner way to go.  The issue I have with that is swallowing the cost
> > > of marshalling/serialization.  I wish there were a lua_xmove() for moving
> > > objects between global_States, so you could make all "Lua instances"
> > > visible within a shared memory space, and swap objects directly between
> > > them.
> >
> >   I don't know.  I want to say that if you want to move an arbitrary object
> > between threads you are doing it wrong, but I'm not sure what you are
> > trying
> > to do, so I won't say that 8-P
> >
> >   In general, it's a difficult, if not outright, impossible task.  Tables
> > are difficult, and I suspect userdata is all but impossible to handle
> > correctly with a "lua_gsxmove()" function.
> >
> >   And as for the cost of marshalling/serialization, remember, the QNX X
> > server I'm talking about did all that, and *still* was faster than a shared
> > memory version of the server.
> >
> >   -spc
> >
> >
> You could make lua_States from separate processes visible to each other
> with shared memory, but what's on the stack is most likely a reference if
> the object isn't something like a number.  You could move these references
> between lua_States of different processes but the data wouldn't be moved
> from one global_State to the other.  This is why marshalling is the
> safest/slowest way right now :(

  How do you know that marshalling is the slowest way?  You are making that
assumption (QNX X server marshalls, and it's faster than using shared
memory).  It may also be that you are trying to share too much thus causing
cache contention [1].  

  -spc (The only way to be sure is to measure ... )

[1]	A novel new approach to spinlocks:

		http://lwn.net/Articles/590243/

	It uses way more memory than a traditional spinlock (something like
	2*number-cpus) but in practical real-world tests [2] it was 100%
	faster, *because* it reduced cache contention.

[2]	A particular type of benchmark *cough cough*.