lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Sat, Jan 7, 2012 at 1:10 AM, Douglas Creager <douglas@creagertino.net> wrote:
>
> We're using PCL [1] in a current project, and it's worked great on both
> Linux and OS X.  Haven't tested on Windows, but their website claims that it
> works there, too.  PCL also doesn't support passing values directly in its
> yield function (co_call), but like others have mentioned, we just throw the
> value into a struct that both coroutines have access to.
>
> [1] http://www.xmailserver.org/libpcl.html
>
> –doug
>
>

Hi all, again!

Thank you douglas for the PCL link. I think it's pretty much the same
implmentation as libcoroutine, but seemed to be harder to build on
windows. :P I can be wrong since I didn't put much effort into making
PCL work on windows after a few compile-error.

Anyway, here's the crude benchmark I made:

https://github.com/arch-jslin/mysandbox/tree/master/cpp/fficb_vs_coro

the main.cpp consists of 4 similar kinds of callbacks and their
benchmarking loop. I simply dumped libcoroutine's source into this
repository (with modification: taskimpl.h doesn't work on windows so I
added preprocesser directives to those; and changed some file from *.c
to *.cpp.)

The benchmark may be inaccurate since these 4 methods aren't doing
exactly the same thing, they do achieve more or less the same
functionality however:

1. call a C++ callback (as a base case, since I wanted to move these
callbacks to lua)
2. register a lua callback to C through LuaJIT FFI.
3. call a C++ callback which contain a call to a lua function.
4. notify lua by switching C coroutine.

The 4th method should be the case which Mike Pall mentioned before he
implemented LuaJIT FFI callback.

And the results:

No doubt 1st method is faster than all 3 other methods by an order of
magitude. LuaJIT FFI callback is the slowest one in this test, but 3rd
method is no better, as they are about the same (difference less than
10%). Pretty much as expected so far.

Lastly, the 4th method is faster than 2nd and 3rd ones, but not THAT
fast (about 40% faster). I've checked with code used by "luajit -jv"
that tight loop in lua is compiled, so the threshold still lies in the
context switching mechanism implemented by libcoroutine, which on
windows uses Fibers.

When using the 4th method you have to be careful that the yielding
call in Lua (which would be triggered by Coro_startCoro_ call in my
code) can't block too long, or LuaJIT will no doubtly blacklist that
loop, and that is an order of magitude slower.

These code should be easily compiled, provided you have a C++ compiler
which can handle tr1 functions and LuaJIT installed.

best regards,
Johnson Lin