lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

On Sat, 01 Aug 2009 07:52:57 +0300, David Manura <> wrote:

About 1.3x (Linux/gcc3.4) to 6x (Cygwin/gcc3.4) speedup over
lua_cpcall depending on the compiler.  See below.

[snip] good code [/snip]

(gcc 3.4/linux)
$ gcc -O2 lt.c -llua -lm
$ ./a.out
lua_cpcall: 2.150000
lua_pcall: 0.950000
lua_call: 0.600000
myluacpcall: 1.600000
myluacpcall2: 5.420000

Thank you, this is consistent with what I'm getting.
I first saw the cygwin timings and thought I've broken GC.
But probably it's cygwin's memory allocator that's broken.
These are timing I get. GCPAUSE is the default 200, decreasing it
slows lua_cpcall much more than the other variants.

gcc-4.2 -DN=10000000 -Os
lua_cpcall: 1.920000
lua_pcall: 0.810000
lua_call: 0.460000
myluacpcall: 1.390000
myluacpcall2: 5.430000
my_cpcall: 1.210000

mipsel-linux-uclibc-gcc-4.2 -DN=100000 -Os
lua_cpcall: 4.820000
lua_pcall: 2.800000
lua_call: 0.330000
myluacpcall: 4.040000
myluacpcall2: 11.720000
my_cpcall: 4.090000

Where my_cpcall was changed to match your calling convention, for an apples-to-apples:

int my_cpcall(lua_State * L, lua_CFunction func, int nargs, int nresults, int errfunc) { lua_pushvalue(L, cpcallIdx); /* myauxccall, cpcallIdx is C ref returned by lua_refi */
  if (nargs) lua_insert(L, -nargs-1);
  lua_pushlightuserdata(L, (void*)func);
  return lua_pcall(L, nargs+1, nresults, errfunc);

int myauxccall(lua_State *L) {
  lua_CFunction f = (lua_CFunction) lua_topointer(L, -1);
  lua_pop(L, 1);
  return f(L);

Still, the difference between plain call and p-calls is appalling, especially on the mips. But I think that adding some kind of automatic code generator for wrapping Lua API code in an outline function plus using this protected calling should be good enough solution. Too bad it seems that C++0x lambda functions are not appearing in GCC any time soon, they
would be just perfect for this task.

Thank you all for the help.