lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


It was thus said that the Great Xin Zhao once stated:
>  Hello,
> I'm current write a lua perftool like gperftools, it use signals call back
> to set lua hook by lua_sethook, and in the hook function get the lua call
> stack by lua_getinfo and save it, because some lua C API running in signals
> call back can cause crash.
> and in pure lua program it runs well.

  I would avoid the use of signal() (or sigaction()) entirely.  It is too
hard to write sane signal handlers as there are nearly no standard C
function [1] that can be safely called, and a limited number of POSIX
functions that can be called [2].  Just avoid any use of signals.

  I recently had to profile an application that is mostly written in Lua
(with some C code).  I first profiled the code at the C level (this under
Linux---easy enough to do by using the "-pg" linking flag when making the
final executable, running it, and checking the output afterwards with
'gprof'):

	http://boston.conman.org/2019/08/20.1

  This revealved that LPEG was the hot spot.  To profile the Lua code, I
wrote some code *in Lua* to collect the profile code [3]:

	http://boston.conman.org/2019/08/21.1

  That wasn't that surprising---the code does a ton of LPEG processing (SIP
messaging).  What was surprising was the top hotspot.  It took a few days of
thinking, but I did clean up that hot spot:

	http://boston.conman.org/2019/08/29.1

as well as only parsing the SIP headers we actually care about (instead of
the nearly 100, all in alphabetical order---that code was written when I
wasn't sure what we needed) and we did get a decent increase in the
performance.

> but sometimes the program may like this:
> 
> while (true)
> {
>      do_something_1();
>      do_something_2();
>      do_something_3();
>      ......
>      call_lua_func("lua_main");
> }
> 
> when the signals trigged, it may running in C native code
> "do_something_2()",  and in the hook function I get the lua info will be
> the "lua_main", it is not correct.
> so I used an ugly code like:
> 
> static void SignalHandler(int sig, siginfo_t *sinfo, void *ucontext)
> {
>     // L-nny == 0 && L-nCcalls == 0
>     unsigned short nny = *(unsigned short *)((char*)L+196);
>     unsigned short nCcalls = *(unsigned short *)((char*)L+198);
>     if (nny == 0 && nCcalls == 0)
>     {
>         return;
>     }
>     lua_sethook(gL, SignalHandlerHook, LUA_MASKCOUNT, 1);
> }
> 
> so is there a C API can check VM is running?
> ps, my code is in https://github.com/esrrhs/pLua

  I don't think there's any meaningful way to profile both C and Lua code in
the same application, and that signal handler is ... well, I would reject
that signal handler outright in any code review (even as a hack).  How did
you determine that the nny field is at offset 196?  Or the nCcalls field is
at 198?  I would at the very least include lstate.h into that source file
and rewrite the function as:

	static void SignalHandler(int sig,siginfo_t *sinfo,void *ucontext)
	{
	  (void)sig;
	  (void)sinfo;
	  (void)ucontext;

	  /* shouldn't this be gL?  Where's L defined? */
	  if ((L->nny == 0) && (L->nCcalls == 0))
	  {
	    return;
	  }
	  lua_sethook(gL,SignalHandlerHook,LUA_MASKCOUNT,1);
	}

  But overall, I think your approach with signals is not the way to go.

  -spc

[1]	memset() (and memmove()/memcpy() functions from string.h) were only
	added to the "safe to call from signal handler" list in 2016!  See
	the discussion here for just how crazy this stuff can get:

	https://news.ycombinator.com/item?id=13313563

[2]	There's a list of safe functions, about half way down the page, just
	prior to section 2.5.

	https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html

[3]	The code runs lots of coroutines.  When I created a coroutine, I add
	a call to the sample2() function below to record its execution (it
	saves the filename, function name (if one is available) or the line
	number (if the function name isn't available).

	At the end of the program run, I called sample2_dump() to dump the
	information.  Then I ran "sort -rn sample2.txt" to see the resulting
	profile.

	local PROFINFO = {}
	
	function sample2(fun,freq)
	  local function hook()
	    local info = debug.getinfo(2,"nSl")
	    local key
	    local val
	
	    if not info.name or info.name == "" or info.name == "?" then
	      key = string.format("%s:%d",info.source,info.currentline)
	    else
	      key = string.format("%s:%s",info.source,info.name)
	    end
	
	    val = PROFINFO[key] or 1
	    PROFINFO[key] = val + 1
	  end
	
	  debug.sethook(fun,hook,"",freq or 97)
	end
	
	function sample2_dump()
	  local f = io.open("sample2.txt","w") or io.stderr
	  for key,val in pairs(PROFINFO) do
	    f:write(string.format("%8d\t%s\n",val,key))
	  end
	  f:close()
	end