lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


When using lua_pcallk (from C) or luaB_xpcall (from lua), we might
specify a message handler to decorate the error thrown within the
function for protected call. And according to the lua manual, when
there is error thrown in error handler, we will end up receiving
LUA_ERRERR as execution status.

Take the following lua code as example:

  print(xpcall(function() error("1") end, function() error("2") end))

Printing out "false   error in error handling" as expected, we are
very likely to omit what is happening beneath its superficial
quiescent looking. We'll need a more transparent perspective into the
execution state of the lua program.

Attach a debugger like gdb to a running lua program, add breakpoint
on "luaD_throw" function, and execute the snippet of lua code above.
And then we will find out the LUA_ERRERR is thrown from a unexpectedly
deep stack, something like:

  #0  luaD_throw (L=0x55675137a2a8, errcode=5) at ldo.c:116
  #1  0x000055674f8195c5 in luaE_checkcstack (L=0x55675137a2a8) at lstate.c:169
  #2  0x000055674f80c971 in ccall (L=0x55675137a2a8, func=0x5567513c45e0, nResults=1, inc=65537) at ldo.c:606
  #3  0x000055674f80ca2e in luaD_callnoyield (L=0x55675137a2a8, func=0x5567513c45e0, nResults=1) at ldo.c:627
  #4  0x000055674f80ad9a in luaG_errormsg (L=0x55675137a2a8) at ldebug.c:813
  #5  0x000055674f8088a2 in lua_error (L=0x55675137a2a8) at lapi.c:1250
  #6  0x000055674f832223 in luaB_error (L=0x55675137a2a8) at lbaselib.c:122
  #7  0x000055674f80c488 in precallC (L=0x55675137a2a8, func=0x5567513c45c0, nresults=0, f=0x55674f83218e <luaB_error>) at ldo.c:506
  #8  0x000055674f80c7ab in luaD_precall (L=0x55675137a2a8, func=0x5567513c45c0, nresults=0) at ldo.c:572
  #9  0x000055674f827d4a in luaV_execute (L=0x55675137a2a8, ci=0x5567513bce20) at lvm.c:1636
  #10 0x000055674f80c9af in ccall (L=0x55675137a2a8, func=0x5567513c45b0, nResults=1, inc=65537) at ldo.c:609
  #11 0x000055674f80ca2e in luaD_callnoyield (L=0x55675137a2a8, func=0x5567513c45b0, nResults=1) at ldo.c:627
  #12 0x000055674f80ad9a in luaG_errormsg (L=0x55675137a2a8) at ldebug.c:813
  ...
  #1751 0x000055674f80409f in docall (L=0x55675137a2a8, narg=0, nres=-1) at lua.c:160
  #1752 0x000055674f804ef4 in doREPL (L=0x55675137a2a8) at lua.c:591
  #1753 0x000055674f805195 in pmain (L=0x55675137a2a8) at lua.c:641
  #1754 0x000055674f80c488 in precallC (L=0x55675137a2a8, func=0x55675137a910, nresults=1, f=0x55674f804f82 <pmain>) at ldo.c:506
  #1755 0x000055674f80c7ab in luaD_precall (L=0x55675137a2a8, func=0x55675137a910, nresults=1) at ldo.c:572
  #1756 0x000055674f80c987 in ccall (L=0x55675137a2a8, func=0x55675137a910, nResults=1, inc=65537) at ldo.c:607
  #1757 0x000055674f80ca2e in luaD_callnoyield (L=0x55675137a2a8, func=0x55675137a910, nResults=1) at ldo.c:627
  #1758 0x000055674f807dce in f_call (L=0x55675137a2a8, ud=0x7fff021773c0) at lapi.c:1041
  #1759 0x000055674f80b4f6 in luaD_rawrunprotected (L=0x55675137a2a8, f=0x55674f807d95 <f_call>, ud=0x7fff021773c0) at ldo.c:144
  #1760 0x000055674f80d325 in luaD_pcall (L=0x55675137a2a8, func=0x55674f807d95 <f_call>, u=0x7fff021773c0, old_top=16, ef=0) at ldo.c:926
  #1761 0x000055674f807ea1 in lua_pcallk (L=0x55675137a2a8, nargs=2, nresults=1, errfunc=0, ctx=0, k=0x0) at lapi.c:1067
  #1762 0x000055674f80526f in main (argc=1, argv=0x7fff02177548) at lua.c:660

It won't be hard to find that the LUA_ERRERR is thrown due to the
number of C-calls accumulated exceeds (LUAI_MAXCCALLS / 10 * 11),
which is checked in luaE_checkcstack. In the midst of the stack
trace, we're observing the repeating and recursive cycle of:

  luaG_errormsg
   -> luaD_callnoyield       (calling our message handler)
    -> ccall
     -> luaV_execute         (function() error("2") end is executed)
      -> luaD_precall        (invoking error("2"))
       -> precallC
        -> luaB_error
         -> lua_error
          (-> luaG_errormsg) (but wait? back to the origin!)

The essence of the problem is, message handler is asked to decorate
the error thrown inside itself, repeatedly and recursively. And we
are lucky enough because we are able to accumulate L->nCcalls per
cycle by 65537, and bailed out by c stack checker, for accumulating
L->nCcalls to a number as big as (LUAI_MAXCCALLS / 10 * 11).

Since the problem is rooted in the message handler asked to decorate
the error thrown inside itself recursively, I attempted to fix it by
asking nobody to decorate the error from a message handler, and
constantly transform error in message handlers into LUA_ERRERR, as is
specified in lua manual.

Execute the snippet of lua cde above again yields the same output,
but this time the stacktrace rooted from luaD_throw is much shallower:

  #0  luaD_throw (L=0x55ce945432a8, errcode=2) at ldo.c:116
  #1  0x000055ce94236e4e in luaG_errormsg (L=0x55ce945432a8) at ldebug.c:826
  #2  0x000055ce942348a2 in lua_error (L=0x55ce945432a8) at lapi.c:1250
  #3  0x000055ce9425e2c6 in luaB_error (L=0x55ce945432a8) at lbaselib.c:122
  #4  0x000055ce9423852b in precallC (L=0x55ce945432a8, func=0x55ce945439d0, nresults=0, f=0x55ce9425e231 <luaB_error>) at ldo.c:506
  #5  0x000055ce9423884e in luaD_precall (L=0x55ce945432a8, func=0x55ce945439d0, nresults=0) at ldo.c:572
  #6  0x000055ce94253ded in luaV_execute (L=0x55ce945432a8, ci=0x55ce94580e20) at lvm.c:1636
  #7  0x000055ce94238a52 in ccall (L=0x55ce945432a8, func=0x55ce945439c0, nResults=1, inc=65537) at ldo.c:609
  #8  0x000055ce94238ad1 in luaD_callnoyield (L=0x55ce945432a8, func=0x55ce945439c0, nResults=1) at ldo.c:627
  #9  0x000055ce94236cee in doerrfunc (L=0x55ce945432a8, ud=0x0) at ldebug.c:807
  #10 0x000055ce94237599 in luaD_rawrunprotected (L=0x55ce945432a8, f=0x55ce94236cba <doerrfunc>, ud=0x0) at ldo.c:144
  #11 0x000055ce942393c8 in luaD_pcall (L=0x55ce945432a8, func=0x55ce94236cba <doerrfunc>, u=0x0, old_top=192, ef=0) at ldo.c:926
  #12 0x000055ce94236de9 in luaG_errormsg (L=0x55ce945432a8) at ldebug.c:819
  ...
  #45 0x000055ce9423126f in main (argc=1, argv=0x7ffe18d6ea58) at lua.c:660

I've also attached the patch to the end of the mail, which can be
applied by "git-am" command conveniently. I believe it will bring
us to somewhere closer to the truth.

Signed-off-by: Haoran Luo <www@aegistudio.net>
---
 ldebug.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/ldebug.c b/ldebug.c
index a716d95e..31a4da3d 100644
--- a/ldebug.c
+++ b/ldebug.c
@@ -803,14 +803,25 @@ const char *luaG_addinfo (lua_State *L, const char *msg, TString *src,
 }
 
 
+static void doerrfunc(lua_State* L, void* ud) {
+  luaD_callnoyield(L, L->top - 2, 1);
+}
+
+
 l_noret luaG_errormsg (lua_State *L) {
+  int status;
   if (L->errfunc != 0) {  /* is there an error handling function? */
     StkId errfunc = restorestack(L, L->errfunc);
     lua_assert(ttisfunction(s2v(errfunc)));
     setobjs2s(L, L->top, L->top - 1);  /* move argument */
     setobjs2s(L, L->top - 1, errfunc);  /* push function */
     L->top++;  /* assume EXTRA_STACK */
-    luaD_callnoyield(L, L->top - 2, 1);  /* call it */
+    status = luaD_pcall(L, doerrfunc, NULL, savestack(L, L->top - 2), 0);
+    if (l_unlikely(status != LUA_OK)) { /* error in error handler */
+      L->top--; /* remove one vacant slot */
+      luaD_seterrorobj(L, LUA_ERRERR, L->top - 1); /* fix error object */
+      luaD_throw(L, LUA_ERRERR);
+    }
   }
   luaD_throw(L, LUA_ERRRUN);
 }
-- 
2.32.0