lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Quoth Roberto Ierusalimschy <roberto@inf.puc-rio.br>, on 2010-05-19 09:46:18 -0300:
> > [...]
> > 
> > In the absence of misunderstandings in the above, I would tend to
> > suggest that lua_callk, lua_pcallk, and lua_yieldk all return int,
> > require being called at the end of a lua_CFunction, and always invoke
> > the continuation function afterwards.  This would make them truly
> > consistent CPS flow operations.
[...]
> Another alternative would be for Lua itself to call the continuation
> every time, so that these functions never return. But then they could
> only be used in places where it is safe to yield, resulting in quite
> complex restrictions.

That was what I meant by "all require being called at the end of a
lua_CFunction" above.

I agree that doing CPS in C in this manner requires destroying the C
frames and that this can make things quite inconvenient.  However, I
do not think that the current behavior of the callk functions is
simpler.  The reason is that you must _already_ be prepared for your
stack frame to be destroyed, which is what incurs most of the C-level
complexity.  The difference is that with the current behavior you must
be prepared to receive control back at either of two places depending
on something (whether there was a yield in the interim) which does not
seem like a useful control distinguisher in 99% of the cases, so
either you have them flow to the same place anyway (risking bugs in
the process unless you carefully abstract it away) or you duplicate
code (also risking bugs).

As an example, see function foreachi in ltablib.c (some lines omitted
for concision):

    static int foreachi (lua_State *L) {
      int n = aux_getn(L, 1);
      int i;
      if (lua_getctx(L, &i) == LUA_YIELD) goto poscall; /* ... */
      for (i = 1; i <= n; i++) { /* loop body ... */
        lua_callk(L, 2, 1, i, foreachi);
        poscall:
        if (!lua_isnil(L, -1))
          return 1;
        lua_pop(L, 1);  /* remove nil result */
      }
      return 0;
    }

Observe the interesting lua_getctx-dependent control flow near the
beginning, because the lua_callk might return, or might go back to the
beginning.  Exhibit B, function pcallcont in lbaselib.c:

    static int pcallcont (lua_State *L) {
      int errfunc;  /* call has an error function in bottom of the stack */
      int status = lua_getctx(L, &errfunc);
      lua_assert(status != LUA_OK);
      lua_pushboolean(L, (status == LUA_YIELD));  /* first result (status) */
      if (errfunc)  /* came from xpcall? */
        lua_replace(L, 1);  /* put first result in place of error function */
      else  /* came from pcall */
        lua_insert(L, 1);  /* open space for first result */
      return lua_gettop(L);
    }

This very nearly duplicates two pieces of code from the functions
luaB_pcall and luaB_xpcall below, mainly the lua_replace and
lua_insert.  (The "came from xpcall" stuff is a red herring; the
function could be split into one for pcall and one for xpcall, but
there would still likely either be duplication or an emulation of
callk-is-always-CPS semantics.)

The main reason I can think of to have the dual control flow is
performance, in that in the non-yielding case you don't have to tear
down and rebuild the C stackframe, but this seems kind of like a
microöptimization at the expense of semantic clarity.  A C function
that is keeping significant state in auto variables will already have
to stash it away in the Lua stack before the callk; this cannot be
done conditionally because there is no way to determine whether a
yield will happen before it happens, and then your stackframe is
already gone.  I would expect that to heavily constrain the amount of
performance improvement that could be gotten from the dual control
flow.  Maybe it makes more sense in C++?

The foreachi example would have an improvement if aux_getn were very
expensive.  I would tend to put the number on the Lua stack and then
write foreachi_cont in pure CPS, but I'm strange.  :-)

As for the foreachcont example, we have:

    static int foreachcont (lua_State *L) {
      for (;;) { /* loop body ... */
        lua_callk(L, 2, 1, 0, foreachcont);
      }
    }

and that could be trivially replaced with a yield-style pure-CPS
return, suffering no degradation except for the Lua core having to
restart the C function.  Note that the Lua stackframe for the C
function already has to continue to exist for callk to work, so that
doesn't have to be totally rebuilt (which would be more expensive);
only the C function call proper has to be restarted.

The tradeoff would seem to depend heavily on how callk is going to be
used, which is not something I can judge very well in a broader scope
by myself, which is why I'm asking the list folks to weigh in if they
think differently.  :-)

> -- Roberto

   ---> Drake Wilson