Improved Coroutines Patch |
|
Here is the patch: Files:wiki_insecure/power_patches/5.0/ejcoro.patch
What follows here is a poorly Wikified rendition of the readme file...
-Eric Jacobs
This patch uses a slightly different strategy to allow yielding than Lua does in the standard case (Lua functions calling Lua functions). In the standard case, Lua maintains a single set of C stack frames for as many Lua frame (CallInfo?'s) that are invoked using normal function calls. The bottommost of these C stack frames is the main loop of the interpreter, luaV_execute(). Calls from Lua to Lua are accomplished without reentering luaV_execute; instead, the new Lua frame is added or removed from the stack and a "goto" is executed to restart the luaV_execute() C frame at an appropriate point.
/<+------------+ / | luaFunc3 | / | | / +------------+ / | luaFunc2 | / | CI_CALLING | / +------------+ +--------------+ 1:n | luaFunc1 | | luaV_execute | | CI_CALLING | +--------------+<======>+------------+ C stack Lua CallInfo'sThis figure illustrates how the C stack and the Lua CallInfo? frame stack line up for the case of Lua functions calling Lua functions. A single C frame of luaV_execute() maps to any number of Lua frames. Note that the CI_CALLING flag is set in the functions which have made calls this way.
When a metamethod or C function calls back into standard Lua, the luaV_execute() function is reentered, and yields are prohibited.
+------------+ | luaFunc2 | +--------------+ | | | luaV_execute |<======>| | ..........+--------------+........+------------+ | callTMxxx | | luaFunc1 | +--------------+ | | | luaV_execute |<======>| | +--------------+ +------------+ C stack Lua CallInfo's +------------+ | luaFunc2 | +--------------+ | | | luaV_execute |<======>| | ..........+--------------+........+------------+ | cFunc | | cFunc C| +--------------+ +------------+ | luaV_execute |<======>| luaFunc1 | +--------------+ +------------+ C stack Lua CallInfo'sThis figure shows how the C stack and Lua CallInfo?'s line up for the case of a metamethod or C function that calls Lua code. The CI_CALLING flag is not set in this case. The dotted line represents how the C stack is divided into two sets of C frames, one for each of the Lua CallInfo?'s. This line is exactly the boundary that standard Lua won't let you yield across.
One strategy for removing this limitation for metamethods would be to make metamethod calls behave similarly to normal function calls for Lua; i.e., they would not reenter luaV_execute() but simply "goto" to the start of the luaV_execute() C frame that is already running. While possibly feasible for pure Lua metamethods, this approach is not satisfactory for use with C functions for the following two reasons:
C functions often store local variables on the C stack. In order to maintain a single instance of luaV_execute(), those variables would have to be moved off the C stack to somewhere else (e.g., the Lua stack) in order to let the C stack unwind. This is not an issue for Lua functions because Lua functions store their local variables on the Lua stack anyway. However, for C functions, it would be mandatory to unwind the stack in order to call back into Lua. This is an undesirable requirement for C functions. Although it is necessary that the C function be able to do in order to take full advantage of coroutines, it is not necessary that they actually do it every time they want to call back into Lua. If the called code never yields, it's simply a waste of CPU time to transfer the variables back and forth.
This patch solves these problems by using an optimistic strategy. When a call is made from a C function into Lua code, it is first assumed that the Lua code will not yield, and the C stack is built up recursively just as in standard Lua. If the called function completes without yielding, the situation is exactly as it is in standard Lua. But suppose that the Lua code instead yields:
+-----------------+ +------------+ /---| coroutine.yield | | luaFunc2 | | +-----------------+ | | -1 | | luaV_execute |<======>| | <- Yield | +-----------------+ +------------+ \ return| | cFunc | | cFunc C| <-- CI_CALLING flag becomes set value | +-----------------+ +------------+ / | | luaV_execute |<======>| luaFunc1 | <- v +-----------------+ +------------+ C stack Lua CallInfo'sThe yield function generates a return value of -1. At this point, the interpreter knows that the optimistic strategy won't work, and begins to unwind the C stack. As the yield return value is propagated downwards, each C frame is required to either save its state in the corresponding Lua frame(s), set the CI_CALLING flag, and return -1, or else throw an error saying that the yield can't be completed. It is at this point that the error regarding an illegal yield will occur.
If the yield is successful, then the C stack is back to its unwound state, and the Lua stack contains all of the state information. By waiting until the -1 yield return value appears before asking C functions to save state, we optimize for the case that a yield does not occur, while giving C functions to opportunity to handle it if it does occur.
When the coroutine is resumed, the top CallInfo? is examined just as in current Lua. Because the stack has successfully unwound, the CI_CALLING flag is set. This indicates that the code to complete the function call (the call to luaD_poscall) isn't waiting on the C continuation, and we need to figure out how to complete the function call and resume the function based on CallInfo? state. For Lua functions, this involves looking at the opcode of the instruction at (ci->savedpc - 1). There is a switch in a new function called luaV_return() that does this. For a C function, it involves calling the C function again to ask it to resume itself. There is a new user-defined integer/pointer union field in the CallInfo? structure that can be used by C functions in order to do this. This is also handled by luaV_return().
Here's a summary of what the patch changes:
The error message indicates which API call was used that could not handle a yield.
int lua_call_yp (lua_State *L, int nargs, int nresults, int tailcall);
This function has the same meaning as lua_call(), except that it will not prevent the called code from yielding. If a yield occurs, the return value is -1. Otherwise, the return value is the actual number of results.
The tailcall parameter should be set to 1 if this call is the last operation that the C function is to do. In this case, if the code yields, when it resumes, the results returned by the called function will become the results of the C function. In this case, the API can be called simply like this:
If the tailcall parameter is set to 0, and the code yields, when it resumes, the C function will be reinvoked. The lua_call_yp() returns -1 in this case, and the C function will need to save the state of its local variables and other state to be able to reconstruct them when the C function is reinvoked. It may use the Lua stack for this purpose; however, it _may not push_ values onto the Lua stack, as the called function(s) have taken the space directly above the stack.
Probably the easiest way to do this for a lot of C functions is to reserve a couple of slots in the stack which can be filled in with numbers or perhaps userdata (with __gc metamethod set if necessary for cleanup.)
When the C function is reinvoked, the Lua stack is how it would be had the original lua_call_yp() succeeded, and it is legal to push values on the stack again.
void *lua_get_frame_state (lua_State *L);
This function returns a pointer to a variable in the Lua stack frame which may be used by C functions to retain state across reinvocations of the C function. The return value may be cast to an int * or void **. When the C function is first invoked, the value of the frame state is always 0. Prior to returning -1 in case of an API call that yielded, the C function should set the frame state to a non-zero value, so that it will recognize that it is being resumed the next time it is called.
Possible uses for the frame state value are: numeric counter for simple loops, stack index of stored variables, or pointer to userdata (the userdata must of course be stored in the stack somewhere to prevent it from being collected.)
table.foreachi() table.foreach() tostring() print() dofile() string.gsub()Consequently, the callback functions invoked from these functions are able to yield.
These opcodes are:
OP_CALL OP_TAILCALL OP_GETTABLE OP_GETGLOBAL OP_SELF OP_ADD OP_SUB OP_MUL OP_DIV OP_POW OP_CONCAT OP_UNM OP_LT OP_LE OP_EQ OP_SETTABLE OP_SETGLOBAL OP_TFORLOOPluaV_return() is called from luaV_execute() during the execution of OP_RETURN, and from resume() when a coroutine is resumed.
0 = There exists an active C stack frame for this CallInfo?,
The CI_CALLING flag is set -
in lvm.c, during execution of OP_CALL and OP_TAILCALL, when
in lvm.c, during execution of opcodes which call a metamethod
TODO:
lua_gettable_yp() lua_settable_yp() ... etc
Tests:
Currently in testmeta.lua. More needed.