Re: Lua and preemptive scheduling for coroutines

lua-l archive
[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]
Subject: Re: Lua and preemptive scheduling for coroutines
From: Leo Romanoff <romixlev@...>
Date: Sun, 31 Mar 2013 01:50:36 -0700 (PDT)
----- Ursprüngliche Message -----

> Von: Leo Romanoff <romixlev@yahoo.com>
> An: "lua-l@lists.lua.org" <lua-l@lists.lua.org>
> CC: 
> Gesendet: 4:20 Freitag, 29.März 2013
> Betreff: Lua and preemptive scheduling for coroutines
> 
> Hi,
> 
> I'm a newbie to Lua and find the language and its implementation very 
> interesting. In particular, I'm very interested in coroutines, because I 
> think they can be nicely used to implement something like Erlang's actors or 
> Akka's actors.
> 
> I did a few tests to see what is possible with coroutines and what kind of 
> overhead they introduce. So far I was able to create about 4 millions of 
> coroutines on my development machine, which is not bad. The memory overhead per 
> coroutine seems to be about 400 bytes for LuaJIT and a bit more for Lua.
> 
> But now I have a usual question, which people ask when they start playing with 
> coroutines - how can one implement the preemptive scheduling???
> 
> Here is a list of my requirements:
> 
> 1) I'd like to be able to introduce preemptive multitasking, i.e. 
> transparently taking control from one coroutine and giving it to the other 
> coroutine when a certain event happens. The event can be an external event or a 
> timeout or may be a certain number of instructions executed. I don't like 
> coroutines to explicitly call yield(). I'd like to "inject" it 
> upon receiving such an event.
> 
> 2) I'd like all this to be done in Lua, i.e. my coroutines are defined and 
> implemented in Lua code. The scheduler or some of the hooks can be implemented 
> in C, as a library that I use from Lua. But I'd like to avoid creating all 
> the coroutines in C or doing any new development in C every time I write an 
> application using preemptive coroutines. 
> 
> Of course, the most obvious way is to try achieving the goal by means of a 
> "count" debug hook. And of course, when I try it out, I immediately 
> run into the well-known "attempt to yield across metamethod/C-call 
> boundary" error...
> 
> I searched through the mailing list archives and I've seen that lua_yield() 
> and lua_resume() in the C code could solve the preemption problem. But as far as 
> I understand, it would require the inversion of control, so to say. I.e. my main 
> code would be in C and this C code would create coroutines and perform 
> scheduling. But this contradicts my requirement number 2. I'd really like to 
> hide/eliminate the C part completely, if possible.
> 
> So, I'd like to ask the question again:
> - Is it somehow possible to achieve both (1) and (2) at the same time? 
> 
> - As far as I understand, debug hooks are called not as usual Lua functions, but 
> as C functions and this is the reason why coroutine.yield cannot be invoked from 
> them. Is it correct understanding? But what if Lua could do something like this: 
> debug_hook can set a certain flag, which is known to the Lua VM. After the hook 
> returns (i.e. no C stack frames are active any more), the VM checks this flag 
> before continuing execution of the Lua code as it usually does after hook 
> invocations. If this flag is set, then Lua VM would simply execute 
> coroutine.yield() in the context of a current coroutine, as if it is a next 
> instruction in this coroutine. Would it work at all? Or may be there are some 
> very subtle details, which I overlook? 
> 
> - Another idea could be to generalize the previous paragraph in the following 
> way: Imagine, that when you define a debug hook you could optionally provide 
> pre-processing and post-processing actions for a hook. Such actions are usual 
> Lua functions. They are supposed to be executed before/after the hook is invoked 
> in the context of the currently executed coroutine. If this would be possible, 
> then my proposal from the previous paragraph would be as easy to implement as:
> 
> -- Executed on the C-stack
> local function debug_hook() 
>    do_yield_flag = 1
> end
> 
> -- Executed in the scope of current coroutine
> local function post_debug_hook()
>    if (do_yield_flag) then coroutine.yield() end
> end
> 
> local function pre_debug_hook()
> end
> 
> 
> -- Extended sethook API: pre and post hooks can be specified. 
> -- They will be executed in the scope of the currently executed coroutine, when 
> the debug hook is to be invoked  
> debug.sethook(mycoroutine, debug_hook, pre_debug_hook, post_debug_hook, 
>  "count", 1000)
> 
> 
> BTW, with this solution the Lua VM does not even need to know about this special 
> flag. It simply needs to execute a post-processing hook, if it was specified.

I'd like to report about a progress on this topic:

- I implemented the solution with pre and post hooks that I proposed above. To do this, I tweaked a bit the Lua VM. Specifically, I tweaked the place where hooks are invoked currently and added a code to invoke a pre-hook and a post-hook, if they are provided. Pre and post hooks are just normal Lua functions and they are invoked by Lua VM as Lua functions, as if they would be directly called from the user code. (BTW, this approach could be generalized to dynamically invoke Lua functions at any place in the user-code without changing the generated bytecodes and injecting any CALL instructions into it. This can be pretty interesting for implementing something like debuggers, profilers, aspect-oriented programming, etc)

- The implementation is very small and introduces just minor changes into the Lua VM, IMHO. 

- I compared the speed of this solution with a speed of a solution, where coroutines would explicitly call yield every N operations of a tight loop.  So far, the pre- and post-hook based solution seems to outperform the explicitly yielding solution by a factor of 2 on my tests, which is not bad ;-) I was actually surprised to see that it makes things faster and not slower. I think the reason for the explicitly yielding solution to be slow is that it needs to check on each loop iteration if a certain flag is set. And this check is a Lua code, i.e. it needs to be interpreted/executed by the Lua VM. The hooks based solution does perform similar checks in C, which is very efficient.

- I don't post the implementation yet, as I want to clean up it a bit first. I also would like to see if I can do a similar thing for LuaJIT.

- TODOs: 
- I need to test that I don't break any existing code with my Lua VM modifications. Are there any test suits/unit tests for Lua that I could use for it?
- I need to check what would happen in a solution where Lua and C calls are mixed together
- My solution seems to swallow and not propagate errors that happen in the prehooks or posthooks. This is not such a big problem, but may be I could fix it
- My solution may have some problems with the way how LuaVM handles RETURN bytecode instructions. If assertions are enabled, it may run into problems, because Lua VM checks that instruction which invoked current frame was a CALL instruction. But since my code just mimics the CALL of a Lua function without having a real CALL instruction in the user bytecode,  such a check would fail.

- And finally, here is some code to see how the code using this solution looks like:

local posthookCalls = 0
local prehookCalls = 0
local explicitYieldCalls = 0

local function hook() 

end



local function posthook()
    posthookCalls = posthookCalls + 1

    coroutine.yield()
end


local function prehook()
    prehookCalls = prehookCalls + 1

    coroutine.yield()
end



local MAX_ITER = 100000000

local function task1()
local sum = 0
for i=1, MAX_ITER do
    sum = sum + i

end
print("Sum1=", sum)
end

local function task2()
local sum = 0
for i=1,MAX_ITER do
    sum = sum + i

end
print("Sum2=", sum)
end


-- Test implicit cooperative multitasking
local coro1 = coroutine.create(task1)
local coro2 = coroutine.create(task2)


-- New special functions for setting prehooks and posthooks
--print ("Prehook coro1:", debug.getprehook(coro1))
--print ("Prehook coro2:", debug.getprehook(coro2))
--print ("Posthook coro1:", debug.getposthook(coro1))
--print ("Postook coro2:", debug.getposthook(coro2))
--debug.setprehook(coro1, prehook)
--debug.setprehook(coro2, prehook)
-- Register post-hooks, which should perform implicit yielding
debug.setposthook(coro1, posthook)
debug.setposthook(coro2, posthook)
--print ("Prehook coro1:", debug.getprehook(coro1))
--print ("Prehook coro2:", debug.getprehook(coro2))
--print ("Posthook coro1:", debug.getposthook(coro1))
--print ("Posthook coro2:", debug.getposthook(coro2))

-- Invoke the hook every 200 instructions for coro2
debug.sethook(coro2, hook, "", 200)

-- Invoke the hook every 100 instructions for coro1
debug.sethook(coro1, hook, "", 100)


while(coroutine.status(coro1) == "suspended" or coroutine.status(coro2) == "suspended") do
   coroutine.resume(coro1)
   coroutine.resume(coro2)
end

print ("Posthook calls using yield = ", posthookCalls)
print ("Prehook calls using yield = ", prehookCalls)


As you can see, the code is pretty simple and can be easily generalized to create a preemptive scheduler in pure Lua without any need for C code to achieve this goal.
  
-Leo
Follow-Ups:
- Re: Lua and preemptive scheduling for coroutines, Jose Luis Hidalgo
- Re: Lua and preemptive scheduling for coroutines, Thomas Jericke
References:
- Lua and preemptive scheduling for coroutines, Leo Romanoff
Prev by Date: Re: arithmetics bug?
Next by Date: Some questions on Lua's design
Previous by thread: Re: Lua and preemptive scheduling for coroutines
Next by thread: Re: Lua and preemptive scheduling for coroutines
Index(es):
- Date
- Thread