lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


I'm trying out a concurrency approach.[1]

I was wondering about threads and garbage. That is, if I keep calling 'create(fun)', will it keep making new threads, even when prior threads died. I assumed 'yes' and then wrote a library to create a thread pool.

I had wondered if there was an indirection in Lua's implementation that made this unnecessary. That is, maybe Lua was holding onto a thread that was dead and returning that to use with the new function, thus reducing the amount of potential garbage, at the expense of the book keeping.


To test this, I made a benchmark that creates a thread, resumes it, asserts that the thread it resume died, and then repeats for `count` iterations:

```
local printf = function(str, ...) print(str:format(...)) end
local create, yield, resume =
coroutine.create, coroutine.yield, coroutine.resume

local isyieldable, status =
coroutine.isyieldable, coroutine.status

local f = function(i)
return math.random(os.time() + (i or 1)) 
end

local test_garbage = function(name, f)
local count = 2 ^ 12 // 1

collectgarbage("collect")
local start_gc = collectgarbage("count")
collectgarbage("stop")
for i = 1 , count do
f(i)
end
local stop_gc = collectgarbage("count")
collectgarbage("restart")
collectgarbage("collect")

local bytes_per_thread =  (stop_gc - start_gc) * 1024 / count

printf("%s[%d]: start:%d stop:%d bpt:%f", 
                name, count, 
start_gc, stop_gc, bytes_per_thread)
return start_gc, stop_gc, bytes_per_thread
end

test_garbage("naked",
function(i) 
local thrd = create(f)
resume(thrd, i)
assert(status(thrd) == "dead")
end)
---> naked[4096]: start:210 stop:4178 bpt:992.000000
```
So, this tells me that every time create is called, a new thread is made, even if prior threads are dead, which was suspected.

To see what the memory impact was, I wrote a library that creates that indirection. The relevant bits are something like:

```
local Threadex = {}; Threadex.__index = Threadex

local function rcv_task(task, ...)
if type(task) == "function" then
return task(...)
elseif task == nil then
return nil
else
error("Expected a function or nil at #1. Received %s" % type(task), 2)
end
end

----> This next function is the indirection. It is what is encapsulated by the thread. Whatever it receives
----> is the coroutine (task), which runs through `rcv_task`, which will simply return nil if given nil. This could be extended
---> to include support for signaling, etc.
local function thread_main (task, ...)
return thread_main(yield(rcv_task(task, ...)))
end

function Threadex:new_thread()
local new_thread = setmetatable({}, Threadex)
new_thread.__thread = create(thread_main)
return threadex_t(new_thread)


end


local thread_pool = setmetatable({}, {__mode = "k"})

function Threadex:get_thread(t)
--this stores ready-made threads weakly in the keys.
local thread = next (thread_pool, t) --will be nil on first call 
        local s = thread and status(thread) or nil
if s == nil then
thread = assert(self:new_thread())
type.assert(thread, "threadex")
thread_pool[thread] = true
elseif s == "dead" then
--purge dead thread
thread_pool[thread] = nil
return self:get_thread()
       elseif s == "suspended" then
return thread
       else --the thread is in use, go to the next one...
           return self:get_thread(t, thread)
       end

end
---Run a function/coroutine in its own thread.
function Threadex:call(task, ...)
        --Get a recycled threadex thread, or a new one.
        --This function should not be called outside of `call`, 
        --because threads in use would need to be iterated over.
local thread = self:get_thread()

---> the resume function is omitted for brevity. It just resumes the thread (self.__thread) 
---> and some other things that aren't relevant to this post.

return thread:resume(task, ...)

end
```

So, testing it revealed:

```
local Threadex = require'threadex'

test_garbage("cached",
function(i) 
Threadex:call(f, i)
end)
--->cached[4096]: start:257 stop:260 bpt:0.775391
```

So, obviously there isn't anything that is useful in this toy example, except to show that recycling the thread by running the thread's coroutine within a function has a big impact on memory, specifically if you make a lot of short-lived threads. [2] 


[3]

--Andrew


[1] I love the libraries that are already out there. I need to more deeply understand concurrency and its effects, before I can use an abstraction that makes it "easy" for me, so I'm writing own, for now.

[2] There are many examples of this, but the one in front of me is when I want to wrap the "return false until you have a result" style functions in a coroutine that yields to the parent thread until a result comes in and then exits with that result as the return value.

[3] Not only do I proclaim that many other people already know this, but I'm also guilty of not checking to see if there was a prior thread because sometimes recycling is good.[4]

[4] Pun intended.