[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: another try at multithreading
- From: Asko Kauppi <askok@...>
- Date: Wed, 25 Jun 2008 22:42:07 +0300
Commenting the plua.txt portion (below).
The style of programming you seem to be after is similar to what the
TBB (Intel Threading Building Blocks) C++ library offers. But it
offers more. It is built on the concept of splitting problems into
smaller chunks, until enough parallelism has been reached to fully
load all the cores of the CPU. I like it.
But I don't see much of that benefit coming from the syntax you
described - recommending reading about TBB just to broaden the way we
think about parallelism. Here:
http://www.intel.com/cd/software/products/asmo-na/eng/294797.htm#overview
-asko
Augusto Radtke kirjoitti 25.6.2008 kello 20:07:
Parallelism in Lua:
===================
parallel do
-- block, every chunk is dispatched in parallel, synchronization at
the end.
end
What exactly should this do? If all the chunks get same values,
they'd just do the same thing.
Looks like SIMD processing to me, but then the data sets should be
different, somehow. Like each of the do-chunks having a separate index
(1..n) or something.
Where do you state the value of N?
To do this in Lanes:
f= lanes.prepare( do
-- block here
end )
local h={}
for i=1,10 do
h[#h+1]= f(i) -- each chunk gets their index here
end
-- does not wait for finishing, reading 'h[1..10][0]' will do it
parallel while <condition> do
-- every while test dispatches a parallel chunk, test the while again
-- and when the condition is false the chunks are synchronized.
end
Again, what's the point if the chunks are having same instructions,
same data. Either instructions or data need to be different.
parallel for k in t do
-- t is a table
-- for each key in table, dispatch in parallel, synchronization at
the end.
Lanes (with same 'f' as above):
local h={}
for k,v in pairs(t) do
h[#h+1]= f(k,v)
end
lanes.wait( h ) -- Note: current Lanes does not wait like this for N
threads, but it could
-- I would just leave them running, until their results are needed
parallel for k,v in t do
-- t is a table
-- for each key,value in table, dispatch in parallel,
synchronization at the end.
end
What's the intended difference between "for k" and "for k,v" ?
parallel for i=1,10,2 do
-- i is a counter
-- 10 is the limit
-- 2 is the step
-- dispatch in parallel, synchronization at the end.
end
-- this is a fool example, no logic to be parallel, but to show the
syntax
function func(arg1, arg2, arg3)
a <= arg1 -- block until receive fro the channel
b <= arg2 -- block until receive fro the channel
c <= arg3 -- block until receive from the channel
return a+b+c
end
In Lanes:
-- h1, h2, h3 are lane handles (cannot say 'thread' since that would
mean coroutine)
-- could also be communication FIFOs (then 'h1:receive()')
--
function func( h1, h2, h3 )
a= h1[0] -- blocks until h1 finished
b= h2[0]
c= h3[0]
return a+b+c -- or just 'return h1[0] + h2[0] + h3[0]'
end
Futures - the ability of using lane handles to both fetch their
results but also to wait for them - is a very nice feature.
Essentially that is what makes parallelism just "merge in" into our
programming without making it a Big Deal. Me thinks.
c = thread func(arg1, arg2, arg3)
c is a default communication channel when you use the "thread"
keyword/dispatch model, something like stdin/stdout.
arg1 <= 1 -- send only blocks if the channel is full (reader didn't
read the last data)
arg2 <= 5
arg3 <= 3
r <= c -- r equals 9, that will block until the function returns
???
synchronization:
==============
lock
-- chunk, if another thread try to execute block until the first
one finish
unlock
lock a -- a is a variable, same behavior of chunk lock
unlock a -- a is a variable
channels:
=======
new object type, channel.
instead of regular assignment, data sending/reading: a <= value
(send), value <= a (read)
any lua type can be sent, including user datas.
Lanes (2008 version):
a= lanes.fifo()
f(a) --launches sublane, gives access to the FIFO
... ...
a:send( value ) ...
... ...
... value= a:receive()
I think this is actually more readable. Also, you will want to have
timeout parameter for 'receive', and the ability to wait for multiple
objects.
value= lanes.wait( a, b, c )
Despatch:
=========
a) threads (green local and SMP)
b) multiple network nodes (interconnected lua VMs
Memory:
========
for a) shared memory
Lanes (2008 version):
K= lanes.keeper()
f(K) -- launches sublane, gives access to the keeper table
... ...
K.something= 42 ...
... print( K.something ) -- 42
for b) nil, integer, string copied between nodes
table, userdata e function remote reference
for data copying use channels.
node connection:
==================
the node connection should be outside the language, using modules.
the lua interpreter can support a default IPv4/IPv6 module:
# default to some port
server1# lua --server &
server2# lua --server &
server3# lua --server &
client1# lua --server file.lua