lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On 1/27/14, Tim Hill <drtimhill@gmail.com> wrote:
>
> On Jan 27, 2014, at 2:58 AM, Dirk Zoller <duz@sol-3.de> wrote:
>
>> Hi all,
>>
>> in my application, many many little problems are tackled by multiple
>> threads. Currently, I have a lua_State per problem domain and thread. This
>> is in the order of thousands of lua_States which work in like 16 threads
>> on those many little problems.
>>
>> I could do with only one lua_State per thread.
>> That would make my program less complex which is better.
>>
>> Before I change my program towards fewer lua_States, I'd like to hear your
>> opinion on this. Will the reduction of thousands of lua_States, each
>> working with one script on a small problem, down to one lua_State per
>> thread, each working with several scripts on many small problems, will
>> this be beneficial or harmful?
>>
>> Does Lua suffer from scalability problems which I didn't run into yet
>> because I chopped the work into so many small pieces?
>>
>> Thank you very much for any insights you can share on this.
>>
>> Dirk
>>
>
> This is a little tricky to answer definitively. For our stress testing of
> Lua we have run up to 10,000 states with no real problems, so nothing will
> actually break using your current fine grained model. However, if the
> computation done by each state is very simple then the amount of time the OS
> spends switching threads may become a significant percentage of total CPU
> time (perhaps even 50%). In this case, re-factoring the code as you suggest
> (fewer states) might actually increase overall throughput (or, manage the
> same throughput with lighter load on the CPU etc).
>
> One advantage of Lua when used in this way is that, since the core VM is so
> small, pretty much all the VM can fit in the L1/L2 caches of an x86 class
> CPU, meaning the performance of running VM instructions starts to get *very*
> fast. And if all your threads are sharing the same VM code (which they will
> be if they are all running in the same OS process), then you won’t get a
> cache warming hit when switching threads. The net-net is you should get good
> performance either way, with a slight boost if you go for “fewer, bigger”
> states.
>
> —Tim
>

Great info, Tim. Might I add/suggest though that "fewer, bigger" is a
false dilemma for getting around the context switching bottleneck? It
seems to me that the real problem is running more threads than
cores/CPUs. If you have a scheduler, preferably one that is aware of
current hardware availability (thinking of Apple's Grand Central
Dispatch as one example), then you avoid overwhelming your hardware
resources and you don't necessarily have to redesign/refactor code to
make "fewer, bigger" Lua states.

Also, if anybody finds this data point useful, I measured a single
baseline Lua state to take about 4-5KB of RAM on a 64-bit Mac.

-Eric
-- 
Beginning iPhone Games Development
http://playcontrol.net/iphonegamebook/