Hi,
While writing a performance testing tool, I chose Lua to describe a 
set of tests that
run in parallel.
So far things are looking great, as far as both available cpu and 
memory usage
are concerned: after increasing LUAI_MAXCSTACK, I've been testing with 
300.000
concurrent and mostly idle threads, created with lua_newthread(). 
Memory usage
for these is about 350M, i.e. a little over 1KB per created thread.
After a glance at the lua_State definition, it looks like there a 
couple of members
that could be left out but nothing that would noticeably reduce 
sizeof(lua_State).
Has anyone managed to create something on the order of 1,000,000 
coroutines ?
If so, any tips on what to leave out from lua_State (or other places) 
would be
appreciated.
Cheers!
Bogdan