I'm trying to think through what kinds of things might go wrong if one were to just create a collection of coroutines using new_thread(), and then run them in parallel using C multi threading.
Conversely, I'm wondering if there's a subset of Lua VM ops that are already (nearly) thread safe, and if there might be relatively easy ways of exploiting this.
For example, I'll often run through the entries of some table, looking for the key/value pair that gets the highest score from some kind of valuation function. I'm becoming suspicious that as long as the GC is off, and my valuation functions are sufficiently well behaved, I could perhaps safely parallelize these searches.
One hole I see in this concept is luaC_newobj, which looks like it would need to be serialized to prevent temporary tables from dropping off the allgc list. That's a pretty trivial diff though. luaS_newlstr would potentially also need a lock. And luaD_throw would need some extra logic of some sort. (That's relative to the 5.2 sources.)
The biggest problem seems to be table writes, as simultaneously writing new entries into the same table from different threads is obviously a recipe for disaster. (In fact, even reading while writing is probably a disaster.) There are various ways I might be able to ensure that table sets were in fact thread-safe-ish, though that would probably mean adding a per-Table mutex, which would lead to a bit of memory bloat; in addition to a lot of locking/unlocking overhead for table read/writes that were, most of the time, not in danger of doing anything troublesome.
But, if I'm imagining a parallel evaluation mode as something that would only run some of the time, I could disable all of this extra locking logic when running inside a single thread. Or I could even just write up some profiling code that would detect the case where multiple threads had written to the same table during a parallel evaluation; use that code to test whether or not my evaluation functions were in fact "well behaved", and then run with the safeties off for maximum speed...
luaT_gettm() looks like it might still cause some kinds of trouble (as it's a place where reading from a table can actually write data to that table value). But after looking a little closer I'm hopeful that caching logic it's used to implement may be harmless, provided that the "no parallel writes to shared tables" rule holds. (Worst case, we'll just end up writing the same value into the same flags variable multiple times.)
-Sven