lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Well, it's just a proof that _javascript_ is extremely well designed and powerful as it allows supporting any existing language, and even virtualization (it can also virtualize itself) independantly of how the native machine was actually built. This means that we'll see now machines specially optimized to support _javascript_ almost natively insteald of being built natively to give natural support to C. This could also change the paradigm about how processors are specially configured in their firmware: it could be as well reconfigured instantly or dynamically to support multiple instruction sets.

Virtual machines are now the goals in many new development: because _javascript_ is still much easier to formalize, it can be used to detect bugs in software that would be hard to find manually.
Lua is a good language but for now it still does not propose a very good implementation of its VM. And may be, instead of porting Lua to native C, it would be highly valuable to port it to sucessful VMs : for _javascript_ (V8, Mozilla), Java (Oracle JVM, or Google's VM for Android), Python... When we know that all these VMs can also integrate the other ones.

The Lua VM however still has some progresses to do: its model for pseudo-"threads" (coroutines) is limited to cooperation, and still does not work really with full reentrance of Lua programs themselves, even if there can be several Lua engines working in true threads,  but completely separately, it is still not reentrant except for its integration on a true multithreading system, because it does not have native support for basic elements: notably mutexes, or critical sections to build mutexes and create atomic operations that annot be broken and splitted by competing (non-cooperating) threads.

The cooperatiing-only model suffers of a wellknown problem: Lua threads can bring competing thread to starvation, stealing all resources for itself (so it is highly exposed to DOS attacks). As well Lua still does not allow limiting the resources allocated to a coroutine (the only limit is their initial stack size, and we know now that objects can be allocated in the heap without real limits to each one, by giving them a limit to their own "local" heap)

As well there's no limit on the number of "threads"/coroutines a single "thread"/coroutine can create, this is not controled by the parent thread creating a new thread) they should be created or configured in their "*State" by using their own dedicated allocator before yielding to them, and no thread should be allowed to change these limits for themselves. However this should be possible by using "void lua_setallocf (lua_State *L, lua_Alloc f, void *ud);" just after using "lua_State *lua_newthread (lua_State *L);" so that the parent thread can enforce these limits (only the child thread will fail if its allowed resources are exhausted, and in such case the configured allocator should be able to given control to other threads so they are no longer blocked, so all other threads will continue without being affected. The child thread will have to use "pcall()" to recover from these memory exhaustion or in case the allocator returns nil to them causeing them to call error() and then exit if there's no other error handler (i.e. a parent pcall() in their stack).

Finally the GC will collect back all these resources, trying to finalize them (it may run in an infinite loop unless the parent thread has configured the allocator to delay repeated resume() calls with an exponential time between each GC cycle for the same "dead" thread; the loop can then be controled, the dead thread will rapidly become permanently blocked, the OS will page out this unused memory and nothing worse will happen, until the Lua's host process is terminated).

How can we control the resources used in a child thread/coroutine; we can use a C function like this:

   typedef void * (*lua_Alloc) (void *ud, void *ptr, size_t osize, size_t nsize);
   "The type of the memory-allocation function used by Lua states. The allocator function must provide a functionality similar to realloc, but not exactly the same. Its arguments are ud, an opaque pointer passed to lua_newstate; ptr, a pointer to the block being allocated/reallocated/freed; osize, the original size of the block or some code about what is being allocated; and nsize, the new size of the block. When ptr is not NULL, osize is the size of the block pointed by ptr, that is, the size given when it was allocated or reallocated. When ptr is NULL, osize encodes the kind of object that Lua is allocating. osize is any of LUA_TSTRING, LUA_TTABLE, LUA_TFUNCTION, LUA_TUSERDATA, or LUA_TTHREAD when (and only when) Lua is creating a new object of that type. When osize is some other value, Lua is allocating memory for something else."

It is possible from C only, but I don't find any equivalent API in the standard Lua library that can be used from Lua instead of C (no equivalent in the "coroutine.*" package or in metamethods for threads according to https://www.lua.org/work/doc/manual.html#2.4).

We should be able to configure in Lua the thread object returned by "thread coroutine.create(function f)" to set its allocator or a metamethod that will be called each time the thread (will try to allocate something). The only thing we can configure for now is its "__gc" metamethod for finalizers.

We need some "__alloc" metamethod used by the Lua VM just before effectively using the parent thread's Allocator, or that will be used by the default Allocator C function configured in the thread's "State", even before really allocating the object and initializing it. That metamethod in Lua should receive at least the three parameters "ptr, osize, nsize" specifying the reference of the object being reallocated/freed (or nil otherwise for new objects), and the old and new sizes (when the first parameter is nil, the second specifies the type of object being allocated: "string", "table", "function", "userdata" or "thread": normally strings, functions, and threads cannot be reallocated but only freed, so the old size is not significant for them, only the new size 0 is indicating they are being freed; but for tables or userdata, the old size makes sense as well as the new size for all of them, which will be 0 when freeing objects before finalizing theml with the "__gc" metamethod). This "__alloc" metamethod cannot perform the actual allocation or freeing, it just has to return a boolean status indicating if the object (still not initialized when oldsize is 0) will be allocated by the parent, or if the parent allocator will not be used and a "nil" value or error() will be returned to the child thread (if this "__alloc" metamethod allowed the allocation, then the parent's Allocator may still fail to allocate the object, in which case it must call once again the "_alloc" metamethod to notify that the desired size for the object was in fact not allocated and must no longer count as used resources).

The "__alloc" metamethod should be useful mostly for "thread" objects but should be used as well for "string", "table", "function", "userdata"; however a "string" has no metatable: it is instead stored in an index position of a "table" storing all strings, but we have a way to still assign metamethods to strings and userdata objects using "debug.setmetatable (string/userdata/thread/function, metatable)" instead of "setmetatable (table, metatable)" and then measure and control and limit their allocated sizes in all cases.

Another thing missing in Lua thread states is a status "paused" which allows a parent thread to know that it must not resume() a thread because its delay before running again is not passed. But this delay can be configured and tested in the metatable of the thread (using the debug library to define it).

An  alternative would be to set a custom "registry" (see https://www.lua.org/work/doc/manual.html#4.5) instead of the (debug) metatable associated to each thread, but this can only work at global level to provide some defaults for all threads if the "__alloc" and "__paused" metamethods are not defined in their metatable; the registry cannot be directly used in Lua in a safe way to properly isolate child threads between each other. The registry is reserved for use in the C environment in which the Lua VM instance is configured and running, and it should not be exposed to Lua programs.





Le mar. 27 nov. 2018 à 18:27, Javier Guerra Giraldez <javier@guerrag.com> a écrit :
On Tue, 27 Nov 2018 at 14:10, Hugo Musso Gualandi
<hgualandi@inf.puc-rio.br> wrote:
>
> These _javascript_ JITs change all the time. Nowadays v8 uses an interpreter as its first stage.

heh, i remember when V8 was introduced, they bragged how it "compiles
everything straight to optimized machine code" on first load.  but
yes, it has to change continuously to keep up.

mozilla also has done a _lot_ of variations of JIT, classic function,
tracing, metatracing, procedure-based (again), pseudoclasses, phased
optimizations....  all while adopting asm.js and turning it from a bad
joke (hey, i can compile C to JS!), to a nice VM in WASM....

there's no limit on what people are willing to do to keep using JS
without actually writing JS!

--
Javier