lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]




On 03/02/16 08:28 PM, Nagaev Boris wrote:
Hi,

computed gotos are a feature of modern compilers [1]. They can be used
as a faster replacement for switch-based VM [2]. Many programming
languages have VMs implemented in computed gotos (at least, Ruby and
Python).

The computed goto version is faster because of two reasons [2]:

   * The switch does a bit more per iteration because of bounds checking.
   * The effects of hardware branch prediction.

I have applied this optimization to VM of Lua 5.3.2, file src/lvm.c
[3]. It was very easy, because VM uses macros vmdispatch, vmcase and
vmbreak. I have redefined these macros and created a dispatch table.

It passes Lua basic tests (path/lua -e"_U=true" all.lua) [4].

My benchmark [5] shows speedup of 1.12:

$ time ./src/lua.orig ~/lang-bench/f3/test.lua
499500

real    0m28.208s
user    0m28.166s
sys     0m0.004s

$ time ./src/lua ~/lang-bench/f3/test.lua
499500

real    0m25.066s
user    0m25.030s
sys     0m0.000s


This commit is a draft, not a final contribution. If a compiler
doesn't support computed gotos, the switch based implementation should
be used, as before.


[1] https://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html
[2] http://eli.thegreenplace.net/2012/07/12/computed-goto-for-efficient-dispatch-tables
[3] https://github.com/starius/lua/commit/b10eb122ad1e6f94e3ebf100179515ed66921756
[4] http://www.lua.org/tests/#basic
[5] https://github.com/starius/lang-bench/blob/master/f3/test.lua

So the C equivalent of NoVM[1]? (NoVM, in Lua and functional languages, basically builds a linked list of functions, which use tail-calling to go to the next instruction)

[1] It's a name I came up with. Basically means a VM without what we consider "the VM", that is a VM without an interpreter loop. (Physical machines use what's equivalent to an interpreter loop, so it makes sense for VMs to use interpreter loops. A loop-less VM doesn't sound much like a real machine.) This is actually much better on JITs, yielding HUGE performance improvements compared to interpreter loops. Tested with LuaJIT. It also lets you add features you couldn't otherwise.

--
Disclaimer: these emails may be made public at any given time, with or without reason. If you don't agree with this, DO NOT REPLY.