[Date Prev][Date Next][Thread Prev][Thread Next]
- Subject: Threaded vm core loop (Was: Re: how does lua arrange vmcase?)
- From: gz@...
- Date: Wed, 05 Apr 2017 14:28:31 +0200
Am 2017-04-04 18:10, schrieb Luiz Henrique de Figueiredo:
It's mostly the order the opcodes are listed in the enum defined in
A switch whose cases are in increasing order should be easily handled
the C compiler, which can generated a simple jump table. More
if the cases form an interval [1,n] with n small, then a simple jump
table works, even if the cases are not ordered.
And the compiler does so quite efficiently. After stumbling upon this
I had a go at making the core vm loop of lua 5.3.4 into a direct
threaded interpreter. The change itself was pretty easy, basically it
involved turning the big switch statement in luaV_execute() into a
jumptable and a goto. This did result in a speedup, but the results are
less than spectacular. On the average, a speedup of about 3% has been
gained across a selection of benchmarks from here:
http://benchmarksgame.alioth.debian.org/u64q/lua.html (with output and
gc turned off), but for some cases the modified interpreter was even a
little bit slower than the vanilla lua interpreter. An interesting
observation was that the distribution of speedups across the benchmarks
were quite different on the 2 different processors I could test this on,
although the average was the same.
So, in conclusion, this was an interesting exercise, but compilers and
processors have advanced quite a bit since the paper was written, and so
the results are not worth maintaining a custom code base.