lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Tue, 22 Mar 2011 18:25:46 +0200, Luiz Henrique de Figueiredo <lhf@tecgraf.puc-rio.br> wrote:

1. The Makefile assumes gcc. So we might as well use all gcc-specific flags to get a better Lua core and interpreter, for some definition of "better".

How about -mtune=native? It seems safe enough.
Is it implied by any of the -O options?


If you're assuming gcc, you could also get "better" code by using function attributes. As an example, this is what happens when you add "flatten" to main interpreter function on x86:

add to Makefile:

# for gcc
CFLAGS+=-D'FLATTEN(x) x __attribute__((flatten))'
# for others
#CFLAGS+=-D'FLATTEN(x) x'

add to lvm.c:

FLATTEN(void luaV_execute (lua_State *L, int nexeccalls));

some crude benchmarking:

$ gcc --version
gcc (Ubuntu/Linaro 4.4.4-14ubuntu5) 4.4.5

$ ls -l lua lua-flat
141012 lua
149204 lua-flat

$ time ./lua ~/P/fannkuchredux.lua 9
8629
Pfannkuchen(9) = 30

real    0m1.318s
user    0m1.312s
sys     0m0.004s

$ time ./lua-flat ~/P/fannkuchredux.lua 9
8629
Pfannkuchen(9) = 30

real    0m1.087s
user    0m1.084s
sys     0m0.004s

$ time ./lua ~/P/binary-trees.lua 14 stretch tree of depth 15 check: -1
32768    trees of depth 4        check: -32768
8192     trees of depth 6        check: -8192
2048     trees of depth 8        check: -2048
512      trees of depth 10       check: -512
128      trees of depth 12       check: -128
32       trees of depth 14       check: -32
long lived tree of depth 14      check: -1

real    0m5.219s
user    0m5.200s
sys     0m0.020s

$ time ./lua-flat ~/P/binary-trees.lua 14
stretch tree of depth 15         check: -1
32768    trees of depth 4        check: -32768
8192     trees of depth 6        check: -8192
2048     trees of depth 8        check: -2048
512      trees of depth 10       check: -512
128      trees of depth 12       check: -128
32       trees of depth 14       check: -32
long lived tree of depth 14      check: -1

real    0m5.043s
user    0m5.036s
sys     0m0.000s