lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

On Fri, Nov 7, 2008 at 11:41 AM, Mike Pall <> wrote:
> Javier Guerra wrote:
>> time luajit -e '...'
> Please use: luajit -O -e '...'


all times go down:

time luajit -O -e 'local a=1.5;local f=math.floor;for i=1,1e8 do local
t=(f(a)==a) end'

real    0m1.777s
user    0m1.664s
sys     0m0.004s

time luajit -O -e 'local a=1.5;for i=1,1e8 do local t=(a%1==0) end'

real    0m1.280s
user    0m1.212s
sys     0m0.000s

time luajit -O -e 'local a=1.5;for i=1,1e8 do local t=((a+2^52)-2^52==a) end'

real    0m0.894s
user    0m0.868s
sys     0m0.000s

- math.floor() regains lots of ground, again it's comparable to the
other contenders
- Mike's insightful bit-twiddling is again the leader, 30% faster than a%1==0
- a%1==0 is still more readable than (a+2^52)-2^52==a

>> [...] t=((a+2^52)-2^52==a)
> This is not representative because conditionals are turned into
> booleans using a two-way branch. You really want to test something
> like:
>  if (a+2^52)-2^52 == a then break end

so you're telling that not storing a boolean lets the compiler skip
the boolean type?  nice to know.  (not that i write code so tight that
it would matter)

>> - both a%1==0 and Mike's hack are much faster than math.floor()
> Note that a%1 is internally expanded to a-floor(a/1)*1. The
> difference is in the overhead for calling a C closure in the Lua
> interpreter. LuaJIT 1.x inlines both (with -O).

that explains why the -O does help math.floor(), while non-O did so little...

> Oh, and I wouldn't call the +-2^52 trick a hack. In fact all
> modern C compilers use a variation of it to implement floor(),
> ceil(), trunc() and round() if the FP is done in hardware, but
> doesn't have a special instruction for rounding (e.g. x87 does,
> SSE2 and most other FPUs don't).

as Roberto said, hackness is in the eye of the beholder; and i suspect
very few Lua coders are comfortable meddling in FP bit layout

> BTW: All loops are the same speed in LuaJIT 2.x. The condition is
> loop-invariant and hoisted, so you end up with an empty loop.
> You'll need to get a bit more clever with those microbenchmarks ...

yep, i can't wait to see all my tests suddenly go 0-time!