lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


David Given wrote:
> Mike Pall wrote:
> > r=a%b; d=(a-r)/b    |  3.76  |  0.84  |
> 
> Is LuaJIT intelligent enough to spot a % and / near each other and
> optimise them into a single divide-with-remainder operation?

LuaJIT 2.x does that, but LuaJIT 1.x doesn't.

Since I had no explanation for this low timing myself, I've
experimented with different dividends and divisors and found some
cases where this particular timing goes up to 1.41 (for two
divides per iteration). This is more in line with the documented
worst-case scalar division throughput for a 45nm Core2:
  1e8 iterations at 0.7s = 7ns/iteration = 21 cycles at 3 GHz

Looks like this new-fangled CPU is pretty clever at optimizing
divides with a low number of one-bits:

  http://www.hardwaresecrets.com/article/434/4

And the SSE4.1 roundsd is around ten times (!) faster than the x87
control-word juggling and frndint required to implement floor(). :-)
Will definitely use this in LJ2 (based on CPU feature-detection of
course).

> > Well ... maybe Lua needs a floor-division operator '//'?
> 
> It is the *one* thing that's missing from a all-numbers-are-doubles
> system; all the other arithmetic operations are fundamentally the same
> for both types, but division isn't.

Ask Roberto often enough and it may end up in 5.2. :-)

--Mike