lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

On Fri, May 15, 2009 at 11:25 AM, Philippe Lhoste wrote:
> On 15/05/2009 14:14, Olivier Hamel wrote:
>> code, there's some un-optimized stuff IMO (unless I'm wrong):
> Beside, depending on platform, using an integer power operator might result
> in slower operation than several multiplications. Unless I am mistaken? Is
> it false on boards with floating-point operations?

Currently it just unrolls functions and eliminates dead ones.  There
is room for performance improvement, but note that (x*x)*(x*x) does
not necessary equal x^4 if you redefine the __mul and __pow
metamethods.  In some cases we can deduce that x is a plain number, in
which case these metamethods are not applied, but even in the case of
plain numbers, the possibility of overflow can still break some
mathematical properties.

We could just avoid optimizations that are conceivably--even
remotely--unsafe, and this is good option as a default.  A way around
the difficulty is to allow the programmer to specify to the optimizer,
such as via pragmas or switches, additional information that would
allow it to determine when certain optimizations may be safe/unsafe
(e.g. is x is a plain number, all variables named according to a given
conventions shall be assumed to be plain numbers, or all arithmetic
operations in the current file scope shall be assumed to have normal
properties).  That information would also be useful for a lint tool
(luaanalyze) or optimizations in lua2c.

For standard Lua on x86, I have found x*x to be faster than x^2 [1]
and have at times manually "strength reduced" [2] x^2, x^3 and x^4 in
code to make it run faster at the expense of being uglier.  However,
one can now avoid some of the ugliness by writing

  return pow4(x+y) + 1


  local function pow2(x) return x*x end
  local function pow4(x) return pow2(pow2(x)) end

and it will automatically reduce to

  local __v4x = x + y
  local __v5x = __v4x   -- note: this line could be eliminated
  local __v3x = __v5x * __v5x
  return __v3x * __v3x + 1

Unfortunately, the optimization pass messes up the debug info (line
numbers and variable names).  One way to mitigate the line number
problem is for the translator to add empty lines and put multiple
generated statements on the same line, in such a way as to preserve
most line numbers.  It may also name the temporaries in a more
meaningful way:

  local x = setmetatable({}, {__add=function() return nil end})
  local y = x
  local __x_plus_y = x + y; local __pow2_x_plus_y = __x_plus_y *
__x_plus_y; return __pow2_x_plus_y * __pow2_x_plus_y + 1

which would raise the somewhat meaningful error

  lua: 1.lua:3: attempt to perform arithmetic on local '__x_plus_y' (a
nil value)
  stack traceback:
          1.lua:3: in main chunk
          [C]: ?

A patch to Lua would allow debugging information to be injected into
the source code, somewhat like the C preprocessor [3]:

  --! __LINE__=3
  local $"x+y" = x + y
  local $"pow2(?)" = $"x+y" * $"x+y"
  return $"pow2(?)" * $"pow2(?)" + 1
  --! __LINE__=4

Metalua resorts instead to writing its own bytecode generator.