lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Johnson Lin wrote:
> This line:
> new[y][x] = old[y][x] > 0 and rule1[count] or rule2[count]
> 
> will slow down the code much. While testing these scripts I got a
> typo, and it becomes
> 
> new[y][x] = old[y][x] and rule1[count] or rule2[count]
> 
> Although the result is obviously wrong, however the speed is ok. I
> don't know what's going on with that "greater than", since using FFI
> VLA it will be sometimes slower than plain Lua table (as I reported in
> the last post, the worst case.)

The reason for the slowdown is quite subtle:

  new[y][x] = expr

is really implemented as:

  local tmp1 = new[y]
  local tmp2 = expr
  tmp1[x] = tmp2

Now, if the expression contains a comparison, a new snapshot is
added to the trace somewhere in the middle. This means tmp1
escapes through the snapshot and can't be eliminated. Thus the
allocation of the intermediate reference is always performed and
you get the GC overhead.

[In the future the compiler will sink those allocations into the
side exit, which solves that case. But this is quite tricky.]

No snapshot is added for the 'old[y][x] and ...' conditional,
because this is a simple type check. Also, it's always true here,
because the result of the load from the FFI data type is known to
be a number. There's no snapshot the allocation could escape to,
so it can be eliminated and performance is much better.

You can work around the problem with:

  local res = expr
  new[y][x] = res

But the variant without a conditional expression is much better,
anyway.

--Mike