lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Mike Pall <mikelu-1104 <at> mike.de> writes:

Thank you for your prompt reply! My replies below:

> The problem in your case are the Lua semantics for __newindex:
> they require checking that the table _doesn't_ contain the key
> before calling the __newindex metamethod.
> 
> So it's really doing this:
> 
>   for i=0,n-1 do
>     if rawget(x, i) == nil then
>       Alg.setVal(x, i, incr)
>     else
>       rawset(x, i, incr)
>     end
>   end
> 
> The JIT compiler realizes the array part of the Lua table 'x' is
> empty and it doesn't need to be checked. But the hash part is not
> empty. Any key could be present or not -- it must be checked with
> a (costly) lookup in the hash part. I.e. the rawget() cannot be
> optimized away.
> 
> E.g. try adding 'ret[12345] = 1' at the end of Alg.newvec. This
> index will later be overwritten with a new value and Alg.setVal
> won't be called for this key. There's no way the JIT compiler can
> know a priori that no such key is present in the Lua table.
> 
> Same issue with __index. These are standard Lua semantics and not
> something I can change.

That's very clear now, thank you for the insight.

However, I then tried the following alternatives which result in empty tables 
(so luajit should be able to avoid checking the hash part of the table as well, 
right?) but I keep getting inferior performance compared with direct access:

-- I think this one is preferable 
local Vec = {}
function Vec.new(n) 
  local storage = ffi.new("double[?]", n) 
  local ret = {}
  -- {__index=storage, __newindex=storage} seems to works equally well...
  setmetatable(ret, 
  { __index = function(x, i) return storage[i] end
  , __newindex = function(x, i, v) storage[i]=v end })
  return ret
end

local Vec2 = {}
function Vec2.new(n) 
  local mt = {}
  mt.storage = ffi.new("double[?]", n)
  mt.__index = function(x, i) return mt.storage[i] end
  mt. __newindex = function(x, i, v) mt.storage[i]=v end
  local ret = {}
  setmetatable(ret, mt)
  return ret
end

Do you have any insight on these as well?

> 
> You could use a userdata to avoid this, since indexing userdata
> calls __index or __newindex without any steps inbetween. But that
> has it's own issues. As has been pointed out, the best solution is
> to directly access the cdata.

To reply to Christoph as well, the point is that if I want to keep the cdata 
separate I need to pass around extra variables everywhere (in the example above 
I need to pass both the vector and its size, but for matrix view I would have to 
pass much more, see below).
It's a bit like the difference between a C-style dynamic array and its C++ 
equivalents, where the vector object also knows about its size and other 
relevant stuff...
I also need to mention that I operate on views on the data (sub-parts of the 
dense vectors and matrixes).

To be more precise, here is what I would like to implement with a speed not far 
from the C++ implementation (this is what I currently have in the C++ version of 
the library), skipping "constructors":

-- For x vector or vector view
y = x[i]
x[i] = y 
n = x:size() -- size(x) or x.size are fine alternatives
n = #x -- same as x:size()
n = x:stride() -- always 1 for vector but not for vector view (for instance if 
it is the view of a column of a matrix)

-- For x matrix or matrix view
y = x[row][col] -- worried about this one
x[row][col] = y -- worried again :)
n = x:rows() 
n = x:cols()

For a matrix view, x[row][col] is computed as 
"m_ptr +row*m_stride +col"
where m_ptr is the ptr to the top-left element of the view, and m_stride is the 
number of columns of the matrix to which the view refers to.

> 
> [That said, I'll likely add metamethods to cdata soon. This will
> give you more flexibility with indexing, too. But abstractions
> _do_ have a certain cost. Not everything can be optimized away.]

That would be very helpful indeed, and for the vector and vector view I think I 
should be able implement everything mentioned above via __index, __newindex, 
__len, and x.stride.
I am a bit more worried about the x[row][col] syntax, but maybe it's doable as 
well if x[row] returns the "pointer" to the element shifted row positions 
downward with respect to the top-left element of the view which then gets 
shifted rightward col elements (sorry for the imprecise language, I hope you got 
the idea :P)

> 
> > Running luajit -jv gives "NYI: return to lower frame" warnings but I cannot
> > interpret these.
> 
> This is normal. It's just the region selection heuristics at work.
> It figures out the correct trace to compile right after that.
> 
> --Mike


Thank you for all the explanations!

Best regards,
KR