lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


> The offset calculation and the multiplication with the stride
> cannot be hoisted. And they are performed as FP calculations and
> need to be converted back to integers. That's rather expensive.

Is this because I have defined the offset as a lua variable instead
of, say, a C-code int (const static?) variable?
However I am puzzled then at why the C-style (no stride nor offset)
array read/write test runs exactly as fast as the custom matrix class
(offset and "stride" multiplication) read/write test (the number of
elements in the vector and the matrix is the same).
I am starting to doubt my benchmark code :P

> I question the wisdom in adding this overhead to a standard vector
> class. I suggest to create a separate class for these use cases.
> Similarly, you can remove the dispatch overhead related to the
> bounds-checks by creating a subclass. Yes, the dispatch check can
> be hoisted in this case, but separation of concerns is valuable
> nonetheless.

Yes, only vector views of matrix columns needs the stride (and I will
remove the offset), so I can introduce simpler "dense" vector and
vector views with just pointer to data and size.
I am not sure about what you mean by subclassing to avoid the dispatch check.
The way I factored the code was to avoid code duplication.

>> At the moment, the only issue is that I cannot enable bound checks on the column
>> index in
>> m[row][col]
>> as this would require having
>> m[i]
>> be defined (for numeric types) as
>> m:row(i)
>> At the moment the (temporary) vector view that is created is not optimized away.
>> It is my understanding that there is a plan to implement this kind of
>> optimizations in LuaJIT in the future. Or is the struct object vecview_t too
>> complex to be optimized away in any case?
>
> Maybe it will be eliminated, maybe not. Depends on the exact
> usage. You could selectively enable the views only for the
> bounds-checking subclass.
>
> --Mike
>
>

Yes I probably will. I am not doing it at the moment because the
computational time increases more than tenfold :P

Thank you for your input, very appreciated! :)

Regards,
KR