lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]



On 24-Sep-06, at 10:19 PM, Glenn Maynard wrote:

Forwarded here because I can't mail Rici directly and I think the mail I was replying to was sent privately unintentionally anyway. Your mail is
broken; it's using a blacklist that denies legitimate mail (merely
because my mail server is not set up to forward pointlessly through an
unreliable third-party server).  Please use SpamAssassin or a similar
heuristic tool.

Sorry. My email provider does use spamassassin, but with quite a
paranoid setting, I think. I wonder how much other mail I'm missing :(
I think iMaxReference should actually never be decremented, since freed
reference indexes are never actually forgotton completely; they're always
either used or in the free list.  (I guess you turned it into an
optimization that does do that, which I suppose works too.)

(Better to just keep a native array of available indexes, anyway.)

I didn't look at the code all that carefully, it just looked wrong to me. :)

The only other thing I can think of is the pentium cache alignment
issue;
I don't think that could be happening here because you're not doing
any arithmetic, but in case it is, you might want to check by doing
the test reffing k things before you start the loop, for k ranging
from 0 to 5, and see if there are particular values of k which
cause slowdowns. (There was a change to storage format of tables
between 5.0 and 5.1, which causes the alignment problem to show
up for different indices, although it always shows up every sixth
element in a table or stack.)

Ick, that was it:

0.65user 0.00system 0:00.66elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0.22user 0.00system 0:00.23elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k

Yeah, it's a pain. The best workaround is to use AMD chips instead of Intel.
(Next time you buy a computer :) ) -- I don't work for AMD or anything,
but it was enough to convince me to change my purchasing preference.

If you don't mind using up a bunch of ram, you can force Lua objects to
doubleword align. That will increase memory usage by up to one-third.
There's a configuration option in luaconf.h which uses "double" as an
alignment, but on x86, doubles are only 4-byte aligned so that doesn't
help. You can add a double-word aligned object to the union (depends on
your compiler how you do that.) You could also try making a lua number
an extended double, if your compiler supports that; I haven't tried
that but it should work on x86, where extended doubles are supposed
to occupy 12 bytes, even though they only use 10. Alternatively, you
can use floats, which reduces memory consumption, but it also loses
a lot of precision, particularly for integers.

On freebsd, where the garbage collector provides predictable alignment,
you can simply not use every sixth element of a table (but this doesn't
help with the stack.) However, you have to figure out which element not
to use, and it won't work on mallocs which are less power-of-2 oriented
anyway. Also, that doesn't help you with numbers stored on the stack.

The problem, in summary, is this: there is no penalty on a pentium for
doubles which are not aligned, unless the double spans
two cache-lines. In that case, there is an enormous penalty (which
depends on the chip.) There are some timing tests (quite old) here:
http://www.cygnus-software.com/papers/infinitytestresults.html
but I don't think things have changed that much.