Following up to myself:
A combined inline assembler replacement may speed up the code quite
a bit. Anyone know what the fastest way to determine whether a
double
fits into an int is on x86?
Well, it's simpler than I thought: the FPU already stores overflow and
precision loss bits in the status word. You just have to check them
after
the fistpl instruction.
-O -O2 -O3 -fomit-frame-pointer
lua-5.0.2 3.30 3.40 3.60
lua-5.1-work0 3.72 4.30 3.44
lua-5.1-work2 4.03 4.18 3.50
Trying the same tests with the appended patch:
lua-5.1-work2a 3.82 4.07 3.28
Looks better now.
There are plenty more optimization opportunities in ltable.c and lvm.c,
but I'll leave that to someone with an immediate need. Dito for the
conversion of the GCC assembler macros to MSVC syntax.
Bye,
Mike
<getnum.patch>