lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Hi all,

(I will avoid opinions in the following. Y'all are free to decide whatever y'all wish to do. :-p)

Tests were done on generic MinGW binaries of lua-5.4.0-work1. The 32-bit binary has the double-rounding issue due to extended precision that was discussed in the past few days. The 64-bit binary uses the 64-bit FPU datapath and is free of double-rounding.

Tests were done on the output of math.random(), which is between 0 and 1. Any zeros are skipped. The first tests are A+B, A-B, A*B and A/B. A and B do not change for the 4 ops. The number of different results between the 32-bit binary and the 64-bit binary is counted.

Operation / Op_Count / Diff_Count
=================================
A+B  1,000,000,000  0
A-B  1,000,000,000  0
A*B  1,000,000,000  243,584
A/B  1,000,000,000  244,107

For A*B and A/B, this is roughly 1 in 4000.

All different results differ by 1 ULP. Nothing more than 1 ULP was seen. Here is a look at one result:

Operation : *
n1  = 0.18987016622488206	0x1.84daa65360288p-3
n2  = 0.98339548253950804	0x1.f77f9cd91528bp-1
32b = 0.18671746373457448	0x1.7e65b9c2a810ep-3
64b = 0.18671746373457451	0x1.7e65b9c2a810fp-3

Plugging the values into an online calculator that has binary128 [1], we can do a check with a 'better' result, assuming n1 and n2 is precise to many more digits. The comparison is as follows:

fpu8087   = 0.18671746373457448
binary128 = 0.1867174637345744950213132394217624
fpu64bit  = 0.18671746373457451

So results that differ have intermediate results that are almost exactly in-between, this is why there is a double-rounding issue. A casual check with a few other numbers show the same trend.

However, note that those decimal representations of the 64-bit floats are not exactly equal to the exact value of the binary representation -- they just have enough digits for round-tripping. Here are the hex representations of the significands:

fpu8087   = 7e65b9c2a810e
binary128 = 7E65B9C2A810E5EAB7A33195161D
fpu64bit  = 7e65b9c2a810f

Using this comparison, it appears that the binary128 result is nearer to fpu8087 than to fpu64bit. We can also verify this by getting a more accurate decimal version of the 64-bit binary representation (done by manual fiddling, so not definitive):

fpu8087   = 0.1867174637345744847571893387794262
binary128 = 0.1867174637345744950213132394217624
fpu64bit  = 0.1867174637345745125127649544083397

Thus it is confirmed that the fpu8087 result is closer to the binary128 result, at least for this example.

This ends the first part.

In order to generate hits for A+B and A-B, the magnitude of B is changed. B is divided by 1e3, 1e6, 1e9 or 1e12, and then the addition is performed and the comparison made.

Operation / Op_Count / Diff_Count
=================================
A+(B/1e3)   1,000,000,000  0
A+(B/1e6)   1,000,000,000  0
A+(B/1e9)   1,000,000,000  244,158
A+(B/1e12)  1,000,000,000  243,489

Now we can see the double-rounding differences. For the last two operations, the rate is also roughly 1 in 4000. Again, all different results differ by 1 ULP. Nothing more than 1 ULP was seen.

This ends the second part.

[1] http://weitz.de/ieee/

--
Cheers,
Kein-Hong Man (esq.)
Selangor, Malaysia