[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Some statistical data on double rounding
- From: KHMan <keinhong@...>
- Date: Fri, 8 Jun 2018 17:47:40 +0800
Hi all,
(I will avoid opinions in the following. Y'all are free to decide
whatever y'all wish to do. :-p)
Tests were done on generic MinGW binaries of lua-5.4.0-work1. The
32-bit binary has the double-rounding issue due to extended
precision that was discussed in the past few days. The 64-bit
binary uses the 64-bit FPU datapath and is free of double-rounding.
Tests were done on the output of math.random(), which is between 0
and 1. Any zeros are skipped. The first tests are A+B, A-B, A*B
and A/B. A and B do not change for the 4 ops. The number of
different results between the 32-bit binary and the 64-bit binary
is counted.
Operation / Op_Count / Diff_Count
=================================
A+B 1,000,000,000 0
A-B 1,000,000,000 0
A*B 1,000,000,000 243,584
A/B 1,000,000,000 244,107
For A*B and A/B, this is roughly 1 in 4000.
All different results differ by 1 ULP. Nothing more than 1 ULP was
seen. Here is a look at one result:
Operation : *
n1 = 0.18987016622488206 0x1.84daa65360288p-3
n2 = 0.98339548253950804 0x1.f77f9cd91528bp-1
32b = 0.18671746373457448 0x1.7e65b9c2a810ep-3
64b = 0.18671746373457451 0x1.7e65b9c2a810fp-3
Plugging the values into an online calculator that has binary128
[1], we can do a check with a 'better' result, assuming n1 and n2
is precise to many more digits. The comparison is as follows:
fpu8087 = 0.18671746373457448
binary128 = 0.1867174637345744950213132394217624
fpu64bit = 0.18671746373457451
So results that differ have intermediate results that are almost
exactly in-between, this is why there is a double-rounding issue.
A casual check with a few other numbers show the same trend.
However, note that those decimal representations of the 64-bit
floats are not exactly equal to the exact value of the binary
representation -- they just have enough digits for round-tripping.
Here are the hex representations of the significands:
fpu8087 = 7e65b9c2a810e
binary128 = 7E65B9C2A810E5EAB7A33195161D
fpu64bit = 7e65b9c2a810f
Using this comparison, it appears that the binary128 result is
nearer to fpu8087 than to fpu64bit. We can also verify this by
getting a more accurate decimal version of the 64-bit binary
representation (done by manual fiddling, so not definitive):
fpu8087 = 0.1867174637345744847571893387794262
binary128 = 0.1867174637345744950213132394217624
fpu64bit = 0.1867174637345745125127649544083397
Thus it is confirmed that the fpu8087 result is closer to the
binary128 result, at least for this example.
This ends the first part.
In order to generate hits for A+B and A-B, the magnitude of B is
changed. B is divided by 1e3, 1e6, 1e9 or 1e12, and then the
addition is performed and the comparison made.
Operation / Op_Count / Diff_Count
=================================
A+(B/1e3) 1,000,000,000 0
A+(B/1e6) 1,000,000,000 0
A+(B/1e9) 1,000,000,000 244,158
A+(B/1e12) 1,000,000,000 243,489
Now we can see the double-rounding differences. For the last two
operations, the rate is also roughly 1 in 4000. Again, all
different results differ by 1 ULP. Nothing more than 1 ULP was seen.
This ends the second part.
[1] http://weitz.de/ieee/
--
Cheers,
Kein-Hong Man (esq.)
Selangor, Malaysia