lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

On 26-Dec-06, at 11:51 PM, Joe Smith wrote:

"Luiz Henrique de Figueiredo" <> wrote in message">
Directly compairing floating point numbers for equality is *always* a bad

Only if the numbers are supposed to be the same number but are the result
of two different computations.

For example, you can find that a floating point value which is not a
NaN is not equal to itself.

I find this very surprising. And possibily wrong, ie against IEEE 754.
The explanation that followed made no sense to me. It would make sense for floating point values held in *different* variables, but not for the *same* variable. It seems to me that the gcc docs is spreading unnecessary FUD.
But I must be missing something...

I've heard this same explanation elsewere too. (Somebody saw this problem in production code). Apparently, the registers used in the FPU are oversized, and thus calculations performed there-in have morie bits than the datatype should. Thus when compairing a value in the register to the truncated value in the memory, they can be non-equal. I lack access to the IEEE 754 standard, and cannot make heads or tails of much of the ISO C standard when it talks about floating point numbers, so I'm not certain if this behavior violates either standard, but I've seen this documented more than once.

On the other hand, it is highly unlikely that the code listed would ever be affected by this, but it is still troubling.

I can't see how (x == x) could yield the incorrect answer. The warning that a number may not be equal to "itself" should probably be interpreted as meaning that two different computations of the same value, even if they are textually identical, may result in different answers. Consequently, the code:

double q;
q = 3.0/7.0
if (q == 3.0/7.0) printf("Equal!\n");
else printf("Not equal!\n");

might print either "Equal!" or "Not equal!". However, note that q is not compared to itself, but rather compared to a similar computation. (As the reference below mentions, the result may be different with different compiler settings; unless optimization is disabled, most C compilers would compute the value 3.0/7.0 at compile time.)

The example was taken from <>, and that link in turn comes from <>. Those are both worth reading, and can be found in various places with various URLs.

To cut to the chase, though, the particular behaviour (use of 80-bit internal precision on x86) is part of the floating point status; it can be set by (non-standard) library functions. Different OSs provide different default floating point modes. While it would be nice if standard numerical libraries worked regardless of the floating point mode, and deviation from this could well be considered a bug, the fact remains that many library implementations will fail in various ways if the floating point mode is not as they expect it to be.

Regardless, the use of floating point to store integers is not at all problematic, except possibly for the lack of an integer divide primitive. As long as a computation (and all intermediate computations) can be represented exactly in 53 bits, the result will be entirely predictable regardless of floating point mode.

It is also worth mentioning that integer arithmetic is subject to similar inconsistencies. In particular, it's not at all uncommon to see code like this:

Foo *make_foo_vector(size_t n) {
  if (n * sizeof(Foo) > MAX_ALLOC)
    return NULL; /* Or issue an error */
  return malloc(n * sizeof(Foo));

This is, of course, incorrect and may result in the successful return of a vector of less than n Foo's, which may in turn lead to a buffer-overrun. On machines where size_t is <= 32 bits, the code could be corrected by doing the computation at line 2 in double-precision floating point, although I suppose the correct solution is to divide instead of multiplying.

As another illustration of the perils of integer arithmetic, consider the following incorrect code for verifying that f(n) == g(n) for all values of n:

int check_all_values(int (*f)(int), int (*g)(int)) {
  int n;
  for (n = INT_MIN; n <= INT_MAX; n++) {
    if (f(n) != g(n)) return 0;
  return 1;