Floating point equality #185

twelsby · 2016-01-21T15:22:10Z

I made a brief comment about some gcc's paranoid float equality warnings in issue #167 but I feel I need to draw attention to this as a separate matter, specifically the function static bool approx(const T a, const T b), which may not be doing what it is intended to do. If this function is intended as nothing more than a way to suppress gcc's compiler warning then it is achieving its objective and you can close this issue. But if it is intended to avoid perceived pitfalls of if(float == float) type statements then it is not doing that.

Try running this code:

#include <random>
#include <iostream>
#include <float>

int main()
{
    std::mt19937 eng;  // a core engine class 
    float halfrange = 100 * std::numeric_limits<float>::epsilon();
    std::uniform_real_distribution<float> dist(1-halfrange, 1+halfrange);
    int agree=0, disagree=0;
    for (int i = 0; i < 1000000000; ++i) {
        float a = dist(eng);
        float b = dist(eng);
        bool method_1 = not (a > b or a < b);
        bool method_2 = (a == b);
        if (method_1 or method_2) {
            if (method_1 == method_2) agree++;
            else disagree++;
            std::cout << "\rIteration: " << i << " Agree: " << agree << " Disagree: " << disagree;
        }
    }
}

Basically this generates pairs of floating point numbers in a very small range about 1 and then compares them for equality using both the == method and the one in approx. The small range is to ensure collisions occur frequently. If either method says they are equal then it increments one of two counters depending on whether they both agree or disagree with the equality. Let it run as long as you like - you will never see disagreement.

The only way I know of to achieve what it is trying to achieve is to determine the difference between the two floats and see if it is less than some arbitrary acceptable error (usually based on epsilon).

But I don't actually think we should be doing that.

The concern about if(float == float) seems to be based around the idea that floating point numbers are somehow indeterminate or uncertain but this is not true - they are fundamentally as determinate as integers - provided the storage and rounding rules are consistent, and that is what IEEE is all about. If I assign a particular floating point number to a variable then the resulting floating point representation should be the same as long as the storage and rounding rules are the same. If I move it to another memory location it will stay the same and I can compare it with the original for equality and it will be equal (assuming the type doesn't change during this process). The x87 instruction set even includes instructions for testing equality.

The alleged inaccuracies in floating point numbers arise from, in my opinion, a misunderstanding about how they are stored due to the assumption that terminating floating point decimal numbers will also be terminating floating point binary numbers. The most oft-quoted example is 0.1₁₀, which in binary is the non-terminating 0.0001100110011...₂.

If you were under the mistaken belief that 0.1₁₀ can be represented in a terminating binary floating point number then you would expect that adding it to 0.8₁₀ would produce 0.9₁₀ and that if you assigned 0.9₁₀ to another variable and compared them that they would be equal. This is not the case because all three numbers incur rounding.

Apply the same logic to non-terminating decimal numbers and the problem becomes obvious. One non-terminating decimal number is 1/3, which can be approximated as 0.33 (with as many digits as you like). Most people would be aware that 3 * 1/3 = 1 but would also not be surprised that 3 * 0.33 does not equal 1 (no matter how many digits you use), because of the rounding. Do the same in binary and the result seems indeterminate.

This is what is "unsafe" about floating point equality and if this scenario would cause problems then that is when you would apply the method I mentioned.

Now none of this should concern us because nowhere are we doing actual floating point calculations. We just take floating point numbers store them, retrieve them and compare them etc. Even the cast with approx comparison (for the loss of precision check) doesn't do anything that would trip up.

Really it should be up the the user of the library to take into account rounding because they know how precisely their data is defined, and how close they want their numbers to be before they are considered equal, and the rounding settings they have applied. Users should be able to expect that if operator== says two floating point numbers are equal that they are equal, not merely close. It is their responsibility, if storing a literal or the result of a calculation they have done, to take into account the rounding that may have occurred, the same as if they were using regular variables.

The warnings that would result can be suppressed with a #pragma.

The text was updated successfully, but these errors were encountered:

nlohmann · 2016-01-22T09:26:23Z

Thanks @twelsby, I originally wanted to suppress a compiler warning, and after some reading I found the approx function on StackOverflow. I never cared about the issue, and until your comment, nobody else seemed to care either.

Thanks for having this issue fixed. Unless anyone objects, I shall merge this into the code.

twelsby · 2016-01-22T14:41:51Z

Sorry I probably went a bit too far with that. To be honest the existing code was not harmful and any performance difference improvement would be well beyond splitting hairs and probably nobody does care. Maybe this is just my pet peeve.

nlohmann · 2016-01-22T18:18:16Z

As I wrote in #183, I think adding #pragmas and #ifdefs is quite an ugly solution...

twelsby · 2016-01-23T07:57:59Z

Agreed, although I still think this is the best option. See my comments in #183. The other thing is that although it does use pre-processor directives at least it doesn't contaminate the namespace with any #define's.

I have split this off from pull #183 to its own pull #190 so that they can be more easily considered independently.

nlohmann · 2016-01-24T11:55:59Z

Thanks!

twelsby mentioned this issue Jan 22, 2016

Extending support to full uint64_t/int64_t range - issue #178 (Integer conversion to unsigned) #183

Closed

nlohmann added state: please discuss please discuss the issue or vote for your favorite option improvement labels Jan 22, 2016

nlohmann added this to the Release 1.0.1 milestone Jan 22, 2016

twelsby pushed a commit to twelsby/json that referenced this issue Jan 23, 2016

Issue nlohmann#185 - remove approx() and use #pragma to kill warnings

3a14034

twelsby mentioned this issue Jan 23, 2016

Issue #185 - remove approx() and use #pragma to kill warnings #190

Merged

nlohmann closed this as completed in e00b00f Jan 24, 2016

twelsby mentioned this issue Jan 24, 2016

Small error in pull #185 #194

Closed

nlohmann removed the state: please discuss please discuss the issue or vote for your favorite option label Nov 24, 2016

cburlacu mentioned this issue Aug 22, 2017

Floating point equality - revisited #703

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Floating point equality #185

Floating point equality #185

twelsby commented Jan 21, 2016

nlohmann commented Jan 22, 2016

twelsby commented Jan 22, 2016

nlohmann commented Jan 22, 2016

twelsby commented Jan 23, 2016

nlohmann commented Jan 24, 2016

Floating point equality #185

Floating point equality #185

Comments

twelsby commented Jan 21, 2016

nlohmann commented Jan 22, 2016

twelsby commented Jan 22, 2016

nlohmann commented Jan 22, 2016

twelsby commented Jan 23, 2016

nlohmann commented Jan 24, 2016