Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Floating point equality #185

Closed
twelsby opened this issue Jan 21, 2016 · 5 comments
Closed

Floating point equality #185

twelsby opened this issue Jan 21, 2016 · 5 comments
Milestone

Comments

@twelsby
Copy link
Contributor

twelsby commented Jan 21, 2016

I made a brief comment about some gcc's paranoid float equality warnings in issue #167 but I feel I need to draw attention to this as a separate matter, specifically the function static bool approx(const T a, const T b), which may not be doing what it is intended to do. If this function is intended as nothing more than a way to suppress gcc's compiler warning then it is achieving its objective and you can close this issue. But if it is intended to avoid perceived pitfalls of if(float == float) type statements then it is not doing that.

Try running this code:

#include <random>
#include <iostream>
#include <float>

int main()
{
    std::mt19937 eng;  // a core engine class 
    float halfrange = 100 * std::numeric_limits<float>::epsilon();
    std::uniform_real_distribution<float> dist(1-halfrange, 1+halfrange);
    int agree=0, disagree=0;
    for (int i = 0; i < 1000000000; ++i) {
        float a = dist(eng);
        float b = dist(eng);
        bool method_1 = not (a > b or a < b);
        bool method_2 = (a == b);
        if (method_1 or method_2) {
            if (method_1 == method_2) agree++;
            else disagree++;
            std::cout << "\rIteration: " << i << " Agree: " << agree << " Disagree: " << disagree;
        }
    }
}

Basically this generates pairs of floating point numbers in a very small range about 1 and then compares them for equality using both the == method and the one in approx. The small range is to ensure collisions occur frequently. If either method says they are equal then it increments one of two counters depending on whether they both agree or disagree with the equality. Let it run as long as you like - you will never see disagreement.

The only way I know of to achieve what it is trying to achieve is to determine the difference between the two floats and see if it is less than some arbitrary acceptable error (usually based on epsilon).

But I don't actually think we should be doing that.

The concern about if(float == float) seems to be based around the idea that floating point numbers are somehow indeterminate or uncertain but this is not true - they are fundamentally as determinate as integers - provided the storage and rounding rules are consistent, and that is what IEEE is all about. If I assign a particular floating point number to a variable then the resulting floating point representation should be the same as long as the storage and rounding rules are the same. If I move it to another memory location it will stay the same and I can compare it with the original for equality and it will be equal (assuming the type doesn't change during this process). The x87 instruction set even includes instructions for testing equality.

The alleged inaccuracies in floating point numbers arise from, in my opinion, a misunderstanding about how they are stored due to the assumption that terminating floating point decimal numbers will also be terminating floating point binary numbers. The most oft-quoted example is 0.110, which in binary is the non-terminating 0.0001100110011...2.

If you were under the mistaken belief that 0.110 can be represented in a terminating binary floating point number then you would expect that adding it to 0.810 would produce 0.910 and that if you assigned 0.910 to another variable and compared them that they would be equal. This is not the case because all three numbers incur rounding.

Apply the same logic to non-terminating decimal numbers and the problem becomes obvious. One non-terminating decimal number is 1/3, which can be approximated as 0.33 (with as many digits as you like). Most people would be aware that 3 * 1/3 = 1 but would also not be surprised that 3 * 0.33 does not equal 1 (no matter how many digits you use), because of the rounding. Do the same in binary and the result seems indeterminate.

This is what is "unsafe" about floating point equality and if this scenario would cause problems then that is when you would apply the method I mentioned.

Now none of this should concern us because nowhere are we doing actual floating point calculations. We just take floating point numbers store them, retrieve them and compare them etc. Even the cast with approx comparison (for the loss of precision check) doesn't do anything that would trip up.

Really it should be up the the user of the library to take into account rounding because they know how precisely their data is defined, and how close they want their numbers to be before they are considered equal, and the rounding settings they have applied. Users should be able to expect that if operator== says two floating point numbers are equal that they are equal, not merely close. It is their responsibility, if storing a literal or the result of a calculation they have done, to take into account the rounding that may have occurred, the same as if they were using regular variables.

The warnings that would result can be suppressed with a #pragma.

@nlohmann
Copy link
Owner

Thanks @twelsby, I originally wanted to suppress a compiler warning, and after some reading I found the approx function on StackOverflow. I never cared about the issue, and until your comment, nobody else seemed to care either.

Thanks for having this issue fixed. Unless anyone objects, I shall merge this into the code.

@nlohmann nlohmann added state: please discuss please discuss the issue or vote for your favorite option improvement labels Jan 22, 2016
@nlohmann nlohmann added this to the Release 1.0.1 milestone Jan 22, 2016
@twelsby
Copy link
Contributor Author

twelsby commented Jan 22, 2016

Sorry I probably went a bit too far with that. To be honest the existing code was not harmful and any performance difference improvement would be well beyond splitting hairs and probably nobody does care. Maybe this is just my pet peeve.

@nlohmann
Copy link
Owner

As I wrote in #183, I think adding #pragmas and #ifdefs is quite an ugly solution...

@twelsby
Copy link
Contributor Author

twelsby commented Jan 23, 2016

Agreed, although I still think this is the best option. See my comments in #183. The other thing is that although it does use pre-processor directives at least it doesn't contaminate the namespace with any #define's.

I have split this off from pull #183 to its own pull #190 so that they can be more easily considered independently.

@nlohmann
Copy link
Owner

Thanks!

@nlohmann nlohmann removed the state: please discuss please discuss the issue or vote for your favorite option label Nov 24, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants