[BUG] Min and Max aggregations involving `infinity` produce incorrect results #11352

ttnghia · 2022-07-26T16:03:58Z

Currently, min and max aggregations for numeric types firstly initialize the results by an identity value then call a device operator. The identity is:

cuda::std::numeric_limits<T>::max(), if the aggregation is min, and
cuda::std::numeric_limits<T>::lowest(), if the aggregation is max.

When the input is floating-point data that has infinity, the min and max are computed incorrectly. That's because min(max, infinity) == max and max(lowest, -infinity) == lowest. In other words, for floating-point numbers the lowest value is still bigger than -infinity and max is still smaller than infinity.

We should specialize the identity value for floating-point numbers.

The text was updated successfully, but these errors were encountered:

bdice · 2022-07-26T16:43:59Z

cudf/cpp/include/cudf/detail/utilities/device_operators.cuh

Lines 168 to 177 in 8faf2b0

    
           template <typename T, 
        
                     std::enable_if_t<!std::is_same_v<T, cudf::string_view> && !cudf::is_dictionary<T>() && 
        
                                      !cudf::is_fixed_point<T>()>* = nullptr> 
        
           static constexpr T identity() 
        
           { 
        
             // chrono types do not have std::numeric_limits specializations and should use T::min() 
        
             // https://eel.is/c++draft/numeric.limits.general#6 
        
             if constexpr (cudf::is_chrono<T>()) return T::min(); 
        
             return cuda::std::numeric_limits<T>::lowest(); 
        
           }

This should probably read something like:

  template <typename T,
            std::enable_if_t<!std::is_same_v<T, cudf::string_view> && !cudf::is_dictionary<T>() &&
                             !cudf::is_fixed_point<T>()>* = nullptr>
  static constexpr T identity()
  {
    // chrono types do not have std::numeric_limits specializations and should use T::min()
    // https://eel.is/c++draft/numeric.limits.general#6
    if constexpr (cudf::is_chrono<T>()) {
        return T::min();
    } else if constexpr (cuda::std::numeric_limits<T>::has_infinity) {
        return -cuda::std::numeric_limits<T>::infinity();
    }
    return cuda::std::numeric_limits<T>::lowest();
  }

and similarly for the min identity with +inf.

… in device operators `min` and `max` (#11357) This fixes a bug of device operators `min` and `max` in generating the `identity` value for floating-point numbers. In particular: * `min::identity()` should return `cuda::std::numeric_limits<T>::infinity()` instead of `cuda::std::numeric_limits<T>::max()`, and * `max::identity()` should return `-cuda::std::numeric_limits<T>::infinity()` instead of `cuda::std::numeric_limits<T>::lowest()`. Closes #11352. Authors: - Nghia Truong (https://github.com/ttnghia) Approvers: - Bradley Dice (https://github.com/bdice) - Yunsong Wang (https://github.com/PointKernel) - Mike Wilson (https://github.com/hyperbolic2346) URL: #11357

ttnghia added bug Something isn't working Needs Triage Need team to review and classify labels Jul 26, 2022

ttnghia self-assigned this Jul 26, 2022

bdice added libcudf Affects libcudf (C++/CUDA) code. and removed Needs Triage Need team to review and classify labels Jul 26, 2022

abellina mentioned this issue Jul 26, 2022

[BUG] improve min/max test coverage for infinity and -infinity NVIDIA/spark-rapids#6095

Open

ttnghia mentioned this issue Jul 26, 2022

Set +/-infinity as the identity values for floating-point numbers in device operators min and max #11357

Merged

rapids-bot bot closed this as completed in #11357 Jul 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Min and Max aggregations involving `infinity` produce incorrect results #11352

[BUG] Min and Max aggregations involving `infinity` produce incorrect results #11352

ttnghia commented Jul 26, 2022 •

edited

Loading

bdice commented Jul 26, 2022

[BUG] Min and Max aggregations involving infinity produce incorrect results #11352

[BUG] Min and Max aggregations involving infinity produce incorrect results #11352

Comments

ttnghia commented Jul 26, 2022 • edited Loading

bdice commented Jul 26, 2022

[BUG] Min and Max aggregations involving `infinity` produce incorrect results #11352

[BUG] Min and Max aggregations involving `infinity` produce incorrect results #11352

ttnghia commented Jul 26, 2022 •

edited

Loading