You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It gets translated into a cudf::grouped_rolling_window with no group by keys a single column with 10 "1"s in it. Preceding and Following are both set to Scalar{type=INT32 value=2147483641} and Scalar{type=INT32 value=2147483642} respectively. But I tried it with INT64 and it didn't change the results. I will try to turn this into a C++ test. Just waiting for things to build.
It looks very much like the row number checks are all done as ints and not longs, so things are overflowing and we get incorrect results. If I check the following up or down, more or less rows get a correct answer, which really hints at an overflow.
Expected behavior
We get the right answer and I don't have to think about overflow. Otherwise I am going to have to fall back to the CPU for any window query with rows that are larger than 1/2 of Int.MaxValue
The text was updated successfully, but these errors were encountered:
Describe the bug
The row range counts can overflow and produce incorrect results in window operations.
Steps/Code to reproduce bug
When I do a spark query like
on the GPU I get a result like
but on the CPU it is all 10s (no nulls).
It gets translated into a
cudf::grouped_rolling_window
with no group by keys a single column with 10 "1"s in it. Preceding and Following are both set toScalar{type=INT32 value=2147483641}
andScalar{type=INT32 value=2147483642}
respectively. But I tried it with INT64 and it didn't change the results. I will try to turn this into a C++ test. Just waiting for things to build.It looks very much like the row number checks are all done as ints and not longs, so things are overflowing and we get incorrect results. If I check the following up or down, more or less rows get a correct answer, which really hints at an overflow.
Expected behavior
We get the right answer and I don't have to think about overflow. Otherwise I am going to have to fall back to the CPU for any window query with rows that are larger than 1/2 of
Int.MaxValue
The text was updated successfully, but these errors were encountered: