-
Notifications
You must be signed in to change notification settings - Fork 230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decimal32 support #1717
Decimal32 support #1717
Conversation
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuCast.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuCast.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuCast.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/decimalExpressions.scala
Outdated
Show resolved
Hide resolved
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
3a0a098
to
8702bef
Compare
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't have time to finish all of the review, but I thought I would get some of my comments in for today.
sql-plugin/src/main/java/org/apache/spark/sql/catalyst/CudfUnsafeRow.java
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/DecimalUtil.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuCast.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuCast.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuCast.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuCast.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuCast.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuCast.scala
Outdated
Show resolved
Hide resolved
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
build |
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall it is looking okay. I am a bit concerned about all of the extra code in the binary operators for decimal. It just feels like there should be a simpler way to do this. Especially with type erasure in cudf. It mostly feels like we just need to check if the output needs to be 64-bit and if it does then we don't need to check any of the children, because a no-op cast is cheap.
...311/src/main/scala/com/nvidia/spark/rapids/shims/spark311/ParquetCachedBatchSerializer.scala
Show resolved
Hide resolved
...311/src/main/scala/org/apache/spark/sql/rapids/shims/spark311/GpuInMemoryTableScanExec.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuCast.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuParquetScan.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuParquetScan.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuRowToColumnarExec.scala
Outdated
Show resolved
Hide resolved
withResource(Scalar.fromDecimal(-scale, 0L)) { scalar => | ||
scalar.sub(input.getBase) | ||
if (DecimalType.is32BitDecimalType(dt)) { | ||
withResource(Scalar.fromDecimal(-scale, 0)) { scalar => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: It would be nice, perhaps as a follow on issue, to Make creating a Scalar decimal value simpler. That way we could combine the code together a bit better.
I am not sure I understand, if the output is 64-bit but the operands are 32-bit, it will result in an overflow so we have to cast the inputs to 64-bit decimal so there is no overflow. |
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
build |
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a nit left, and I am OK with this going in without it
|
||
object DecimalUtil { | ||
|
||
def createCudfDecimal(precision: Int, scale: Int): DType = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can we have some comments here? I would like it clean that the input precision and scale should be what Spark expects and this will convert it into whatever CUDF expects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will fix this in one of my PRs
* Add support for Decimal32 Signed-off-by: Raza Jafri <rjafri@nvidia.com> * fixed unary_minus Signed-off-by: Raza Jafri <rjafri@nvidia.com> * unscaledLong fix Signed-off-by: Raza Jafri <rjafri@nvidia.com> * More support for Decimal32 Signed-off-by: Raza Jafri <rjafri@nvidia.com> * implicit for casting dec32todec64 Signed-off-by: Raza Jafri <rjafri@nvidia.com> * cleanup Signed-off-by: Raza Jafri <rjafri@nvidia.com> * refactored castDecimalToDecimal Signed-off-by: Raza Jafri <rjafri@nvidia.com> * fixed legacy decimal read for non-nested Signed-off-by: Raza Jafri <rjafri@nvidia.com> * fixed casting and added more tests Signed-off-by: Raza Jafri <rjafri@nvidia.com> * added nested tests for reading legacy decimals Signed-off-by: Raza Jafri <rjafri@nvidia.com> * removed implicit Signed-off-by: Raza Jafri <rjafri@nvidia.com> * struct working Signed-off-by: Raza Jafri <rjafri@nvidia.com> * Lists working Signed-off-by: Raza Jafri <rjafri@nvidia.com> * divide not working Signed-off-by: Raza Jafri <rjafri@nvidia.com> * cleanup Signed-off-by: Raza Jafri <rjafri@nvidia.com> * division working but problem with casting Signed-off-by: Raza Jafri <rjafri@nvidia.com> * moved div code to GpuModLike Signed-off-by: Raza Jafri <rjafri@nvidia.com> * code cleanup Signed-off-by: Raza Jafri <rjafri@nvidia.com> * some more fixes Signed-off-by: Raza Jafri <rjafri@nvidia.com> * some more fixes Signed-off-by: Raza Jafri <rjafri@nvidia.com> * addressed review comments Signed-off-by: Raza Jafri <rjafri@nvidia.com> * added more comments Signed-off-by: Raza Jafri <rjafri@nvidia.com> * park Signed-off-by: Raza Jafri <rjafri@nvidia.com> * properly cast scalar Signed-off-by: Raza Jafri <rjafri@nvidia.com> * fixed gpu metric Signed-off-by: Raza Jafri <rjafri@nvidia.com> * upmerged Signed-off-by: Raza Jafri <rjafri@nvidia.com> * fixed castFloatsToDecimals to pick the right precision Signed-off-by: Raza Jafri <rjafri@nvidia.com> * addressed review comments Signed-off-by: Raza Jafri <rjafri@nvidia.com> * Fixed memory leak Signed-off-by: Raza Jafri <rjafri@nvidia.com> * removed length restriction Signed-off-by: Raza Jafri <rjafri@nvidia.com> * fixed test failure due to upmerge Signed-off-by: Raza Jafri <rjafri@nvidia.com> Co-authored-by: Raza Jafri <rjafri@nvidia.com>
* Add support for Decimal32 Signed-off-by: Raza Jafri <rjafri@nvidia.com> * fixed unary_minus Signed-off-by: Raza Jafri <rjafri@nvidia.com> * unscaledLong fix Signed-off-by: Raza Jafri <rjafri@nvidia.com> * More support for Decimal32 Signed-off-by: Raza Jafri <rjafri@nvidia.com> * implicit for casting dec32todec64 Signed-off-by: Raza Jafri <rjafri@nvidia.com> * cleanup Signed-off-by: Raza Jafri <rjafri@nvidia.com> * refactored castDecimalToDecimal Signed-off-by: Raza Jafri <rjafri@nvidia.com> * fixed legacy decimal read for non-nested Signed-off-by: Raza Jafri <rjafri@nvidia.com> * fixed casting and added more tests Signed-off-by: Raza Jafri <rjafri@nvidia.com> * added nested tests for reading legacy decimals Signed-off-by: Raza Jafri <rjafri@nvidia.com> * removed implicit Signed-off-by: Raza Jafri <rjafri@nvidia.com> * struct working Signed-off-by: Raza Jafri <rjafri@nvidia.com> * Lists working Signed-off-by: Raza Jafri <rjafri@nvidia.com> * divide not working Signed-off-by: Raza Jafri <rjafri@nvidia.com> * cleanup Signed-off-by: Raza Jafri <rjafri@nvidia.com> * division working but problem with casting Signed-off-by: Raza Jafri <rjafri@nvidia.com> * moved div code to GpuModLike Signed-off-by: Raza Jafri <rjafri@nvidia.com> * code cleanup Signed-off-by: Raza Jafri <rjafri@nvidia.com> * some more fixes Signed-off-by: Raza Jafri <rjafri@nvidia.com> * some more fixes Signed-off-by: Raza Jafri <rjafri@nvidia.com> * addressed review comments Signed-off-by: Raza Jafri <rjafri@nvidia.com> * added more comments Signed-off-by: Raza Jafri <rjafri@nvidia.com> * park Signed-off-by: Raza Jafri <rjafri@nvidia.com> * properly cast scalar Signed-off-by: Raza Jafri <rjafri@nvidia.com> * fixed gpu metric Signed-off-by: Raza Jafri <rjafri@nvidia.com> * upmerged Signed-off-by: Raza Jafri <rjafri@nvidia.com> * fixed castFloatsToDecimals to pick the right precision Signed-off-by: Raza Jafri <rjafri@nvidia.com> * addressed review comments Signed-off-by: Raza Jafri <rjafri@nvidia.com> * Fixed memory leak Signed-off-by: Raza Jafri <rjafri@nvidia.com> * removed length restriction Signed-off-by: Raza Jafri <rjafri@nvidia.com> * fixed test failure due to upmerge Signed-off-by: Raza Jafri <rjafri@nvidia.com> Co-authored-by: Raza Jafri <rjafri@nvidia.com>
This PR handles Decimal32 support.
failingall passinglackingdoneWhile copying vector back on the Host inThis is now handled based on the precisionHostColumnarToGpu
I am getting theBigDecimal
value, it could be optimized by getting the unscaledLong value like we were before this PRThere might be issues with casting from D32 to D64 that I was seeing before but I don't see them now. It could be attributed to the scale value flipped in cudf, so I might have made a mistake while evaluating themthis was happening because I was confused between the scale values being opposite in cudf