Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decimal32 support #1717

Merged
merged 37 commits into from
Apr 13, 2021
Merged

Decimal32 support #1717

merged 37 commits into from
Apr 13, 2021

Conversation

razajafri
Copy link
Collaborator

@razajafri razajafri commented Feb 12, 2021

This PR handles Decimal32 support.

  • Tests for division are failing all passing
  • Support for nested types for reading legacy Decimals is lacking done
  • While copying vector back on the Host in HostColumnarToGpu I am getting the BigDecimal value, it could be optimized by getting the unscaledLong value like we were before this PR This is now handled based on the precision
  • There might be issues with casting from D32 to D64 that I was seeing before but I don't see them now. It could be attributed to the scale value flipped in cudf, so I might have made a mistake while evaluating them this was happening because I was confused between the scale values being opposite in cudf

@razajafri razajafri changed the title Decimal32 support [WIP] Decimal32 support Feb 12, 2021
@razajafri razajafri marked this pull request as draft February 12, 2021 17:46
@razajafri razajafri changed the title [WIP] Decimal32 support Decimal32 support Feb 12, 2021
@sameerz sameerz added the feature request New feature or request label Feb 13, 2021
@razajafri razajafri changed the base branch from branch-0.4 to branch-0.5 February 22, 2021 20:27
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
@razajafri razajafri marked this pull request as ready for review March 1, 2021 20:39
Copy link
Collaborator

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't have time to finish all of the review, but I thought I would get some of my comments in for today.

Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
@razajafri
Copy link
Collaborator Author

build

Signed-off-by: Raza Jafri <rjafri@nvidia.com>
@razajafri
Copy link
Collaborator Author

build

Copy link
Collaborator

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall it is looking okay. I am a bit concerned about all of the extra code in the binary operators for decimal. It just feels like there should be a simpler way to do this. Especially with type erasure in cudf. It mostly feels like we just need to check if the output needs to be 64-bit and if it does then we don't need to check any of the children, because a no-op cast is cheap.

integration_tests/src/main/python/arithmetic_ops_test.py Outdated Show resolved Hide resolved
integration_tests/src/main/python/arithmetic_ops_test.py Outdated Show resolved Hide resolved
withResource(Scalar.fromDecimal(-scale, 0L)) { scalar =>
scalar.sub(input.getBase)
if (DecimalType.is32BitDecimalType(dt)) {
withResource(Scalar.fromDecimal(-scale, 0)) { scalar =>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: It would be nice, perhaps as a follow on issue, to Make creating a Scalar decimal value simpler. That way we could combine the code together a bit better.

@razajafri
Copy link
Collaborator Author

Overall it is looking okay. I am a bit concerned about all of the extra code in the binary operators for decimal. It just feels like there should be a simpler way to do this. Especially with type erasure in cudf. It mostly feels like we just need to check if the output needs to be 64-bit and if it does then we don't need to check any of the children, because a no-op cast is cheap.

I am not sure I understand, if the output is 64-bit but the operands are 32-bit, it will result in an overflow so we have to cast the inputs to 64-bit decimal so there is no overflow.

Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
@razajafri
Copy link
Collaborator Author

build

@razajafri
Copy link
Collaborator Author

@revans2 I have filed a couple of follow-on issues. One for refactoring the binaryoperations here and the other one for making the creation of scalar simpler here. PTAL and let me know if you have any more concerns

Signed-off-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
@razajafri
Copy link
Collaborator Author

build

Copy link
Collaborator

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a nit left, and I am OK with this going in without it


object DecimalUtil {

def createCudfDecimal(precision: Int, scale: Int): DType = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we have some comments here? I would like it clean that the input precision and scale should be what Spark expects and this will convert it into whatever CUDF expects.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will fix this in one of my PRs

@razajafri razajafri merged commit cc47a25 into NVIDIA:branch-0.5 Apr 13, 2021
@razajafri razajafri deleted the decimal32_support branch April 13, 2021 18:15
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
* Add support for Decimal32

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* fixed unary_minus

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* unscaledLong fix

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* More support for Decimal32

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* implicit for casting dec32todec64

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* cleanup

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* refactored castDecimalToDecimal

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* fixed legacy decimal read for non-nested

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* fixed casting and added more tests

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* added nested tests for reading legacy decimals

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* removed implicit

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* struct working

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* Lists working

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* divide not working

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* cleanup

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* division working but problem with casting

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* moved div code to GpuModLike

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* code cleanup

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* some more fixes

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* some more fixes

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* addressed review comments

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* added more comments

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* park

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* properly cast scalar

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* fixed gpu metric

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* upmerged

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* fixed castFloatsToDecimals to pick the right precision

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* addressed review comments

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* Fixed memory leak

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* removed length restriction

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* fixed test failure due to upmerge

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

Co-authored-by: Raza Jafri <rjafri@nvidia.com>
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
* Add support for Decimal32

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* fixed unary_minus

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* unscaledLong fix

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* More support for Decimal32

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* implicit for casting dec32todec64

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* cleanup

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* refactored castDecimalToDecimal

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* fixed legacy decimal read for non-nested

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* fixed casting and added more tests

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* added nested tests for reading legacy decimals

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* removed implicit

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* struct working

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* Lists working

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* divide not working

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* cleanup

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* division working but problem with casting

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* moved div code to GpuModLike

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* code cleanup

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* some more fixes

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* some more fixes

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* addressed review comments

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* added more comments

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* park

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* properly cast scalar

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* fixed gpu metric

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* upmerged

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* fixed castFloatsToDecimals to pick the right precision

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* addressed review comments

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* Fixed memory leak

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* removed length restriction

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

* fixed test failure due to upmerge

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

Co-authored-by: Raza Jafri <rjafri@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants