Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix call to thrust::reduce_by_key in argmin/argmax libcudf groupby #9244

Conversation

davidwendt
Copy link
Contributor

Closes #9156

This PR simplifies the parameters when calling thrust::reduce_by_key for the argmin/argmax aggregations in cudf::groupby. The illegalMemoryAccess found in #9156 was due to invalid data being passed from thrust::reduce_by_key through to the BinaryPredicate function as documented in NVIDIA/cccl#789

The invalid data being passed is only a real issue for strings columns where the device pointer was neither nullptr nor a valid address. The new logic provides only size_type values to thrust::reduce_by_key so invalid values can only be out-of-bounds for the input column which is easily checked before retrieving the string_view objects within the ArgMin and ArgMax operators.

@davidwendt davidwendt added bug Something isn't working 2 - In Progress Currently a work in progress libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change labels Sep 17, 2021
@davidwendt davidwendt self-assigned this Sep 17, 2021
@davidwendt
Copy link
Contributor Author

@revans2 Could you try this out to see if this fixes the issue for you as reported in #9156?

@revans2
Copy link
Contributor

revans2 commented Sep 20, 2021

@revans2 Could you try this out to see if this fixes the issue for you as reported in #9156?

Yes this fixed the issues I was seeing.

@davidwendt
Copy link
Contributor Author

@revans2 Could you try this out to see if this fixes the issue for you as reported in #9156?

Yes this fixed the issues I was seeing.

After discussing in our libcudf standup, we will move this into 21.10.

@davidwendt davidwendt changed the base branch from branch-21.12 to branch-21.10 September 20, 2021 21:21
@github-actions github-actions bot added CMake CMake build issue conda Java Affects Java cuDF API. labels Sep 20, 2021
@davidwendt
Copy link
Contributor Author

rerun tests

@codecov
Copy link

codecov bot commented Sep 21, 2021

Codecov Report

Merging #9244 (0b11c5d) into branch-21.10 (3ee3ecf) will decrease coverage by 0.00%.
The diff coverage is 0.00%.

❗ Current head 0b11c5d differs from pull request most recent head a0cb094. Consider uploading reports for the commit a0cb094 to get more accurate results
Impacted file tree graph

@@               Coverage Diff                @@
##           branch-21.10    #9244      +/-   ##
================================================
- Coverage         10.85%   10.84%   -0.01%     
================================================
  Files               115      116       +1     
  Lines             19158    19171      +13     
================================================
  Hits               2080     2080              
- Misses            17078    17091      +13     
Impacted Files Coverage Δ
python/cudf/cudf/__init__.py 0.00% <ø> (ø)
python/cudf/cudf/_lib/__init__.py 0.00% <ø> (ø)
python/cudf/cudf/io/__init__.py 0.00% <0.00%> (ø)
python/cudf/cudf/io/text.py 0.00% <0.00%> (ø)
python/cudf/cudf/utils/ioutils.py 0.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ba2cbd9...a0cb094. Read the comment docs.

@davidwendt davidwendt removed the 2 - In Progress Currently a work in progress label Sep 21, 2021
@davidwendt davidwendt added the 3 - Ready for Review Ready for review by team label Sep 21, 2021
@davidwendt davidwendt marked this pull request as ready for review September 21, 2021 11:45
@davidwendt davidwendt requested review from a team as code owners September 21, 2021 11:45
Copy link
Contributor

@robertmaynard robertmaynard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks to be branched from 21.12 but targeting 21.10, you will need to rebase on 21.10

@davidwendt
Copy link
Contributor Author

I seemed to have hosed my branch now. I will close the PR and reopen based on 21.10

@davidwendt davidwendt closed this Sep 21, 2021
rapids-bot bot pushed a commit that referenced this pull request Sep 22, 2021
…9263)

Closes #9156

This PR simplifies the parameters when calling thrust::reduce_by_key for the argmin/argmax aggregations in cudf::groupby. The illegalMemoryAccess found in #9156 was due to invalid data being passed from thrust::reduce_by_key through to the BinaryPredicate function as documented in NVIDIA/thrust#1525

The invalid data being passed is only a real issue for strings columns where the device pointer was neither nullptr nor a valid address. The new logic provides only size_type values to thrust::reduce_by_key so invalid values can only be out-of-bounds for the input column which is easily checked before retrieving the string_view objects within the ArgMin and ArgMax operators.

This the same as #9244 but based on 21.10

Authors:
  - David Wendt (https://github.com/davidwendt)

Approvers:
  - Devavret Makkar (https://github.com/devavret)
  - Nghia Truong (https://github.com/ttnghia)
  - Robert Maynard (https://github.com/robertmaynard)

URL: #9263
@davidwendt davidwendt deleted the bug-sorted-groupby-min-strings branch September 24, 2021 18:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team bug Something isn't working CMake CMake build issue Java Affects Java cuDF API. libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] IllegalMemoryAccess sometimes on sorted group by min of strings
4 participants