-
Notifications
You must be signed in to change notification settings - Fork 888
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support scatter() for list columns #6768
Conversation
Supports lists of: 1. fixed width data types 2. strings 3. lists (of all of the above, and lists(of ...))
1. Additional type checking for child columns 2. Moved list_device_view functions inline
Please update the changelog in order to start CI tests. View the gpuCI docs here. |
Codecov Report
@@ Coverage Diff @@
## branch-0.17 #6768 +/- ##
============================================
Coverage 81.87% 81.87%
============================================
Files 96 96
Lines 16079 16079
============================================
Hits 13165 13165
Misses 2914 2914 Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a challenging PR. Nice work, it's overall very clean and understandable.
My suggestions are in two main categories:
- Keeping up with in-progress work to replace
cudaStream_t
withrmm::cuda_stream_view
- Better parallelism for the scatter offset computation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very proper. No further comment apart from the ones already posted my Mark.
@devavret, @harrism, would you be averse to our checking this version of The main thing pending here is @harrism's suggestion better parallelize child-column construction. For now, it's only as efficient as |
Just need @devavret to click approve unless he has other changes to request. |
Re-upped, after resolving conflicts. |
Thanks for the reviews, @harrism, @devavret! I'm investigating the build failure indicated in the failing-check:
|
Maybe because my cuda_stream_view PR merged? |
Yep, that was it. I'm rebuilding and rerunning my tests at my end. I'll post the corrected version shortly. |
I've pushed a fix for the build break. I have to remind myself that the CI doesn't have |
I've committed this now. Thank you so much for the reviews, all. |
Addendum to #6768, to support scatter on columns of type list<struct>. The hard part was the tests.
This change adds support for
cudf::scatter()
to be called on list columns. This implementation supports the following child types:Struct support will be added as a follow-on.
The implementation follows a similar approach to
cudf::scatter()
forstring_view
. Alist_device_view
was added to represent the device view of a list row.(Note: Collapsing this into
list_view
is proving difficult becauselist_view
is included intypes.hpp
, and the dependencies form a Gordian Knot.)