Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support segmented reductions and null mask reductions #9621

Merged
merged 67 commits into from
Mar 10, 2022

Conversation

isVoid
Copy link
Contributor

@isVoid isVoid commented Nov 5, 2021

closes #9135
closes #9552

This PR adds support for numeric types to simple_op, sum, prod, min, max, any, all. Also, this PR adds segmented_null_mask_reduction to compute null mask reductions on segments.

@isVoid isVoid requested a review from a team as a code owner November 5, 2021 21:54
@github-actions github-actions bot added CMake CMake build issue libcudf Affects libcudf (C++/CUDA) code. labels Nov 5, 2021
@isVoid isVoid marked this pull request as draft November 5, 2021 21:54
cpp/src/reductions/simple.cuh Outdated Show resolved Hide resolved
@jrhemstad jrhemstad changed the title [WIP] Support segment reduce [WIP] Support segmented reductions Nov 10, 2021
@ttnghia
Copy link
Contributor

ttnghia commented Nov 15, 2021

I didn't read the implementation yet. But I assume that this is extremely similar to what sort-based groupby aggregate is doing. If this new implementation is more efficient than the current groupby aggregate implementation, we also need to switch to use segmented reduction for groupby aggregate too. In order to support this idea, please add a benchmark for comparison between their performance.

@github-actions github-actions bot added conda Java Affects Java cuDF API. Python Affects Python cuDF API. labels Dec 14, 2021
@isVoid isVoid changed the base branch from branch-21.12 to branch-22.02 December 14, 2021 22:00
@github-actions github-actions bot removed gpuCI Python Affects Python cuDF API. labels Dec 15, 2021
@isVoid
Copy link
Contributor Author

isVoid commented Feb 17, 2022

I updated the docstring for segmented_reduction, to describe the specification of offsets. Also added a note on the current up-cast of precision when reducing integral and floating point values, and remove mentioning of computation precision at output_dtype.

Copy link
Contributor

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This combined with #10302 looks to give me what I need

rapids-bot bot pushed a commit that referenced this pull request Feb 17, 2022
This PR adds an implicit conversion operator from `column_view` to `device_span<T const>`. The immediate purpose of this PR is to make it possible to use the API `segmented_reduce(column_view data, device_span<size_type> offsets, ...)` in PR #9621.

This PR also resolves #9656 by adding a `column_view` constructor from `device_span<T const>`.

More broadly, this PR should make it easier to refactor instances where `column.data()` is used with counting iterators to build transform iterators, or other patterns that require a length (e.g. vector factories to copy to host).

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Mark Harris (https://github.com/harrism)
  - Jake Hemstad (https://github.com/jrhemstad)
  - David Wendt (https://github.com/davidwendt)

URL: #10302
Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few minor comments but I think this is sufficiently ready for merge. We have a few TODO items that we should probably track in an issue.

cpp/include/cudf/reduction.hpp Outdated Show resolved Hide resolved
cpp/include/cudf/reduction.hpp Outdated Show resolved Hide resolved
cpp/src/reductions/segmented_reductions.cpp Outdated Show resolved Hide resolved
cpp/src/reductions/segmented_reductions.cpp Outdated Show resolved Hide resolved
cpp/tests/reductions/segmented_reduction_tests.cpp Outdated Show resolved Hide resolved
@bdice
Copy link
Contributor

bdice commented Mar 10, 2022

@gpucibot merge Oops. This is blocked until @jrhemstad approves / retracts request for changes.

@bdice
Copy link
Contributor

bdice commented Mar 10, 2022

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 7ff1956 into rapidsai:branch-22.04 Mar 10, 2022
rapids-bot bot pushed a commit that referenced this pull request Mar 14, 2022
This adds in JNI support for #9621. It also adds in a helper API to allow us to do the processing on a list easily.

Authors:
  - Robert (Bobby) Evans (https://github.com/revans2)

Approvers:
  - Liangcai Li (https://github.com/firestarman)
  - Jason Lowe (https://github.com/jlowe)

URL: #10413
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CMake CMake build issue feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
7 participants