Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only manifest the current batch in cached block shuffle read iterator #892

Merged
merged 1 commit into from
Sep 30, 2020

Conversation

jlowe
Copy link
Member

@jlowe jlowe commented Sep 30, 2020

The RAPIDS shuffle read iterator manifests all columnar batches that were cached locally before creating the read iterator for those batches. This creates a lot of memory pressure and fragmentation issues, since all of those batches are not spillable.

This updates the RAPIDS shuffle read iterator to only manifest the current batch when the iterator is used.

@jlowe jlowe added bug Something isn't working shuffle things that impact the shuffle plugin labels Sep 30, 2020
@jlowe jlowe added this to the Sep 28 - Oct 9 milestone Sep 30, 2020
@jlowe jlowe requested a review from abellina September 30, 2020 16:58
@jlowe jlowe self-assigned this Sep 30, 2020
@abellina abellina added the performance A performance related task/issue label Sep 30, 2020
@jlowe
Copy link
Member Author

jlowe commented Sep 30, 2020

build

@jlowe jlowe merged commit 666f89b into NVIDIA:branch-0.3 Sep 30, 2020
@jlowe jlowe deleted the fix-local-shuffle-iter branch October 28, 2020 16:41
abellina pushed a commit to abellina/spark-rapids that referenced this pull request Nov 10, 2020
sperlingxx pushed a commit to sperlingxx/spark-rapids that referenced this pull request Nov 20, 2020
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
tgravescs pushed a commit to tgravescs/spark-rapids that referenced this pull request Nov 30, 2023
…IDIA#892)

Signed-off-by: spark-rapids automation <70000568+nvauto@users.noreply.github.com>

Signed-off-by: spark-rapids automation <70000568+nvauto@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working performance A performance related task/issue shuffle things that impact the shuffle plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants