Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement getShuffleRDD and fixup mismatched output types on shuffle reuse [databricks] #4257

Merged
merged 2 commits into from
Dec 2, 2021

Conversation

jlowe
Copy link
Member

@jlowe jlowe commented Dec 1, 2021

Fixes #4216.

This implements the getShuffleRDD interface for GpuShuffleExchangeExec and fixes an issue with ReusedExchangeExec caching CPU aggregation buffer types before they're updated with GPU aggregation buffer types. The plan is searched for instances where the ReusedShuffleExec output types do not match the underlying GPU shuffle output types, fixing the output types of ReusedShuffleExec to match the GPU shuffle.

…reuse

Signed-off-by: Jason Lowe <jlowe@nvidia.com>
@jlowe jlowe self-assigned this Dec 1, 2021
@jlowe
Copy link
Member Author

jlowe commented Dec 1, 2021

build

@jlowe jlowe linked an issue Dec 1, 2021 that may be closed by this pull request
revans2
revans2 previously approved these changes Dec 1, 2021
abellina
abellina previously approved these changes Dec 1, 2021
@jlowe jlowe dismissed stale reviews from abellina and revans2 via 1f8bcf6 December 2, 2021 02:27
@jlowe
Copy link
Member Author

jlowe commented Dec 2, 2021

Databricks 9.1 build failed due to access restrictions on newReuseInstance. Added a shim for this and my best guess at a workaround for it.

@jlowe
Copy link
Member Author

jlowe commented Dec 2, 2021

build

Copy link
Collaborator

@gerashegalov gerashegalov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

import org.apache.spark.sql.execution.adaptive.{QueryStageExec, ShuffleQueryStageExec}

/** Utility methods for manipulating Catalyst classes involved in Adaptive Query Execution */
object AQEUtils {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as the db version? hopefully this will be combined after the commonizing PR #4235

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think #4235 is going to address this. This change is only needed by spark312db, and there's no source directory for "everything except spark312db." I didn't bother to create one for this since the code is so small.

@jlowe jlowe merged commit bad9fee into NVIDIA:branch-21.12 Dec 2, 2021
@jlowe jlowe deleted the fix-shufflerdd-partitionspec branch December 2, 2021 14:55
andygrove pushed a commit to andygrove/spark-rapids that referenced this pull request Dec 2, 2021
…reuse [databricks] (NVIDIA#4257)

* Implement getShuffleRDD and fixup mismatched output types on shuffle reuse

Signed-off-by: Jason Lowe <jlowe@nvidia.com>

* Fix Databricks 9.1 build
@sameerz sameerz added the bug Something isn't working label Dec 3, 2021
res-life pushed a commit to res-life/spark-rapids that referenced this pull request Dec 6, 2021
…reuse [databricks] (NVIDIA#4257)

* Implement getShuffleRDD and fixup mismatched output types on shuffle reuse

Signed-off-by: Jason Lowe <jlowe@nvidia.com>

* Fix Databricks 9.1 build
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] AQE Crashing Spark RAPIDS when using filter() and union()
5 participants