Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate executor broadcast shuffle handling performance on Databricks #10229

Open
tgravescs opened this issue Jan 19, 2024 · 0 comments
Open
Labels
performance A performance related task/issue

Comments

@tgravescs
Copy link
Collaborator

Is your feature request related to a problem? Please describe.
Related to fix for #10165. We have a couple of specific handling for executor side broadcasts where we add in an extra shuffle to go from GpuExchange -> GpuColumnarToRow -> Exchange (EXECUTOR_BROADCAST). We do this because the BroadcastHashJoin is expecting the Exchange node there and not a ColumnarToRow exec. This works but its likely adding in extra overhead.

Perhaps we can make a special GPUExchange that has a doExecute and handles the columnar to row itself, but we should analyze the performance impact and if something like that will work.

after fix for #10165 The plan looks like:

                  +- Exchange (16)
                     +- GpuColumnarToRow (15)
                        +- GpuShuffleCoalesce (14)
                           +- ShuffleQueryStage (13), 
                              +- GpuColumnarExchange (12)
                                 +- GpuCoalesceBatches (11)
                                    +- GpuFilter (10)

You can see that we have an extra shuffle in there.

@tgravescs tgravescs added ? - Needs Triage Need team to review and classify performance A performance related task/issue labels Jan 19, 2024
@mattahrens mattahrens removed the ? - Needs Triage Need team to review and classify label Feb 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance A performance related task/issue
Projects
None yet
Development

No branches or pull requests

2 participants