Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] q82 regression after #3288 #3640

Closed
abellina opened this issue Sep 23, 2021 · 1 comment · Fixed by #3657
Closed

[BUG] q82 regression after #3288 #3640

abellina opened this issue Sep 23, 2021 · 1 comment · Fixed by #3657
Assignees
Labels
bug Something isn't working P0 Must have for release

Comments

@abellina
Copy link
Collaborator

It looks like the changes in this PR: #3288 are causing a regression in performance for q82 (to several minutes).

I noticed a lot of time spent in the following code and @jlowe suggested to revert. After doing so, we are able to get back to ~20 seconds.

ai.rapids.cudf.HashJoin.create(Native Method)
ai.rapids.cudf.HashJoin.<init>(HashJoin.java:87)
org.apache.spark.sql.rapids.execution.HashJoinIterator.$anonfun$maybeBuildHashTable$3(GpuHashJoin.scala:358)
org.apache.spark.sql.rapids.execution.HashJoinIterator$$Lambda$2420/1713476990.apply(Unknown Source)
@abellina abellina added bug Something isn't working ? - Needs Triage Need team to review and classify P0 Must have for release labels Sep 23, 2021
@jlowe jlowe self-assigned this Sep 24, 2021
@jlowe jlowe removed the ? - Needs Triage Need team to review and classify label Sep 24, 2021
@jlowe
Copy link
Member

jlowe commented Sep 24, 2021

I think I know what triggered this. In the new code, we will always use Spark's build-side table to build the hash table used in the join. However in the old code, the hash table was being built implicitly as part of the libcudf join call, and libcudf's inner join logic would choose whichever table had the fewest rows as the build-side table. For query 82, there is an inner join where the right side table is enormous compared to the left side table. That means in the old code we would build a tiny table from the left, whereas in the new code we build an enormous table from the right. Building the hash table is the most expensive part of the join, so we end up losing badly to the code that chose the smaller table as the build table for inner joins.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working P0 Must have for release
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants