Fix canonicalization of GpuFileSourceScanExec, GpuShuffleCoalesceExec #1310

jlowe · 2020-12-08T00:30:52Z

At least part of the issue in #1308 is caused by parts of the query being executed multiple times redundantly because Spark is unable to determine that subsections of the query are identical. This is caused by broken canonicalization where a GPU Exec node is not comparing correctly with a semantically-equivalent copy of itself.

In this case it's the use of a RapidsConf instance as a parameter of the case class. RapidsConf doesn't have an equals method, so two instances will be considered different. However I don't think adding an equals method is the real fix here, as the configs could be slightly different in ways unrelated to the semantics of the node's execution and therefore should still be considered equal with respect to that node's canonicalization.

In the case of GpuShuffleCoalesceExec it was easy, there was only one config value being used, so I replaced the conf parameter with the value from the conf. For GpuFileSourceScanExec it is trickier since there are many config parameters being used, and it already has quite a few arguments. In this case I opted to place the conf instance in a separate parameter list which is excluded from the automatically generated equals and hashcode comparison which is used by Spark plan canonicalization. In our case, there are no RAPIDS config settings that change the semantics for what data is read and how it is presented, so we can safely ignore any RAPIDS configs for purposes of canonicalization.

Signed-off-by: Jason Lowe <jlowe@nvidia.com>

jlowe · 2020-12-08T00:31:04Z

build

jlowe · 2020-12-08T15:54:43Z

This has a bug where an extra argument was added to GpuFileSourceScanExec but otherCopyArgs was not overridden. It can fail queries where the node needs to be copied after it's inserted into the plan. I'll post a followup PR.

…NVIDIA#1310) Signed-off-by: Jason Lowe <jlowe@nvidia.com>

[auto-merge] bot-auto-merge-branch-23.08 to branch-23.10 [skip ci] [bot]

Fix canonicalization of GpuFileSourceScanExec, GpuShuffleCoalesceExec

f7ffa93

Signed-off-by: Jason Lowe <jlowe@nvidia.com>

jlowe added the SQL part of the SQL/Dataframe plugin label Dec 8, 2020

jlowe added this to the Dec 7 - Dec 18 milestone Dec 8, 2020

jlowe self-assigned this Dec 8, 2020

tgravescs approved these changes Dec 8, 2020

View reviewed changes

revans2 approved these changes Dec 8, 2020

View reviewed changes

revans2 merged commit ccbb3e1 into NVIDIA:branch-0.3 Dec 8, 2020

jlowe deleted the fix-canonicalization branch December 8, 2020 15:56

This was referenced Dec 8, 2020

Fix copying GpuFileSourceScanExec node #1318

Merged

[BUG] TPC-DS query 77 at scale=1TB fails with maxResultSize exceeded error #1284

Closed

[BUG] TPC-DS query 14a runs much slower on 0.3 #1308

Closed

nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021

Fix canonicalization of GpuFileSourceScanExec, GpuShuffleCoalesceExec (…

f4ec9f9

…NVIDIA#1310) Signed-off-by: Jason Lowe <jlowe@nvidia.com>

nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021

Fix canonicalization of GpuFileSourceScanExec, GpuShuffleCoalesceExec (…

3e110f8

…NVIDIA#1310) Signed-off-by: Jason Lowe <jlowe@nvidia.com>

tgravescs pushed a commit to tgravescs/spark-rapids that referenced this pull request Nov 30, 2023

Merge pull request NVIDIA#1310 from NVIDIA/bot-auto-merge-branch-23.08

9ba9dd2

[auto-merge] bot-auto-merge-branch-23.08 to branch-23.10 [skip ci] [bot]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix canonicalization of GpuFileSourceScanExec, GpuShuffleCoalesceExec #1310

Fix canonicalization of GpuFileSourceScanExec, GpuShuffleCoalesceExec #1310

jlowe commented Dec 8, 2020

jlowe commented Dec 8, 2020

jlowe commented Dec 8, 2020

Fix canonicalization of GpuFileSourceScanExec, GpuShuffleCoalesceExec #1310

Fix canonicalization of GpuFileSourceScanExec, GpuShuffleCoalesceExec #1310

Conversation

jlowe commented Dec 8, 2020

jlowe commented Dec 8, 2020

jlowe commented Dec 8, 2020