Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use fresh SparkSession when capturing to avoid late capture of previous query #537

Merged
merged 1 commit into from
Aug 11, 2020

Conversation

jlowe
Copy link
Member

@jlowe jlowe commented Aug 10, 2020

Signed-off-by: Jason Lowe jlowe@nvidia.com

This hopefully fixes #473.

I believe what's happening in that bug is the test just before the one that fails isn't trying to capture yet the capture callback is still enabled. I suspect the callback on the last test's query is late, occurring while the next test is already running and after it enables the callback capture. That causes it to capture the previous test's GPU run as the CPU run and the subsequent GPU run captures the CPU run instead which explains why we see a CPU plan when it fails.

This updates runOnCpuAndGpuWithCapture to force a new Spark session which should drain the listener callbacks during the session stop and should create a hard boundary between the previous test and the next test that is trying to capture. The downside is that using captures will be slower due to the new session being created.

…us query

Signed-off-by: Jason Lowe <jlowe@nvidia.com>
@jlowe jlowe added bug Something isn't working test Only impacts tests labels Aug 10, 2020
@jlowe jlowe added this to the Aug 3 - Aug 14 milestone Aug 10, 2020
@jlowe jlowe self-assigned this Aug 10, 2020
@jlowe
Copy link
Member Author

jlowe commented Aug 10, 2020

build

@revans2 revans2 merged commit f7e8536 into NVIDIA:branch-0.2 Aug 11, 2020
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
…us query (NVIDIA#537)

Signed-off-by: Jason Lowe <jlowe@nvidia.com>
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
…us query (NVIDIA#537)

Signed-off-by: Jason Lowe <jlowe@nvidia.com>
@jlowe jlowe deleted the capture-bug branch September 10, 2021 15:31
tgravescs pushed a commit to tgravescs/spark-rapids that referenced this pull request Nov 30, 2023
Signed-off-by: spark-rapids automation <70000568+nvauto@users.noreply.github.com>

Signed-off-by: spark-rapids automation <70000568+nvauto@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working test Only impacts tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] PartMerge:countDistinct:sum fails sporadically
3 participants