Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the failing test test_window on Databricks #2484

Merged
merged 1 commit into from
May 24, 2021

Conversation

firestarman
Copy link
Collaborator

@firestarman firestarman commented May 24, 2021

This small PR is to fix the failing test test_window on Databricks by making the outReference be lazy.

Because on Databricks, it will get some invalid expressions (e.g. none#0L, none@1L.) in the projectList if binding the references on the driver side.

closes #2372

Signed-off-by: Firestarman firestarmanllc@gmail.com

By making the 'outReference' be lazy. Because on Databricks, it
will get some invalid expressions in the 'projectList' if binding
the reference on driver side. E.g. 'none#0L', 'none@1L'.

Signed-off-by: Firestarman <firestarmanllc@gmail.com>
@firestarman
Copy link
Collaborator Author

Verification on DB is running ... https://blossom.nvidia.com/sw-gpu-spark-jenkins/view/Testing/job/lc-db/ .

@firestarman
Copy link
Collaborator Author

build

@pxLi pxLi added the bug Something isn't working label May 24, 2021
Copy link
Collaborator

@pxLi pxLi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

still not sure about the root cause of abnormal expressions, we can get back to this if we see something-related again~

@pxLi pxLi merged commit 31b168e into NVIDIA:branch-21.06 May 24, 2021
@firestarman firestarman deleted the fix-cudf-window-test branch May 24, 2021 05:12
Copy link
Collaborator

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am happy that this was fixed, but it would have been nice to add a comment to the outReferences explaining why it needs to be lazy. Often when there a subtle bugs like this it is easy for someone to accidentally undo the fix without a warning that there is a problem to be aware of.

@tgravescs
Copy link
Collaborator

Agree with Bobby, I tried to make sure in the new shim that we document changes made specific to Databricks, so if we can followup and add comment that would be great

@firestarman
Copy link
Collaborator Author

firestarman commented May 25, 2021

Made the PR #2496 to add the comments.

nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
By making the 'outReference' be lazy. Because on Databricks, it
will get some invalid expressions in the 'projectList' if binding
the reference on driver side. E.g. 'none#0L', 'none@1L'.

Signed-off-by: Firestarman <firestarmanllc@gmail.com>
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
By making the 'outReference' be lazy. Because on Databricks, it
will get some invalid expressions in the 'projectList' if binding
the reference on driver side. E.g. 'none#0L', 'none@1L'.

Signed-off-by: Firestarman <firestarmanllc@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] FAILED ../../src/main/python/udf_cudf_test.py::test_window
4 participants