Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to ignore tests depending on spark shim version #504

Merged
merged 4 commits into from
Aug 4, 2020

Conversation

andygrove
Copy link
Contributor

Signed-off-by: Andy Grove andygrove@nvidia.com

Adds the ability to ignore scala tests conditionally, depending on the version of Spark we are testing with.

Also closes #382

Signed-off-by: Andy Grove <andygrove@nvidia.com>
@tgravescs
Copy link
Collaborator

build

@tgravescs tgravescs added this to the Aug 3 - Aug 14 milestone Aug 3, 2020
@tgravescs tgravescs added the test Only impacts tests label Aug 3, 2020
case SparkShimVersion(major, minor, _) =>
// this test is not valid in Spark 3.1 and later because the expression is
// NullIntolerant and gets replaced with a null literal instead
val isValidTestForSparkVersion = major <= 3 && minor == 0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although it seems unnecessary to have this separate variable, it makes the output message more obvious:

Test Canceled: isValidTestForSparkVersion was false

Copy link
Collaborator

@tgravescs tgravescs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so the SparkShimServiceProvider in each of the directories already has a VERSIONNAME string in it. It would be nice if we didn't have to duplicate it twice. Also you are missing the 300 databricks version here. I assume you added the class DatabricksShimVersion for it? we may need more information for databricks - runtime - we can always extend it later though as well.

Signed-off-by: Andy Grove <andygrove@nvidia.com>
@andygrove
Copy link
Contributor Author

I wasn't sure how this would work with Databricks or what the version numbers would be. The case classes might give us more flexibility than strings and could perhaps use these instead of the string for version name?

I pushed another commit to demonstrate how I imagined this working for databricks.

@andygrove andygrove self-assigned this Aug 3, 2020
@tgravescs
Copy link
Collaborator

we can use the case classes as the real version and then have it have a function toString, that is used in SparkShimServiceProvider for now. That way we don't have the version in 2 places.

I would like to add it to the databricks version as well. I don't want to have to fix it afterwards. Have it run the tests for it as it is now and the test is passing. They only changed this in 3.1 and so far 7.0 databricks runtime doesn't have that change. If the test ends up failing we can handle it separate but I would at least like to make an attempt here otherwise compilation will fail. The databricks version is just 3.0.0-databricks. It is not 7.0 which is the runtime version.

The downside to having different case classes is demonstrated here, where you would have to match on both. If we had a "vendor" field in SparkShimVersion you could conditionalize that. In this case your check (noticed extra _)
SparkShimVersion(major, minor, _, _) => major == 3 && minor == 0
would still work and cover databricks.
I can see reasons to do it both ways though. For instance this way would allow the Databricks version to have a separate runtime field in it as well. I'm fine with leaving it like this for now with different case classes, but please add the version to the 300db Shim layer as DatabricksShimVersion(3,0,0) and have the test run for it.

I think either way we will have to enhance the databricks version and probably this anyway to handle cases where the version is like 3.0.0.0 - but there is a separate jira for that.

Signed-off-by: Andy Grove <andygrove@nvidia.com>
@andygrove
Copy link
Contributor Author

Thanks @tgravescs I hadn't realized that the databricks shim was already merged in. I have made those updates although this change has turned out larger than I originally thought. Let me know what you think. I'm happy to take a different approach if this is too disruptive.

@tgravescs
Copy link
Collaborator

overall I think this is fine, few minor nits. We can enhance it as we need.

@tgravescs
Copy link
Collaborator

build

Signed-off-by: Andy Grove <andygrove@nvidia.com>
@andygrove
Copy link
Contributor Author

build

@andygrove
Copy link
Contributor Author

@tgravescs I addressed those nits

@tgravescs tgravescs merged commit 0d585ab into NVIDIA:branch-0.2 Aug 4, 2020
@andygrove andygrove deleted the spark-shim-version branch August 14, 2020 18:16
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
* Add ability to ignore tests depending on spark shim version

Signed-off-by: Andy Grove <andygrove@nvidia.com>

* better version check

Signed-off-by: Andy Grove <andygrove@nvidia.com>

* remove duplicate VERSION strings and use case classes instead

Signed-off-by: Andy Grove <andygrove@nvidia.com>

* fix formatting

Signed-off-by: Andy Grove <andygrove@nvidia.com>
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
* Add ability to ignore tests depending on spark shim version

Signed-off-by: Andy Grove <andygrove@nvidia.com>

* better version check

Signed-off-by: Andy Grove <andygrove@nvidia.com>

* remove duplicate VERSION strings and use case classes instead

Signed-off-by: Andy Grove <andygrove@nvidia.com>

* fix formatting

Signed-off-by: Andy Grove <andygrove@nvidia.com>
pxLi pushed a commit to pxLi/spark-rapids that referenced this pull request May 12, 2022
tgravescs pushed a commit to tgravescs/spark-rapids that referenced this pull request Nov 30, 2023
* adding clang-format for cpp files

Signed-off-by: Mike Wilson <knobby@burntsheep.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
test Only impacts tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Spark3.1 StringFallbackSuite regexp_replace null cpu fall back test fails.
2 participants