Add ability to ignore tests depending on spark shim version #504

andygrove · 2020-08-03T20:33:41Z

Signed-off-by: Andy Grove andygrove@nvidia.com

Adds the ability to ignore scala tests conditionally, depending on the version of Spark we are testing with.

Also closes #382

Signed-off-by: Andy Grove <andygrove@nvidia.com>

tgravescs · 2020-08-03T20:36:11Z

build

andygrove · 2020-08-03T20:38:48Z

tests/src/test/scala/com/nvidia/spark/rapids/StringFallbackSuite.scala

+        case SparkShimVersion(major, minor, _) =>
+          // this test is not valid in Spark 3.1 and later because the expression is
+          // NullIntolerant and gets replaced with a null literal instead
+          val isValidTestForSparkVersion = major <= 3 && minor == 0


Although it seems unnecessary to have this separate variable, it makes the output message more obvious:

Test Canceled: isValidTestForSparkVersion was false

tgravescs

so the SparkShimServiceProvider in each of the directories already has a VERSIONNAME string in it. It would be nice if we didn't have to duplicate it twice. Also you are missing the 300 databricks version here. I assume you added the class DatabricksShimVersion for it? we may need more information for databricks - runtime - we can always extend it later though as well.

Signed-off-by: Andy Grove <andygrove@nvidia.com>

andygrove · 2020-08-03T20:58:31Z

I wasn't sure how this would work with Databricks or what the version numbers would be. The case classes might give us more flexibility than strings and could perhaps use these instead of the string for version name?

I pushed another commit to demonstrate how I imagined this working for databricks.

tgravescs · 2020-08-03T21:26:57Z

we can use the case classes as the real version and then have it have a function toString, that is used in SparkShimServiceProvider for now. That way we don't have the version in 2 places.

I would like to add it to the databricks version as well. I don't want to have to fix it afterwards. Have it run the tests for it as it is now and the test is passing. They only changed this in 3.1 and so far 7.0 databricks runtime doesn't have that change. If the test ends up failing we can handle it separate but I would at least like to make an attempt here otherwise compilation will fail. The databricks version is just 3.0.0-databricks. It is not 7.0 which is the runtime version.

The downside to having different case classes is demonstrated here, where you would have to match on both. If we had a "vendor" field in SparkShimVersion you could conditionalize that. In this case your check (noticed extra _)
SparkShimVersion(major, minor, _, _) => major == 3 && minor == 0
would still work and cover databricks.
I can see reasons to do it both ways though. For instance this way would allow the Databricks version to have a separate runtime field in it as well. I'm fine with leaving it like this for now with different case classes, but please add the version to the 300db Shim layer as DatabricksShimVersion(3,0,0) and have the test run for it.

I think either way we will have to enhance the databricks version and probably this anyway to handle cases where the version is like 3.0.0.0 - but there is a separate jira for that.

Signed-off-by: Andy Grove <andygrove@nvidia.com>

andygrove · 2020-08-03T23:03:00Z

Thanks @tgravescs I hadn't realized that the databricks shim was already merged in. I have made those updates although this change has turned out larger than I originally thought. Let me know what you think. I'm happy to take a different approach if this is too disruptive.

...park300/src/main/scala/com/nvidia/spark/rapids/shims/spark300/SparkShimServiceProvider.scala

...park301/src/main/scala/com/nvidia/spark/rapids/shims/spark301/SparkShimServiceProvider.scala

...park310/src/main/scala/com/nvidia/spark/rapids/shims/spark310/SparkShimServiceProvider.scala

tgravescs · 2020-08-04T13:21:27Z

overall I think this is fine, few minor nits. We can enhance it as we need.

tgravescs · 2020-08-04T13:21:32Z

build

Signed-off-by: Andy Grove <andygrove@nvidia.com>

andygrove · 2020-08-04T14:57:51Z

build

andygrove · 2020-08-04T17:29:00Z

@tgravescs I addressed those nits

* Add ability to ignore tests depending on spark shim version Signed-off-by: Andy Grove <andygrove@nvidia.com> * better version check Signed-off-by: Andy Grove <andygrove@nvidia.com> * remove duplicate VERSION strings and use case classes instead Signed-off-by: Andy Grove <andygrove@nvidia.com> * fix formatting Signed-off-by: Andy Grove <andygrove@nvidia.com>

* adding clang-format for cpp files Signed-off-by: Mike Wilson <knobby@burntsheep.com>

Add ability to ignore tests depending on spark shim version

d4b5d48

Signed-off-by: Andy Grove <andygrove@nvidia.com>

tgravescs added this to the Aug 3 - Aug 14 milestone Aug 3, 2020

tgravescs added the test Only impacts tests label Aug 3, 2020

andygrove commented Aug 3, 2020

View reviewed changes

tgravescs reviewed Aug 3, 2020

View reviewed changes

better version check

fc3d8d6

Signed-off-by: Andy Grove <andygrove@nvidia.com>

andygrove self-assigned this Aug 3, 2020

remove duplicate VERSION strings and use case classes instead

46c9acd

Signed-off-by: Andy Grove <andygrove@nvidia.com>

tgravescs reviewed Aug 4, 2020

View reviewed changes

fix formatting

d4ca2af

Signed-off-by: Andy Grove <andygrove@nvidia.com>

tgravescs approved these changes Aug 4, 2020

View reviewed changes

tgravescs merged commit 0d585ab into NVIDIA:branch-0.2 Aug 4, 2020

andygrove deleted the spark-shim-version branch August 14, 2020 18:16

pxLi pushed a commit to pxLi/spark-rapids that referenced this pull request May 12, 2022

Fix DummyOverseerAgent overseer_info and sp_list (NVIDIA#504)

2956f1d

tgravescs pushed a commit to tgravescs/spark-rapids that referenced this pull request Nov 30, 2023

adding clang-format for cpp files (NVIDIA#504)

7e66083

* adding clang-format for cpp files Signed-off-by: Mike Wilson <knobby@burntsheep.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ability to ignore tests depending on spark shim version #504

Add ability to ignore tests depending on spark shim version #504

andygrove commented Aug 3, 2020

tgravescs commented Aug 3, 2020

andygrove Aug 3, 2020

tgravescs left a comment

andygrove commented Aug 3, 2020

tgravescs commented Aug 3, 2020

andygrove commented Aug 3, 2020

tgravescs commented Aug 4, 2020

tgravescs commented Aug 4, 2020

andygrove commented Aug 4, 2020

andygrove commented Aug 4, 2020

Add ability to ignore tests depending on spark shim version #504

Add ability to ignore tests depending on spark shim version #504

Conversation

andygrove commented Aug 3, 2020

tgravescs commented Aug 3, 2020

andygrove Aug 3, 2020

Choose a reason for hiding this comment

tgravescs left a comment

Choose a reason for hiding this comment

andygrove commented Aug 3, 2020

tgravescs commented Aug 3, 2020

andygrove commented Aug 3, 2020

tgravescs commented Aug 4, 2020

tgravescs commented Aug 4, 2020

andygrove commented Aug 4, 2020

andygrove commented Aug 4, 2020