-
Notifications
You must be signed in to change notification settings - Fork 230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUILD] databricks IT tests should run in parallel #1499
Comments
I'll make it run parallel with python xdist, similar as spark pre-merge build did |
Still working on the issue. I can set up ENV to run parallel tests as the pre-merge build did. There are some failures which I'm checking if there are some python modules missing. |
Test pipeline : https://blossom.nvidia.com/sw-gpu-spark-jenkins/view/Testing/job/tim-db-build-0/ Build & test (45minuntes) can be down within 1 hour Most of the tests PASS, but still have some failures, tracking 21:33:45 �[31m= �[31m�[1m213 failed�[0m, �[32m4164 passed�[0m, �[33m130 skipped�[0m, �[33m163 xfailed�[0m, �[33m6 xpassed�[0m, �[33m66 warnings�[0m�[31m in 2165.62s (0:36:05)�[0m�[31m =�[0m |
@tgravescs @revans2 @jlowe Blossom Jenkins: https://blossom.nvidia.com/sw-gpu-spark-jenkins/view/Testing/job/tim-db-build-0/7 PR1549: https://github.com/NVIDIA/spark-rapids/pull/1549/files But there are below 3 modules failed, could you please help to check? Thanks! 21:38:38 195 failed 4182 passed 130 skipped 163 xfailed xpassed 68 warningsin 2214.54s (0:36:54) window_function_test.py: tpch_test.py: udf_test.py: I saw exceptions in parallel log: integration_tests/target/run_dir/target/surefire-reports/scala-test-detailed-output.log |
oh some of them are hitting the stack overflow issue in optimizer.SparkPlanStats.computeStats now for some reason, the script might have other options we weren't using before, need to look more |
cache tests are different, the error above is tpch. The window tests failed because spark context already stopped, presumably from one of the other errors |
tpch weren't running before because we didn't set --std_input_path |
|
g4dn.xlarge CPU(s): 4, Mem 16G |
As the nightly Databricks pipeline run the integration tests without any parameters, so I also remove below configs in the parallel scripts https://github.com/NVIDIA/spark-rapids/blob/branch-0.4/integration_tests/run_pyspark_from_build.sh#L85-L88 |
Are you sure that they are all passing and not just being skipped? Some of the lines you removed are the ones that allow TPCH to run. You also removed the lines for setting the time zone to UTC which might cause all of the timestamp tests to be skipped if the time zone is not UTC by default. |
@revans2 https://blossom.nvidia.com/sw-gpu-spark-jenkins/job/rapids_databricks301_nightly-dev-github/56/consoleFull SKIPPED [32] ../../src/main/python/conftest.py:169: std_input_path is not configured |
• https://github.com/NVIDIA/spark-rapids/blob/branch-0.4/integration_tests/run_pyspark_from_build.sh#L85-L88 (PASS udf_test.py) |
What happens if it is just lines 85, 86 and 97 that are removed? |
I guess it can also PASS, too. Let me check it. I'll update the result here. |
Lines 85 and 86 make me think that we cannot set the java command line options with find spark in databricks. I am not sure why this would cause some of the tests to fail though. I think we need someone to acutely debug and root cause these issues at this point. Removing line 97 just disables a lot of tests and side steps the problem. |
Also PASS |
@revans2 @tgravescs Need we create a issue for the databricks IT skipping below tests? |
pipeline: https://blossom.nvidia.com/sw-gpu-spark-jenkins/view/Testing/job/tim-db-build-0/9/console |
yes we need a specific issue to investigate those failures |
close the issue per #1645 merged |
…IDIA#1499) Signed-off-by: spark-rapids automation <70000568+nvauto@users.noreply.github.com>
Is your feature request related to a problem? Please describe.
We should enable the databricks IT tests to run in parallel. We added support for it but its not used in the databricks test scripts.
The text was updated successfully, but these errors were encountered: