Split expensive pytest files in cases level [skip ci] #4336

pxLi · 2021-12-09T09:12:19Z

Signed-off-by: Peixin Li pxli@nyu.edu

enable worker cleanup
dynamically set parallelism for normal tests
separate gpu memory-consuming and time-consuming cases
parallelize 311+ cache_test

Verified in multiple scenarios, saved around ~15 mins (30x) and 30 mins (311+) comparing to current setup

Signed-off-by: Peixin Li <pxli@nyu.edu>

revans2 · 2021-12-09T14:04:35Z

jenkins/spark-tests.sh

    # --halt "now,fail=1": exit when the first job fail, and kill running jobs.
    #                      we can set it to "never" and print failed ones after finish running all tests if needed
    # --group: print stderr after test finished for better readability
+    parallel --group --halt "now,fail=1" -j2 run_test ::: ${mem_consuming_cases}
+
+    time_consuming_tests_str=$(echo ${time_consuming_tests} | xargs | sed 's/ / or /g')


Would it be better to have a tag for these tests?

Would also be good if we could document better some place about these things

Would it be better to have a tag for these tests?

Yes, tag would be better here. Let me add some +TODO, and I will add some tags here if we see more cases in this category

jenkins/spark-tests.sh

tgravescs · 2021-12-09T14:46:56Z

jenkins/spark-tests.sh

+    if [[ $PARALLEL_TEST == "true" ]] && [ -x "$(command -v parallel)" ]; then
+      cache_test_cases=$(./run_pyspark_from_build.sh -k "cache_test" \
+                            --collect-only -qq 2>/dev/null | grep -oP '(?<=::).*?(?=\[)' | uniq | shuf | xargs)
+      # hardcode parallelism as 4


comment looks wrong, using -j5 below

nice catch, fixed

tgravescs

looks fine, please file an issue for the followup tagging.

pxLi · 2021-12-13T00:52:27Z

build

pxLi · 2021-12-13T00:54:45Z

created #4348 to follow tagging updates

Split expensive pytest files in cases level

231e4d7

Signed-off-by: Peixin Li <pxli@nyu.edu>

pxLi added test Only impacts tests improve labels Dec 9, 2021

pxLi requested review from GaryShen2008, jlowe, NvTimLiu, revans2 and tgravescs as code owners December 9, 2021 09:12

revans2 previously approved these changes Dec 9, 2021

View reviewed changes

tgravescs reviewed Dec 9, 2021

View reviewed changes

add more doc and rename func

ce16d17

pxLi dismissed revans2’s stale review via ce16d17 December 10, 2021 01:09

tgravescs approved these changes Dec 10, 2021

View reviewed changes

pxLi mentioned this pull request Dec 13, 2021

[FEA] tagging for nightly test categories #4348

Closed

pxLi merged commit bc0cccb into NVIDIA:branch-22.02 Dec 13, 2021

revans2 mentioned this pull request Jan 20, 2022

[BUG] test_hash_reduction_decimal_overflow_sum[30] failed OOM in integration tests #4315

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Split expensive pytest files in cases level [skip ci] #4336

Split expensive pytest files in cases level [skip ci] #4336

pxLi commented Dec 9, 2021

revans2 Dec 9, 2021

revans2 Dec 9, 2021

pxLi Dec 10, 2021 •

edited

Loading

tgravescs Dec 9, 2021

pxLi Dec 10, 2021

tgravescs left a comment

pxLi commented Dec 13, 2021

pxLi commented Dec 13, 2021

Split expensive pytest files in cases level [skip ci] #4336

Split expensive pytest files in cases level [skip ci] #4336

Conversation

pxLi commented Dec 9, 2021

revans2 Dec 9, 2021

Choose a reason for hiding this comment

revans2 Dec 9, 2021

Choose a reason for hiding this comment

pxLi Dec 10, 2021 • edited Loading

Choose a reason for hiding this comment

tgravescs Dec 9, 2021

Choose a reason for hiding this comment

pxLi Dec 10, 2021

Choose a reason for hiding this comment

tgravescs left a comment

Choose a reason for hiding this comment

pxLi commented Dec 13, 2021

pxLi commented Dec 13, 2021

pxLi Dec 10, 2021 •

edited

Loading