TestNG execution hangs with version 7.9 or higher, with large amount of test cases #3152

bapszy · 2024-07-23T07:08:25Z

TestNG Version

7.9 and all newer versions

Note: only the latest version is supported

Expected behavior

Tests run, reports generated, test run finished successfully.

Actual behavior

When running large amount of test cases (200+) despite the last test case was finished, exetution doeasn't stop, build is terminated after timeout only.
We use Appium with TestNG for mobile testing in GitLab, on linux runners. This happens only with the large alltest suites. We run tests parallel, and don't use DataProvider or parametrization.
We use Gradle to execute tests with useTestNG().
Everything works with version 7.8, but none of the newer versions. I tried to debug on our side but could not find the cause. AfterMethod runs, but AfterSuite doesn't start, both have the alwaysrun=true set. I couldn't figure out what happens in between.
Please help. Thanks.

Is the issue reproducible on runner?

Test case sample

Please, share the test case (as small as possible) which shows the issue
Issue happens with large amount of test cases, 200+
Our alltest configuration:

(Sorry I needed to hide some parts)

Contribution guidelines

Incase you plan to raise a pull request to fix this issue, please make sure you refer our Contributing section for detailed set of steps.

krmahadevan · 2024-07-23T07:20:05Z

@bapszy - Can you please help share a sample that can be used to reproduce the problem. I ask this because given the nature of your issue, its not going to be easy to identify the root cause without a sample.

Please feel free to create a simple project that doesnt use any other library apart from TestNG, which we can use to reproduce the issue

github-actions · 2024-07-23T07:20:27Z

Hi, @bapszy.
We need more information to reproduce the issue.

Please help share a Minimal, Reproducible Example that can be used to recreate the issue.

In addition to sharing a sample, please also add the following details:

TestNG version being used.
JDK version being used.
How was the test run (IDE/Build Tools)?
How does your build file (pom.xml | build.gradle | build.gradle.kts) look like ?

It would be better if you could share a sample project that can be directly used to reproduce the problem.
Reply to this issue when all information is provided, thank you.

bapszy · 2024-07-23T07:41:27Z

@krmahadevan As I mentioned it only happens whith large amount of test cases, I don't know how to create an example for that which doesn't take long time to create and still can reproduce the issue. Writing 200+ fake tests seems to be a hard work. Our alltest suite runs for more then 1 hour, on a complex system of Appium, GitlabRunners, Mac test machines which runs the Android emulators and iOS Simulators. If you can help me with any suggestions with it.

I'm happy to answer any questions, you have.

We are using Gradle 8.8
We are on Java 17

It happens when we run our big test suites in Gitlab. (locally it would take a whole day to run all tests on 1 thread)

krmahadevan · 2024-07-23T08:02:07Z

@bapszy I hear you.

But please help me understand how do I go about debugging this problem to find out what is going wrong. There is no information that you have shared so far.

I would atleast need to know what your code does. Which is why I am asking for a sample to reproduce.

itkhanz · 2024-07-25T14:49:21Z

@bapszy We have a similar setup, and I at-least have not encountered any such issue. Here is our environment:

Gradle 8.9
Java 17
TestNG 7.10.2
Appium Java client 9.2.3
Azure Pipelines (Linux machines)
BrowserStack remote execution

Also, I do use testng suite xml file and have around 100 tests which altogether take about 1.5-2 hours on when running in non-parallel mode. However i do not have any methods annotated with @AfterSuite.

This issue may be related to how your pipeline is configured to run tests or Appium client server communication.

Please share your build.gradle to see how you have configured TestNG as well as command which you use to run tests. Without any example to reproduce, it is hard to debug.

bapszy · 2024-07-25T15:01:39Z

In our build.gradle configured this:

test {
    // run specific suite from command line with gradlew test -Psuite1
    systemProperty "configFailurePolicy", "continue"
    systemProperty "surefire.printSummary", "false"
    useTestNG() {
        useDefaultListeners = true

//Android
if (project.hasProperty('AndroidAllTests')) {
    suites 'src/main/java/com/****/*****/suites/android/AllTests.xml'
}

And we start the test run in the GitLab yml:
gradle -i test -PAlltest

Currently I'm working on going through the snapshots with a binary search to see if I can find the commit from which the problem occurs

itkhanz · 2024-07-25T16:01:31Z

In our build.gradle configured this:
test {
    // run specific suite from command line with gradlew test -Psuite1
    systemProperty "configFailurePolicy", "continue"
    systemProperty "surefire.printSummary", "false"
    useTestNG() {
        useDefaultListeners = true

//Android
if (project.hasProperty('AndroidAllTests')) {
    suites 'src/main/java/com/****/*****/suites/android/AllTests.xml'
}
And we start the test run in the GitLab yml: gradle -i test -PAlltest

Currently I'm working on going through the snapshots with a binary search to see if I can find the commit from which the problem occurs

There is apparently nothing wrong here. Run your tests with gradle debug mode, as well as enabled debug mode logging in GitLab if there is one. This will help to see what is going on after the very last test has finished executing. You can also check Appium Server logs to see whats going on if there is some test execution stuck, or put some logging statements to see at which test or configuration method execution hangs.

bapszy · 2024-07-29T13:34:56Z

This Snaphot version is where it starts to fail:
testng-7.9.0-20231218.041814-36.jar | Mon Dec 18 04:20:08 UTC 2023

If I'm right this is the corresponding pull request:
#3014

krmahadevan · 2024-07-29T14:55:31Z

@bapszy - That is a good find. But just the seggregation of the executor logic AFAIK shouldn't cause any stalling.

The only stalling issue I am aware of was this #3028 and I think I fixed this as well in 7.11.0

bapszy · 2024-07-29T15:08:33Z

@krmahadevan I don't see version 7.11.0, only the 7.10.2

What is the correct way of enabling TestNG logging? We tried the log4testng.properties but it doesn't work

krmahadevan · 2024-07-29T15:11:42Z

7.11.0 is the upcoming version. Not yet released. But that fix wont impact you if you are NOT using the shared thread pool executors (Are you doing that somewhere?)

What is the correct way of enabling TestNG logging? We tried the log4testng.properties but it doesn't work

TestNG now uses the standard slf4j for its logging purposes. So you can use your favorite implementation (log4j or jul logger) to configure the slf4j logger and dump logs.

bapszy · 2024-08-14T13:09:11Z

Hi again,
I was on holiday and now I continue the investigation of this. But the snapshots of 7.9 are missing from https://oss.sonatype.org/content/repositories/snapshots/org/testng/testng/

bapszy · 2024-08-14T18:05:03Z

@krmahadevan
I could make the logs work, and with version 7.9.0 I could reach the point when it hangs.
I see the following line exactly every 60 seconds in the log at the end:
[idle-connection-reaper] DEBUG PoolingHttpClientConnectionManager:441 - Closing connections idle longer than 60000 MILLISECONDS

Is it something testNG calls?

krmahadevan · 2024-08-15T01:20:45Z

No. That looks like some connection manager at work cleaning stale http connections.
TestNG does not have any connection managers.

krmahadevan · 2024-08-15T01:21:53Z

Hi again,

I was on holiday and now I continue the investigation of this. But the snapshots of 7.9 are missing from https://oss.sonatype.org/content/repositories/snapshots/org/testng/testng/

I am not sure what may have happened there. I personally have not paid a lot of attention to snapshot jars because there is hardly anyone who asks for it.

ilya-corp · 2024-09-17T06:01:14Z

Hi, looks like I have the same problem. The amount of tests is 250, approx. 15 failed and 15 skipped.
I have tried JDK 21 and 22. The version of TestNG is latest.
I start tests from XML runner file from IntellijIdea. Selenide 7.4.2 or 7.5.0 is in use also.

<!DOCTYPE suite SYSTEM "https://testng.org/testng-1.0.dtd" >

<suite name="lk-multithreading" parallel="none" configfailurepolicy="continue">
    <listeners>
        <listener class-name="org.testng.reporters.TestHTMLReporter" />
        <listener class-name="org.testng.reporters.EmailableReporter2" />
        <listener class-name="org.testng.reporters.FailedReporter" />
        <listener class-name="org.testng.reporters.XMLReporter" />
        <listener class-name="ru.company.core.TestOrderRandomizer" />
    </listeners>
    <test name="proxy" parallel="methods" thread-count="8">
        <groups>
            <run>
                <exclude name="not-for-production" />
                <exclude name="draft" />
                <exclude name="no-proxy" />
            </run>
        </groups>
        <packages>
            <package name="ru.company.lk.tests.functional.*">
                <exclude name="ru.company.lk.tests.functional.debug.*" />
                <exclude name="ru.company.lk.tests.functional.monitoring.StagingLiveCheckerTests.*" />
                <exclude name="ru.company.lk.tests.functional.regular.RegularTests" />
            </package>
        </packages>
    </test>

    <test name="no-proxy" parallel="methods" thread-count="2">
        <groups>
            <run>
                <include name="no-proxy" />
                <exclude name="not-for-production" />
                <exclude name="draft" />
            </run>
        </groups>
        <packages>
            <package name="ru.company.lk.tests.functional.*">
                <exclude name="ru.company.lk.tests.functional.registration.*" />
                <exclude name="ru.company.lk.tests.functional.debug.*" />
                <exclude name="ru.company.lk.tests.functional.monitoring.StagingLiveCheckerTests.*" />
                <exclude name="ru.company.lk.tests.functional.regular.RegularTests" />
            </package>
        </packages>
    </test>
</suite>

It looks like @AfterSuite hooks not starting after every test completed.
But the TestNG IntellijIdea plugin showing progress animation for skipped tests.
To skip test manually I use throw new SkipException("skipped");

krmahadevan added the needs-sample label Jul 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TestNG execution hangs with version 7.9 or higher, with large amount of test cases #3152

TestNG execution hangs with version 7.9 or higher, with large amount of test cases #3152

bapszy commented Jul 23, 2024 •

edited by krmahadevan

Loading

krmahadevan commented Jul 23, 2024

github-actions bot commented Jul 23, 2024

bapszy commented Jul 23, 2024

krmahadevan commented Jul 23, 2024

itkhanz commented Jul 25, 2024

bapszy commented Jul 25, 2024

itkhanz commented Jul 25, 2024

bapszy commented Jul 29, 2024

krmahadevan commented Jul 29, 2024

bapszy commented Jul 29, 2024

krmahadevan commented Jul 29, 2024

bapszy commented Aug 14, 2024

bapszy commented Aug 14, 2024 •

edited

Loading

krmahadevan commented Aug 15, 2024

krmahadevan commented Aug 15, 2024

ilya-corp commented Sep 17, 2024 •

edited

Loading

TestNG execution hangs with version 7.9 or higher, with large amount of test cases #3152

TestNG execution hangs with version 7.9 or higher, with large amount of test cases #3152

Comments

bapszy commented Jul 23, 2024 • edited by krmahadevan Loading

TestNG Version

Expected behavior

Actual behavior

Is the issue reproducible on runner?

Test case sample

Contribution guidelines

krmahadevan commented Jul 23, 2024

github-actions bot commented Jul 23, 2024

bapszy commented Jul 23, 2024

krmahadevan commented Jul 23, 2024

itkhanz commented Jul 25, 2024

bapszy commented Jul 25, 2024

itkhanz commented Jul 25, 2024

bapszy commented Jul 29, 2024

krmahadevan commented Jul 29, 2024

bapszy commented Jul 29, 2024

krmahadevan commented Jul 29, 2024

bapszy commented Aug 14, 2024

bapszy commented Aug 14, 2024 • edited Loading

krmahadevan commented Aug 15, 2024

krmahadevan commented Aug 15, 2024

ilya-corp commented Sep 17, 2024 • edited Loading

bapszy commented Jul 23, 2024 •

edited by krmahadevan

Loading

bapszy commented Aug 14, 2024 •

edited

Loading

ilya-corp commented Sep 17, 2024 •

edited

Loading