Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: investigate async-hooks/test-signalwrap #14070

Closed
refack opened this issue Jul 4, 2017 · 17 comments
Closed

test: investigate async-hooks/test-signalwrap #14070

refack opened this issue Jul 4, 2017 · 17 comments
Assignees
Labels
async_hooks Issues and PRs related to the async hooks subsystem. flaky-test Issues and PRs related to the tests with unstable failures on the CI. test Issues and PRs related to the tests.

Comments

@refack
Copy link
Contributor

refack commented Jul 4, 2017

  • Version: master
  • Platform: aix61-ppc64
  • Subsystem: test,async_hooks
not ok 39 async-hooks/test-signalwrap
  ---
  duration_ms: 0.240
  severity: fail
  stack: |-
    Mismatched onsigusr2 function calls. Expected exactly 2, actual 0.
        at Object.exports.mustCall (/home/iojs/build/workspace/node-test-commit-aix/nodes/aix61-ppc64/test/common/index.js:484:10)
        at Object.<anonymous> (/home/iojs/build/workspace/node-test-commit-aix/nodes/aix61-ppc64/test/async-hooks/test-signalwrap.js:14:30)
        at Module._compile (module.js:569:30)
        at Object.Module._extensions..js (module.js:580:10)
        at Module.load (module.js:503:32)
        at tryModuleLoad (module.js:466:12)
        at Function.Module._load (module.js:458:3)
        at Function.Module.runMain (module.js:605:10)
        at startup (bootstrap_node.js:158:16)
  ...

https://ci.nodejs.org/job/node-test-commit-aix/6994/nodes=aix61-ppc64/console

/cc @nodejs/async_hooks @nodejs/platform-aix

@refack refack added aix Issues and PRs related to the AIX platform. async_hooks Issues and PRs related to the async hooks subsystem. flaky-test Issues and PRs related to the tests with unstable failures on the CI. test Issues and PRs related to the tests. labels Jul 4, 2017
@gireeshpunathil
Copy link
Member

Running a thousand times locally to see if it fails.

@refack
Copy link
Contributor Author

refack commented Jul 4, 2017

Stress CI (the whole test/async-hooks/ suite [which runs in parallel]): https://ci.nodejs.org/job/node-stress-single-test/1314/nodes=aix61-ppc64/

@refack
Copy link
Contributor Author

refack commented Jul 5, 2017

Stress CI (the whole test/async-hooks/ suite [which runs in parallel]): https://ci.nodejs.org/job/node-stress-single-test/1314/nodes=aix61-ppc64/

Stress returned no results 🤷‍♂️

@Trott
Copy link
Member

Trott commented Jul 5, 2017

Stress returned no results

You had previously landed something that has the async-hooks tests running parallel, right? If so, maybe try the stress test again under load by passing something like -j 96 --repeat 960? Just an idea, not a particularly informed guess or anything as I haven't looked at the test. It's kinda my standard first thing to try when I can't make a test that's failing on CI fail otherwise.

@refack
Copy link
Contributor Author

refack commented Jul 5, 2017

You had previously landed something that has the async-hooks tests running parallel, right? If so, maybe try the stress test again under load by passing something like -j 96 --repeat 960? Just an idea, not a particularly informed guess or anything as I haven't looked at the test. It's kinda my standard first thing to try when I can't make a test that's failing on CI fail otherwise.

Worth a shot - https://ci.nodejs.org/job/node-stress-single-test/1315/nodes=aix61-ppc64/
Why 96?

@Trott
Copy link
Member

Trott commented Jul 5, 2017

Why 96?

I think there's a default shell setting on macOS that will stop things at 100, so I usually go for just under that as a first pass. And 96 instead of 97, 98, or 99 because it's kinda sorta almost a power of 2 (64 + 32), and we all prefer powers of 2, don't we?

@refack
Copy link
Contributor Author

refack commented Jul 5, 2017

@Trott you are an evil genius 😉 this gave us a plethora of weirdness... No fails in async-hooks/test-signalwrap though.

A harness burp:

ok 869 async-hooks/test-callback-error
  ---
  duration_ms: 4.485
  ...
Exception in thread Thread-23:
Traceback (most recent call last):
  File "/opt/freeware/lib/python2.7/threading.py", line 810, in __bootstrap_inner
    self.run()
  File "/opt/freeware/lib/python2.7/threading.py", line 763, in run
    self.__target(*self.__args, **self.__kwargs)
  File "tools/test.py", line 140, in RunSingle
    output = case.Run()
  File "/home/iojs/build/workspace/node-stress-single-test/nodes/aix61-ppc64/tools/test.py", line 537, in Run
    "TEST_THREAD_ID": "%d" % self.thread_id
  File "/home/iojs/build/workspace/node-stress-single-test/nodes/aix61-ppc64/tools/test.py", line 519, in RunCommand
    env)
  File "/home/iojs/build/workspace/node-stress-single-test/nodes/aix61-ppc64/tools/test.py", line 752, in Execute
    pty_out = pty_out
  File "/home/iojs/build/workspace/node-stress-single-test/nodes/aix61-ppc64/tools/test.py", line 649, in RunProcess
    **rest
  File "/opt/freeware/lib/python2.7/subprocess.py", line 710, in __init__
    errread, errwrite)
  File "/opt/freeware/lib/python2.7/subprocess.py", line 1231, in _execute_child
    self.pid = os.fork()
OSError: [Errno 11] Resource temporarily unavailable

ok 870 async-hooks/test-callback-error

A fail in callback-error (here we go again 🤦‍♂️ )

not ok 609 async-hooks/test-callback-error
  ---
  duration_ms: 4.97
  severity: fail
  stack: |-
    start case 1
    assert.js:586
    assert.ifError = function ifError(err) { if (err) throw err; };
                                                      ^
    
    Error: spawnSync /home/iojs/build/workspace/node-stress-single-test/nodes/aix61-ppc64/out/Release/node EAGAIN
        at exports._errnoException (util.js:1020:11)
        at Object.spawnSync (internal/child_process.js:918:20)
        at spawnSync (child_process.js:548:24)
        at Object.<anonymous> (/home/iojs/build/workspace/node-stress-single-test/nodes/aix61-ppc64/test/async-hooks/test-callback-error.js:51:17)
        at Module._compile (module.js:569:30)
        at Object.Module._extensions..js (module.js:580:10)
        at Module.load (module.js:503:32)
        at tryModuleLoad (module.js:466:12)
        at Function.Module._load (module.js:458:3)
        at Function.Module.runMain (module.js:605:10)
  ...

A different callback-error error:

not ok 41 async-hooks/test-callback-error
  ---
  duration_ms: 7.592
  severity: fail
  stack: |-
    start case 1
    end case 1: 726.624ms
    start case 2
    end case 2: 428.448ms
    start case 3
    /home/iojs/build/workspace/node-stress-single-test/nodes/aix61-ppc64/test/async-hooks/test-callback-error.js:84
      child.stdout.on('data', (data) => {
                  ^
    
    TypeError: Cannot read property 'on' of undefined
        at Object.<anonymous> (/home/iojs/build/workspace/node-stress-single-test/nodes/aix61-ppc64/test/async-hooks/test-callback-error.js:84:15)
        at Module._compile (module.js:569:30)
        at Object.Module._extensions..js (module.js:580:10)
        at Module.load (module.js:503:32)
        at tryModuleLoad (module.js:466:12)
        at Function.Module._load (module.js:458:3)
        at Function.Module.runMain (module.js:605:10)
        at startup (bootstrap_node.js:158:16)
        at bootstrap_node.js:575:3
  ...

@BridgeAR
Copy link
Member

@refack if I understand your comment correct it seems like the original error just had a hiccup and other errors came up as being flaky instead?

@refack
Copy link
Contributor Author

refack commented Sep 28, 2017

@BridgeAR It seems like stressing AIX gives many varied failures. But AFAICT the issue in the OP is still unresolved, but also unreproducible.

Personally I haven't seen any of those since. I'll run another stress test, and if nothing comes up consistently I think this can be closed.

https://ci.nodejs.org/job/node-stress-single-test/1424/nodes=aix61-ppc64/

@refack
Copy link
Contributor Author

refack commented Sep 29, 2017

Well we got one of the original:

not ok 24 async-hooks/test-signalwrap
  ---
  duration_ms: 7.424
  severity: fail
  stack: |-
    Mismatched onsigusr2Again function calls. Expected exactly 1, actual 0.
        at Object.exports.mustCall (/home/iojs/build/workspace/node-stress-single-test/nodes/aix61-ppc64/test/common/index.js:485:10)
        at process.onsigusr2 (/home/iojs/build/workspace/node-stress-single-test/nodes/aix61-ppc64/test/async-hooks/test-signalwrap.js:48:34)
        at process.<anonymous> (/home/iojs/build/workspace/node-stress-single-test/nodes/aix61-ppc64/test/common/index.js:517:15)
        at emitNone (events.js:105:13)
        at process.emit (events.js:207:7)
        at Signal.wrap.onsignal (internal/process.js:198:44)
  ...

(1 out of 100 while testing with -j 96 async-hooks)

@joyeecheung
Copy link
Member

joyeecheung commented Oct 31, 2017

Showed up in the 8.9.0 LTS proposal #16630 , this time on CentOS

https://ci.nodejs.org/job/node-test-commit-linux/13690/nodes=centos7-64/consoleFull

not ok 45 async-hooks/test-signalwrap
  ---
  duration_ms: 0.156
  severity: fail
  stack: |-
    Mismatched onsigusr2Again function calls. Expected exactly 1, actual 0.
        at Object.exports.mustCall (/home/iojs/build/workspace/node-test-commit-linux/nodes/centos7-64/test/common/index.js:501:10)
        at process.onsigusr2 (/home/iojs/build/workspace/node-test-commit-linux/nodes/centos7-64/test/async-hooks/test-signalwrap.js:48:34)
        at process.<anonymous> (/home/iojs/build/workspace/node-test-commit-linux/nodes/centos7-64/test/common/index.js:533:15)
        at emitNone (events.js:106:13)
        at process.emit (events.js:208:7)
        at Signal.wrap.onsignal (internal/process.js:208:44)
  ...

@joyeecheung joyeecheung removed the aix Issues and PRs related to the AIX platform. label Oct 31, 2017
@Trott
Copy link
Member

Trott commented Nov 17, 2017

Today on alpine36-container-x64:

https://ci.nodejs.org/job/node-test-commit-linux/14184/nodes=alpine36-container-x64/consoleFull

not ok 100 async-hooks/test-signalwrap
  ---
  duration_ms: 0.767
  severity: fail
  stack: |-
    Mismatched onsigusr2 function calls. Expected exactly 2, actual 1.
        at Object.exports.mustCall (/home/iojs/build/workspace/node-test-commit-linux/nodes/alpine36-container-x64/test/common/index.js:490:10)
        at Object.<anonymous> (/home/iojs/build/workspace/node-test-commit-linux/nodes/alpine36-container-x64/test/async-hooks/test-signalwrap.js:14:30)
        at Module._compile (module.js:644:30)
        at Object.Module._extensions..js (module.js:655:10)
        at Module.load (module.js:563:32)
        at tryModuleLoad (module.js:506:12)
        at Function.Module._load (module.js:498:3)
        at Function.Module.runMain (module.js:685:10)
        at startup (bootstrap_node.js:192:16)
  ...

@Trott
Copy link
Member

Trott commented Dec 22, 2017

alpine35-container-x64 too.

https://ci.nodejs.org/job/node-test-commit-linux/15151/nodes=alpine35-container-x64/consoleText

not ok 44 async-hooks/test-signalwrap
  ---
  duration_ms: 0.375
  severity: fail
  stack: |-
    Mismatched onsigusr2Again function calls. Expected exactly 1, actual 0.
        at Object.exports.mustCall (/home/iojs/build/workspace/node-test-commit-linux/nodes/alpine35-container-x64/test/common/index.js:496:10)
        at process.onsigusr2 (/home/iojs/build/workspace/node-test-commit-linux/nodes/alpine35-container-x64/test/async-hooks/test-signalwrap.js:48:34)
        at process.<anonymous> (/home/iojs/build/workspace/node-test-commit-linux/nodes/alpine35-container-x64/test/common/index.js:530:15)
        at process.emit (events.js:127:13)
        at Signal.wrap.onsignal (internal/process.js:198:44)
  ...

@Trott Trott self-assigned this Dec 22, 2017
@Trott
Copy link
Member

Trott commented Dec 22, 2017

I think I know what's probably causing this....but first let me see if I can reproduce it...

Stress test with -j 96 --repeat 192 test/async-hooks/test-signalwrap.js on alpine34: https://ci.nodejs.org/job/node-stress-single-test/1571/nodes=alpine34-container-x64/

@Trott
Copy link
Member

Trott commented Dec 22, 2017

Also, with -j 1 --repeat 192 test/async-hooks/test-signalwrap.js: https://ci.nodejs.org/job/node-stress-single-test/1572/nodes=alpine34-container-x64/

@Trott
Copy link
Member

Trott commented Dec 22, 2017

Oh, thank goodness, the first one (-j 96) managed to reproduce the issue:

not ok 25 async-hooks/test-signalwrap
  ---
  duration_ms: 11.527
  severity: fail
  stack: |-
    Mismatched onsigusr2 function calls. Expected exactly 2, actual 1.
        at Object.exports.mustCall (/home/iojs/build/workspace/node-stress-single-test/nodes/alpine34-container-x64/test/common/index.js:496:10)
        at Object.<anonymous> (/home/iojs/build/workspace/node-stress-single-test/nodes/alpine34-container-x64/test/async-hooks/test-signalwrap.js:14:30)
        at Module._compile (module.js:660:30)
        at Object.Module._extensions..js (module.js:671:10)
        at Module.load (module.js:577:32)
        at tryModuleLoad (module.js:517:12)
        at Function.Module._load (module.js:509:3)
        at Function.Module.runMain (module.js:701:10)
        at startup (bootstrap_node.js:194:16)
  ...

Trott added a commit to Trott/io.js that referenced this issue Dec 22, 2017
Use an interval to keep the event loop open so the test does not exit
before receiving all signals fom asynchronous `exec()` calls.

Fixes: nodejs#14070
@Trott
Copy link
Member

Trott commented Dec 22, 2017

Proposed fix: #17827

@Trott Trott closed this as completed in 457ceeb Dec 25, 2017
MylesBorins pushed a commit that referenced this issue Jan 8, 2018
Use an interval to keep the event loop open so the test does not exit
before receiving all signals fom asynchronous `exec()` calls.

PR-URL: #17827
Fixes: #14070
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
MylesBorins pushed a commit that referenced this issue Jan 9, 2018
Use an interval to keep the event loop open so the test does not exit
before receiving all signals fom asynchronous `exec()` calls.

PR-URL: #17827
Fixes: #14070
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
MylesBorins pushed a commit that referenced this issue Jan 9, 2018
Use an interval to keep the event loop open so the test does not exit
before receiving all signals fom asynchronous `exec()` calls.

PR-URL: #17827
Fixes: #14070
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
MylesBorins pushed a commit that referenced this issue May 22, 2018
Use an interval to keep the event loop open so the test does not exit
before receiving all signals fom asynchronous `exec()` calls.

PR-URL: #17827
Fixes: #14070
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
MylesBorins pushed a commit that referenced this issue Jun 14, 2018
Use an interval to keep the event loop open so the test does not exit
before receiving all signals fom asynchronous `exec()` calls.

PR-URL: #17827
Fixes: #14070
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
rvagg pushed a commit that referenced this issue Aug 16, 2018
Use an interval to keep the event loop open so the test does not exit
before receiving all signals fom asynchronous `exec()` calls.

PR-URL: #17827
Fixes: #14070
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
async_hooks Issues and PRs related to the async hooks subsystem. flaky-test Issues and PRs related to the tests with unstable failures on the CI. test Issues and PRs related to the tests.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants