Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate flaky test-vm-timeout-rethrow on Windows #11261

Closed
mscdex opened this issue Feb 9, 2017 · 5 comments
Closed

Investigate flaky test-vm-timeout-rethrow on Windows #11261

mscdex opened this issue Feb 9, 2017 · 5 comments
Labels
test Issues and PRs related to the tests. vm Issues and PRs related to the vm subsystem. windows Issues and PRs related to the Windows platform.

Comments

@mscdex
Copy link
Contributor

mscdex commented Feb 9, 2017

  • Version: master
  • Platform: Windows
  • Subsystem: test, vm

Example: https://ci.nodejs.org/job/node-test-binary-windows/6447/RUN_SUBSET=3,VS_VERSION=vs2015,label=win2012r2/console

not ok 342 sequential/test-vm-timeout-rethrow
  ---
  duration_ms: 0.641
  severity: fail
  stack: |-
    
    assert.js:85
      throw new assert.AssertionError({
      ^
    AssertionError: false == true
        at process.<anonymous> (c:\workspace\node-test-binary-windows\RUN_SUBSET\3\VS_VERSION\vs2015\label\win2012r2\test\sequential\test-vm-timeout-rethrow.js:27:12)
        at emitOne (events.js:101:20)
        at process.emit (events.js:189:7)
  ...
@mscdex mscdex added test Issues and PRs related to the tests. vm Issues and PRs related to the vm subsystem. windows Issues and PRs related to the Windows platform. labels Feb 9, 2017
@kunalspathak
Copy link
Member

I did some investigation for this test case and it looks like some timing issue is causing it. In repro case, uv_process_timers() miss calling timer_cb because of this condition where timer->due is greater than loop->timeby 1. Because of this it gives time for script to complete execution before next timeprocess_timers` is called. In non-repro case, these 2 values are equal and timers are processed.

For repro case, here loop->time is not updated ( or incremented by 1) while timer's due is already incremented at this point. This causes the above condition to fail. For non-repro, loop->time gets incremented.

Repro:

Updating due_time. loop_time : 1724132451, timeout : 1, due : 1724132452.                          
Updated loop->time to 1724132451 at 499.                                                            
starting timer : due : 1724132452, loop_time : 1724132451, diff = 18446744073709 <-- This is condition fails and timer is skipped              
calling uv_process_reqs.                                                                            
Updated loop->time to 1724132452 at 441.                                                            
Updated loop->time to 1724132452 at 499.                                                            
starting timer : due : 1724132452, loop_time : 1724132452, diff = 0.            
running timer : due : 1724132452, loop_time : 1724132452, diff = 0.             
******TIMER called******                                                                            
calling uv_process_reqs.                                                                            
Updated loop->time to 1724132452 at 499.                                                            
calling uv_process_reqs.                                                                            
Updated loop->time to 1724132454 at 494.                                                            
Updated loop->time to 1724132454 at 494.                                                            
Updated loop->time to 1724132459 at 499.                                                            
calling uv_process_reqs.                                                                            
Updated loop->time to 1724132459 at 499.                                                            
calling uv_process_reqs.                                                                            
                                                                                                    
got Exception : false == true                                                                       
Updated loop->time to 1724132611 at 499.                                                            
calling uv_process_reqs.                                                                            
Updated loop->time to 1724132611 at 499.                                                            
calling uv_process_reqs.                 

No-Repro

Updating due_time. loop_time : 1724138355, timeout : 1, due : 1724138356.            
Updated loop->time to 1724138356 at 499.                                                            
starting timer : due : 1724138356, loop_time : 1724138356, diff = 0.            
running timer : due : 1724138356, loop_time : 1724138356, diff = 0.             
******TIMER called******                                                                            
calling uv_process_reqs.                                                                            
Updated loop->time to 1724138356 at 499.                                                            
calling uv_process_reqs.                                                                            
Updated loop->time to 1724138356 at 441.                                                            
Updated loop->time to 1724138356 at 499.                                                            
calling uv_process_reqs.                                                                            
                                                                                                    
Updated loop->time to 1724138491 at 499.                                                            
calling uv_process_reqs.                                                                            
Updated loop->time to 1724138492 at 499.                                                            
calling uv_process_reqs.                                                                            

I am wondering why would this be windows specific issue. Did we see similar issue in the past? Since when this test started failing?

@mscdex
Copy link
Contributor Author

mscdex commented Feb 17, 2017

I'm not sure how long it's been happening, I just happened to notice it that time and didn't find a pre-existing issue filed about it.

I don't know if Jenkins has an easy way to see the history of successes/failures of individual tests. Perhaps @nodejs/build knows the answer to this?

@gibfahn
Copy link
Member

gibfahn commented Feb 17, 2017

@mscdex you can look at the individual tap result and then click Next build and Previous build to skip through the tests. It's not a proper overview, but it is at least reasonably quick.

Also in the case where the tests mostly pass, you can just check the Red blobs

@kunalspathak
Copy link
Member

The earliest i can go is till run# 6435 which was triggered on Feb 8th. @joaocgreis , do you know if there is an easier way to check the history of a unit test in Jenkins?

@joaocgreis
Copy link
Member

There is no easy way to check the history of a test in CI.

cc @nodejs/testing

kunalspathak added a commit to kunalspathak/node that referenced this issue Feb 24, 2017
The intention of test case is to make sure that `timeout` property is honored
and the code in context terminates and throws correct exception. However in
test case, the code inside context would complete before `timeout` for windows
and would sometimes fail. Updated the code so it guarantee to not complete
execution until timeout is triggered.

Fixes: nodejs#11261
italoacasas pushed a commit to italoacasas/node that referenced this issue Feb 25, 2017
The intention of test case is to make sure that `timeout` property is honored
and the code in context terminates and throws correct exception. However in
test case, the code inside context would complete before `timeout` for windows
and would sometimes fail. Updated the code so it guarantee to not complete
execution until timeout is triggered.

Fixes: nodejs#11261
PR-URL: nodejs#11530
Reviewed-By: James M Snell <jasnell.gmail.com>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Josh Gavant <josh.gavant@outlook.com>
jasnell pushed a commit that referenced this issue Mar 7, 2017
The intention of test case is to make sure that `timeout` property is honored
and the code in context terminates and throws correct exception. However in
test case, the code inside context would complete before `timeout` for windows
and would sometimes fail. Updated the code so it guarantee to not complete
execution until timeout is triggered.

Fixes: #11261
PR-URL: #11530
Reviewed-By: James M Snell <jasnell.gmail.com>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Josh Gavant <josh.gavant@outlook.com>
jasnell pushed a commit that referenced this issue Mar 7, 2017
The intention of test case is to make sure that `timeout` property is honored
and the code in context terminates and throws correct exception. However in
test case, the code inside context would complete before `timeout` for windows
and would sometimes fail. Updated the code so it guarantee to not complete
execution until timeout is triggered.

Fixes: #11261
PR-URL: #11530
Reviewed-By: James M Snell <jasnell.gmail.com>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Josh Gavant <josh.gavant@outlook.com>
MylesBorins pushed a commit that referenced this issue Mar 9, 2017
The intention of test case is to make sure that `timeout` property is honored
and the code in context terminates and throws correct exception. However in
test case, the code inside context would complete before `timeout` for windows
and would sometimes fail. Updated the code so it guarantee to not complete
execution until timeout is triggered.

Fixes: #11261
PR-URL: #11530
Reviewed-By: James M Snell <jasnell.gmail.com>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Josh Gavant <josh.gavant@outlook.com>
MylesBorins pushed a commit that referenced this issue Mar 9, 2017
The intention of test case is to make sure that `timeout` property is honored
and the code in context terminates and throws correct exception. However in
test case, the code inside context would complete before `timeout` for windows
and would sometimes fail. Updated the code so it guarantee to not complete
execution until timeout is triggered.

Fixes: #11261
PR-URL: #11530
Reviewed-By: James M Snell <jasnell.gmail.com>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Josh Gavant <josh.gavant@outlook.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
test Issues and PRs related to the tests. vm Issues and PRs related to the vm subsystem. windows Issues and PRs related to the Windows platform.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants