Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NameResolutionPalTest failing intermittently on Linux Arm64 #27622

Open
weshaggard opened this issue Oct 12, 2018 · 16 comments
Open

NameResolutionPalTest failing intermittently on Linux Arm64 #27622

weshaggard opened this issue Oct 12, 2018 · 16 comments
Labels
arch-arm64 area-System.Net disabled-test The test is disabled in source code against the issue os-linux Linux OS (any supported distro) test-bug Problem in test source code (most likely) test-run-core Test failures in .NET Core test runs
Milestone

Comments

@weshaggard
Copy link
Member

https://mc.dot.net/#/user/weshaggard/pr~2Fjenkins~2Fdotnet~2Fcorefx~2Fmaster~2F/test~2Ffunctional~2Fcli~2F/fdfc526a83091617be7978de875969f48d2666bf/workItem/System.Net.NameResolution.Pal.Tests/analysis/xunit/System.Net.NameResolution.PalTests.NameResolutionPalTests~2FTryGetAddrInfo_HostName

Message :
Assert.Equal() Failure
Expected: Success
Actual:   TryAgain
Stack Trace :
   at System.Net.NameResolution.PalTests.NameResolutionPalTests.TryGetAddrInfo_HostName() in /mnt/j/workspace/dotnet_corefx/master/linux-TGroup_netcoreapp+CGroup_Release+AGroup_arm64+TestOuter_false_prtest/src/System.Net.NameResolution/tests/PalTests/NameResolutionPalTests.cs:line 54

https://mc.dot.net/#/user/weshaggard/pr~2Fjenkins~2Fdotnet~2Fcorefx~2Fmaster~2F/test~2Ffunctional~2Fcli~2F/fdfc526a83091617be7978de875969f48d2666bf/workItem/System.Net.NameResolution.Pal.Tests/analysis/xunit/System.Net.NameResolution.PalTests.NameResolutionPalTests~2FTryGetAddrInfo_HostName_TryGetNameInfo

Message :
Assert.Equal() Failure
Expected: Success
Actual:   TryAgain
Stack Trace :
   at System.Net.NameResolution.PalTests.NameResolutionPalTests.TryGetAddrInfo_HostName_TryGetNameInfo() in /mnt/j/workspace/dotnet_corefx/master/linux-TGroup_netcoreapp+CGroup_Release+AGroup_arm64+TestOuter_false_prtest/src/System.Net.NameResolution/tests/PalTests/NameResolutionPalTests.cs:line 112
@danmoseley
Copy link
Member

@karelz broke CI again
https://mc.dot.net/#/user/danmosemsft/pr~2Fjenkins~2Fdotnet~2Fcorefx~2Fmaster~2F/test~2Ffunctional~2Fcli~2F/5651b5c2b7134c8ab662c5594b10f86db7f3cb9c

Can we please make it outer loop or disable for ARM until someone has time to robustify it?

@karelz
Copy link
Member

karelz commented Oct 29, 2018

@wfurt @rmkerr how often does it fail?
@wfurt is in process to enable us debug test failures ...

@rmkerr
Copy link
Contributor

rmkerr commented Oct 29, 2018

It doesn't appear to be failing at all in the daily runs, but it is failing regularly on PRs. It's failing ten to fifteen times a day, so we should fix it or move it to outerloop ASAP.

@wfurt
Copy link
Member

wfurt commented Oct 30, 2018

note that OSX has similar issues. For one, when OS returns EAGAIN do not retry.
But I have seen cases when OS simply returns "host not found".

@danmoseley
Copy link
Member

danmoseley commented Nov 2, 2018

@rmkerr
Copy link
Contributor

rmkerr commented Nov 2, 2018

Yep, I'll do that now.

@wfurt wfurt self-assigned this Dec 19, 2018
wfurt referenced this issue in wfurt/corefx Jan 29, 2019
wfurt referenced this issue in dotnet/corefx Jan 30, 2019
* add instrumentation for #32797

* actually retry the lookup

* use PlatformID.Unix
@danmoseley
Copy link
Member

@wfurt we re-enabled these tests on ARM in dotnet/corefx#34962, unfortunately this and another failed again there.

https://mc.dot.net/#/user/dotnet-bot/pr~2Fdotnet~2Fcorefx~2Frefs~2Fpull~2F35056~2Fmerge/test~2Ffunctional~2Fcli~2F/20190204.11/workItem/System.Net.NameResolution.Pal.Tests

Ubuntu.1604.Arm64.Open-arm64-Release
Get Repro environment
Unhandled Exception of Type Xunit.Sdk.EqualException
Message :
Assert.Equal() Failure
Expected: Success
Actual:   TryAgain
Stack Trace :
   at System.Net.NameResolution.PalTests.NameResolutionPalTests.TryGetAddrInfo_HostName_TryGetNameInfo() in /__w/1/s/src/System.Net.NameResolution/tests/PalTests/NameResolutionPalTests.cs:line 132

same for TryGetAddrInfo_HostName

@wfurt
Copy link
Member

wfurt commented Feb 5, 2019

yes, I know. I've been trying to work out with @ulisesh to make some infrastructure changes. I was hoping we can get it done quickly to avoid another disable/enable cycle.
(see notes in dotnet/corefx#34934)
If too noisy, I can disable them again and retry to enable them later.

@danmoseley
Copy link
Member

Is it possible to make them outer loop so they don't break CI jobs? Unless you have a reason for them to be inner loop. You can still request outer loop in CI for testing, of course...

@wfurt
Copy link
Member

wfurt commented Feb 9, 2019

Changes to container configuration was made. Last failure was at 2019-02-06 04:18:16 (~2days ago)
keeping fingers crossed.

@wfurt
Copy link
Member

wfurt commented Feb 12, 2019

After 5 day passes it failed again today for pr/dotnet/corefx/refs/pull/34931/merge

Are we sure we updated all machines @ulisesh ?
This is Ubuntu.1604.Arm64.Open vs Ubuntu.1604.Arm64 used for official builds.

@ulisesh
Copy link
Contributor

ulisesh commented Feb 14, 2019

Right now, Ubuntu.1604.Arm64.Open doesn't run on containers, it runs on Centriq servers. We will move Ubuntu.1604.Arm64.Open to run on containers early next week

@wfurt
Copy link
Member

wfurt commented Mar 28, 2019

I did not see failure for almost month after infrastructure changes. Closing for now.

@wfurt wfurt closed this as completed Mar 28, 2019
@stephentoub
Copy link
Member

I did not see failure for almost month after infrastructure changes. Closing for now.

That's because the test is disabled.

@stephentoub stephentoub reopened this Jun 18, 2019
@karelz
Copy link
Member

karelz commented Jun 28, 2019

Seems like test issue, no need to have it in 3.0. cc @wfurt

@antonfirsov antonfirsov removed their assignment Jul 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arch-arm64 area-System.Net disabled-test The test is disabled in source code against the issue os-linux Linux OS (any supported distro) test-bug Problem in test source code (most likely) test-run-core Test failures in .NET Core test runs
Projects
None yet
10 participants