-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Linux musl x64 build break apparently due to some libc version mismatch (high priority as it is blocking clean CI) #84949
Comments
The error sounds as if we attempted to load MUSL version of the libjitinterface_x64.so on a glibc distro. |
@sbomer It looks like the cross build is attempting to use crossgen2 with MUSL version of JIT native libraries while it should use the glibc ones. |
@janvorli - I see, so am I right to understand that in this particular case |
/cc @dotnet/jit-contrib |
I see this in the logs:
The musl x64 build changed to become a cross-build (build on x64 glibc, targeting x64 musl), so I think this needs to be changed to use the x64 glibc hosted crossgen2 that gets built as part of the runtime cross components build. I think we need to address this TODO: runtime/src/tasks/Crossgen2Tasks/Microsoft.NET.CrossGen.targets Lines 76 to 78 in add51b8
|
For example here's how we handle this when we crossgen corelib: runtime/src/coreclr/crossgen-corelib.proj Line 16 in add51b8
runtime/src/coreclr/crossgen-corelib.proj Line 78 in add51b8
|
I will try to reproduce this locally, but if someone is already set up to run the R2R outerloop tests locally, it might help us to fix and validate this faster. |
Actually, I think the place that needs to be fixed up is Lines 529 to 532 in add51b8
|
Overall I think this is the right place to fix this as this constructs the path to Crossgen2 for compiling the framework but I'm still somewhat struggling with understanding how this pre-existing cross-architecture scheme translates into cross-OS-variation build. If I'm right to understand JanV's comments, at some point we're likely producing separate Linux-targeting vs. Linux-musl-targeting jitinterface and we're loading / using the wrong one. I'll need to remind myself how we're producing Crossgen2 versions targeting these OS variants before I can start thinking about a fix. |
I have asked @davidwrighton offline for help as I believe he wrote most of these scripts, I'll continue working on this tomorrow if we don't nail it down before EOD today. One thing I'm still missing is why has this blown up right now considering we've been running Linux-musl tests for years as far as I recall? |
How we produce crossgen2 targeting those OS variants changed in #84148. Before that change, the x64 musl build was done on a musl build machine, but now it is done on an x64 glibc build machine, with an x64 musl rootfs. For musl x64, the TargetArchitecture and BuildArchitecture are both x64. I believe the fix will look something like #84952, but I haven't validated it yet.
It was caused recently by #84148. I think we just didn't catch it in PR because these jobs don't run there. |
Because we've never cross built x64 musl on x64 glibc before |
I see, thanks for explaining. |
Ok, I was able to see the failure locally, and #84952 fixes it. After the fix, I see output like:
|
By the way, do we know why the PR run on #84148 didn't catch this while obviously subsequent PR / CI runs have been hitting this? Three years ago, Santi wrote this filtering logic so that we decide what tests to run based on the extent of a PR, is there a mapping we're missing in the vein of, if we're touching this bit of code, we need to rerun these tests? |
It looks like it's because these jobs only run as part of the outerloop pipeline which doesn't get triggered by default in PRs. Maybe the PR jobs should do a pri0 R2R test build as well? Although it would still not have caught the bug unless we included musl x64 in the matrix. Hard to say what exactly the right balance is. |
I see, thanks for explaining. At some point I was convinced I was seeing these failures in the CI runs but you're right it doesn't make sense, we're only running the R2R legs in outerloop. It would be nice to add at least minimum coverage to PR but the dnceng team used to be strongly opposed to that as PR costs are already too high, perhaps if we manage to save some time elsewhere we might be able to do that. |
Platform: Linux musl x64
Example run: https://dev.azure.com/dnceng-public/public/_build/results?buildId=241777&view=logs&j=92e17f34-a5d5-5d2c-e1e5-826eb9786835&t=8b99f496-81fe-5fa6-0e3e-125347621b1c
Diagnostics:
/cc @janvorli @sbomer @dotnet/runtime-infrastructure @dotnet/crossgen-contrib
Known Issue Error Message
Fill the error message using known issues guidance.
Report
Summary
The text was updated successfully, but these errors were encountered: