Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating JLLs with Dependency("libcxxwrap_julia_jll") #2160

Closed
25 of 48 tasks
fingolfin opened this issue Nov 21, 2020 · 21 comments
Closed
25 of 48 tasks

Updating JLLs with Dependency("libcxxwrap_julia_jll") #2160

fingolfin opened this issue Nov 21, 2020 · 21 comments

Comments

@fingolfin
Copy link
Member

fingolfin commented Nov 21, 2020

Once we have versions of libcxxwrap_julia_jll for all Julia versions, we'll need to update the various JLLs using it to also provide multiple versions. Here I try to describe what's going on and why a change is needed; explain how to make that change; and track the progress of each affected package.

What's going on

Julia does not currently provide a stable ABI, meaning that things that were compiled against, say, Julia 1.3, may or may not work when linked at runtime against Julia 1.4/1.5/... -- and this holds of course for any combination of Julia versions. For libcxxwrap and anything using its headers (including all (?) JLLs using Dependency("libcxxwrap_julia_jll")), it is indeed a fact that a binary compiled against Julia 1.3 can lead to segfaults during GC when used with any later Julia version, and vice versa. Right now it is not definitely known whether other Julia combinations can lead to issues, but it is not unlikely that this will eventually happen.

Thus the only safe way forward is to provide binaries for libcxxwrap_julia_jll which are compiled against each of Julia 1.3, 1.4, 1.5, ....

In an ideal world, the julia version would simply be part of the Platform, and we'd be done. Indeed, @staticfloat added support for that -- alas, it will only work in Julia >= 1.6; older versions will balk when they see such platform identifiers. So as long as one wants to support any older Julia versions (which I'd expect to be everybody right now, as 1.6 has not yet been released, not even branched), extra work is needed.

On the upside to all this, cxxwrap now supports many more platforms, which you can potentially benefit from for your own JLLs, too!

How to deal with it

The following is an initial suggestion, and will be revised based on feedback; I suggest that people wait a bit with trying to implement this, until we've ironed out everything

In a nutshell, here's the plan: for each affected JLL, we build it several times, once for each supported Julia minor version (so e.g. one each for Julia 1.3, 1.4 and 1.5+), but built from the same source. For technical reasons, each such build will have to carry a separate version. The convention I propose is to set the patch level of your JLL to the minor version of Julia (so libcxxwrap_julia_jll 0.8.4 was built against Julia 1.4). Yes, that means you can't have a one-to-one mapping with the actual versions of the code your JLL wraps; but the longterm plan for JLLs is to completely decouple JLL versions from the versions of the wrapped software anyway.

There are many ways to deal with this; I'll describe one of them here, which I am using in PR #1798. The core idea is to rename your current build_tarballs.jl to common.jl and then add one subdir for each supported Julia version, each with a build_tarballs.jl including common.jl.

For example, PR #1798 renames L/libsingular_julia/build_tarballs.jl to L/libsingular_julia/common.jl and adds L/libsingular_julia/libsingular_julia@1.3/build_tarballs.jl which looks like this:

const julia_version = v"1.3.1"
include("../common.jl")

For technical reasons, the build_tarballs.jl for Julia 1.4 and 1.5 will be added in later PRs (AFAIK Yggdrasil tooling does not allow adding multiple versions of a JLL in a single PR -- so far that would have been a weird thing to do anyway).

In common.jl, some changes are made:

  • the version is adjusted to set the patchlevel to match the Julia minor version: version = VersionNumber(0, 2, julia_version.minor)
  • the list of platforms is expanded to match the platforms supported by libcxxwrap_julia_jll (this could be neater, see also Add a way to "inherit" default platform list from another JLL BinaryBuilder.jl#957)
  • the dependencies are adjusted:
    • instead of Julia_jll, we now need libjulia_jll in the right version:
      BuildDependency(PackageSpec(name="libjulia_jll", version=julia_version)),
  • we inform build_tarballs about the runtime Julia requirements, by adding this keyword argument:
    julia_compat = "$(julia_version.major).$(julia_version.minor)"
    Note that this is not using ~ as one might expect, simply so that the build for Julia 1.5 can be used with Julia 1.6 until a libjulia_jll for 1.6 exists (yes, that might not work, due to ABI incompatibilities; but at least it has a chance of working, which is better than what we'd have if we used ~ in the version specification. Technically this means one could also use the JLL version made for Julia 1.3 in Julia 1.4; but since the JLL version for Julia 1.4 is strictly newer than that for Julia 1.3, this shouldn't happen unless a user forces that, and well, then they deserve it...

UPDATE: we also recommend that all packages using libcxxwrap_julia_jl should use preferred_gcc_version=v"8" or higher; see #2236 and JuliaInterop/libcxxwrap-julia#75

Affected JLLs and their status

Note: I plan to ping the respective authors once we have a stable plan, but for now, it seems premature to bother them. In particular until PR #2158 is ready...

@jstrube
Copy link
Contributor

jstrube commented Nov 27, 2020

FastJet_Julia_Wrapper + Julia 1.5 #2180
LCIO_Julia_Wrapper + Julia 1.5 #2181

I don't know of any users that need support for Julia 1.4 or earlier, so I'd prefer not to make releases for those, unless a compelling case arises.

@barche
Copy link
Contributor

barche commented Nov 28, 2020

I tried the latest libcxxwrap_julia_jll v0.8.5 and when running test CxxWrap I get this error:

dyld: lazy symbol binding failed: Symbol not found: __Unwind_Resume
  Referenced from: /Users/user/tmp/clean_julia_depot_cxx/artifacts/1417b691b9d1117225089c656ede8940fbf2eba8/lib/libtypes.dylib
  Expected in: /Applications/Julia-1.5.app/Contents/Resources/julia/bin/../lib/libjulia.dylib

dyld: Symbol not found: __Unwind_Resume
  Referenced from: /Users/user/tmp/clean_julia_depot_cxx/artifacts/1417b691b9d1117225089c656ede8940fbf2eba8/lib/libtypes.dylib
  Expected in: /Applications/Julia-1.5.app/Contents/Resources/julia/bin/../lib/libjulia.dylib

@fingolfin
Copy link
Member Author

fingolfin commented Nov 29, 2020

@barche Damn! OK, first the good news is that in Julia 1.3 and 1.4, test CxxWrap works fine on both Mac and Linux machine.

The bad news is that I can reproduce the issue you describe on macOS; and also a crash on Linux with Julia 1.5 (log see below). I'll try to look into both.

UPDATE: submitted this also as JuliaInterop/libcxxwrap-julia#75

...
Running tests from containers.jl...

signal (11): Segmentation fault
in expression starting at /home/mhorn/.julia/packages/CxxWrap/ZOkSN/test/containers.jl:21
_ZN5jlcxx6detail11CallFunctorINS_10ConstArrayIdLl1EEEJEE5applyEPKv at /home/mhorn/.julia/artifacts/860a8b2216bd059600ed7c44cdaa3bb81b23ff1c/lib/libjlcxx_containers.so (unknown line)
const_vector at /home/mhorn/.julia/packages/CxxWrap/ZOkSN/src/CxxWrap.jl:590
macro expansion at /home/mhorn/.julia/packages/CxxWrap/ZOkSN/test/containers.jl:29 [inlined]
macro expansion at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1115 [inlined]
top-level scope at /home/mhorn/.julia/packages/CxxWrap/ZOkSN/test/containers.jl:23
jl_toplevel_eval_flex at /buildworker/worker/package_linux64/build/src/toplevel.c:834
jl_parse_eval_all at /buildworker/worker/package_linux64/build/src/ast.c:913
jl_load_rewrite at /buildworker/worker/package_linux64/build/src/toplevel.c:914
include at ./client.jl:457
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2214 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2398
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1690 [inlined]
do_call at /buildworker/worker/package_linux64/build/src/interpreter.c:117
eval_value at /buildworker/worker/package_linux64/build/src/interpreter.c:206
eval_stmt_value at /buildworker/worker/package_linux64/build/src/interpreter.c:157 [inlined]
eval_body at /buildworker/worker/package_linux64/build/src/interpreter.c:566
eval_body at /buildworker/worker/package_linux64/build/src/interpreter.c:492
eval_body at /buildworker/worker/package_linux64/build/src/interpreter.c:492
jl_interpret_toplevel_thunk at /buildworker/worker/package_linux64/build/src/interpreter.c:660
jl_toplevel_eval_flex at /buildworker/worker/package_linux64/build/src/toplevel.c:840
jl_parse_eval_all at /buildworker/worker/package_linux64/build/src/ast.c:913
jl_load_rewrite at /buildworker/worker/package_linux64/build/src/toplevel.c:914
include at ./client.jl:457
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2231 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2398
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1690 [inlined]
do_call at /buildworker/worker/package_linux64/build/src/interpreter.c:117
eval_value at /buildworker/worker/package_linux64/build/src/interpreter.c:206
eval_stmt_value at /buildworker/worker/package_linux64/build/src/interpreter.c:157 [inlined]
eval_body at /buildworker/worker/package_linux64/build/src/interpreter.c:566
jl_interpret_toplevel_thunk at /buildworker/worker/package_linux64/build/src/interpreter.c:660
jl_toplevel_eval_flex at /buildworker/worker/package_linux64/build/src/toplevel.c:840
jl_toplevel_eval_flex at /buildworker/worker/package_linux64/build/src/toplevel.c:790
jl_toplevel_eval_in at /buildworker/worker/package_linux64/build/src/toplevel.c:883
eval at ./boot.jl:331
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2214 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2398
exec_options at ./client.jl:272
_start at ./client.jl:506
jfptr__start_53898.clone_1 at /home/mhorn/julia-1.5.3/lib/julia/sys.so (unknown line)
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2214 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2398
jl_apply at /buildworker/worker/package_linux64/build/ui/../src/julia.h:1690 [inlined]
true_main at /buildworker/worker/package_linux64/build/ui/repl.c:106
main at /buildworker/worker/package_linux64/build/ui/repl.c:227
__libc_start_main at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
_start at /home/mhorn/julia-1.5.3/bin/julia (unknown line)
Allocations: 16112006 (Pool: 16107521; Big: 4485); GC: 15

@fingolfin
Copy link
Member Author

@jstrube so, are the new JLLs at least working correctly for you?

@jstrube
Copy link
Contributor

jstrube commented Nov 29, 2020

LCIO.jl tests fine. I also get a CxxWrap segfault on the test...

Running tests from basic_types.jl...
char               -> CxxChar
signed char        -> Int8
unsigned char      -> UInt8
short              -> Int16
unsigned short     -> UInt16
int                -> Int32
unsigned int       -> UInt32
long               -> Int64
unsigned long      -> UInt64
long long          -> CxxLongLong
unsigned long long -> CxxULongLong
int8_t             -> Int8
uint8_t            -> UInt8
int16_t            -> Int16
uint16_t           -> UInt16
int32_t            -> Int32
uint32_t           -> UInt32
int64_t            -> Int64
uint64_t           -> UInt64
Running tests from containers.jl...

signal (11): Segmentation fault
in expression starting at /home/jstrube/.julia/packages/CxxWrap/ZOkSN/test/containers.jl:21
_ZN5jlcxx6detail11CallFunctorINS_10ConstArrayIdLl1EEEJEE5applyEPKv at /home/jstrube/.julia/artifacts/860a8b2216bd059600ed7c44cdaa3bb81b23ff1c/lib/libjlcxx_containers.so (unknown line)
const_vector at /home/jstrube/.julia/packages/CxxWrap/ZOkSN/src/CxxWrap.jl:590
macro expansion at /home/jstrube/.julia/packages/CxxWrap/ZOkSN/test/containers.jl:29 [inlined]
macro expansion at /home/jstrube/Applications/julia-stable/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1115 [inlined]
top-level scope at /home/jstrube/.julia/packages/CxxWrap/ZOkSN/test/containers.jl:23
jl_toplevel_eval_flex at /home/jstrube/Applications/julia-stable/src/toplevel.c:834
jl_parse_eval_all at /home/jstrube/Applications/julia-stable/src/ast.c:913
jl_load_rewrite at /home/jstrube/Applications/julia-stable/src/toplevel.c:914
include at ./client.jl:457
jl_apply at /home/jstrube/Applications/julia-stable/src/julia.h:1690 [inlined]
do_call at /home/jstrube/Applications/julia-stable/src/interpreter.c:117
eval_value at /home/jstrube/Applications/julia-stable/src/interpreter.c:206
eval_stmt_value at /home/jstrube/Applications/julia-stable/src/interpreter.c:157 [inlined]
eval_body at /home/jstrube/Applications/julia-stable/src/interpreter.c:552
eval_body at /home/jstrube/Applications/julia-stable/src/interpreter.c:492
eval_body at /home/jstrube/Applications/julia-stable/src/interpreter.c:492
jl_interpret_toplevel_thunk at /home/jstrube/Applications/julia-stable/src/interpreter.c:660
jl_toplevel_eval_flex at /home/jstrube/Applications/julia-stable/src/toplevel.c:840
jl_parse_eval_all at /home/jstrube/Applications/julia-stable/src/ast.c:913
jl_load_rewrite at /home/jstrube/Applications/julia-stable/src/toplevel.c:914
include at ./client.jl:457
jl_apply at /home/jstrube/Applications/julia-stable/src/julia.h:1690 [inlined]
do_call at /home/jstrube/Applications/julia-stable/src/interpreter.c:117
eval_value at /home/jstrube/Applications/julia-stable/src/interpreter.c:206
eval_stmt_value at /home/jstrube/Applications/julia-stable/src/interpreter.c:157 [inlined]
eval_body at /home/jstrube/Applications/julia-stable/src/interpreter.c:552
jl_interpret_toplevel_thunk at /home/jstrube/Applications/julia-stable/src/interpreter.c:660
jl_toplevel_eval_flex at /home/jstrube/Applications/julia-stable/src/toplevel.c:840
jl_toplevel_eval_flex at /home/jstrube/Applications/julia-stable/src/toplevel.c:790
jl_toplevel_eval_in at /home/jstrube/Applications/julia-stable/src/toplevel.c:883
eval at ./boot.jl:331
exec_options at ./client.jl:272
_start at ./client.jl:506
jfptr__start_48296 at /home/jstrube/Applications/julia-stable/usr/lib/julia/sys.so (unknown line)
jl_apply at /home/jstrube/Applications/julia-stable/ui/../src/julia.h:1690 [inlined]
true_main at /home/jstrube/Applications/julia-stable/ui/repl.c:106
main at /home/jstrube/Applications/julia-stable/ui/repl.c:227
__libc_start_main at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
_start at /home/jstrube/Applications/julia-stable/usr/bin/julia (unknown line)
Allocations: 15238304 (Pool: 15234197; Big: 4107); GC: 15
ERROR: Package CxxWrap errored during testing (received signal: 11)

@fingolfin
Copy link
Member Author

fingolfin commented Nov 29, 2020

I checked the dylib's in usr/lib of my julia master build, and found this:

usr/lib/libgcc_s.1.dylib
000000000000e700 T __Unwind_Resume
0000000000011023 s __Unwind_Resume.cold
000000000000e7d0 T __Unwind_Resume_or_Rethrow
0000000000011028 s __Unwind_Resume_or_Rethrow.cold

usr/lib/libgmpxx.4.dylib
                 U __Unwind_Resume

usr/lib/libgmpxx.dylib
                 U __Unwind_Resume

usr/lib/libosxunwind.dylib
00000000000071a9 S $ld$hide$os10.4$__Unwind_Resume
000000000000717c S $ld$hide$os10.4$__Unwind_Resume_or_Rethrow
00000000000071aa S $ld$hide$os10.5$__Unwind_Resume
000000000000717d S $ld$hide$os10.5$__Unwind_Resume_or_Rethrow
0000000000001080 T __Unwind_Resume
0000000000000b70 T __Unwind_Resume_or_Rethrow

This makes me (a) suspect that libgmpxx is also affected, and (b) wonder if this is somehow a C++ issue.

@fingolfin
Copy link
Member Author

Anyway, this may mean that static linking libosxunwind isn't an option after all...

@barche
Copy link
Contributor

barche commented Nov 29, 2020

I believe the Julia 1.5 linux segfault is orthogonal to the unwind issue, there have been sporadic crashes in this spot in the containers.jl test for a long time, I had hoped this was due to binary incompatibility too, but since that cause has been eliminated, it seems I need to dig deeper. At least it seems somewhat reproducible now :)

@barche
Copy link
Contributor

barche commented Nov 29, 2020

and (b) wonder if this is somehow a C++ issue.

Looking at where it fails in the C++ test, I believe this is related to C++ exceptions. The libunwind code also mentions exceptions in the context of this symbol. I expect this will happen whenever C++ code throws an exception. Libcxxwrap-julia catches all these exceptions and passes the message text to jl_error to show the message in Julia.

@fingolfin
Copy link
Member Author

OK, it's "good" to hear that this segfault was there before (I mean, a segfault is never good, but you know what I mean...). It crashes in jlcxx::detail::CallFunctor<jlcxx::ConstArray<double, 1l>>::apply(void const*).... Might be interesting to build a variant locally with debug symbols active and see if we can track the crash down in gdb -- perhaps the cool rr support in Julia will help, too. Thought it seems a bit annoying overall: need to build a modified libcxxwrap_julia_jll from a modified version of the libcxxwrap-julia repository... ah well :-).

I wonder if this might be a GC issue again. I just submitted PR JuliaInterop/libcxxwrap-julia#74 but that's pure guess work and not based on any actual debugging yet. Actually, I just see that your Travis tests against Julia nightly reproduce the crash, and my PR did nothing about it. So...

@fingolfin
Copy link
Member Author

We resolve the libcxxwrap crashes on Julia 1.5 by rebuilding with GCC 8. Since much of the cxxwrap code is in the headers, JLL builders using libcxxwrap should also consider using GCC 8 or newer. @jstrube that might affect your JLLs, too -- but of course if they work fine right now, there is no immediate need to rebuild them. However, I also noticed that your JLLs are still limited in the platforms they support -- just two, in fact. Is that intentional?

@jstrube
Copy link
Contributor

jstrube commented Dec 16, 2020

I can rebuild, but both LCIO.jl and FastJet.jl test fine. That could just be a result of lacking coverage, though, so I'll get on that.
The platforms are limited by the upstream libraries. I doubt this will change soon, but I'll update when the upstream libs support more architectures.

@fingolfin
Copy link
Member Author

I am confused: both LCIO_jll and FastJet_jll support for more platforms than their _julia counterparts? are you saying there are additional limitations beyond availability of those JLLs?

@jstrube
Copy link
Contributor

jstrube commented Dec 16, 2020

Oh, you're right! Thanks. Let me try to expand the platforms for the wrappers, then.

@rgcv
Copy link
Contributor

rgcv commented Dec 24, 2020

Been a hot minute! I'll soon jump on the bandwagon and take care of doing my best at updating libcgal_julia. Love the proposal @fingolfin, awesome work!

@findmyway
Copy link
Contributor

Hi @fingolfin , really appreciate your work!

The ConnectFourSolver related PR can be merged directly. For OpenSpiel I still need to spend some time on testing the v0.2 this weekend.

@rgcv
Copy link
Contributor

rgcv commented Dec 30, 2020

It might be a bit too late to discuss the way versioning of jlls when coupled with libjulia should be since several jlls already adopted the MAJOR.MINOR.JULIA_MINOR format. I was thinking that instead of it replacing the jll's minor version, it could maybe extend the typical version number by adding a dash to it (don't know what the formal semver component name is).

For example, extend the version number following the format MAJOR.MINOR.PATCH-JULIA_MINOR. This way, one could still benefit from the patch component for small featureless non-breaking updates. Of course, this would replace components such as release candidates (rcs, alphas, betas, what have you), but since that isn't seen much in jlls, that could work.

Just food for thought at this point, probably, but I'd like to stir up some discussion, search for some opinions. Or simple backhanded "no" slap-in-the-face answers, I accept those too, no biggy ^^

@fingolfin
Copy link
Member Author

@rgcv that's what I used initially but it is not possible because such versions are by definition prereleases, and the package registry does not accept prereleases.

@rgcv
Copy link
Contributor

rgcv commented Dec 30, 2020

@fingolfin Should've guessed you tried that, silly of me to wonder. Totally fine. In fact, major and minor could replace minor and patch bumps. Instead of having 0.x.y, one could have x.y.j where j is Julia's minor version.

This however seems problematic once 2.0 eventually hits. Granted, by then things might have changed enough for this not to be a problem. Concerns me nonetheless.

Be that as it may, right now, your proposal is more than sufficient and I'll gladly adopt it.

@fingolfin
Copy link
Member Author

@rgcv not at all silly, it was/is a very reasonable question and idea (I would have greatly preferred the approach you suggested for various reasons). Anyway, I've written some more on potential versioning schemes here: #2327 (comment)

@fingolfin
Copy link
Member Author

This is basically done (only z3 remains to be adapted, but there was no feedback on my PR from the original author of the Z3 builder, and they haven't been active on GitHub for ~half a year, so it doesn't seem to be a priority.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants