Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault on arm64 #628

Open
tillea opened this issue Jan 14, 2024 · 12 comments · Fixed by #634
Open

Segmentation fault on arm64 #628

tillea opened this issue Jan 14, 2024 · 12 comments · Fixed by #634
Labels

Comments

@tillea
Copy link

tillea commented Jan 14, 2024

Describe the bug
When running the test suite on arm64 architecture Python3.11 segfaults.

To Reproduce
The Debian continuous integration test is running on all Debian release architectures. While it passed for amd64 it fails on arm64 and other architectures. Feel free to check the full build log

Expected behavior
The test suite should pass on all architectures.

System

  • OS and version: Debian unstable
  • sparse version: 0.15.1
  • NumPy version: 1.24.2
  • Numba version: 0.57.1

Kind regards, Andreas.

@tillea tillea added the bug Indicates an unexpected problem or unintended behavior label Jan 14, 2024
@hameerabbasi
Copy link
Collaborator

This is likely a problem with Numba code generation -- do the tests pass with Py3.10 and below on other architectures?

@tillea
Copy link
Author

tillea commented Jan 17, 2024

The test used to pass with Py3.10. You can check the list of architectures including test logs on our CI page

@hameerabbasi
Copy link
Collaborator

hameerabbasi commented Jan 17, 2024

Let me rephrase, do the tests pass with Python 3.11 and sparse 0.14, but Numba 0.57.1? How about Python 3.10, sparse 0.15.1 and Numba 0.57.1?

I unfortunately don't have access to an ARM64 machine, so I cannot debug this personally, and would rely on reporters to isolate the issue.

@tillea
Copy link
Author

tillea commented Jan 17, 2024 via email

@hameerabbasi
Copy link
Collaborator

@mtsokol IIRC you had a Mac, is that Apple Silicon by any chance? Could you reproduce this bug with the software versions mentioned?

@mtsokol
Copy link
Collaborator

mtsokol commented Jan 18, 2024

@mtsokol IIRC you had a Mac, is that Apple Silicon by any chance? Could you reproduce this bug with the software versions mentioned?

Unfortunately my Mac is an ancient MacBook Pro 2015 with Intel i7.

@hameerabbasi
Copy link
Collaborator

I've attempted to fix this in #634, please re-open if the issue isn't resolved.

@tillea
Copy link
Author

tillea commented Apr 24, 2024

Hi,
(sorry, I do not find any re-open button)
I tried tag 0.16.0a4 (not sure whether this is considered alpha??) and the problem persist. In addition I tried amd64 test which fails as well.
Kind regards, Andreas.

@hameerabbasi
Copy link
Collaborator

hameerabbasi commented Apr 25, 2024

@tillea I just tested locally, it doesn't fail for me in a Docker container -- You might want to look at numba/numba#9109 (comment) and backporting llvm/llvm-project@2e1b838 to Debian's LLVM 14.

Relevant LLVM issue: llvm/llvm-project#61402

@hameerabbasi hameerabbasi reopened this Apr 25, 2024
@hameerabbasi hameerabbasi added upstream and removed bug Indicates an unexpected problem or unintended behavior labels Apr 25, 2024
@detrout
Copy link

detrout commented Apr 25, 2024

Andreas asked me to help out with this bug as he has new Debian project leader responsibilities. I was slowly trying to help deal with the numba side of the problems, but fell behind on understanding the llvm fix.
Currently I'm trying to the llvmlite maintainer to update llvmlite so I can release numba 0.59.1

@hameerabbasi
Copy link
Collaborator

@detrout Thank you for helping out -- Some background info from reading the Numba issue, it isn't an issue with Numba itself, but present in Debian's LLVM 14 (and release LLVM 14, IIUC). The reason it doesn't show up on Numba from PyPI or conda-forge is that they already have the LLVM patch applied in llvmlite on PyPI and LLVM 14 from conda-forge, which is why I think backporting the patch might help.

@hameerabbasi
Copy link
Collaborator

I recently ran the test suite on both an Apple Silicon Mac as well as multiple arm64-based containers trying to reproduce this, but the test suite ran fine. Can anyone, maybe @detrout, check what happens if llvmlite and numba are installed via PyPI instead of via apt? That would confirm a packaging issue, and would point to #628 (comment) being a possible cause.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants