Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault on import juliacall in a CI job #472

Closed
mtsokol opened this issue Mar 12, 2024 · 4 comments
Closed

Segmentation fault on import juliacall in a CI job #472

mtsokol opened this issue Mar 12, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@mtsokol
Copy link

mtsokol commented Mar 12, 2024

Affects: JuliaCall

Describe the bug

Hi! I recently started using JuliaCall to call Julia from Python. In my CI I have several jobs: ubuntu, macos and windows. One of my steps installs Julia and the other one builds python package and runs pytest.

A segmentation fault happens on import juliacall, inside juliacall.__init__.py::init() function, here, but only on ubuntu jobs. Macos and windows complete successfully. Julia 1.10.2 gets installed.

Do you know what might be the reason?


Failing ubuntu output: https://github.com/pydata/sparse/actions/runs/8252820808/job/22573248981?pr=647#step:7:116
JuliaCall setup: https://github.com/willow-ahrens/finch-tensor/blob/main/src/finch/julia.py
JuliaCall version: https://github.com/willow-ahrens/finch-tensor/blob/1bf21a28d28a19ba1cea59c6f5a719cb8914e395/pyproject.toml#L11
CI definition that installs Julia: https://github.com/pydata/sparse/blob/4bfea8fa5b66393a1ff7c2db45218ba41f46baec/.github/workflows/ci.yml#L44

@mtsokol mtsokol added the bug Something isn't working label Mar 12, 2024
@hameerabbasi
Copy link

hameerabbasi commented Mar 13, 2024

Just to note, I can reproduce this in Docker outside of GitHub Actions, using the content in pydata/sparse#649 and this Dockerfile:

FROM --platform=x86_64 condaforge/miniforge3:latest

ADD ci/environment.yml .
RUN conda env create -f environment.yml

RUN mkdir -p /root/workdir
WORKDIR /root/workdir
ADD . .
RUN conda run --live-stream -n sparse-dev pip install -e .[tests]
RUN conda run --live-stream -n sparse-dev pytest --pyargs sparse/tests/test_backends.py

Also, switching to Poetry resolves the issue somehow.

@hameerabbasi
Copy link

Valgrind output
77.76 ERROR conda.cli.main_run:execute(124): `conda run pytest --pyargs sparse/tests/test_backends.py` failed. (See above for error)
78.52 ==7== 
78.52 ==7== HEAP SUMMARY:
78.52 ==7==     in use at exit: 529,620 bytes in 1,132 blocks
78.52 ==7==   total heap usage: 34,393 allocs, 33,261 frees, 38,464,700 bytes allocated
78.52 ==7== 
78.56 ==7== 528 bytes in 1 blocks are possibly lost in loss record 564 of 761
78.56 ==7==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
78.56 ==7==    by 0x2244CD: _PyObject_GC_NewVar (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x23D0CB: _PyEval_MakeFrameVector (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x24DC3E: _PyFunction_Vectorcall (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x24D447: object_vacall (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x25BBCA: _PyObject_CallMethodIdObjArgs (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x17D3C4: PyImport_ImportModuleLevelObject.cold (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x241668: _PyEval_EvalFrameDefault (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x2E281F: _PyEval_Vector (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x2E2766: PyEval_EvalCode (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x2E9969: builtin_exec (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x24DE32: cfunction_vectorcall_FASTCALL (in /opt/conda/bin/python3.10)
78.56 ==7== 
78.56 ==7== 544 bytes in 1 blocks are possibly lost in loss record 565 of 761
78.56 ==7==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
78.56 ==7==    by 0x2244CD: _PyObject_GC_NewVar (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x23D0CB: _PyEval_MakeFrameVector (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x24DC3E: _PyFunction_Vectorcall (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x23D62F: _PyEval_EvalFrameDefault (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x24DC6B: _PyFunction_Vectorcall (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x23D62F: _PyEval_EvalFrameDefault (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x24DC6B: _PyFunction_Vectorcall (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x23D62F: _PyEval_EvalFrameDefault (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x24DC6B: _PyFunction_Vectorcall (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x241F29: _PyEval_EvalFrameDefault (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x24DC6B: _PyFunction_Vectorcall (in /opt/conda/bin/python3.10)
78.56 ==7== 
78.56 ==7== 568 bytes in 1 blocks are possibly lost in loss record 570 of 761
78.56 ==7==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
78.56 ==7==    by 0x2244CD: _PyObject_GC_NewVar (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x23D0CB: _PyEval_MakeFrameVector (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x24DC3E: _PyFunction_Vectorcall (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x23DA3B: _PyEval_EvalFrameDefault (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x24DC6B: _PyFunction_Vectorcall (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x23DA3B: _PyEval_EvalFrameDefault (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x24DC6B: _PyFunction_Vectorcall (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x23D62F: _PyEval_EvalFrameDefault (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x24DC6B: _PyFunction_Vectorcall (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x23D62F: _PyEval_EvalFrameDefault (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x24DC6B: _PyFunction_Vectorcall (in /opt/conda/bin/python3.10)
78.56 ==7== 
78.56 ==7== 664 bytes in 1 blocks are possibly lost in loss record 625 of 761
78.56 ==7==    at 0x483DFAF: realloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
78.56 ==7==    by 0x2EB828: _PyObject_GC_Resize (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x23D1C9: _PyEval_MakeFrameVector (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x24DC3E: _PyFunction_Vectorcall (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x23D62F: _PyEval_EvalFrameDefault (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x24DC6B: _PyFunction_Vectorcall (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x23D62F: _PyEval_EvalFrameDefault (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x24DC6B: _PyFunction_Vectorcall (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x23D62F: _PyEval_EvalFrameDefault (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x24DC6B: _PyFunction_Vectorcall (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x23D62F: _PyEval_EvalFrameDefault (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x24DC6B: _PyFunction_Vectorcall (in /opt/conda/bin/python3.10)
78.56 ==7== 
78.56 ==7== 1,664 bytes in 2 blocks are possibly lost in loss record 719 of 761
78.56 ==7==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
78.56 ==7==    by 0x2244CD: _PyObject_GC_NewVar (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x23D0CB: _PyEval_MakeFrameVector (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x24DC3E: _PyFunction_Vectorcall (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x23D62F: _PyEval_EvalFrameDefault (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x24DC6B: _PyFunction_Vectorcall (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x23D62F: _PyEval_EvalFrameDefault (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x24DC6B: _PyFunction_Vectorcall (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x23D62F: _PyEval_EvalFrameDefault (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x24DC6B: _PyFunction_Vectorcall (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x23D62F: _PyEval_EvalFrameDefault (in /opt/conda/bin/python3.10)
78.56 ==7==    by 0x24DC6B: _PyFunction_Vectorcall (in /opt/conda/bin/python3.10)
78.56 ==7== 
78.56 ==7== LEAK SUMMARY:
78.56 ==7==    definitely lost: 0 bytes in 0 blocks
78.56 ==7==    indirectly lost: 0 bytes in 0 blocks
78.56 ==7==      possibly lost: 3,968 bytes in 6 blocks
78.56 ==7==    still reachable: 525,652 bytes in 1,126 blocks
78.56 ==7==         suppressed: 0 bytes in 0 blocks
78.56 ==7== Reachable blocks (those to which a pointer was found) are not shown.
78.56 ==7== To see them, rerun with: --leak-check=full --show-leak-kinds=all
78.56 ==7== 
78.56 ==7== Use --track-origins=yes to see where uninitialised values come from
78.56 ==7== For lists of detected and suppressed errors, rerun with: -s
78.56 ==7== ERROR SUMMARY: 15 errors from 6 contexts (suppressed: 16 from 2)

@hameerabbasi
Copy link

Okay, I've managed to reduce it down to this (Purely conda, juliapkg and juliacall):

environment.yml
name: sparse-dev
channels:
  - conda-forge
  - nodefaults
dependencies:
  - python
  - numpy
  - julia
  - pyjuliacall
  - pyjuliapkg
Dockerfile
FROM --platform=x86_64 condaforge/miniforge3:latest

ADD environment.yml .
RUN conda env create -f environment.yml
RUN mkdir -p /root/workdir
WORKDIR /root/workdir
ADD test.py /root/workdir/test.py
RUN conda run --live-stream -n sparse-dev python test.py
test.py
import juliacall

@cjdoris
Copy link
Collaborator

cjdoris commented Mar 14, 2024

I think this is a duplicate of #464 (please comment if not and I'll reopen).

@cjdoris cjdoris closed this as completed Mar 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants