Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not compile code with nvcc if no CUDA kernel exists #171

Merged
merged 2 commits into from
Dec 2, 2021

Conversation

gigony
Copy link
Contributor

@gigony gigony commented Dec 1, 2021

Currently, cuCIM code links CUDA runtime by creating PTX code explicitly (changing LANGUAGE to CUDA) even though cuCIM code (in C++) currently doesn't have any CUDA kernels.

# At least one file needs to be compiled with nvcc.
# Otherwise, it will cause `/usr/bin/ld: cannot find -lcudart` error message.
set_source_files_properties(src/cucim.cpp src/filesystem/cufile_driver.cpp PROPERTIES LANGUAGE CUDA)

That causes the issue with CuPy, causing the following error message (#170)

>>> import cupy as cp
>>> import cucim.clara
>>> a = cp.zeros((3,3))
...
cupy/cuda/memory.pyx in cupy.cuda.memory.Memory.__init__()

cupy_backends/cuda/api/runtime.pyx in cupy_backends.cuda.api.runtime.malloc()

cupy_backends/cuda/api/runtime.pyx in cupy_backends.cuda.api.runtime.check_status()

CUDARuntimeError: cudaErrorUnsupportedPtxVersion: the provided PTX was compiled with an unsupported toolchain.

The error is related to the nvcc version (from cuda 11.5) used in RAPIDS GPUCI.

https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html#group__CUDART__TYPES_1gg3f51e3575c21[…]e003844fd2df23a72a9ee7364d30150ae7e9e

cudaErrorUnsupportedPtxVersion = 222
This indicates that the provided PTX was compiled with an unsupported toolchain. The most common reason for this, is the PTX was generated by a compiler newer than what is supported by the CUDA driver and PTX JIT compiler.

With CUDA toolkit (nvcc) v11.4, it doesn't cause the error.

Fix the error by linking libcudart.so explicitly.
Also fixes warnings.

Close #170

@gigony gigony added the bug Something isn't working label Dec 1, 2021
@gigony gigony added this to the v21.12.00 milestone Dec 1, 2021
@gigony gigony requested review from a team as code owners December 1, 2021 11:31
@gigony gigony self-assigned this Dec 1, 2021
@gigony gigony changed the title Do not compile code with nvcc if no CUDA kernel code exists Do not compile code with nvcc if no CUDA kernel exists Dec 1, 2021
@jakirkham
Copy link
Member

rerun tests

@jakirkham
Copy link
Member

jakirkham commented Dec 1, 2021

There seem to be some test failures in this job, but they are coming from CuPy calls, which don't seem related to this change

@jakirkham
Copy link
Member

After looking into this more, am thinking this failure may be related, but unclear how. Not seeing this failure in Greg's PR ( #172 ) and it is reproducing here. The CuPy error suggests it came up before CuPy was invoked. So the cause may be elsewhere. May need a more detailed traceback

@jakirkham jakirkham changed the base branch from branch-22.02 to branch-21.12 December 1, 2021 22:54
@jakirkham jakirkham added the non-breaking Introduces a non-breaking change label Dec 2, 2021
@ajschmidt8 ajschmidt8 merged commit 28ac81f into rapidsai:branch-21.12 Dec 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working non-breaking Introduces a non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] cudaErrorUnsupportedPtxVersion with cuCIM+CuPy on CUDA 11.5
3 participants