Do not compile code with nvcc if no CUDA kernel exists #171

gigony · 2021-12-01T11:31:16Z

Currently, cuCIM code links CUDA runtime by creating PTX code explicitly (changing LANGUAGE to CUDA) even though cuCIM code (in C++) currently doesn't have any CUDA kernels.

# At least one file needs to be compiled with nvcc.
# Otherwise, it will cause `/usr/bin/ld: cannot find -lcudart` error message.
set_source_files_properties(src/cucim.cpp src/filesystem/cufile_driver.cpp PROPERTIES LANGUAGE CUDA)

That causes the issue with CuPy, causing the following error message (#170)

>>> import cupy as cp
>>> import cucim.clara
>>> a = cp.zeros((3,3))
...
cupy/cuda/memory.pyx in cupy.cuda.memory.Memory.__init__()

cupy_backends/cuda/api/runtime.pyx in cupy_backends.cuda.api.runtime.malloc()

cupy_backends/cuda/api/runtime.pyx in cupy_backends.cuda.api.runtime.check_status()

CUDARuntimeError: cudaErrorUnsupportedPtxVersion: the provided PTX was compiled with an unsupported toolchain.

The error is related to the nvcc version (from cuda 11.5) used in RAPIDS GPUCI.

https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html#group__CUDART__TYPES_1gg3f51e3575c21[…]e003844fd2df23a72a9ee7364d30150ae7e9e

cudaErrorUnsupportedPtxVersion = 222
This indicates that the provided PTX was compiled with an unsupported toolchain. The most common reason for this, is the PTX was generated by a compiler newer than what is supported by the CUDA driver and PTX JIT compiler.

With CUDA toolkit (nvcc) v11.4, it doesn't cause the error.

Fix the error by linking libcudart.so explicitly.
Also fixes warnings.

Close #170

jakirkham · 2021-12-01T17:17:43Z

rerun tests

jakirkham · 2021-12-01T18:53:35Z

There seem to be some test failures in this job, but they are coming from CuPy calls, which don't seem related to this change

jakirkham · 2021-12-01T20:05:34Z

After looking into this more, am thinking this failure may be related, but unclear how. Not seeing this failure in Greg's PR ( #172 ) and it is reproducing here. The CuPy error suggests it came up before CuPy was invoked. So the cause may be elsewhere. May need a more detailed traceback

gigony added 2 commits December 1, 2021 02:55

Do not generate unnecessary PTX

f7501fc

Handle errors caused by using C++ compiler

d36cb0d

gigony added the bug Something isn't working label Dec 1, 2021

gigony added this to the v21.12.00 milestone Dec 1, 2021

gigony requested review from quasiben and jakirkham December 1, 2021 11:31

gigony requested review from a team as code owners December 1, 2021 11:31

gigony self-assigned this Dec 1, 2021

gigony changed the title ~~Do not compile code with nvcc if no CUDA kernel code exists~~ Do not compile code with nvcc if no CUDA kernel exists Dec 1, 2021

jakirkham approved these changes Dec 1, 2021

View reviewed changes

jakirkham changed the base branch from branch-22.02 to branch-21.12 December 1, 2021 22:54

jakirkham added the non-breaking Introduces a non-breaking change label Dec 2, 2021

ajschmidt8 merged commit 28ac81f into rapidsai:branch-21.12 Dec 2, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do not compile code with nvcc if no CUDA kernel exists #171

Do not compile code with nvcc if no CUDA kernel exists #171

gigony commented Dec 1, 2021

jakirkham commented Dec 1, 2021

jakirkham commented Dec 1, 2021 •

edited

Loading

jakirkham commented Dec 1, 2021

Do not compile code with nvcc if no CUDA kernel exists #171

Do not compile code with nvcc if no CUDA kernel exists #171

Conversation

gigony commented Dec 1, 2021

jakirkham commented Dec 1, 2021

jakirkham commented Dec 1, 2021 • edited Loading

jakirkham commented Dec 1, 2021

jakirkham commented Dec 1, 2021 •

edited

Loading