Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cross compile cuda support (cnt'd) #210

Merged
merged 21 commits into from
Feb 2, 2023
Merged

Conversation

h-vetinari
Copy link
Member

@h-vetinari h-vetinari commented Nov 4, 2022

New PR since I cannot commit into #209

Closes #209

@conda-forge-linter
Copy link

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

@h-vetinari h-vetinari changed the title cross compile cuda support cross compile cuda support (cnt'd) Nov 4, 2022
@h-vetinari h-vetinari mentioned this pull request Nov 4, 2022
5 tasks
recipe/cross_compile_support.sh Outdated Show resolved Hide resolved
recipe/cross_compile_support.sh Show resolved Hide resolved
@h-vetinari
Copy link
Member Author

Thanks for the quick review @kkraus14! Do you know whether we need the other devel packages?

@kkraus14
Copy link
Contributor

kkraus14 commented Nov 4, 2022

Thanks for the quick review @kkraus14! Do you know whether we need the other devel packages?

Yes we'll need those. Those contain the headers, the generic unversioned symlinked shared library, the static libraries that we may want specifically for CUDA because there's some implications IIRC, and the pkgconfig files.

@h-vetinari
Copy link
Member Author

Argh, the test here cannot work because at test time, the "BUILD_PLATFORM" is already emulated.

Yes we'll need those.

Added in 91610af; not sure if my scripting is worth a damn, I'm not experienced with jq, but I tested it in https://jqplay.org/ at least, and of course bash errors are a real possibility too. 😅

# (names need "_" not "-" to match spelling in manifest)
declare -a DEVELS=(
"cuda_cudart"
"cuda_driver"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you need to use the cuda_cudart version for cuda_driver since it doesn't have its own key.

Comment on lines 56 to 69
declare -a DEVELS=(
"cuda_cudart"
"cuda_driver"
"cuda_nvml"
"cuda_nvrtc"
"libcublas"
"libcufft"
"libcurand"
"libcusolver"
"libcusparse"
"libnpp"
"libnvjpeg"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to add: nvidia_driver, nvidia_fabric_manager, nvidia_libXNVCtrl. All three should use the version from the nvidia_driver key

Copy link
Member Author

@h-vetinari h-vetinari Nov 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, this and the cuda_driver thing make it sound like we need to do a full mapping here. Currently it was just a simple scheme: xyz -> xyz_devel.

You're saying I need to change this to:

    # new key...                ...uses version from
    cuda_cudart_devel       ->  cuda_cudart
    cuda_driver_devel       ->  cuda_cudart       # special
    cuda_nvrtc_devel        ->  cuda_nvrtc
    cuda_nvml_devel         ->  cuda_nvml_dev     # special
    libcublas_devel         ->  libcublas
    libcufft_devel          ->  libcufft
    libcurand_devel         ->  libcurand
    libcusolver_devel       ->  libcusolver
    libcusparse_devel       ->  libcusparse
    libnpp_devel            ->  libnpp
    libnvjpeg_devel         ->  libnvjpeg
    nvidia_driver_devel     ->  nvidia_driver   
    nvidia_fabric_manager   ->  nvidia_driver     # special
    nvidia_libXNVCtrl       ->  nvidia_driver     # special

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are not needed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you specify what you mean by "those"? The original list was based on the devel-libs you had in your PR, but weren't listed in the manifest

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nvidia_libXNVCtrl is GPL licensed (https://github.com/NVIDIA/nvidia-settings/blob/main/COPYING) and not governed under the EULA so I think (I am not a lawyer!) we can actually ship that as a conda package if desired so that can definitely be removed from the list in hindsight. My apologies.

I can't find a license or EULA for the nvidia_fabric_manager (should actually have been nvidia_fabric_manager_devel, my bad again), so I assumed we weren't allowed to ship it as a conda package. I haven't seen anyone actually use this library before so I'd be perfectly happy to leave it out.

There's other things that would potentially be needed that there aren't stub libraries available for:

  • nvidia-driver-libs notably contains libnvoptix and a bunch of other libraries
  • nvidia-driver-cuda-libs notably contains libnvidia-nvvm and libnvidia-ptxjitcompiler and more generally contains the libraries that the symlinks in nvidia_driver_devel point to

I have seen a fair amount of software that uses these options where I think it would make sense to add these libraries: nvidia_driver_devel, nvidia_driver_libs, and nvidia_driver_cuda_libs.

Notably with the proposed above, we'll be reliant on stubs for the CUDA driver library libcuda and the NVIDIA Management Library libnvidia-ml. I'm not sure how build systems like CMake will react to using a mix of actual and stub libraries. Hopefully everything will be okay?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recipe/cross_compile_support.sh Outdated Show resolved Hide resolved
Copy link
Member

@isuruf isuruf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is way more complicated and downloads more things than needed. The packages I had in my PR were the only ones in $CUDA_HOME/targets/sbsa-linux which is what the cross compiler looks at. All the other libraries are irrelevant.

@h-vetinari
Copy link
Member Author

This is way more complicated and downloads more things than needed.

Fair enough, I was just following the instructions to download packages based on the manifest. I started with downloading everything (+ what Keith told me to add), but if we can filter/shorten the list, all the better

@kkraus14
Copy link
Contributor

kkraus14 commented Nov 5, 2022

This is way more complicated and downloads more things than needed. The packages I had in my PR were the only ones in $CUDA_HOME/targets/sbsa-linux which is what the cross compiler looks at. All the other libraries are irrelevant.

I think there's a handful of other libraries that end up in the sysroot (I think?) that there aren't stubs for that we'd want as well that weren't included in your PR. I did my best to break them down in my comment in the thread.

@h-vetinari
Copy link
Member Author

h-vetinari commented Nov 5, 2022

@isuruf: This is way more complicated and downloads more things than needed.

@kkraus14: There's other things that would potentially be needed [...]

I'll let you two decide what should be included.

My goal here was just to make it possible to use the versions from the manifest. As feared, this turned into a fair amount of code, due to various small and not so small divergences between the manifest and the actual RPMs. It's conceivable that someone might prefer to just hardcode the list of deps, but now that it works (w/ ability to map in additional rpms, and the obvious possibility to filter out unwanted packages), I tend to think it's probably a good thing going forward.

@isuruf, I managed to get this to run through to the point where you had added

mv ./usr/local/cuda-${CUDA_COMPILER_VERSION}/targets/${CUDA_HOST_PLATFORM_ARCH}-linux ${CUDA_HOME}/targets/${CUDA_HOST_PLATFORM_ARCH}-linux

at the end of the loop, but this fails with

mv: cannot move ‘./usr/local/cuda-11.2/targets/sbsa-linux’ to ‘/usr/local/cuda/targets/sbsa-linux’: Permission denied

@h-vetinari
Copy link
Member Author

@isuruf @kkraus14
Can we iterate on the list of things to install?

At the moment the list is as follows:

{
  "cuda": {
    "name": "CUDA SDK",
    "version": "11.2.2"
  },
  "cuda_cudart": {
    "name": "CUDA Runtime (cudart)",
    "version": "11.2.152"
  },
  "cuda_cuobjdump": {
    "name": "cuobjdump",
    "version": "11.2.152"
  },
  "cuda_cupti": {
    "name": "CUPTI",
    "version": "11.2.152"
  },
  "cuda_cuxxfilt": {
    "name": "CUDA cu++ filt",
    "version": "11.2.152"
  },
  "cuda_gdb": {
    "name": "CUDA GDB",
    "version": "11.2.152"
  },
  "cuda_nvcc": {
    "name": "CUDA NVCC",
    "version": "11.2.152"
  },
  "cuda_nvdisasm": {
    "name": "CUDA nvdisasm",
    "version": "11.2.152"
  },
  "cuda_nvml_dev": {
    "name": "CUDA NVML Headers",
    "version": "11.2.152"
  },
  "cuda_nvprof": {
    "name": "CUDA nvprof",
    "version": "11.2.152"
  },
  "cuda_nvprune": {
    "name": "CUDA nvprune",
    "version": "11.2.152"
  },
  "cuda_nvrtc": {
    "name": "CUDA NVRTC",
    "version": "11.2.152"
  },
  "cuda_nvtx": {
    "name": "CUDA NVTX",
    "version": "11.2.152"
  },
  "cuda_samples": {
    "name": "CUDA Samples",
    "version": "11.2.152"
  },
  "cuda_sanitizer_api": {
    "name": "CUDA Compute Sanitizer API",
    "version": "11.2.152"
  },
  "libcublas": {
    "name": "CUDA cuBLAS",
    "version": "11.4.1.1043"
  },
  "libcufft": {
    "name": "CUDA cuFFT",
    "version": "10.4.1.152"
  },
  "libcurand": {
    "name": "CUDA cuRAND",
    "version": "10.2.3.152"
  },
  "libcusolver": {
    "name": "CUDA cuSOLVER",
    "version": "11.1.0.152"
  },
  "libcusparse": {
    "name": "CUDA cuSPARSE",
    "version": "11.4.1.1152"
  },
  "libnpp": {
    "name": "CUDA NPP",
    "version": "11.3.2.152"
  },
  "libnvjpeg": {
    "name": "CUDA nvJPEG",
    "version": "11.4.0.152"
  },
  "nsight_compute": {
    "name": "Nsight Compute",
    "version": "2020.3.1.4"
  },
  "nsight_systems": {
    "name": "Nsight Systems",
    "version": "2020.4.3.7"
  },
  "nvidia_driver": {
    "name": "NVIDIA Linux Driver",
    "version": "460.32.03"
  },
  # ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  # from manifest up until this point;
  # entries below added by mapping
  # vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
  "cuda_cudart_devel": {
    "version": "11.2.152"
  },
  "cuda_driver_devel": {
    "version": "11.2.152"
  },
  "cuda_nvrtc_devel": {
    "version": "11.2.152"
  },
  "libcublas_devel": {
    "version": "11.4.1.1043"
  },
  "libcufft_devel": {
    "version": "10.4.1.152"
  },
  "libcurand_devel": {
    "version": "10.2.3.152"
  },
  "libcusolver_devel": {
    "version": "11.1.0.152"
  },
  "libcusparse_devel": {
    "version": "11.4.1.1152"
  },
  "libnpp_devel": {
    "version": "11.3.2.152"
  },
  "libnvjpeg_devel": {
    "version": "11.4.0.152"
  },
  "nvidia_driver_devel": {
    "version": "460.32.03"
  }
}

@h-vetinari
Copy link
Member Author

More compact list:

cuda:11.2.2
cuda-cudart:11.2.152
cuda-cudart-devel:11.2.152
cuda-cuobjdump:11.2.152
cuda-cupti:11.2.152
cuda-cuxxfilt:11.2.152
cuda-driver-devel:11.2.152
cuda-gdb:11.2.152
cuda-nvcc:11.2.152
cuda-nvdisasm:11.2.152
cuda-nvml-devel:11.2.152
cuda-nvprof:11.2.152
cuda-nvprune:11.2.152
cuda-nvrtc:11.2.152
cuda-nvrtc-devel:11.2.152
cuda-nvtx:11.2.152
cuda-samples:11.2.152
cuda-sanitizer:11.2.152
libcublas:11.4.1.1043
libcublas-devel:11.4.1.1043
libcufft:10.4.1.152
libcufft-devel:10.4.1.152
libcurand:10.2.3.152
libcurand-devel:10.2.3.152
libcusolver:11.1.0.152
libcusolver-devel:11.1.0.152
libcusparse:11.4.1.1152
libcusparse-devel:11.4.1.1152
libnpp:11.3.2.152
libnpp-devel:11.3.2.152
libnvjpeg:11.4.0.152
libnvjpeg-devel:11.4.0.152
nsight-compute:2020.3.1.4
nsight-systems:2020.4.3.7
nvidia-driver:460.32.03
nvidia-driver-devel:460.32.03

@h-vetinari
Copy link
Member Author

@isuruf, I managed to get this to run through to the point where you had added

mv ./usr/local/cuda-${CUDA_COMPILER_VERSION}/targets/${CUDA_HOST_PLATFORM_ARCH}-linux ${CUDA_HOME}/targets/${CUDA_HOST_PLATFORM_ARCH}-linux

at the end of the loop, but this fails with

mv: cannot move ‘./usr/local/cuda-11.2/targets/sbsa-linux’ to ‘/usr/local/cuda/targets/sbsa-linux’: Permission denied

I'm pretty sure now that this happens because I was running this in the recipe-section for debugging purposes, where we don't actually have root rights. I think that if this script is run where it should, that actually should run through. In any case, that it works up until this point shows that the downloads are fine, which means I'm marking this as ready (happy to clean up commit history if desired).

@h-vetinari h-vetinari marked this pull request as ready for review November 7, 2022 03:23
@h-vetinari h-vetinari requested a review from a team as a code owner November 7, 2022 03:23
Comment on lines 40 to 46
docker_image: # [os.environ.get("BUILD_PLATFORM", "").startswith("linux") and (ppc64le or aarch64)]
# case: native compilation (build == target)
- quay.io/condaforge/linux-anvil-ppc64le-cuda:11.2 # [ppc64le and os.environ.get("BUILD_PLATFORM") == "linux-ppc64le"]
- quay.io/condaforge/linux-anvil-aarch64-cuda:11.2 # [aarch64 and os.environ.get("BUILD_PLATFORM") == "linux-aarch64"]
# case: cross-compilation (build != target)
- quay.io/condaforge/linux-anvil-cuda:11.2 # [ppc64le and os.environ.get("BUILD_PLATFORM") == "linux-64"]
- quay.io/condaforge/linux-anvil-cuda:11.2 # [aarch64 and os.environ.get("BUILD_PLATFORM") == "linux-64"]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if adding this migration is strictly speaking necessary, but since the linux builds have both cuda / non-cuda, I think we should do the same for aarch/ppc?

This reflects the changes in conda-forge/conda-forge-pinning-feedstock#3624, which we should ideally merge before-hand (then we can skip the use_local: true here).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove this until nvcc-feedstock is ready

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem, I'll remove it; though I wonder if we'll still be able to cross-compile cuda without this (which is the point of this PR, no?)

Copy link
Member

@isuruf isuruf Feb 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you propose we solve the issue in #210 (comment)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you propose we solve the issue in #210 (comment)?

I don't know yet, I'm trying to help work through the issues as they appear. But the overarching goal remains to cross-compile CUDA, so if we're falling short of that, we should keep this open IMO.

@h-vetinari
Copy link
Member Author

I'm pretty sure now that this happens because I was running this in the recipe-section for debugging purposes, where we don't actually have root rights. I think that if this script is run where it should, that actually should run through.

This turned out to be wrong... So now I'm not sure what we need to tweak (chown?) to get the packages into $CUDA_HOME. It's a "simple" permission issue, but I don't know well enough which layer here is allowed to do what.

Also worth noting - not everything gets installed in ./usr/local/cuda-11.2, some goes in ./opt/nvidia or ./usr/share.

@h-vetinari
Copy link
Member Author

h-vetinari commented Feb 2, 2023

The committer being you is fine. I just don't want my commit squashed.

OK, I'll keep this in mind. For context, I often need to trawl through feedstock history, and I find sequences of commits like:

really annoying (aside from the content-free commit messages that the GH UI encourages), because they tackle one specific issue and semantically should be grouped together. This is not a dig at your commits here, I do the same during development as well, I just have the habit of cleaning up my commit history as I go resp. before merging, so that it's easier to navigate down the line.

But now that I know that you don't like that, I'll make an exception for your commits.

@isuruf
Copy link
Member

isuruf commented Feb 2, 2023

If you want, you can clean up the commit message when there's no message, but keep my commits as they are in the future.

@h-vetinari
Copy link
Member Author

@isuruf: How do you propose we solve the issue in #210 (comment)?

@h-vetinari: I don't know yet, I'm trying to help work through the issues as they appear. But the overarching goal remains to cross-compile CUDA, so if we're falling short of that, we should keep this open IMO.

So what are the next steps we need to tackle for cross-compiling CUDA, and where do we track them (if not here)?

@isuruf
Copy link
Member

isuruf commented Feb 2, 2023

We can track them in nvcc-feedstock

@isuruf isuruf requested a review from kkraus14 February 2, 2023 02:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants