-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot find cuda_profiler_api.h when building cpu_adam #2622
Comments
@hagope, thanks for reporting this issue. Do you mind sharing a PR? |
Hi guys, I have encountered the same issue. Any update on this? Thanks! |
@DakeZhang1998 - I'm reproducing this now, but are you able to simlink to work around this in the meantime? |
I am quite new to this. I am trying to use Deepspeed on a Ubuntu server where I don't have sudo access. So I am managing CUDA in Anaconda. I installed PyTorch using the command from their official website: |
This is probably a simple case of your python environment not containing the right PATH and LD_LIBRARY environment variables.
I think deepspeed could also do better at finding your cuda install even when these variables are not specified given that they could follow the |
Thanks to the technical manager in my school, my issue is fixed by forcing Conda to install cuda-nvprof (11.7) instead of v12 by default. |
Thanks @abhay-agarwal and glad that works for you, @DakeZhang1998. @hagope, can you check your PATH and LD_LIBRARY_PATH? I'm taking a look at why DeepSpeed doesn't discover the cuda install any better. |
This actually appears to be an issue on the PyTorch side. Credit to @HeyangQin for this, links here to PyTorch discussion and another DeepSpeed issue with info on this. Seems like we'll have to wait for the next PyTorch release, so closing this issue for now. https://discuss.pytorch.org/t/not-able-to-include-cusolverdn-h/169122 |
This worked. In my casem, when building FasterTransformer, I have to use |
For people installing CUDA with apt, doing |
I don't know what to put in that environment variable. In this situation, what should put below? addition, when i command which nvcc -> /opt/conda/bin/nvcc came out |
Describe the bug
When trying to pip install and build Adam,
cuda_profiler_api.h
is missing in sources.To Reproduce
Steps to reproduce the behavior:
DS_BUILD_CPU_ADAM=1 pip install .
cuda_profiler_api.h
cannot be found.Expected behavior
I believe the
cuda_profiler_api.h
should be added to thecsrc/includes
path? I was able to work around the problem with a symlink to my local cuda install:ln -s /usr/local/cuda-11.7/targets/x86_64-linux/include/cuda_profiler_api.h csrc/includes/cuda_profiler_api.h
ds_report output
System info (please complete the following information):
The text was updated successfully, but these errors were encountered: