Skip to content

Spark RAPIDS takes a long time to initialise GPU memory #11423

Closed Answered by jlowe
melvin-koh asked this question in Q&A
Discussion options

You must be logged in to vote

Version 22.02 does not officially support H100 GPUs. H100 is not officially supported until version 23.06. The long delay is caused by the driver JIT-compiling all the cudf GPU kernels from the PTX code for H100 so they can run on that GPU. There are a ton of kernels, so this takes a very long time. I strongly suggest updating to a more recent version, e.g.: 24.06.1, and that should fix the long startup delay. Download details for 24.06.1 can be found here: https://github.com/NVIDIA/spark-rapids/blob/branch-24.06/docs/download.md

Replies: 2 comments

Comment options

You must be logged in to vote
0 replies
Answer selected by melvin-koh
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants