[Bug]: deploy on V100, mma -> mma layout conversion is only supported on Ampere #8024
Open
1 task done
Labels
bug
Something isn't working
Your current environment
There are some related issues #2729 , #6723
The output of `python collect_env.py`
🐛 Describe the bug
We use vLLM to startup
deepseek-ai/deepseek-coder-33b-instruct
on V100, meet Error follow as .The current workaround is set to
--enable-chunked-prefill=False
, but this method is unknown to most users.Does vLLM have plans to reimplement the a fwd kernel, support
enable-chunked-prefill
on V100?Before submitting a new issue...
The text was updated successfully, but these errors were encountered: