[ci][test] exclude model download time in server start time #7834

youkaichao · 2024-08-24T08:00:48Z

fixes errors observed in https://buildkite.com/vllm/fastcheck/builds/3075#019182d5-5e41-408c-89e8-1d6bb86cc729

github-actions · 2024-08-24T08:00:59Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which consists a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of default ones by unblocking the steps in your fast-check build on Buildkite UI.

Once the PR is approved and ready to go, please make sure to run full CI as it is required to merge (or just use auto-merge).

To run full CI, you can do one of these:

Comment /ready on the PR
Add ready label to the PR
Enable auto-merge.

🚀

DarkLight1337 · 2024-08-24T13:32:20Z

We should have ran entrypoints-tests before merging this. Since the model name is changed, test cases that check the model name now fail.

mgoin · 2024-08-24T16:59:23Z

tests/utils.py

+        if not model.startswith("/"):
+            # download the model if it's not a local path
+            # to exclude the model download time from the server start time
+            model = snapshot_download(model)


snapshot_download can download many more files than we need for inference, such as duplicate .pt files when we prefer to use safetensors. We should try to use a common function with how vLLM usually pulls down files for models

can you come up with a fix for it?

I don't know if safetensors would be enough. but we can have atry.

Basically I am proposing using DefaultModelLoader._prepare_weights

vllm/vllm/model_executor/model_loader/loader.py

Line 232 in ea9fa16

def _prepare_weights(self, model_name_or_path: str,

I can make a PR for this later today

please go ahead, I do see more files are downloaded in https://buildkite.com/vllm/fastcheck/builds/3094#01918525-98da-4596-8d31-f4e2c1172455

…ject#7834)

fix

88bcfe5

youkaichao requested a review from DarkLight1337 August 24, 2024 08:00

DarkLight1337 approved these changes Aug 24, 2024

View reviewed changes

youkaichao merged commit ea9fa16 into vllm-project:main Aug 24, 2024
18 of 21 checks passed

youkaichao deleted the fix_download_in_server_start branch August 24, 2024 08:03

DarkLight1337 mentioned this pull request Aug 24, 2024

[CI/Build] Avoid downloading all HF files in RemoteOpenAIServer #7836

Merged

youkaichao mentioned this pull request Aug 24, 2024

[ci][test] fix RemoteOpenAIServer #7838

Merged

mgoin reviewed Aug 24, 2024

View reviewed changes

mgoin mentioned this pull request Aug 24, 2024

Update RemoteOpenAIServer to use common prepare_weights function #7839

Closed

omrishiv pushed a commit to omrishiv/vllm that referenced this pull request Aug 26, 2024

[ci][test] exclude model download time in server start time (vllm-pro…

1c106f6

…ject#7834)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ci][test] exclude model download time in server start time #7834

[ci][test] exclude model download time in server start time #7834

youkaichao commented Aug 24, 2024

github-actions bot commented Aug 24, 2024

DarkLight1337 commented Aug 24, 2024

mgoin Aug 24, 2024

youkaichao Aug 24, 2024

mgoin Aug 24, 2024

youkaichao Aug 24, 2024

[ci][test] exclude model download time in server start time #7834

[ci][test] exclude model download time in server start time #7834

Conversation

youkaichao commented Aug 24, 2024

github-actions bot commented Aug 24, 2024

DarkLight1337 commented Aug 24, 2024

mgoin Aug 24, 2024

Choose a reason for hiding this comment

youkaichao Aug 24, 2024

Choose a reason for hiding this comment

mgoin Aug 24, 2024

Choose a reason for hiding this comment

youkaichao Aug 24, 2024

Choose a reason for hiding this comment