Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return usage for openai requests #1663

Merged
merged 1 commit into from
Nov 16, 2023
Merged

Conversation

ichernev
Copy link
Contributor

@ichernev ichernev commented Nov 14, 2023

Include number of input, output and total tokens for openai regular and streaming requests.

OpenAI backend doesn't include this information for streaming requests, but it is a very requested feature.

@ichernev ichernev force-pushed the usage-v2 branch 2 times, most recently from 2f0242a to 7f0482c Compare November 14, 2023 15:07
@simon-mo
Copy link
Collaborator

Hi @ichernev,

Thank you for your PR. Currently, we are trying to keep API server as minimal and encourage folks to use the OpenAI version. If you can limit this PR to just adding streaming support for OpenAI that would be great. We believe there's already support for usage in the non-streaming case.

@ichernev
Copy link
Contributor Author

@simon-mo thanks for the feedback, I'll limit it to the streaming openai entrypoint.

Include number of input, output and total tokens for regular and
streaming openai requests.
@ichernev ichernev changed the title Return usage for all requests Return usage for openai requests Nov 16, 2023
@ichernev
Copy link
Contributor Author

@simon-mo code tweaks only openai endpoint now. Edited title & description.

@simon-mo simon-mo merged commit 686f5e3 into vllm-project:main Nov 16, 2023
2 checks passed
yxl pushed a commit to yxl/vllm that referenced this pull request Nov 29, 2023
hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024
sjchoi1 pushed a commit to casys-kaist-internal/vllm that referenced this pull request May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants