Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.Net: [Feature Request] Expose token usage and executed prompt as part of function result. #4691

Closed
gmantri opened this issue Jan 21, 2024 · 9 comments
Assignees
Labels
.NET Issue or Pull requests regarding .NET code

Comments

@gmantri
Copy link

gmantri commented Jan 21, 2024

Currently there's a convoluted way to get the token usage and executed prompt when executing a function. For example, here's how you can get the token usage:

#pragma warning disable SKEXP0004
kernel.FunctionInvoked += (sender, args) =>
{
    var metadata = args.Metadata;
    if (metadata.ContainsKey("Usage"))
    {
        var usage = (CompletionsUsage)metadata["Usage"];
        Console.WriteLine($"Token usage. Input tokens: {usage.PromptTokens}; Output tokens: {usage.CompletionTokens}");
    }
};

It would be really useful if this information is surfaced in FunctionResult. Even if it is part of Metadata property, I think it is fine.

Same thing for the prompt that was sent to the LLM as well.

@shawncal shawncal added .NET Issue or Pull requests regarding .NET code triage labels Jan 21, 2024
@dmm-l-mediehus
Copy link

dmm-l-mediehus commented Jan 22, 2024

How do you get token usage from IChatCompletionService.GetStreamingChatMessageContentsAsync(..) ?

I got it to work with default GetChatMessageContents, but not streaming. Several people asked for it including me. "Usage" in Metadata isn't there for me, I only see 3 items in metadata:
image

@dmytrostruk
Copy link
Member

It would be really useful if this information is surfaced in FunctionResult. Even if it is part of Metadata property, I think it is fine.

@gmantri Token usage is already part of Metadata property of FunctionResult, here is an example how to get it:

FunctionResult result = await kernel.InvokeAsync(myFunction, new() { ["input"] = "travel" });
// Display results
WriteLine(result.GetValue<string>());
WriteLine(result.Metadata?["Usage"]?.AsJson());

Same thing for the prompt that was sent to the LLM as well.

This is not supported at the moment, but if it would be useful, we will add it as part of FunctionResult.Metadata as well.

I would also recommend checking new approach of getting token usage and prompt - filters. PR with filters is here and will be released soon: #4437

@gmantri
Copy link
Author

gmantri commented Jan 22, 2024

@dmytrostruk - Thanks. I am not sure how I missed that. I will go ahead and close this. Do you want me to open up a new issue for the prompt so that it can be tracked properly?

@dmm-l-mediehus - It may be worthwhile to open a new issue for your question.

@gmantri gmantri closed this as completed Jan 22, 2024
@dmytrostruk
Copy link
Member

How do you get token usage from IChatCompletionService.GetStreamingChatMessageContentsAsync(..) ?

@dmm-l-mediehus I just checked, it doesn't look like Azure SDK or OpenAI API provide usage information as part of chat completion chunk object in streaming scenario. It makes sense, because information like token usage can be calculated only when you receive a full response and not chunk. Most probably in streaming scenario you have to calculate tokens manually on your side.

@dmm-l-mediehus
Copy link

I managed to get it working using a third party nuget to calculate ChatHistory and the final output of the stream.👍

How do you get token usage from IChatCompletionService.GetStreamingChatMessageContentsAsync(..) ?

@dmm-l-mediehus I just checked, it doesn't look like Azure SDK or OpenAI API provide usage information as part of chat completion chunk object in streaming scenario. It makes sense, because information like token usage can be calculated only when you receive a full response and not chunk. Most probably in streaming scenario you have to calculate tokens manually on your side.

@EVENFLOW212
Copy link

I managed to get it working using a third party nuget to calculate ChatHistory and the final output of the stream.👍

How do you get token usage from IChatCompletionService.GetStreamingChatMessageContentsAsync(..) ?

@dmm-l-mediehus I just checked, it doesn't look like Azure SDK or OpenAI API provide usage information as part of chat completion chunk object in streaming scenario. It makes sense, because information like token usage can be calculated only when you receive a full response and not chunk. Most probably in streaming scenario you have to calculate tokens manually on your side.

dmm-l-mediehus

How did you manage to get the token count thru the nuget package and which package is it?

@EVENFLOW212
Copy link

@dmytrostruk - Thanks. I am not sure how I missed that. I will go ahead and close this. Do you want me to open up a new issue for the prompt so that it can be tracked properly?

@dmm-l-mediehus - It may be worthwhile to open a new issue for your question.

gmantri

when should we expect token usage for GetStreamingChatMessageContentsAsync? Which version of SK would it go into?

Thanks

@dmm-l-mediehus
Copy link

I managed to get it working using a third party nuget to calculate ChatHistory and the final output of the stream.👍

How do you get token usage from IChatCompletionService.GetStreamingChatMessageContentsAsync(..) ?

@dmm-l-mediehus I just checked, it doesn't look like Azure SDK or OpenAI API provide usage information as part of chat completion chunk object in streaming scenario. It makes sense, because information like token usage can be calculated only when you receive a full response and not chunk. Most probably in streaming scenario you have to calculate tokens manually on your side.

dmm-l-mediehus

How did you manage to get the token count thru the nuget package and which package is it?

Tiktoken nuget:

  1. Serialize ChatHistory to a json - use then Tiktoken to calculate Prompt tokens
  2. finally-block (after IAsyncEnumerable has finished), use Tiktoken to calculate the Completion tokens.

@gmantri
Copy link
Author

gmantri commented Feb 4, 2024

I managed to get it working using a third party nuget to calculate ChatHistory and the final output of the stream.👍

How do you get token usage from IChatCompletionService.GetStreamingChatMessageContentsAsync(..) ?

@dmm-l-mediehus I just checked, it doesn't look like Azure SDK or OpenAI API provide usage information as part of chat completion chunk object in streaming scenario. It makes sense, because information like token usage can be calculated only when you receive a full response and not chunk. Most probably in streaming scenario you have to calculate tokens manually on your side.

dmm-l-mediehus
How did you manage to get the token count thru the nuget package and which package is it?

Tiktoken nuget:

  1. Serialize ChatHistory to a json - use then Tiktoken to calculate Prompt tokens
  2. finally-block (after IAsyncEnumerable has finished), use Tiktoken to calculate the Completion tokens.

@dmm-l-mediehus - Thanks for the tip on using Tiktoken! it was indeed helpful.
@EVENFLOW212 - Here's what I ended up doing. In my application, I am using YAML files as prompt templates and then I am passing a number of arguments (including chat history) to that template.

private async Task<(string prompt, Exception exception)> GetPrompt(Kernel kernel, string promptFilePath, IDictionary<string, object> arguments)
{
    try
    {
        var promptFileContents = await File.ReadAllTextAsync(promptFilePath);
        var promptTemplateConfig = KernelFunctionYaml.ToPromptTemplateConfig(promptFileContents);
        var factory = new HandlebarsPromptTemplateFactory();
        if (!factory.TryCreate(promptTemplateConfig, out var promptTemplate)) return (string.Empty, new InvalidOperationException("Unable to create prompt template."));
        var openAIPromptSettings = new OpenAIPromptExecutionSettings()
        {
            Temperature = 0
        };
        var kernelArguments = new KernelArguments(openAIPromptSettings);
        if (arguments != null)
        {
            foreach (var kvp in arguments)
            {
                kernelArguments.TryAdd(kvp.Key, kvp.Value);
            }
        }

        var prompt = await promptTemplate.RenderAsync(kernel, kernelArguments);
        return (prompt, null);
    }
    catch (Exception exception)
    {
        //log exception
        return (string.Empty, exception);
    }
}

And as I am getting streaming response, I am saving that to a string. Once all responses have been received, I am making use of Tiktoken to calculate prompt and completion tokens using something like the following:

private (int PromptTokens, int CompletionTokens) CalculateTokensForQuestionAndAnswer(string prompt, string answer)
{
    try
    {
        var promptTokens = 0;
        var completionTokens = 0;
        var encodingForModel = Tiktoken.Encoding.TryForModel("model type - https://github.com/tryAGI/Tiktoken/blob/main/src/libs/Tiktoken/Services/Helpers.GetNameByModel.cs);
        if (encodingForModel == null) return (PromptTokens: promptTokens, CompletionTokens: completionTokens);
        promptTokens = encodingForModel.CountTokens(prompt);
        completionTokens = encodingForModel.CountTokens(answer);
        return (PromptTokens: promptTokens, CompletionTokens: completionTokens);
    }
    catch (Exception exception)
    {
        //log exception
        return (PromptTokens: 0, CompletionTokens: 0);
    }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
.NET Issue or Pull requests regarding .NET code
Projects
Archived in project
Development

No branches or pull requests

6 participants