.Net: [Feature Request] Expose token usage and executed prompt as part of function result. #4691

gmantri · 2024-01-21T23:22:35Z

Currently there's a convoluted way to get the token usage and executed prompt when executing a function. For example, here's how you can get the token usage:

#pragma warning disable SKEXP0004
kernel.FunctionInvoked += (sender, args) =>
{
    var metadata = args.Metadata;
    if (metadata.ContainsKey("Usage"))
    {
        var usage = (CompletionsUsage)metadata["Usage"];
        Console.WriteLine($"Token usage. Input tokens: {usage.PromptTokens}; Output tokens: {usage.CompletionTokens}");
    }
};

It would be really useful if this information is surfaced in FunctionResult. Even if it is part of Metadata property, I think it is fine.

Same thing for the prompt that was sent to the LLM as well.

The text was updated successfully, but these errors were encountered:

dmm-l-mediehus · 2024-01-22T11:14:27Z

How do you get token usage from IChatCompletionService.GetStreamingChatMessageContentsAsync(..) ?

I got it to work with default GetChatMessageContents, but not streaming. Several people asked for it including me. "Usage" in Metadata isn't there for me, I only see 3 items in metadata:

dmytrostruk · 2024-01-22T11:30:23Z

It would be really useful if this information is surfaced in FunctionResult. Even if it is part of Metadata property, I think it is fine.

@gmantri Token usage is already part of Metadata property of FunctionResult, here is an example how to get it:

semantic-kernel/dotnet/samples/KernelSyntaxExamples/Example43_GetModelResult.cs

Lines 30 to 34 in 3fadf9a

    
           FunctionResult result = await kernel.InvokeAsync(myFunction, new() { ["input"] = "travel" }); 
        
           // Display results 
        
           WriteLine(result.GetValue<string>()); 
        
           WriteLine(result.Metadata?["Usage"]?.AsJson());

Same thing for the prompt that was sent to the LLM as well.

This is not supported at the moment, but if it would be useful, we will add it as part of FunctionResult.Metadata as well.

I would also recommend checking new approach of getting token usage and prompt - filters. PR with filters is here and will be released soon: #4437

gmantri · 2024-01-22T11:36:49Z

@dmytrostruk - Thanks. I am not sure how I missed that. I will go ahead and close this. Do you want me to open up a new issue for the prompt so that it can be tracked properly?

@dmm-l-mediehus - It may be worthwhile to open a new issue for your question.

dmytrostruk · 2024-01-22T12:16:09Z

How do you get token usage from IChatCompletionService.GetStreamingChatMessageContentsAsync(..) ?

@dmm-l-mediehus I just checked, it doesn't look like Azure SDK or OpenAI API provide usage information as part of chat completion chunk object in streaming scenario. It makes sense, because information like token usage can be calculated only when you receive a full response and not chunk. Most probably in streaming scenario you have to calculate tokens manually on your side.

dmm-l-mediehus · 2024-01-23T08:00:10Z

I managed to get it working using a third party nuget to calculate ChatHistory and the final output of the stream.👍

How do you get token usage from IChatCompletionService.GetStreamingChatMessageContentsAsync(..) ?

@dmm-l-mediehus I just checked, it doesn't look like Azure SDK or OpenAI API provide usage information as part of chat completion chunk object in streaming scenario. It makes sense, because information like token usage can be calculated only when you receive a full response and not chunk. Most probably in streaming scenario you have to calculate tokens manually on your side.

EVENFLOW212 · 2024-01-26T18:45:31Z

I managed to get it working using a third party nuget to calculate ChatHistory and the final output of the stream.👍

How do you get token usage from IChatCompletionService.GetStreamingChatMessageContentsAsync(..) ?

@dmm-l-mediehus I just checked, it doesn't look like Azure SDK or OpenAI API provide usage information as part of chat completion chunk object in streaming scenario. It makes sense, because information like token usage can be calculated only when you receive a full response and not chunk. Most probably in streaming scenario you have to calculate tokens manually on your side.

dmm-l-mediehus

How did you manage to get the token count thru the nuget package and which package is it?

EVENFLOW212 · 2024-01-26T18:47:25Z

@dmytrostruk - Thanks. I am not sure how I missed that. I will go ahead and close this. Do you want me to open up a new issue for the prompt so that it can be tracked properly?

@dmm-l-mediehus - It may be worthwhile to open a new issue for your question.

gmantri

when should we expect token usage for GetStreamingChatMessageContentsAsync? Which version of SK would it go into?

Thanks

dmm-l-mediehus · 2024-01-26T20:29:22Z

I managed to get it working using a third party nuget to calculate ChatHistory and the final output of the stream.👍

How do you get token usage from IChatCompletionService.GetStreamingChatMessageContentsAsync(..) ?

@dmm-l-mediehus I just checked, it doesn't look like Azure SDK or OpenAI API provide usage information as part of chat completion chunk object in streaming scenario. It makes sense, because information like token usage can be calculated only when you receive a full response and not chunk. Most probably in streaming scenario you have to calculate tokens manually on your side.

dmm-l-mediehus

How did you manage to get the token count thru the nuget package and which package is it?

Tiktoken nuget:

Serialize ChatHistory to a json - use then Tiktoken to calculate Prompt tokens
finally-block (after IAsyncEnumerable has finished), use Tiktoken to calculate the Completion tokens.

gmantri · 2024-02-04T14:59:20Z

I managed to get it working using a third party nuget to calculate ChatHistory and the final output of the stream.👍

How do you get token usage from IChatCompletionService.GetStreamingChatMessageContentsAsync(..) ?

@dmm-l-mediehus I just checked, it doesn't look like Azure SDK or OpenAI API provide usage information as part of chat completion chunk object in streaming scenario. It makes sense, because information like token usage can be calculated only when you receive a full response and not chunk. Most probably in streaming scenario you have to calculate tokens manually on your side.

dmm-l-mediehus
How did you manage to get the token count thru the nuget package and which package is it?

Tiktoken nuget:

Serialize ChatHistory to a json - use then Tiktoken to calculate Prompt tokens

finally-block (after IAsyncEnumerable has finished), use Tiktoken to calculate the Completion tokens.

@dmm-l-mediehus - Thanks for the tip on using Tiktoken! it was indeed helpful.
@EVENFLOW212 - Here's what I ended up doing. In my application, I am using YAML files as prompt templates and then I am passing a number of arguments (including chat history) to that template.

private async Task<(string prompt, Exception exception)> GetPrompt(Kernel kernel, string promptFilePath, IDictionary<string, object> arguments)
{
    try
    {
        var promptFileContents = await File.ReadAllTextAsync(promptFilePath);
        var promptTemplateConfig = KernelFunctionYaml.ToPromptTemplateConfig(promptFileContents);
        var factory = new HandlebarsPromptTemplateFactory();
        if (!factory.TryCreate(promptTemplateConfig, out var promptTemplate)) return (string.Empty, new InvalidOperationException("Unable to create prompt template."));
        var openAIPromptSettings = new OpenAIPromptExecutionSettings()
        {
            Temperature = 0
        };
        var kernelArguments = new KernelArguments(openAIPromptSettings);
        if (arguments != null)
        {
            foreach (var kvp in arguments)
            {
                kernelArguments.TryAdd(kvp.Key, kvp.Value);
            }
        }

        var prompt = await promptTemplate.RenderAsync(kernel, kernelArguments);
        return (prompt, null);
    }
    catch (Exception exception)
    {
        //log exception
        return (string.Empty, exception);
    }
}

And as I am getting streaming response, I am saving that to a string. Once all responses have been received, I am making use of Tiktoken to calculate prompt and completion tokens using something like the following:

private (int PromptTokens, int CompletionTokens) CalculateTokensForQuestionAndAnswer(string prompt, string answer)
{
    try
    {
        var promptTokens = 0;
        var completionTokens = 0;
        var encodingForModel = Tiktoken.Encoding.TryForModel("model type - https://github.com/tryAGI/Tiktoken/blob/main/src/libs/Tiktoken/Services/Helpers.GetNameByModel.cs);
        if (encodingForModel == null) return (PromptTokens: promptTokens, CompletionTokens: completionTokens);
        promptTokens = encodingForModel.CountTokens(prompt);
        completionTokens = encodingForModel.CountTokens(answer);
        return (PromptTokens: promptTokens, CompletionTokens: completionTokens);
    }
    catch (Exception exception)
    {
        //log exception
        return (PromptTokens: 0, CompletionTokens: 0);
    }
}

shawncal added .NET Issue or Pull requests regarding .NET code triage labels Jan 21, 2024

markwallace-microsoft removed the triage label Jan 22, 2024

markwallace-microsoft assigned dmytrostruk Jan 22, 2024

gmantri closed this as completed Jan 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.Net: [Feature Request] Expose token usage and executed prompt as part of function result. #4691

.Net: [Feature Request] Expose token usage and executed prompt as part of function result. #4691

gmantri commented Jan 21, 2024

dmm-l-mediehus commented Jan 22, 2024 •

edited

Loading

dmytrostruk commented Jan 22, 2024

gmantri commented Jan 22, 2024

dmytrostruk commented Jan 22, 2024

dmm-l-mediehus commented Jan 23, 2024

EVENFLOW212 commented Jan 26, 2024

EVENFLOW212 commented Jan 26, 2024

dmm-l-mediehus commented Jan 26, 2024

gmantri commented Feb 4, 2024 •

edited

Loading

.Net: [Feature Request] Expose token usage and executed prompt as part of function result. #4691

.Net: [Feature Request] Expose token usage and executed prompt as part of function result. #4691

Comments

gmantri commented Jan 21, 2024

dmm-l-mediehus commented Jan 22, 2024 • edited Loading

dmytrostruk commented Jan 22, 2024

gmantri commented Jan 22, 2024

dmytrostruk commented Jan 22, 2024

dmm-l-mediehus commented Jan 23, 2024

EVENFLOW212 commented Jan 26, 2024

EVENFLOW212 commented Jan 26, 2024

dmm-l-mediehus commented Jan 26, 2024

gmantri commented Feb 4, 2024 • edited Loading

dmm-l-mediehus commented Jan 22, 2024 •

edited

Loading

gmantri commented Feb 4, 2024 •

edited

Loading