Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enhance: avoid context limit #832

Merged
merged 4 commits into from
Aug 30, 2024

Conversation

g-linville
Copy link
Member

for gptscript-ai/desktop#283

This PR introduces three new measures to try to help the user avoid hitting the context length error:

  1. Garbage collection works now. Previously, the token threshold was set too high, so we would run out of context before garbage collection would ever hit. Now, older messages will get deleted from the chat (on the LLM's side only; users can still see them) once it is needed in order to stay under the context limit.
  2. If the output of a single tool output surpasses approximately 80% of the total context window, the tool output is considered too long and an error message is given to the LLM. We will design tools with the context limits in mind, but this is a safeguard to prevent just crashing and to allow the user to continue chatting with the LLM, and to allow the LLM to adjust its tool arguments and possibly remain under the output limit.
  3. If, despite the previous measures, we do somehow hit the context length limit error, we catch that error and try to garbage collect more aggressively, and then retry the call up to ten times, doing more aggressive garbage collection each time.

With these three measures, I am fairly confident that users will no longer encounter the error.

Signed-off-by: Grant Linville <grant@acorn.io>
Signed-off-by: Grant Linville <grant@acorn.io>
Signed-off-by: Grant Linville <grant@acorn.io>
Copy link
Contributor

@ibuildthecloud ibuildthecloud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The very important thing is that we never drop the system role message. I believe that is still that case.

@g-linville
Copy link
Member Author

@ibuildthecloud correct. The way I implemented this will never change how the system role messages are handled. Those will always stick around.

@@ -317,6 +320,14 @@ func (c *Client) Call(ctx context.Context, messageRequest types.CompletionReques
}

if messageRequest.Chat {
// Check the last message. If it is from a tool call, and if it takes up more than 80% of the budget on its own, reject it.
lastMessage := msgs[len(msgs)-1]
if lastMessage.Role == string(types.CompletionMessageRoleTypeTool) && countMessage(lastMessage) > int(math.Round(float64(getBudget(messageRequest.MaxTokens))*0.8)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: probably don't need to math.Round here. Not much of a difference between 102,399 and 102,400.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed


// If we got back a context length exceeded error, keep retrying and shrinking the message history until we pass.
var apiError *openai.APIError
if err != nil && errors.As(err, &apiError) && apiError.Code == "context_length_exceeded" && messageRequest.Chat {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: errors.As takes care of the err != nil check

Suggested change
if err != nil && errors.As(err, &apiError) && apiError.Code == "context_length_exceeded" && messageRequest.Chat {
if errors.As(err, &apiError) && apiError.Code == "context_length_exceeded" && messageRequest.Chat {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

err error
)

for range 10 { // maximum 10 tries
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the first use in our code base?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so!


func decreaseTenPercent(maxTokens int) int {
maxTokens = getBudget(maxTokens)
return int(math.Round(float64(maxTokens) * 0.9))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same nit about math.Round

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Signed-off-by: Grant Linville <grant@acorn.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants