enhance: avoid context limit #832

g-linville · 2024-08-29T23:35:21Z

for gptscript-ai/desktop#283

This PR introduces three new measures to try to help the user avoid hitting the context length error:

Garbage collection works now. Previously, the token threshold was set too high, so we would run out of context before garbage collection would ever hit. Now, older messages will get deleted from the chat (on the LLM's side only; users can still see them) once it is needed in order to stay under the context limit.
If the output of a single tool output surpasses approximately 80% of the total context window, the tool output is considered too long and an error message is given to the LLM. We will design tools with the context limits in mind, but this is a safeguard to prevent just crashing and to allow the user to continue chatting with the LLM, and to allow the LLM to adjust its tool arguments and possibly remain under the output limit.
If, despite the previous measures, we do somehow hit the context length limit error, we catch that error and try to garbage collect more aggressively, and then retry the call up to ten times, doing more aggressive garbage collection each time.

With these three measures, I am fairly confident that users will no longer encounter the error.

Signed-off-by: Grant Linville <grant@acorn.io>

ibuildthecloud

The very important thing is that we never drop the system role message. I believe that is still that case.

g-linville · 2024-08-30T03:40:18Z

@ibuildthecloud correct. The way I implemented this will never change how the system role messages are handled. Those will always stick around.

thedadams · 2024-08-30T05:22:00Z

pkg/openai/client.go

@@ -317,6 +320,14 @@ func (c *Client) Call(ctx context.Context, messageRequest types.CompletionReques
 	}

 	if messageRequest.Chat {
+		// Check the last message. If it is from a tool call, and if it takes up more than 80% of the budget on its own, reject it.
+		lastMessage := msgs[len(msgs)-1]
+		if lastMessage.Role == string(types.CompletionMessageRoleTypeTool) && countMessage(lastMessage) > int(math.Round(float64(getBudget(messageRequest.MaxTokens))*0.8)) {


nit: probably don't need to math.Round here. Not much of a difference between 102,399 and 102,400.

thedadams · 2024-08-30T05:23:26Z

pkg/openai/client.go

+
+		// If we got back a context length exceeded error, keep retrying and shrinking the message history until we pass.
+		var apiError *openai.APIError
+		if err != nil && errors.As(err, &apiError) && apiError.Code == "context_length_exceeded" && messageRequest.Chat {


nit: errors.As takes care of the err != nil check

Suggested change

if err != nil && errors.As(err, &apiError) && apiError.Code == "context_length_exceeded" && messageRequest.Chat {

if errors.As(err, &apiError) && apiError.Code == "context_length_exceeded" && messageRequest.Chat {

thedadams · 2024-08-30T05:24:55Z

pkg/openai/client.go

+		err      error
+	)
+
+	for range 10 { // maximum 10 tries


Is this the first use in our code base?

I think so!

thedadams · 2024-08-30T05:26:21Z

pkg/openai/count.go

+
+func decreaseTenPercent(maxTokens int) int {
+	maxTokens = getBudget(maxTokens)
+	return int(math.Round(float64(maxTokens) * 0.9))


same nit about math.Round

Signed-off-by: Grant Linville <grant@acorn.io>

g-linville added 3 commits August 29, 2024 19:31

enhance: avoid context limit

9a0709f

Signed-off-by: Grant Linville <grant@acorn.io>

fix retry loop

92876e8

Signed-off-by: Grant Linville <grant@acorn.io>

fix

30f24a4

Signed-off-by: Grant Linville <grant@acorn.io>

g-linville marked this pull request as ready for review August 29, 2024 23:39

g-linville requested review from ibuildthecloud, cjellick, thedadams, njhale, StrongMonkey and iwilltry42 August 29, 2024 23:51

ibuildthecloud approved these changes Aug 30, 2024

View reviewed changes

thedadams approved these changes Aug 30, 2024

View reviewed changes

PR feedback

53b9671

Signed-off-by: Grant Linville <grant@acorn.io>

g-linville merged commit 1ad818b into gptscript-ai:main Aug 30, 2024
10 checks passed

g-linville deleted the avoid-context-limit branch August 30, 2024 13:31

g-linville mentioned this pull request Aug 30, 2024

Provide a way to continue chatting with a thread after hitting context length limit possibly by dropping/consolidating old message. gptscript-ai/desktop#283

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

enhance: avoid context limit #832

enhance: avoid context limit #832

g-linville commented Aug 29, 2024

ibuildthecloud left a comment

g-linville commented Aug 30, 2024

thedadams Aug 30, 2024

g-linville Aug 30, 2024

thedadams Aug 30, 2024

g-linville Aug 30, 2024

thedadams Aug 30, 2024

g-linville Aug 30, 2024

thedadams Aug 30, 2024

g-linville Aug 30, 2024

	if err != nil && errors.As(err, &apiError) && apiError.Code == "context_length_exceeded" && messageRequest.Chat {
	if errors.As(err, &apiError) && apiError.Code == "context_length_exceeded" && messageRequest.Chat {

enhance: avoid context limit #832

enhance: avoid context limit #832

Conversation

g-linville commented Aug 29, 2024

ibuildthecloud left a comment

Choose a reason for hiding this comment

g-linville commented Aug 30, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment