Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add backoff and jitter to retry #643

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

proost
Copy link
Contributor

@proost proost commented Oct 3, 2024

close: #633

Adding retry policy.

default is exponential backoff and jitter.

clean logic for retry handler is not easy. If have a idea, give me review.

retry.go Outdated Show resolved Hide resolved
retry.go Outdated
defer timer.Stop()

select {
case <-ctx.Done():
Copy link
Collaborator

@rueian rueian Oct 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not wait for the context deadline but check whether the deadline will be exceeded or not:

	dl, ok := ctx.Deadline()
	if !ok || time.Until(dl) > delay {
		time.Sleep(delay)
	}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there no need to respect context cancel ?

Copy link
Collaborator

@rueian rueian Oct 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better to respect the cancellation, but the returned errAbortWaiting complicates the code outside.

How about we do this, removing the errAbortWaiting:

	ctx := context.Background()
	if dl, ok := ctx.Deadline(); !ok || time.Until(dl) > delay {
		if ch := ctx.Done(); ch != nil {
			tm := time.NewTimer(delay)
			defer tm.Stop()
			select {
			case <-ch:
			case <-tm.C:
			}
		} else {
			time.Sleep(delay)
		}
		return true
	}
	return false

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still let it retry if the context is ended. The underlying pipe will check the ctx.Err().

Copy link
Contributor Author

@proost proost Oct 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

outside code of WaitUntilNextRetry can be complicated.

I feel that this is bit more code style. when not returning error, caller should always check context is done or ,as you can said, pipe always will check and all contributors should know that. It is unclear to me, so that complicated but explicit code is more sense to me. I change it, but can you give me more detailed reason?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pipe always will check and all contributors should know that. It is unclear to me

It is indeed implicit. But the reason for not returning an error is simple: adding a new error path to replace a goto retry statement seems doing too much.

@proost proost requested a review from rueian October 5, 2024 08:49
retry.go Outdated
base := util.FastRandFloat64()
backoff := uint64(1 << uint64(math.Min(float64(attempts), maxAttemptsShift)))
jitter := base * float64(backoff) * float64(time.Millisecond)
return time.Duration(math.Min(jitter, float64(defaultMaxRetryDelay)))
Copy link
Collaborator

@rueian rueian Oct 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should apply randomness only to the jitter part. The delay produced by the current implementation is not monotonically increasing.

Can we avoid using float64? We can use the built-in min and max functions.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example:

	backoff := time.Microsecond * 5 * (1 << min(20, attempts))
	jitter := time.Microsecond * time.Duration(rand.Int64N(1<<min(20, attempts)))
	return min(time.Second, backoff+jitter)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, actually above logic came from here.

The delay produced by the current implementation is not monotonically increasing.

Intended part. To smooth multiple client calls.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@proost proost requested a review from rueian October 6, 2024 14:22
Comment on lines +489 to +491
if err != nil {
return newErrResult(err)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be removed?

Comment on lines +765 to +767
if err != nil {
return fillErrs(len(multi), err)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be removed?

Comment on lines +800 to +802
if err != nil {
return newErrResult(err)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be removed?

Comment on lines +619 to +630

if !isWaitingForRetry {
shouldRetry := c.retryHandler.WaitUntilNextRetry(
ctx, attempts, resp.Error(),
)

if !shouldRetry {
continue
}

isWaitingForRetry = true
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you put the waiting here? Instead of putting it at:

rueidis/cluster.go

Lines 679 to 681 in fc348f5

if len(retries.m) != 0 {
goto retry
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature: retry backoffs
2 participants