-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add backoff and jitter to retry #643
base: main
Are you sure you want to change the base?
Conversation
retry.go
Outdated
defer timer.Stop() | ||
|
||
select { | ||
case <-ctx.Done(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not wait for the context deadline but check whether the deadline will be exceeded or not:
dl, ok := ctx.Deadline()
if !ok || time.Until(dl) > delay {
time.Sleep(delay)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there no need to respect context cancel ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is better to respect the cancellation, but the returned errAbortWaiting
complicates the code outside.
How about we do this, removing the errAbortWaiting
:
ctx := context.Background()
if dl, ok := ctx.Deadline(); !ok || time.Until(dl) > delay {
if ch := ctx.Done(); ch != nil {
tm := time.NewTimer(delay)
defer tm.Stop()
select {
case <-ch:
case <-tm.C:
}
} else {
time.Sleep(delay)
}
return true
}
return false
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still let it retry if the context is ended. The underlying pipe will check the ctx.Err().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
outside code of
WaitUntilNextRetry
can be complicated.
I feel that this is bit more code style. when not returning error, caller should always check context is done or ,as you can said, pipe
always will check and all contributors should know that. It is unclear to me, so that complicated but explicit code is more sense to me. I change it, but can you give me more detailed reason?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pipe always will check and all contributors should know that. It is unclear to me
It is indeed implicit. But the reason for not returning an error is simple: adding a new error path to replace a goto retry
statement seems doing too much.
retry.go
Outdated
base := util.FastRandFloat64() | ||
backoff := uint64(1 << uint64(math.Min(float64(attempts), maxAttemptsShift))) | ||
jitter := base * float64(backoff) * float64(time.Millisecond) | ||
return time.Duration(math.Min(jitter, float64(defaultMaxRetryDelay))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should apply randomness only to the jitter part. The delay produced by the current implementation is not monotonically increasing.
Can we avoid using float64? We can use the built-in min
and max
functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example:
backoff := time.Microsecond * 5 * (1 << min(20, attempts))
jitter := time.Microsecond * time.Duration(rand.Int64N(1<<min(20, attempts)))
return min(time.Second, backoff+jitter)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, actually above logic came from here.
The delay produced by the current implementation is not monotonically increasing.
Intended part. To smooth multiple client calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if err != nil { | ||
return newErrResult(err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be removed?
if err != nil { | ||
return fillErrs(len(multi), err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be removed?
if err != nil { | ||
return newErrResult(err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be removed?
|
||
if !isWaitingForRetry { | ||
shouldRetry := c.retryHandler.WaitUntilNextRetry( | ||
ctx, attempts, resp.Error(), | ||
) | ||
|
||
if !shouldRetry { | ||
continue | ||
} | ||
|
||
isWaitingForRetry = true | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you put the waiting here? Instead of putting it at:
Lines 679 to 681 in fc348f5
if len(retries.m) != 0 { | |
goto retry | |
} |
close: #633
Adding retry policy.
default is exponential backoff and jitter.
clean logic for retry handler is not easy. If have a idea, give me review.