Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(restore): Do not retry restore proposal #8058

Merged
merged 2 commits into from
Sep 30, 2021

Conversation

ahsanbarkati
Copy link
Contributor

@ahsanbarkati ahsanbarkati commented Sep 29, 2021

There is an issue caused due to retry of restore proposal, consider the following scenario:

1. alpha-2 gets the restore request (leader is alpha-0)
2. alpha-2 sends the request to alpha-0 (leader).
3. alpha-0 called proposeAndWait which proposed the req (index 24) at time=15:56:10
4. alpha-0 was still waiting for the proposal to be applied and rpc call for `Restore` by alpha-2 got "transport closing error" at time=15:59:08
5. transport closing is a retriable error, so alpha-2 again tried to proposeoOrSend, this time leader was alpha-1, so it sent it to alpha-1
6. alpha-1 proposed the restore request (index 28) at time=15:59:09

This PR removes the retry logic. If the restore fails, it should be re-triggered manually.


This change is Reviewable

Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 1 files at r1, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @ahsanbarkati)

@ahsanbarkati ahsanbarkati merged commit 69b186a into master Sep 30, 2021
@ahsanbarkati ahsanbarkati deleted the ahsan/no-restore-retry branch September 30, 2021 20:12
mangalaman93 pushed a commit that referenced this pull request Oct 17, 2023
Do not retry the restore proposal. It can cause issues in
the edge case scenarios. Consider the following scenario:
  1. alpha-2 gets the restore request (leader is alpha-0)
  2. alpha-2 sends the request to alpha-0 (leader).
  3. alpha-0 called proposeAndWait which proposed the req
      (index 24) at time=15:56:10
  4. alpha-0 was still waiting for the proposal to be applied
      and RPC call for `Restore` by alpha-2 got "transport
      closing error" at time=15:59:08
  5. transport closing is a retriable error, so alpha-2 again
      tried to proposeoOrSend, this time leader was alpha-1,
      so it sent it to alpha-1
  6. alpha-1 proposed the restore request (index 28) at time=15:59:09
mangalaman93 added a commit that referenced this pull request Oct 17, 2023
Do not retry the restore proposal. It can cause issues in
the edge case scenarios. Consider the following scenario:
  1. alpha-2 gets the restore request (leader is alpha-0)
  2. alpha-2 sends the request to alpha-0 (leader).
  3. alpha-0 called proposeAndWait which proposed the req
      (index 24) at time=15:56:10
  4. alpha-0 was still waiting for the proposal to be applied
      and RPC call for `Restore` by alpha-2 got "transport
      closing error" at time=15:59:08
  5. transport closing is a retriable error, so alpha-2 again
      tried to proposeoOrSend, this time leader was alpha-1,
      so it sent it to alpha-1
  6. alpha-1 proposed the restore request (index 28) at time=15:59:09
mangalaman93 added a commit that referenced this pull request Mar 13, 2024
Do not retry the restore proposal. It can cause issues in the edge case
scenarios. Consider the following scenario:
  1. alpha-2 gets the restore request (leader is alpha-0)
  2. alpha-2 sends the request to alpha-0 (leader).
  3. alpha-0 called proposeAndWait which proposed the req (index 24) at
time=15:56:10
  4. alpha-0 was still waiting for the proposal to be applied and RPC call
for `Restore` by alpha-2 got "transport closing error" at time=15:59:08
  5. transport closing is a retriable error, so alpha-2 again tried to
proposeoOrSend, this time leader was alpha-1, so it sent it to alpha-1
  6. alpha-1 proposed the restore request (index 28) at time=15:59:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants