Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Zen2] Introduce FollowersChecker #33917

Merged
merged 12 commits into from
Sep 22, 2018

Conversation

DaveCTurner
Copy link
Contributor

It is important that the leader periodically checks that its followers are
still healthy and can remain part of its cluster. If these checks fail
repeatedly then the leader should remove the faulty node from the cluster. The
FollowerChecker, introduced in this commit, performs these periodic checks and
deals with retries.

It is important that the leader periodically checks that its followers are
still healthy and can remain part of its cluster. If these checks fail
repeatedly then the leader should remove the faulty node from the cluster. The
FollowerChecker, introduced in this commit, performs these periodic checks and
deals with retries.
@DaveCTurner DaveCTurner added >enhancement v7.0.0 :Distributed/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. labels Sep 20, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left a few smaller comments and one item I would like to discuss.

request, this).getFormattedMessage());
}

executeRunnable.accept(new AbstractRunnable() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add assert request.term > this.term before this line?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't hold - we could be in the correct term but a CANDIDATE (e.g. LeaderChecker thinks the leader failed) and then we need to call into the coordinator and become a FOLLOWER again.

@DaveCTurner
Copy link
Contributor Author

@elasticmachine test this please

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ywelsch ywelsch mentioned this pull request Sep 21, 2018
61 tasks
@DaveCTurner DaveCTurner merged commit 1761b6c into elastic:zen2 Sep 22, 2018
@DaveCTurner DaveCTurner deleted the 2018-09-19-followers-checker branch September 22, 2018 10:34
ywelsch added a commit that referenced this pull request Sep 28, 2018
Allows this class to be cleanly shared between Zen1 and Zen2. Follow-up to #33917
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. >enhancement v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants