Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Zen2] Add lag detector #35685
[Zen2] Add lag detector #35685
Changes from 8 commits
c8ddb4f
fa3f3fc
e6f3291
b39750b
4fbd9a8
4cc00ac
dea2640
213436f
286eddd
409bf3f
97578a5
31218bc
3102f8c
cdcc190
7150336
f956db5
63b21ee
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we can avoid to hook this in here, and whether there's a way to do this in Coordinator. We call
ackOnce(null)
here, which in turn callsAckListener.onNodeAck(DiscoveryNode, @Nullable Exception)
. We already hook into that acklistener in Coordinator, so we could also get those events there. And we also know the version of the cluster state we're publishing in Coordinator.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we know that? We start the lag detector after clearing
currentPublication
, and another publication could then start. It wouldn't be right to detect lag based on a newer publication.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My suggestion is something like the following (I had to revert the not-null assertion, because I think we don't have that guarantee, and CoordinatorTests were failing):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, ok, I did that in 63b21ee.