Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix cluster health when closing #61709

Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@

import org.elasticsearch.action.ActionFuture;
import org.elasticsearch.action.admin.cluster.health.ClusterHealthResponse;
import org.elasticsearch.action.admin.indices.settings.put.UpdateSettingsRequest;
import org.elasticsearch.action.support.IndicesOptions;
import org.elasticsearch.action.support.PlainActionFuture;
import org.elasticsearch.cluster.health.ClusterHealthStatus;
Expand Down Expand Up @@ -309,14 +310,26 @@ public void clusterStateProcessed(String source, ClusterState oldState, ClusterS

public void testHealthOnMasterFailover() throws Exception {
final String node = internalCluster().startDataOnlyNode();
boolean withIndex = randomBoolean();
if (withIndex) {
// create index with many shards to provoke the health request to wait (for green) while master is being shut down.
createIndex("test", Settings.builder().put(IndexMetadata.SETTING_NUMBER_OF_REPLICAS, randomIntBetween(0, 10)).build());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused. How can the cluster ever be green with that many replicas? Should this be number_of_shards?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It ensures that the cluster is yellow when the health request is made, making the health request wait on the observer, triggering the call to onClusterServiceClose when master is shutdown.

The number of replicas is cleared to 0 after having fired all the async restarts and done the master restarts. That ensures that all the requests responds with green status.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah ok, can you add a comment to that effect

}
final List<ActionFuture<ClusterHealthResponse>> responseFutures = new ArrayList<>();
// Run a few health requests concurrent to master fail-overs against a data-node to make sure master failover is handled
// without exceptions
for (int i = 0; i < 20; ++i) {
responseFutures.add(client(node).admin().cluster().prepareHealth().setWaitForEvents(Priority.LANGUID)
.setWaitForGreenStatus().execute());
.setWaitForGreenStatus().setMasterNodeTimeout(TimeValue.timeValueMinutes(1)).execute());
internalCluster().restartNode(internalCluster().getMasterName(), InternalTestCluster.EMPTY_CALLBACK);
}
if (withIndex) {
assertAcked(
client().admin().indices()
.updateSettings(new UpdateSettingsRequest("test")
.settings(Settings.builder().put(IndexMetadata.SETTING_NUMBER_OF_REPLICAS, 0))).get()
);
}
for (ActionFuture<ClusterHealthResponse> responseFuture : responseFutures) {
assertSame(responseFuture.get().getStatus(), ClusterHealthStatus.GREEN);
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@
import org.elasticsearch.common.unit.TimeValue;
import org.elasticsearch.common.util.CollectionUtils;
import org.elasticsearch.index.IndexNotFoundException;
import org.elasticsearch.node.NodeClosedException;
import org.elasticsearch.tasks.Task;
import org.elasticsearch.threadpool.ThreadPool;
import org.elasticsearch.transport.TransportService;
Expand Down Expand Up @@ -204,7 +205,7 @@ public void onNewClusterState(ClusterState newState) {

@Override
public void onClusterServiceClose() {
listener.onFailure(new IllegalStateException("ClusterService was close during health call"));
listener.onFailure(new NodeClosedException(clusterService.localNode()));
}

@Override
Expand Down