Handle Elasticsearch health status changes/restarts more gracefully during Kibana index migration #26049

ppf2 · 2018-11-21T21:38:10Z

Kibana version: 6.5.1

During a Kibana index migration, an ES node restarted.

See the initial no living connections messages. Then it was able to reconnect and issued an index creation request for the next incremental upgrade index (.kibana_7):

{"type":"log","@timestamp":"2018-11-21T16:08:28Z","tags":["warning","elasticsearch","admin"],"pid":27581,"message":"Unable to revive connection: https://IP:9200/"}
{"type":"log","@timestamp":"2018-11-21T16:08:28Z","tags":["warning","elasticsearch","admin"],"pid":27581,"message":"No living connections"}
{"type":"log","@timestamp":"2018-11-21T16:08:31Z","tags":["status","plugin:elasticsearch@6.5.1","info"],"pid":27581,"state":"green","message":"Status changed from red to green - Ready","prevState":"red","prevMsg":"Unable to connect to Elasticsearch at https://IP:9200/."}
{"type":"log","@timestamp":"2018-11-21T16:08:31Z","tags":["info","migrations"],"pid":27581,"message":"Creating index .kibana_7."}
{"type":"log","@timestamp":"2018-11-21T16:08:31Z","tags":["license","info","xpack"],"pid":27581,"message":"Imported license information from Elasticsearch for the [monitoring] cluster: mode: gold | status: active | expiry date: 2019-04-29T16:59:59-07:00"}
{"type":"log","@timestamp":"2018-11-21T16:08:52Z","tags":["info","migrations"],"pid":27581,"message":"Migrating .kibana-6 saved objects to .kibana_7"}
{"type":"log","@timestamp":"2018-11-21T16:09:01Z","tags":["security","error"],"pid":27581,"message":"Error registering Kibana Privileges with Elasticsearch for kibana-.kibana: Request Timeout after 30000ms"}
{"type":"log","@timestamp":"2018-11-21T16:09:01Z","tags":["status","plugin:security@6.5.1","error"],"pid":27581,"state":"red","message":"Status changed from red to red - Request Timeout after 30000ms","prevState":"red","prevMsg":"No Living connections"}
{"type":"log","@timestamp":"2018-11-21T16:09:01Z","tags":["status","plugin:security@6.5.1","info"],"pid":27581,"state":"green","message":"Status changed from red to green - Ready","prevState":"red","prevMsg":"Request Timeout after 30000ms"}

Except that it failed at "Error registering Kibana Privileges" shortly after. Here are the corresponding logs on the Elasticsearch side (timestamps in US Pacific below).

You will see that server101 (the node that was restarted) returned to the cluster and the subsequent corresponding kibana_7 index creation request. And the cluster turned green (from yellow) afterwards. While the cluster was yellow, there should at least be a copy of the security-6 index available. So it seems like Kibana had trouble determining the actual status of the indices in the cluster.

[2018-11-21T08:08:29,980][INFO ][o.e.c.s.ClusterApplierService] [server103.infra] added {{server101.infra}{5GrIP0JGSJ6Q62fzVznZ5w}{OqlW_fHfSa2rFStexnqWGQ}{IP1}{IP1:9300}{ml.machine_memory=8203079680, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true},}, reason: apply cluster state (from master [master {server103.infra}{eYU43brtT_qb6CZ0kOSyrg}{UOuroEMmTyuv9ZZv94WbQg}{IP3}{IP3:9300}{ml.machine_memory=8202833920, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true} committed version [72562] source [zen-disco-node-join[{server101.infra}{5GrIP0JGSJ6Q62fzVznZ5w}{OqlW_fHfSa2rFStexnqWGQ}{IP1}{IP1:9300}{ml.machine_memory=8203079680, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true}]]])
[2018-11-21T08:08:34,061][INFO ][o.e.c.m.MetaDataCreateIndexService] [server103.infra] [.kibana_7] creating index, cause [api], templates [], shards [1]/[1], mappings [doc]
[2018-11-21T08:08:53,047][WARN ][o.e.c.r.a.AllocationService] [server103.infra] [.security-6][0] marking unavailable shards as stale: [gPLSyrbKQxG9I-USwlexew]
[2018-11-21T08:09:04,562][INFO ][o.e.c.m.MetaDataMappingService] [server103.infra] [.kibana_7/tZA7GvolQ-Oarh6DUN3Y3A] update_mapping [doc]
[2018-11-21T08:09:07,761][INFO ][o.e.c.m.MetaDataMappingService] [server103.infra] [.kibana_7/tZA7GvolQ-Oarh6DUN3Y3A] update_mapping [doc]
[2018-11-21T08:09:28,615][INFO ][o.e.c.m.MetaDataMappingService] [server103.infra] [.kibana_7/tZA7GvolQ-Oarh6DUN3Y3A] update_mapping [doc]
[2018-11-21T08:12:05,665][INFO ][o.e.c.m.MetaDataIndexTemplateService] [server103.infra] adding template [.management-beats] for index patterns [.management-beats]
[2018-11-21T08:15:48,865][INFO ][o.e.c.r.a.AllocationService] [server103.infra] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[.watcher-history-9-2018.09.01][0]] ...]).

It will be nice if Kibana can handle these situations more gracefully. Perhaps we can retry, or prevent the next incremental migration from starting if it detects a cluster health status change during the full migration, etc..

The text was updated successfully, but these errors were encountered:

elasticmachine · 2018-11-26T20:44:46Z

Pinging @elastic/kibana-operations

elasticmachine · 2021-04-05T14:18:10Z

Pinging @elastic/kibana-core (Team:Core)

pgayvallet · 2024-07-04T07:58:16Z

Addressed by the v2 migration algorithm (#66056)

ppf2 mentioned this issue Nov 22, 2018

Tracking of migration related issues #25821

Closed

5 tasks

Bargs added Team:Operations Team label for Operations Team triage_needed labels Nov 26, 2018

rudolf mentioned this issue Dec 4, 2019

Improve Saved Object Migrations to minimize operational impact of Kibana upgrades #52202

Closed

jbudz added Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc and removed Team:Operations Team label for Operations Team labels Apr 5, 2021

pgayvallet closed this as completed Jul 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle Elasticsearch health status changes/restarts more gracefully during Kibana index migration #26049

Handle Elasticsearch health status changes/restarts more gracefully during Kibana index migration #26049

ppf2 commented Nov 21, 2018

elasticmachine commented Nov 26, 2018

elasticmachine commented Apr 5, 2021

pgayvallet commented Jul 4, 2024

Handle Elasticsearch health status changes/restarts more gracefully during Kibana index migration #26049

Handle Elasticsearch health status changes/restarts more gracefully during Kibana index migration #26049

Comments

ppf2 commented Nov 21, 2018

elasticmachine commented Nov 26, 2018

elasticmachine commented Apr 5, 2021

pgayvallet commented Jul 4, 2024