Skip to content
This repository has been archived by the owner on Mar 30, 2023. It is now read-only.

LoadBalancer kills Logstash connections #249

Closed
mario-paniccia opened this issue Nov 16, 2018 · 8 comments
Closed

LoadBalancer kills Logstash connections #249

mario-paniccia opened this issue Nov 16, 2018 · 8 comments

Comments

@mario-paniccia
Copy link

We are experiencing slow indexing time and data loss in Logstash because the cluster loadbalancer has a idle timeout of 5 minutes, and so is killing idle Logstash connections to ES.
Here's the LS logs that result when a new message is processed after a period of inactivity of more than 5 min (10.100.24.254 is the loadbalancer ip):

[2018-11-16T09:36:50,728][ERROR][logstash.outputs.elasticsearch] Attempted to send a bulk request to elasticsearch' but Elasticsearch appears to be unreachable or down! {:error_message=>"Elasticsearch Unreachable: [http://10.100.24.254:9200/][Manticore::SocketTimeout] Read timed out", :class=>"LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError", :will_retry_in_seconds=>2}
[2018-11-16T09:36:51,101][INFO ][logstash.outputs.elasticsearch] Running health check to see if an Elasticsearch connection is working {:healthcheck_url=>http://10.100.24.254:9200/, :path=>"/"}
[2018-11-16T09:36:52,754][WARN ][logstash.outputs.elasticsearch] UNEXPECTED POOL ERROR {:e=>#<LogStash::Outputs::ElasticSearch::HttpClient::Pool::NoConnectionAvailableError: No Available connections>}
[2018-11-16T09:36:52,761][ERROR][logstash.outputs.elasticsearch] Attempted to send a bulk request to elasticsearch, but no there are no living connections in the connection pool. Perhaps Elasticsearch is unreachable or down? {:error_message=>"No Available connections", :class=>"LogStash::Outputs::ElasticSearch::HttpClient::Pool::NoConnectionAvailableError", :will_retry_in_seconds=>4}
[2018-11-16T09:36:54,118][WARN ][logstash.outputs.elasticsearch] Attempted to resurrect connection to dead ES instance, but got an error. {:url=>"http://10.100.24.254:9200/", :error_type=>LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError, :error=>"Elasticsearch Unreachable: [http://10.100.24.254:9200/][Manticore::SocketTimeout] Read timed out"}
[2018-11-16T09:36:55,119][INFO ][logstash.outputs.elasticsearch] Running health check to see if an Elasticsearch connection is working {:healthcheck_url=>http://10.100.24.254:9200/, :path=>"/"}
[2018-11-16T09:36:55,123][WARN ][logstash.outputs.elasticsearch] Restored connection to ES instance {:url=>"http://10.100.24.254:9200/"}```

I've tried setting various combinations for the connection parameters of the elastic output plugin, e.g. timeout, validate_after_inactivity, resurrect_delay, but nothing seems to work:

https://www.elastic.co/guide/en/logstash/current/plugins-outputs-elasticsearch.html

Here's my Logstash configuration for the output:

output {
    elasticsearch {
			hosts => ["10.100.24.254:9200"]
			sniffing => false
			index => "%{[index_id]}"
			document_type => "_doc"
			document_id => "%{id}"
			timeout => 3
			resurrect_delay => 3
			validate_after_inactivity => 3000
    }
}

Any idea of how I can configure Logstash to prevent this connection issue?

@russcam
Copy link
Contributor

russcam commented Dec 10, 2018

Thanks for raising @mario-paniccia, we'll need to look into this.

@mario-paniccia
Copy link
Author

@russcam I've spent days on this issue without luck. This has been causing so many problems, including slow Logstash indexing time and documents lost during indexing.
In the end I gave up. Now I connect LS directly to the cluster Data Nodes and it's working perfectly

@RomasZekonis
Copy link

+1
we also have the same issues with internal LoadBalancer.
Can we pass the array of elastic nodes it the Logstash pipeline output ?

output { elasticsearch {
hosts => ["http://data-0:9200", "http://data-1:9200", "http://data-2:9200"]
index => "%{[Application]}-%{+YYYY.MM.dd}"
} }

Thanks

@russcam
Copy link
Contributor

russcam commented Jun 2, 2019

Can we pass the array of elastic nodes it the Logstash pipeline output ?

Yes, you can do this if you're using Azure DNS, or custom DNS that forwards to Azure DNS to be able to resolve VMs by hostname

@russcam
Copy link
Contributor

russcam commented Jun 25, 2019

I've opened #292 to allow load balancer SKU to be specified for both internal and external load balancers. A Standard SKU load balancer may alleviate some of the issues experienced.

@RomasZekonis
Copy link

Maybe we do not require internal load balancer at all ? In my setup with array of nodes in the logstash pipeline hosts option works much better. Or there is another purpose of this LB I do not aware?

@russcam
Copy link
Contributor

russcam commented Feb 12, 2020

Maybe we do not require internal load balancer at all ? In my setup with array of nodes in the logstash pipeline hosts option works much better. Or there is another purpose of this LB I do not aware?

@RomasZekonis an internal load balancer is needed for some versions of Kibana that the template deploys, which was limited to communicating with Elasticsearch through only a single endpoint (see elastic/kibana#214 and elastic/kibana#21928). There's a fair amount of complexity involved in updating the template to deploy an internal loadbalancer only for versions that require it. It's not a high priority to address but would accept a PR for it.

@mario-paniccia Would you like to see if a Standard SKU internal loadbalancer addresses this, which are now supported, or are we OK to close this issue?

@russcam
Copy link
Contributor

russcam commented Dec 3, 2020

We’re closing this issue due to inactivity; if you’re still in need of resolution, feel free to re-open with the additional details and we'd be glad to help you out 👍

@russcam russcam closed this as completed Dec 3, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants