Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TCP Socket errors - Transport endpoint is already connected #40

Closed
oini opened this issue Jun 16, 2020 · 3 comments · Fixed by #44
Closed

TCP Socket errors - Transport endpoint is already connected #40

oini opened this issue Jun 16, 2020 · 3 comments · Fixed by #44
Labels
bug Something isn't working in progress Someone is actively working on this issue

Comments

@oini
Copy link

oini commented Jun 16, 2020

Hi there! We're using this module to log metrics for a high-frequency ETL service that is running on ECS where we have been noticing TCP socket 106 errors. A CloudWatch Agent container is setup as a sidecar alongside the application server in the ECS Task Definition that is responsible for publishing the metrics.

I'm trying to understand the root cause of these errors and how to avoid them. Any help is appreciated. Thanks!

Exact error:

[ERROR][tcp_client] Failed to connect to the socket. [Errno 106] Transport endpoint is already connected
@jaredcnance
Copy link
Member

jaredcnance commented Jun 16, 2020

This looks like a bug to me. TcpClient is not thread-safe, so if you have multiple threads concurrently sending metrics on startup, then this seems possible. We should catch and swallow this particular error on connect. Do you know if you have a single TcpClient or if the same client is shared across your application?

self._sock.connect((self._endpoint.hostname, self._endpoint.port))
self._should_connect = False

@jaredcnance jaredcnance added the bug Something isn't working label Jun 16, 2020
@oini
Copy link
Author

oini commented Jun 16, 2020

Do you know if you have a single TcpClient or if the same client is shared across your application?

While MetricsLoggers are not shared across threads, it looks like each instance of MetricsLogger is pointing to the same Environment which has a single AgentSink which has a single TcpClient. So yes, I would say it's shared across the application with multiple loggers flushing to the CW agent at once.

@ben51
Copy link

ben51 commented Jun 23, 2020

I've had a similar issue running an multi-threaded application on ECS. It seems that there was a temporary Network issue between the CW agent container and my application that lead to a socket error but the metric logger was never able to recover from it.

An extract from the logs:

2020-06-22T01:54:53.496 [INFO] aws_embedded_metrics.sinks.tcp_client:51 - Submitted metrics to agent over TCP.
2020-06-22T01:54:58.449 [INFO] aws_embedded_metrics.sinks.tcp_client:51 - Submitted metrics to agent over TCP.
2020-06-22T01:58:36.412 [ERROR] aws_embedded_metrics.sinks.tcp_client:56 - Failed to write metrics to the socket due to socket.error. [Errno 113] No route to host
2020-06-22T01:58:36.414 [ERROR] aws_embedded_metrics.sinks.tcp_client:40 - Failed to connect to the socket. [Errno 106] Transport endpoint is already connected
2020-06-22T01:58:36.418 [ERROR] aws_embedded_metrics.sinks.tcp_client:56 - Failed to write metrics to the socket due to socket.error. [Errno 32] Broken pipe
2020-06-22T01:58:36.418 [ERROR] aws_embedded_metrics.sinks.tcp_client:40 - Failed to connect to the socket. [Errno 106] Transport endpoint is already connected
2020-06-22T01:58:36.421 [ERROR] aws_embedded_metrics.sinks.tcp_client:56 - Failed to write metrics to the socket due to socket.error. [Errno 32] Broken pipe
2020-06-22T01:58:36.421 [ERROR] aws_embedded_metrics.sinks.tcp_client:40 - Failed to connect to the socket. [Errno 106] Transport endpoint is already connected
2020-06-22T01:58:36.422 [ERROR] aws_embedded_metrics.sinks.tcp_client:40 - Failed to connect to the socket. [Errno 106] Transport endpoint is already connected
2020-06-22T01:58:36.422 [ERROR] aws_embedded_metrics.sinks.tcp_client:56 - Failed to write metrics to the socket due to socket.error. [Errno 32] Broken pipe
2020-06-22T01:58:36.422 [ERROR] aws_embedded_metrics.sinks.tcp_client:40 - Failed to connect to the socket. [Errno 106] Transport endpoint is already connected
2020-06-22T01:58:36.423 [ERROR] aws_embedded_metrics.sinks.tcp_client:40 - Failed to connect to the socket. [Errno 106] Transport endpoint is already connected
2020-06-22T01:58:36.423 [ERROR] aws_embedded_metrics.sinks.tcp_client:56 - Failed to write metrics to the socket due to socket.error. [Errno 32] Broken pipe

@jaredcnance jaredcnance added the in progress Someone is actively working on this issue label Jun 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working in progress Someone is actively working on this issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants