From a7040c58c7fc7e507f074c02c1a601fce944469c Mon Sep 17 00:00:00 2001 From: Stan Bondi Date: Wed, 24 Aug 2022 16:34:08 +0400 Subject: [PATCH] fix(comms): ignore dial cancel event when inbound connection exists (#4521) Description --- Ignores `DialCancelled` event when the dialer is notified about an established `Inbound` connection Motivation and Context --- In the case where A dials B and B dials A and the B->A(Inbound relative to A) connection occurs first. The connectivity manager notifies the dialer actor to cancel any pending dials. However, this emits a DialCancelled event which causes the connectivity manager to discard the previous inbound connection. This PR ignores canceled dial events when an active inbound connection exists. Difficult to outright confirm, but I suspect this fixes: - dial/disconnect churn - A and B disagreeing on being connected to each other for a time/indefinitely. How Has This Been Tested? --- Hard to test for, tested manually by attempting to make B -> A happen first and checking that this case occurs in the logs. --- comms/core/src/connectivity/manager.rs | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/comms/core/src/connectivity/manager.rs b/comms/core/src/connectivity/manager.rs index acd45a5521..c83556d285 100644 --- a/comms/core/src/connectivity/manager.rs +++ b/comms/core/src/connectivity/manager.rs @@ -546,6 +546,15 @@ impl ConnectivityManagerActor { PeerConnected(conn) => (conn.peer_node_id(), ConnectionStatus::Connected, Some(conn.clone())), PeerConnectFailed(node_id, ConnectionManagerError::DialCancelled) => { + if let Some(conn) = self.pool.get_connection(node_id) { + if conn.is_connected() && conn.direction().is_inbound() { + debug!( + target: LOG_TARGET, + "Ignoring DialCancelled({}) event because an inbound connection already exists", node_id + ); + return Ok(()); + } + } debug!( target: LOG_TARGET, "Dial was cancelled before connection completed to peer '{}'", node_id