Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

protocols/kad/behaviour: Remove false assert on connected_peers #2120

Merged
merged 4 commits into from
Jul 8, 2021

Conversation

mxinden
Copy link
Member

@mxinden mxinden commented Jul 5, 2021

Given the following scenario:

  1. Remote peer X connects and is added to connected_peers.
  2. Remote peer X opens a Kademlia substream and thus confirms that it supports
    the Kademlia protocol.
  3. Remote peer X is added to the routing table as Connected.
  4. Remote peer X disconnects and is thus marked as Disconnected in the routing
    table.
  5. Remote peer Y connects and is added to connected_peers.
  6. Remote peer X re-connects and is added to connected_peers.
  7. Remote peer Y opens a Kademlia substream and thus confirms that it supports
    the Kademlia protocol.
  8. Remote peer Y is added to the routing table. Given that the bucket is already
    full the call to entry.insert returns kbucket::InsertResult::Pending { disconnected } where disconnected is peer X.

While peer X is in connected_peers it has not yet (re-) confirmed that it
supports the Kademlia routing protocol and thus is still tracked as
Disconnected in the routing table. The debug_assert removed in this pull
request does not capture this scenario.

Fixes #2092

@FelipeRosa @izolyomi would you mind taking a look and confirm that this fixes your issues as well?

Given the following scenario:

1. Remote peer X connects and is added to `connected_peers`.
2. Remote peer X opens a Kademlia substream and thus confirms that it supports
   the Kademlia protocol.
3. Remote peer X is added to the routing table as `Connected`.
4. Remote peer X disconnects and is thus marked as `Disconnected` in the routing
   table.
5. Remote peer Y connects and is added to `connected_peers`.
6. Remote peer X re-connects and is added to `connected_peers`.
7. Remote peer Y opens a Kademlia substream and thus confirms that it supports
   the Kademlia protocol.
8. Remote peer Y is added to the routing table. Given that the bucket is already
   full the call to `entry.insert` returns `kbucket::InsertResult::Pending {
   disconnected }` where disconnected is peer X.

While peer X is in `connected_peers` it has not yet (re-) confirmed that it
supports the Kademlia routing protocol and thus is still tracked as
`Disconnected` in the routing table. The `debug_assert` removed in this pull
request does not capture this scenario.
Copy link
Contributor

@romanb romanb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me. I guess this was an oversight when implementing the delayed insertion into the routing tables (i.e. only after protocol confirmation).

@izolyomi
Copy link
Contributor

izolyomi commented Jul 5, 2021

Using this new dependency version 0.39 seems to have a ton of breaking changes compared to 0.38. I'd have to understand all the changes and fix everything to test, I might stick with the old version for the moment.

@FelipeRosa
Copy link

Thanks for the work @mxinden and @romanb

Sure, I'll take a look at it later today.

However, I believe that should solve the problem because what I ended up doing to prevent triggering the debug_assert was setting Kademlia's behaviour kbuckets insert to KademliaBucketInserts::Manual and adding DHT peer addresses manually at a later stage (after the peer has sent Identify information and said it supports the kad protocol).

Since I've done that I have not seen the debug_assert being triggered and everything seems to be working fine when using the kad DHT.

@izolyomi
Copy link
Contributor

izolyomi commented Jul 7, 2021

I've finally upgraded to the unreleased 0.39, it wasn't too hard to fix all breaking changes after all. I can confirm that I haven't encountered debug_assert failing even once.

@mxinden mxinden merged commit 05a4b16 into libp2p:master Jul 8, 2021
@mxinden
Copy link
Member Author

mxinden commented Jul 8, 2021

Thanks everyone for the input and thanks @romanb for the review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

protocols/kad: debug_assert panic - expecting disconnected peer not to be connected
4 participants