Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsequent Kademlia bootstrapping events #2121

Closed
izolyomi opened this issue Jul 6, 2021 · 2 comments
Closed

Inconsequent Kademlia bootstrapping events #2121

izolyomi opened this issue Jul 6, 2021 · 2 comments

Comments

@izolyomi
Copy link
Contributor

izolyomi commented Jul 6, 2021

Following events coming from the Kademlia protocol, I always receive bootstrapping events in the following pattern:

BootstrapOk { peer: PeerId("12D3KooWAYFzVvNfpu8j6HZ8cS6tXngRe8kLXCrwEeXBw86JG9h6"), num_remaining: 11 }
...
BootstrapOk { peer: PeerId("1AkytfpPb2r6xtH8UtKE3EADBUbGjRuKWRjE5EgbtBTJpd"), num_remaining: 0 }
BootstrapOk { peer: PeerId("1AauQwngUA6DWkBHWwfqHMDQB4eum5GViciWvPsnCtM6ke"), num_remaining: 0 }

I.e. there are always two different events with num_remaining: 0 at the end. I think there must be a bug in either the discovery algorithm or the counter (for remaining peerids to be discovered) has an off by one error.

I've also seen that the initial value of the counter varies from 11 to 13. This might be normal to be undeterministic though, I'm not completely familiar with all internal details of this DHT implementation.

@mxinden
Copy link
Member

mxinden commented Jul 6, 2021

I.e. there are always two different events with num_remaining: 0 at the end. I think there must be a bug in either the discovery algorithm or the counter (for remaining peerids to be discovered) has an off by one error.

Great catch @izolyomi! From taking a quick look, the issue seems to be here:

let num_remaining = remaining.len().saturating_sub(1) as u32;
if let Some(target) = remaining.next() {
let info = QueryInfo::Bootstrap {
peer: target.clone().into_preimage(),
remaining: Some(remaining)
};
let peers = self.kbuckets.closest_keys(&target);
let inner = QueryInner::new(info);
self.queries.continue_iter_closest(query_id, target.clone(), peers, inner);
}
Some(KademliaEvent::OutboundQueryCompleted {
id: query_id,
stats: result.stats,
result: QueryResult::Bootstrap(Ok(BootstrapOk { peer, num_remaining }))
})

The code does two things:

  1. Spawn the next Query via self.queries.continue_iter_closest.
  2. Report the result of the previous query via OutboundQueryCompleted.

num_remaining is calculated for the freshly spawned query, instead of for the just finished query. A change along the lines of the below should fix it:

- let num_remaining = remaining.len().saturating_sub(1) as u32;
+ let num_remaining = remaining.len() as u32;

@izolyomi do you want to send in a patch?


I've also seen that the initial value of the counter varies from 11 to 13. This might be normal to be undeterministic though, I'm not completely familiar with all internal details of this DHT implementation.

This is normal. The variation is the result of the code below:

let mut remaining = remaining.unwrap_or_else(|| {
debug_assert_eq!(&peer, local_key.preimage());
// The lookup for the local key finished. To complete the bootstrap process,
// a bucket refresh should be performed for every bucket farther away than
// the first non-empty bucket (which are most likely no more than the last
// few, i.e. farthest, buckets).
self.kbuckets.iter()
.skip_while(|b| b.is_empty())
.skip(1) // Skip the bucket with the closest neighbour.
.map(|b| {
// Try to find a key that falls into the bucket. While such keys can
// be generated fully deterministically, the current libp2p kademlia
// wire protocol requires transmission of the preimages of the actual
// keys in the DHT keyspace, hence for now this is just a "best effort"
// to find a key that hashes into a specific bucket. The probabilities
// of finding a key in the bucket `b` with as most 16 trials are as
// follows:
//
// Pr(bucket-255) = 1 - (1/2)^16 ~= 1
// Pr(bucket-254) = 1 - (3/4)^16 ~= 1
// Pr(bucket-253) = 1 - (7/8)^16 ~= 0.88
// Pr(bucket-252) = 1 - (15/16)^16 ~= 0.64
// ...
let mut target = kbucket::Key::from(PeerId::random());
for _ in 0 .. 16 {
let d = local_key.distance(&target);
if b.contains(&d) {
break;
}
target = kbucket::Key::from(PeerId::random());
}
target
}).collect::<Vec<_>>().into_iter()
});

@izolyomi
Copy link
Contributor Author

izolyomi commented Jul 7, 2021

@mxinden Thanks for the quick reply and confirmation. I've also had my suspicion on the saturating_sub() being guilty here, but all the surrounding logic was beyond my current limited understanding.

As the patch seems to be trivial, I can easily send a one-liner if you prefer the usual Github PR workflow, but feel free to resolve it in any faster way if that's more practical on your side. 🤷

rubdos added a commit to rubdos/rust-libp2p that referenced this issue Jul 9, 2021
Fixes libp2p#2121

Co-authored-with: Max Inden <mail@max-inden.de>
Co-authored-with: @izolyomi
rubdos added a commit to rubdos/rust-libp2p that referenced this issue Jul 9, 2021
Fixes libp2p#2121

Co-authored-by: Max Inden <mail@max-inden.de>
Co-authored-by: @izolyomi
rubdos added a commit to rubdos/rust-libp2p that referenced this issue Jul 9, 2021
Fixes libp2p#2121

Co-authored-by: Max Inden <mail@max-inden.de>
Co-authored-by: @izolyomi
@mxinden mxinden closed this as completed in f600f7a Jul 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants