Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: rethinking the peerstore (towards v2) #164

Closed
raulk opened this issue Apr 16, 2019 · 6 comments
Closed

Proposal: rethinking the peerstore (towards v2) #164

raulk opened this issue Apr 16, 2019 · 6 comments

Comments

@raulk
Copy link
Member

raulk commented Apr 16, 2019

Context

The peerstore v1 comprises three components: address book, key book and the peer metadata store.

The address book is the hottest component: it deals with highly-volatile mutable data. It temporarily, or pseudo-permanently, stores the addresses of the peers we discover and interact with.

It's up to the caller to determine the TTL of each address, in isolation and without knowing the past track record of that address. In practice, clients find it hard to choose a value for a TTL, and end up fixating on three well-known values: 10 minutes, infinity, infinity-1 (as a marker for something).

Instead, a TTL should be chosen based on the stability of the address: the more times we've seen it and successfully connected to it, the higher our confidence should be and the longer we want to remember that address.

Altogether, I posit that TTLs are a suboptimal and leaky abstraction. What lies beneath is the intent to model address confidence and quality over time.

Technical proposal

The peerstore should become an autonomous agent inside the libp2p stack, capable of taking decisions and reacting to environment changes.

Its main input should be a stream of events:

  1. Address observation events: an address for a peer was reported to us, but we have not used it yet.
  2. Dial events: we dialled an address, and we either succeeded or failed.

The main goal of the peerstore is to optimise for storage and address quality/confidence.

  • Quality/confidence is calculated based on incoming events and a time decay function.
  • The time decay function makes us lose confidence for stored addresses over time.
  • Every time we receive a successful dial event, we increment up the quality/confidence of an address.
  • Every time we receive a failed dial event, we decrement the quality/confidence of an address.
  • When confidence for an address falls below a threshold, we attempt to revalidate the address by dialling to it. If successful, we bump up the score; else we prune the address entirely.

When a component queries the peerstore for addresses for peer P, we return results sorted by confidence (descending), so high-quality addresses are dialled first.


We should also think about evolving the keybook and metadata store, although they haven't proven problematic yet.

@vyzo
Copy link
Contributor

vyzo commented Apr 16, 2019

There is also the TempAddrTTL, now at 2 minutes.

@vyzo
Copy link
Contributor

vyzo commented Apr 16, 2019

When confidence for an address falls below a threshold, we attempt to revalidate the address by dialling to it. If successful, we bump up the score; else we prune the address entirely.

I don't think that's a good idea.

@dirkmc
Copy link

dirkmc commented Apr 17, 2019

Should we also observe when peers successfully dial us?

@JustMaier
Copy link

I really like this proposal for peerstore v2. It turns it into a living entity that helps manage network quality, something that is as far as I can tell, is missing from libp2p at the moment (I could be very wrong here, I just haven't seen anything about it). Is v1 documented somewhere?

@burdiyan
Copy link

burdiyan commented Dec 8, 2022

How large is all this peer-related information? If the amount of data per peer is small, and even for hundreds of peers it is still in the order of megabytes, maybe a simpler cleanup approach could suffice? Like cleaning up less recently used peer information once in a while, instead of trying to do it in a more real-time fashion?

@mxinden
Copy link
Member

mxinden commented Dec 14, 2022

I am closing here. My rational is (1) this is implementation specific and thus doesn't belong into specs, (2) it is outdated and there are no plans to act upon this. Please comment in case you think this should stay open.

How large is all this peer-related information?

In rust-libp2p it is tiny and thus duplication is not a concern.

@mxinden mxinden closed this as completed Dec 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants