Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Multinode connection pool #139

Closed
wants to merge 10 commits into from

Conversation

srleyva
Copy link
Contributor

@srleyva srleyva commented Sep 13, 2020

  • This commit adds a connection pool that takes a static
    list of nodes and distributes the load.

  • trait for setting connection distribution.
    Defaults to RoundRobin.

#57

* This commit adds a connection pool that takes a static
  list of nodes and distributes the load.

* trait for setting connection distribution.
  Defaults to RoundRobin.
Copy link
Contributor

@russcam russcam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation looks good. Would sniffing connection pool be implemented on top of this? I think sniffing is a trickier implementation, but it is probably more valuable - the current connection pool implementation may need to change to accommodate interior mutability of a collection of Connection types

elasticsearch/src/http/transport.rs Outdated Show resolved Hide resolved
elasticsearch/src/http/transport.rs Outdated Show resolved Hide resolved
@srleyva
Copy link
Contributor Author

srleyva commented Sep 16, 2020

I believe your assessment is correct as far as changing the ConnectionPool interface to account for interior mutability. It seems the only way to do that across threads is a mutex type. I think there’s overhead in requiring every ConnectionPool type to require a Mutex even if it’s not dynamic but it’s only way I see how to do this. Any thoughts?

@russcam
Copy link
Contributor

russcam commented Sep 21, 2020

Thinking about sniffing further and discussing with folks on the clients team, the connection pool is responsible for managing one or more connections (connections being details about nodes in the cluster such as Uri) and data related to connections such as whether the pool supports sniffing, but the Transport is responsible for acting on this data i.e. the Transport should perform sniffing and (re)seed the pool if the pool supports it.

With this in mind, a sniffing connection pool would likely have a Arc<RwLock<Vec<Connection>>> (or similar construct) to allow multi-threaded reads through immutable references, with a single writer through a mutable reference. A question that I have is how this kind of connection pool advertises itself e.g. through the ConnectionPool trait with something like

pub trait ConnectionPool: Debug + dyn_clone::DynClone + Sync + Send {
    /// Gets a reference to the next [Connection]
    fn next(&self) -> &Connection;

    fn reseedable(&self) -> bool {
        false
    }

    fn reseed(&self, connections: Vec<Connection>) {}
}

Would need to play around to get the right type and function design for this. StaticNodeListConnectionPool might then be more aptly named MultiNodeConnectionPool with functions to construct sniffable and non-sniffable i.e. static versions.

* The connection should be owned by the current user of said connection
Comment on lines 385 to 388
// NodesInfo::new(&self, NodesInfoParts::None)
// .send()
// .await
// .expect("Could not retrieve nodes for refresh");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is technically a recursive call, which requires returning a futures::future::BoxFuture, still playing around with making that work without breaking the current send implementation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand reseeding the pool would be synchronous with the request? We may have a race condition if two threads have reseedable() == true and start reseeding.

It would be better for reseeding to be a background periodic operation and send() can then use the current list of nodes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My current understanding (concurrency is hard so it may be wrong :D), is that if two threads get the reseedable() == true it'll kick off two serial reseeds as the reseed routine acquires a write lock (I don't predict this to be a likely event but definitely needs to be accounted for. I thought this trade off would be ok accounting for complexity of adding a subroutine for triggering reseeds). The problem I see is reseeding would block the current ConnectionPool requests until the Write lock is dropped. If you think kicking off multiple serial requests may be problematic I am more than willing to change the implementation to a background operation. :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I envisage the implementation doing something like the following:

  1. An API request is initiated
  2. The connection pool indicates if it is reseedable and needs reseeding
  3. If reseeding is needed, a thread-safe signal is set to indicate reseeding is in progress
  4. A reseed request is started, either in synchronous flow of the request, or on a separate thread. A separate thread is probably preferable as the request that initiates reseeding is not then impacted by the reseeding process. It does make the implementation more complex though.
  5. Upon receiving the reseed response, a write lock is acquired to update the connection pool. Failure in the reseeding process, should be logged (I think we can add a TODO for this for now and skip, as the client does not currently have logging in place, but should have in future).
  6. A thread-safe signal is set to indicate reseeding is not in progress

@srleyva
Copy link
Contributor Author

srleyva commented Sep 21, 2020

WIP as I still need to implement the NodeInfo parts, but thought I'd get feedback on the current reseed implementation and the changes to the ConnectionPool trait. :)

@russcam
Copy link
Contributor

russcam commented Sep 21, 2020

Really appreciate your effort, @srleyva! I've left some comments

Comment on lines 385 to 388
// NodesInfo::new(&self, NodesInfoParts::None)
// .send()
// .await
// .expect("Could not retrieve nodes for refresh");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand reseeding the pool would be synchronous with the request? We may have a race condition if two threads have reseedable() == true and start reseeding.

It would be better for reseeding to be a background periodic operation and send() can then use the current list of nodes.

elasticsearch/src/http/transport.rs Show resolved Hide resolved
elasticsearch/src/http/transport.rs Outdated Show resolved Hide resolved
elasticsearch/src/http/transport.rs Outdated Show resolved Hide resolved
@elasticmachine
Copy link
Collaborator

Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually?

@srleyva srleyva force-pushed the StaticNodeList branch 2 times, most recently from 0ccc20d to fedd490 Compare September 23, 2020 03:26
@srleyva
Copy link
Contributor Author

srleyva commented Sep 23, 2020

First pass at node sniff request following elastic docs. I thought even if aren't sure of the final design, having the request parts up would be easier.

Manual tests for round robin load balancing on a local docker-compose ES cluster look quite promising when seeded with http://localhost:9200 😄

Ping Response: 200 OK Url: http://172.23.0.3:9200/
Ping Response: 200 OK Url: http://172.23.0.6:9200/
Ping Response: 200 OK Url: http://172.23.0.5:9200/
Ping Response: 200 OK Url: http://172.23.0.2:9200/
Ping Response: 200 OK Url: http://172.23.0.3:9200/
Ping Response: 200 OK Url: http://172.23.0.6:9200/
Ping Response: 200 OK Url: http://172.23.0.5:9200/
Ping Response: 200 OK Url: http://172.23.0.2:9200/
Ping Response: 200 OK Url: http://172.23.0.3:9200/

@srleyva srleyva changed the title Implement static node list connection pool Implement Multinode connection pool Sep 23, 2020
elasticsearch/src/http/transport.rs Outdated Show resolved Hide resolved
elasticsearch/src/http/transport.rs Outdated Show resolved Hide resolved
elasticsearch/src/http/transport.rs Outdated Show resolved Hide resolved
elasticsearch/src/http/transport.rs Show resolved Hide resolved
elasticsearch/src/http/transport.rs Outdated Show resolved Hide resolved
elasticsearch/src/http/transport.rs Outdated Show resolved Hide resolved
elasticsearch/src/http/transport.rs Outdated Show resolved Hide resolved
@srleyva
Copy link
Contributor Author

srleyva commented Sep 24, 2020

I'll play around with using a background thread to reseed over the next few days. Thanks for your patience and guidance on this as I am still very new to rust. 😄

* Style changes per code review
@srleyva
Copy link
Contributor Author

srleyva commented Jan 18, 2021

Hey @russcam is there any update on this PR? Apologies for the lack of communication. Its feature complete according to the original requirements. If there's anything else that's needed please let me know 😄

@russcam
Copy link
Contributor

russcam commented Jan 27, 2021

Thanks for the ping, @srleyva 👍 Hoping to get a chance to look at this again this week, unless @swallez beats me to it!

@mfelsche
Copy link

What is holding this PR back? We would like to use the official client, but without at least a way to target multiple elastic nodes, this is not feasible for us. Is there any way I could help out to get this over the finish line?

@tommilligan
Copy link
Contributor

I'm also interested in getting this merged. The branch looks stale and based on old dependencies - I'll rebase it on master and update to newer dependencies, and open as a new PR.

@mfelsche
Copy link

Great! @tommilligan ping me, if you need another hand or pair of eyes. :)

@tommilligan
Copy link
Contributor

@mfelsche I've opened the rebased PR at #189

@swallez
Copy link
Member

swallez commented Aug 23, 2024

Superseded by PR #189

@swallez swallez closed this Aug 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants