Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Basic load balancing #78

Merged
merged 3 commits into from
Apr 23, 2020
Merged

Basic load balancing #78

merged 3 commits into from
Apr 23, 2020

Conversation

Tratcher
Copy link
Member

I'd added the config model to select a load balancing strategy per backend. I've also implemented most of the strategies.

Fixes #75 - The power of two choices - This is the default, it has the best cost-benefit trade off.
Fixes #74 - Least requests - Has the cost of examining all endpoints.
Fixes #73 - Random - Has degenerate cases where it make assign work to busy endpoints.

  • First - This was the only one available before. It doesn't seem like actual load balancing, but it may be useful in the scenario where you have two endpoints and you only want to switch to the second one if the first goes down. First handles that because load balancing only considers healthy endpoints.

#72 - Round-Robin - This is the only one I skipped. It's not clearly better than any of the above strategies, and it's difficult to implement over an unstable list of endpoints. Nor do we have a place to store the per-backend state needed to track which endpoint to select next.

Note this will have merge conflicts with #50.

@Tratcher Tratcher added this to the 1.0.0-preview1 milestone Apr 21, 2020
@Tratcher Tratcher self-assigned this Apr 21, 2020
@Tratcher Tratcher requested a review from davidni April 21, 2020 18:25
/// <summary>
/// Load balancing strategies for endpoint selection.
/// </summary>
public enum LoadBalancingMode
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Custom?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Custom load balancing would require a few different things.
A) You'd have to replace/wrap the ILoadBalancer service or middleware.
B) Would you only want that for some backends?
C) What if you have more than one type of custom load balancing for different backends?

I did a quick test and you can put "Mode": "-1" into config and it successfully binds to the enum. That means you don't need a custom value, people can just define their own entries.

// Licensed under the MIT License.

using Microsoft.ReverseProxy.Core.RuntimeModel;

namespace Microsoft.ReverseProxy.Core.Abstractions
{
/// <summary>
/// Load balancing options.
/// </summary>
public sealed class LoadBalancingOptions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have mode-specific options? I could imagine wanting to add a PowerOfNChoices mode where N was some sort of option for example, but I wouldn't want that cluttering up LoadBalancingOptions. Maybe the mode name should be a key in config, and the mode-specific options, if any, could be in the value.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not tackle that until we have a concrete need. I am curious what kind of config would be needed, these algorithms are pretty self-contained. The discussion I saw around PowerOfNChoices was that the benefits dropped off steeply when doing more than 2 choices. If 3 choices was still a good option but more than that wasn't, then I could see just adding a separate enum value for it.

case LoadBalancingMode.LeastRequests:
var leastRequestsEndpoint = healthyEndpoints[0];
var leastRequestsCount = leastRequestsEndpoint.ConcurrencyCounter.Value;
for (var i = 1; i < endpointCount; i++)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth storing healthy endpoints in a Min Heap data structure instead of list when endpointCount is big enough (e.g. > 20). It will allow to find an endpoint with least requests with O(1) complexity in contrast to O(n) for searching in a plain list. Heap can be reconstructed in BackendProber.ProbeEndpointsAsync after each probing iteration.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tradeoffs being how frequently you have to build and update that structure. Having to update it every time you add and remove a request seems like a lot of overhead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I understand it. But, do we really want to see an effect of ConcurrencyCounters updates on each request? The current linear search algorithm is not atomic anyways, so it might return endpoint which is not the least loaded one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Point, it's an approximation of least busy without locking.

Out goal at this point is getting basic implementations we can measure and compare. We'll use that data to decide what to refine later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alnikola We should file a bug for this for sure, that's a good idea.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this more, that kind of structure would be disruptive to the other load balancing algorithms, not to mention unnecessary overhead when you're not using LeastRequests. It's also unclear how it would fit with the winnowing middleware approach.

If the heap were not updated frequently, wouldn't you end up overloading the first node?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep in mind that Power of Two Choices is already O(1) and has no overhead on ConcurrencyCounters updates. It is a good default choice when there is a large number of endpoints.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this more, that kind of structure would be disruptive to the other load balancing algorithms, not to mention unnecessary overhead when you're not using LeastRequests.

We can introduce a new abstraction for endpoint collection (e.g. IEndpointCollection : IReadOnlyList<EndpointInfo>) exposing GetLeastLoadedEndpoint method with at least two implementations: ListEndpointCollection and MinHeapEndpointCollection. Further, in the end of probing iteration if the current BackendConfig.Mode = LeastRequests and endpoint count is big enough, BackendProber will command IEndpointManager to change the underlying collection holding endpoints to new MinHeapEndpointCollection, or to ListEndpointCollection otherwise.
The worst case here is mode changing to LeastRequests from something else just after BackendProber updated the endpoint list, so ListEndpointCollection would have to use a linear search to find the target item. However, it will be the same as it is currently.

If the heap were not updated frequently, wouldn't you end up overloading the first node?

To avoid overloading, heap update should be trigged by a heuristic based on the real request count tracked by ProxyInvokerMiddleware instead of a simple timer.

Summarily, I will create a separate issue as @davidfowl suggested for this improvement to discuss it separately from this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the same endpoint appear in more than one backend cluster? If so I'm assuming that the load needs to be tracked on a per endpoint basis rather than as part of the backend structure. As the list of backends can change dynamically, managing lists for least loaded is going to be expensive.
This is where the pick the least loaded of 2 is beneficial. I wonder if we should do other algorithms that don't involve sorting all the endpoints - like something that can assess the load in relation to the average, and pick the first that is below average? That way it can drop out on the first good candidate. If that's combined with a round robin seed, so the next search will be based on the last found, then it will spread the load out across the servers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Submitted issue #94

@Tratcher Tratcher mentioned this pull request Apr 23, 2020
@Tratcher Tratcher merged commit fee1ada into master Apr 23, 2020
@Tratcher Tratcher deleted the tratcher/loadrandom branch April 23, 2020 22:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
7 participants