Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make SocketManager worker count configurable #1115

Merged
merged 3 commits into from
Feb 3, 2020

Conversation

naile
Copy link
Contributor

@naile naile commented Apr 7, 2019

With async-heavy workloads the default Shared instances' 10 workers are not always enough. With a dedicated SockerManager the worker count is only 5 and not configurable.

This PR

  • Adds overload to SocketManager exposing the workerCount so that is configurable.
  • Updated ConfigurationOptions XML comment to align with how the SockerManager.Shared implementation works

Should solve #1035

@mgravell
Copy link
Collaborator

mgravell commented Apr 8, 2019

To echo what I said in the linked issue - can I ask more info about the scenarios where this is manifesting? I just want to make sure we're solving the correct problem. Do you have some log/trace/etc that indicates that we're exhausting the dedicated pool? Also: what is the configuration here? How many multiplexers? How many connections? Is this cluster? etc...

I want to better understand the scenario in which this would be necessary.

@naile
Copy link
Contributor Author

naile commented Apr 8, 2019

Sure. Our current scenario is a message processing system. We're trying to achieve a good balance for throughput/CPU usage. We process many small messages concurrently. Each message will incur ~5-10 Redis operations, all async. When trying to find the sweet spot for how many consumers to run concurrently with different settings these are our findings. Processed messages / second is the metric we care about. Was measured on our production setup, Azure P3 Redis, not clustered. Processing on a 4 CPU machine, dotnet core 2.2 console app using latest version of StackExchange.Redis. When testing we are the sole consumer of this Redis instance.

Single Multiplexer: (using the 10 worker Shared SocketManager)
1 consumer = 67/s
2 consumers = 150/s
3 consumers = 155/s
4 consumers = 215/s
6 consumers = 230/s
8 consumers = 270/s
12 consumers = 290/s
16 consumers = 320/s
2 Multiplexers (using the 10 worker Shared SocketManager)
4 consumers = 280/s
8 consumers = 450/s
16 consumers = 530/s
2 Multiplexers (using the 5 worker specific SocketManager)
8 consumers = frequent crashes with TimeoutExceptions
16 consumers = frequent crashes with TimeoutExceptions
4 Multiplexers (using the 5 worker specific SocketManager)
8 consumers = 480/s
10 consumers = 540/s <----- sweetspot? Had very high CPU usage though.
16 consumers = frequent crashes with TimeoutExceptions
Single Multiplexer with specific SocketManager with 16 workers (created via reflection)
4 consumers = 460/s
8 consumers = 630/s
16 consumers = 1020/s <---- same CPU usage as 10 consumers with 4 multiplexers

So for our use case it seems a single Multiplexer with a SocketManager that has a larger thread pool is a lot more performant. It's not TimeoutExceptions that's our main issue, even though we can get them when using a smaller SocketManager worker pool.
When we get them they usually looks like this with 0 available mgr:

The timeout was reached before the message could be written to the output buffer, and it was not sent, timeout: 5000, outbound: 0KiB, inbound: 0KiB, inst: 1, qu: 1, qs: 1, aw: True, bw: RecordingTimeout, rs: ReadAsync, ws: Idle, in: 0, serverEndpoint: Unspecified/OMITTED-URL:6380, mgr: 0 of 5 available, clientName: Cint.Statistics.Service, IOCP: (Busy=0,Free=1000,Min=16,Max=1000), WORKER: (Busy=2,Free=681,Min=16,Max=682), v: 2.0.593.37019 (Please take a look at this article for some common client-side issues that can cause timeouts: https://stackexchange.github.io/StackExchange.Redis/Timeouts)

@mgravell
Copy link
Collaborator

mgravell commented Apr 8, 2019

When we get them they usually looks like this with 0 available mgr:

Perfect, in that case - seems legit. You've obviously done a lot of work here to understand context etc - but you'd be amazed how often we get requests to change things without that level of effort, just a random "this might fix it?". It surprises me that you've managed to saturate the pool without "cluster", though... a small part of me is wondering "what are they doing?", but... we may never know!


In terms of review feedback - since we already have a public constructor that takes useHighPrioritySocketThreads, I'd rather keep that available on any new constructor; I think rather than adding public SocketManager(string name = null, int workerCount = DEFAULT_WORKERS) (which already has ambiguity issues with public SocketManager(string name = null), what I'd rather see is:

public SocketManager(string name) // <== remove optional param here
 : this(name, DEFAULT_WORKERS, false) {}
public SocketManager(string name, bool useHighPrioritySocketThreads)
 : this(name, DEFAULT_WORKERS, useHighPrioritySocketThreads) {}
public SocketManager(string name = null, int workerCount = 0, // this *was* the private .ctor
    bool useHighPrioritySocketThreads = false)
{
    if (workerCount <= 0) workerCount = DEFAULT_WORKERS;
    // ... rest of the "real" code here
}

reasons:

  • new callers will default to the "real" API
  • API is binary-compatible for existing callers
  • everything is available (we won't need another PR later to add another .ctor)
  • the DEFAULT_WORKS isn't baked into the API - I'd rather retain this as an implementation detail
  • the count seems a more likely parameter to want to use than the high-pri, so: push it earlier

I'm happy to merge and make those changes myself, or you're welcome to make them as part of the PR. Just let me know your preference.

@naile
Copy link
Contributor Author

naile commented Apr 8, 2019

That makes sense, i can update the PR. Would you prefer replacing the private CTOR or calling it?
I.e

public SocketManager(string name = null, int workerCount = 0, bool useHighPrioritySocketThreads = false)
    : this(name, useHighPrioritySocketThreads, workerCount) { }

private SocketManager(string name, bool useHighPrioritySocketThreads, int workerCount)
{
  if (workerCount <= 0) workerCount = DEFAULT_WORKERS; // <--this is new
  // ... rest of the "real" code here
}

@mgravell
Copy link
Collaborator

mgravell commented Apr 8, 2019

might as well replace it, i.e. make it public and swap the args order

@humayun-ahmed
Copy link

do you have any idea when this PR may merge?

@mgravell
Copy link
Collaborator

mgravell commented Feb 3, 2020

merging, thanks

@mgravell mgravell merged commit 76c4462 into StackExchange:master Feb 3, 2020
@etuzel
Copy link

etuzel commented Mar 10, 2020

@mgravell any plans when 2.1.0 will be released with these changes?

@mgravell
Copy link
Collaborator

I have another PR I want to review and dogfood, as it could significant help on cluster, then we should be good for a deploy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants