Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load scheduling by actual request concurrency and response times #2333

Open
akevdmeer opened this issue Jul 7, 2023 · 3 comments
Open

Load scheduling by actual request concurrency and response times #2333

akevdmeer opened this issue Jul 7, 2023 · 3 comments
Assignees

Comments

@akevdmeer
Copy link

We have a business-critical service that is characterized by a low request concurrency and a high response time variance of let's say 100ms to 10s (when not overloaded). Estimating the response time based on the request is infeasible. We need to apply (adaptive) load scheduling to arbitrate between contending workloads but don't see how this can be done effectively with Aperture at the moment.

Is it possible to do load scheduling based on the actual request concurrency, without needing to determine the request cost in tokens upfront? The flow.End() calls would seem to make it possible to track flows that are active.

@kwapik
Copy link
Contributor

kwapik commented Jul 7, 2023

@tanveergill PTAL

@hdkshingala
Copy link
Contributor

@akevdmeer Please join our slack community. We can discuss this or any other issues you faced or need help with.

CC: @jaidesai-fn

@harjotgill
Copy link
Contributor

@akevdmeer - Using flow.End() to track exact concurrency is certainly feasible. Bookkeeping is fairly simple, though we will have to build a token audit mechanism to replenish tokens in case they are lost due to some intermittent bookkeeping failures.

However, in most cases (including hard concurrency limits scenarios), Aperture's current mechanism works, provided we can reliably detect overload based on some health metric(s). At a high level, there are 2 parts of adaptive load scheduling:

  1. Overload detection: Health signals (latency, concurrent connections, queue depths metrics) can be used to detect overloads, optionally, along with confirmatory signals to reduce false positives (CPU metric). I am not aware of your exact setup, but very likely we can find some metrics that can help us reliably detect the overload buildup. Based on the severity of the overload, the Aperture Controller adjusts the token bucket fill rate. We have found that using latency as an overload detection signal in a hard concurrency limit scenario (in our playground), Aperture can adaptively adjust the request rate to "discover" the concurrency limit without exact bookkeeping (see attached graph). Similar to this idea, we can switch out the latency-based feedback and use some other signal to adjust the request rate to match the inherent concurrency limit of your service.

  2. Scheduling workloads (prioritization): In case your workloads have high variance, then you can switch off the latency-based token estimation algorithm and use the priority levels and/or hard code workload tokens to schedule requests. The automatic token estimation helps determine the "size" of each request w.r.t. to other workloads based on their response times. E.g., a low-priority but lightweight request will have a better chance of getting scheduled than another low-priority but heavier request.

We will be happy to learn more about your scenario and be more prescriptive with our advice. We can certainly build exact concurrency bookkeeping if that is needed to solve your scenario. But first, we are curious how the current request rate-based system would behave in this low concurrency, high latency variance scenario by using metrics other than latency to detect overloads.

PS: @tanveergill suggested that in case no other signal can alert us on overloads, then perhaps we can use the latency percentiles to get some stable readings. The Aperture FluxMeters collect latency metrics as Prometheus summaries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants