Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add penalty for nodes unavailability in latency-based routing #587

Conversation

nikolay-komarevskiy
Copy link
Contributor

@nikolay-komarevskiy nikolay-komarevskiy commented Aug 26, 2024

Description

In the current implementation of the latency-based routing, the probability of choosing a node as a routing URL is proportional to the node's score. The score is calculated as an inverse value of the moving average of latency measurements over some sliding window. Hence, nodes with smaller latencies produce higher score values and are more likely to be selected. However, in the current approach when a node was unavailable (within the sliding window), it was completely removed from routing. This was a "harsh" punishment, which is not always optimal, as latency can experience sporadic spikes.

In this PR we introduce a more sophisticated and customizable way of calculating node's score, where the combination of node's latency and availability are used to produce the final score. Node's score is now penalized, if it was unavailable within the sliding window measurements and not set to zero as before. A node is only removed from routing if it was unavailable in the last measurement.

Approach:

  • We store two separate sliding arrays: availabilities and latencies for each node
  • Latency score is computed as an unnormalized value:
$$latency\_score=\frac{\sum\limits_{i=0}^{win\_size-1}w_i * latency_i^{-1}}{\sum\limits_{i=0}^{win\_size-1}w_i }$$

Note that inverse of latency is used to produce higher scores for smaller latencies.

  • Availability score is computed as a normalized $[0.0, 1.0]$ value:
$$availability\_score=\frac{\sum\limits_{i=0}^{win\_size-1}w_i * availability_i}{\sum\limits_{i=0}^{win\_size-1}w_i }$$

$availability_i=0$, if the node is unavailable and $1$ if it is available.

  • Weights $w_i$ are computed via an exponential decay formula:
$$w_i=\exp(−\lambda * i),~~i = 0, ..., win\_size-1$$

This allows to take recency factor into account, i.e. more recent observations contribute to the score with higher weights.
Note, users can also provide their custom weights. For example, to exclude recency factor, all $w_i=1$ could be provided as input.

  • The combined score is computed via a simple product:
$$combined\_score = availability\_score * latency\_score$$

This allows to emphasize the importance of both metrics in the final score.

Example:

let latencies_secs = [None,Some(0.190),Some(0.195),Some(0.180),Some(0.195),Some(0.180),Some(0.185),Some(0.2),Some(0.192),Some(0.215),Some(0.175),None,Some(0.190),None,Some(0.187),None,Some(0.165),Some(0.189),None,Some(0.192),Some(0.200),Some(0.187),Some(0.178),Some(0.185)]];

image

How Has This Been Tested?

Additional tests in latency_based_routing.rs have been introduced.

Checklist:

  • The title of this PR complies with Conventional Commits.
  • I have edited the CHANGELOG accordingly.
  • I have made corresponding changes to the documentation.

@nikolay-komarevskiy nikolay-komarevskiy changed the base branch from main to dynamic_route August 26, 2024 09:54
@nikolay-komarevskiy nikolay-komarevskiy marked this pull request as ready for review August 27, 2024 12:46
@nikolay-komarevskiy nikolay-komarevskiy requested a review from a team as a code owner August 27, 2024 12:46
@adamspofford-dfinity adamspofford-dfinity merged commit 572e555 into dfinity:dynamic_route Aug 27, 2024
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants