Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for listening to Azure Maintenance Events #1876

Merged
merged 37 commits into from
Oct 7, 2021

Conversation

amsoedal
Copy link
Collaborator

@amsoedal amsoedal commented Oct 5, 2021

Adding an automatic subscription to the AzureRedisEvents pubsub channel for Azure caches. This channel notifies clients of upcoming maintenance and failover events.

By exposing these events, users will be able to know about maintenance ahead of time, and can implement their own logic (e.g. diverting traffic from the cache to another database for the duration of the maintenance event) in response, with the goal of minimizing downtime and disrupted connections. We also automatically refresh our view of the topology of the cluster in response to certain events.

Here are some of the possible notifications:

// Indicates that a maintenance event is scheduled. May be several minutes from now
NodeMaintenanceScheduled,

// This event gets fired ~20s before maintenance begins
NodeMaintenanceStarting,

// This event gets fired when maintenance is imminent (<5s)
NodeMaintenanceStart,

// Indicates that the node maintenance operation is over
NodeMaintenanceEnded,

// Indicates that a replica has been promoted to primary
NodeMaintenanceFailover

@amsoedal amsoedal requested review from NickCraver, mgravell and philon-msft and removed request for NickCraver and mgravell October 5, 2021 17:37
@amsoedal
Copy link
Collaborator Author

amsoedal commented Oct 5, 2021

Hi @NickCraver, I closed #1865 and opened this one instead! Also working on addressing remaining comments from @mgravell. Big thanks to both of you!

@NickCraver
Copy link
Collaborator

I merged in an unrelated PR on the search side - updating to get those failing test runs off the radar.

NickCraver and others added 8 commits October 6, 2021 11:12
Doing a set of suggestions here following our call - cc @amsoedal! I've just merged up latest, will do a test run here but wanted to get changes to make it a bit more extensible for other clouds and easier to consume in front of your eyes. Philo added some color around the purposes for the public events (makes sense!), so was trying to make using them a bit cleaner on the client side, e.g.:
```cs
if (evt is AzureMaintenanceEvent ame && ame.NotificationType == AzureMaintenanceEvent.NotificationTypes.NodeMaintenanceStarting)
```

The extensibility model would be other cloud providers adding a PR here, which would consist in total of their handler (if different) and adding to the base `ServerMaintenanceEvent` (not meant to be externally extensible at the moment).

Can you please provide thoughts?

Co-authored-by: Michelle Soedal <ansoedal@microsoft.com>
@philon-msft
Copy link
Collaborator

@NickCraver and @mgravell , I believe all the planned changes are now complete. Feel free to merge if we're in agreement

Copy link
Collaborator

@NickCraver NickCraver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made one small tweak to avoid allocating the list every time on the azure endpoint check - looking good! Thanks for all the work and refinements here <3

src/StackExchange.Redis/ConfigurationOptions.cs Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants