You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When Nexus observes or is told that a VMM is unexpectedly no longer resident on a sled where it lived previously, Nexus takes control of the state machine (assuming the sled has abandoned it) and moves the VMM to Failed. Similar logic applies to migrations: if a migration-participant VMM has disappeared, then the sled has also abdicated the state machine for that side of the migration, and Nexus should also move that migration-half to Failed.
The text was updated successfully, but these errors were encountered:
Greg and I discussed this at length yesterday. My opinion is that it is not currently all that urgent to fix this, as the migration table is currently only used to indicate to instance-update sagas when an in-progress migration has completed or failed. Its main purpose is to tell the update saga whether migration IDs need to be unset and/or the active VMM ID has changed...and if either VMM has transitioned to Failed, the update saga already knows that it needs to clean up a VMM and potentially unset the migration IDs. So, it would be nice if the update saga also cleaned up the migration records, but it's currently the only consumer of the migration table anyway, and leaving them in progress when a VMM moves to Failed isn't actually a problem as far as the update saga is concerned.
I do still think it's worth fixing, especially if the migration table is ever used for other purposes in the future. If we were to, for example, use it to generate a UI showing an instance's migration history, it would, of course, be wrong to leave behind permanently InProgress migrations after a VMM goes to Failed. But, it's not terribly urgent to fix immediately IMO.
When Nexus observes or is told that a VMM is unexpectedly no longer resident on a sled where it lived previously, Nexus takes control of the state machine (assuming the sled has abandoned it) and moves the VMM to Failed. Similar logic applies to migrations: if a migration-participant VMM has disappeared, then the sled has also abdicated the state machine for that side of the migration, and Nexus should also move that migration-half to Failed.
The text was updated successfully, but these errors were encountered: