-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
want a way for background tasks to await saga completion #6569
Comments
I'd like to extend the |
hawkw
added a commit
that referenced
this issue
Sep 13, 2024
Nexus API handlers (and other sagas) that start sagas have the ability to wait for the completion of the saga by `await`ing the `RunningSaga` future returned by `RunnableSaga::start`. Background tasks, on the other hand, cannot currently do this, as their only interface to the saga executor is an `Arc<dyn StartSaga>`, which provides only the [`saga_start` method][1]. This method throws away the `RunningSaga` returned by `RunnableSaga::start`, so the completion of the saga cannot be awaited. In some cases, it's desirable for background tasks to be able to await a saga running to completion. I described some motivations for this in #6569. This commit adds a new `StartSaga::saga_run` method to the `StartSaga` trait, which starts a saga *and* waits for it to finish. Since many tests use a `NoopStartSaga` type which doesn't actually start sagas, this interface still throws away most of the saga *outputs* provided by `StoppedSaga`, to make it easier for the noop test implementation to implement this method. If that's an issue in the future, we can revisit the interface, and maybe make `NoopStartSaga` return fake saga results or something. I've updated the `instance_watcher` and `instance_updater` background tasks to use the `saga_run` method, because they were written with the intent to spawn tasks that run sagas to completion --- this is necessary for how the number of concurrent update sagas is *supposed* to be limited by `instance_watcher`. I left the region-replacement tasks using `StartSaga::saga_start`, because --- as far as I can tell --- the "fire and forget" behavior is desirable there. Perhaps @jmpesp can confirm this? Closes #6569 [1] https://github.com/oxidecomputer/omicron/blob/8be99b0c0dd18495d4a98187145961eafdb17d8f/nexus/src/app/saga.rs#L96-L108
hawkw
added a commit
that referenced
this issue
Sep 13, 2024
Nexus API handlers (and other sagas) that start sagas have the ability to wait for the completion of the saga by `await`ing the `RunningSaga` future returned by `RunnableSaga::start`. Background tasks, on the other hand, cannot currently do this, as their only interface to the saga executor is an `Arc<dyn StartSaga>`, which provides only the [`saga_start` method][1]. This method throws away the `RunningSaga` returned by `RunnableSaga::start`, so the completion of the saga cannot be awaited. In some cases, it's desirable for background tasks to be able to await a saga running to completion. I described some motivations for this in #6569. This commit adds a new `StartSaga::saga_run` method to the `StartSaga` trait, which starts a saga *and* waits for it to finish. Since many tests use a `NoopStartSaga` type which doesn't actually start sagas, this interface still throws away most of the saga *outputs* provided by `StoppedSaga`, to make it easier for the noop test implementation to implement this method. If that's an issue in the future, we can revisit the interface, and maybe make `NoopStartSaga` return fake saga results or something. I've updated the `instance_watcher` and `instance_updater` background tasks to use the `saga_run` method, because they were written with the intent to spawn tasks that run sagas to completion --- this is necessary for how the number of concurrent update sagas is *supposed* to be limited by `instance_watcher`. I left the region-replacement tasks using `StartSaga::saga_start`, because --- as far as I can tell --- the "fire and forget" behavior is desirable there. Perhaps @jmpesp can confirm this? Closes #6569 [1]: https://github.com/oxidecomputer/omicron/blob/8be99b0c0dd18495d4a98187145961eafdb17d8f/nexus/src/app/saga.rs#L96-L108
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Nexus API handlers (and other sagas) that start sagas have the ability to wait for the completion of the saga by
await
ing theRunningSaga
future returned byRunnableSaga::start
. Background tasks, on the other hand, cannot currently do this, as their only interface to the saga executor is anArc<dyn StartSaga>
, which provides only the followingsaga_start
method:omicron/nexus/src/app/saga.rs
Lines 96 to 108 in 8be99b0
Note that the
RunningSaga
future is thrown away, meaning that the saga has been started, but there's no way to wait for its completion.It would be useful for background tasks to be able to wait for the saga's completion. Not being able to do means that background tasks don't really have the ability to enforce meaningful concurrency limits on the sagas they start --- they can only limit the concurrency of starting sagas, not the number of sagas that are concurrently running. It also means that the approach for avoiding Nexus replicas all operating on instances/VMMs/disks in the same order described by @gjcolombo in #6565 (comment) is much harder to implement in cases where the operation that changes the state of a database record so that it is no longer returned by a query requires a saga completing, since the background task cannot easily wait for the sagas it starts to finish before querying the database again.
See also 0582e84
The text was updated successfully, but these errors were encountered: