should unwinding instance-start sagas put the instance in `Stopped` or `Failed`? #7727

hawkw · 2025-03-04T18:42:20Z

When an instance-start saga unwinds, the compensating actions transition the instance back to the Stopped state. This makes sense from the perspective of "an unwinding saga node should put things back Exactly The Way They Were Before".¹ However, it's a bit weird with regards to the user's intent: arguably, an attempt to start the instance that was unsuccessful is equivalent to successfully starting an instance and then having it fail, in terms of the desired state for that instance. The user asked us to start it, so perhaps we should continue trying to start it.²

This could maybe be achieved by transitioning the instance to Failed. It's also potentially solveable by the idea @gjcolombo and I have discussed where we store a target instance state alongside the current instance state, and attempt to reconcile them when Something Happens in the Real World. This also solves stuff like #6809, which is kind of the same weird behavior in the opposite direction (we were asked to stop an instance, upon doing so we discover it's gone away, and then we immediately try to restart it, which is Goofy).

On the other hand, the question of what is the Right Thing in this situation is complicated by some of the reasons an instance-start saga may fail. In particular, it might fail due to insufficient resources being available for the instance, in which case we probably should not retry starting it immediately, but should maybe try again in a little while if sufficient resources become available? Or perhaps not --- maybe retrying starting an instance that we couldn't find resources for should be an explicit user action? I feel a bit conflicted about that.

(more or less, modulo generation numbers &c) ↩
At least, if auto-restart is enabled. ↩

The text was updated successfully, but these errors were encountered:

hawkw assigned hawkw and gjcolombo Mar 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

should unwinding instance-start sagas put the instance in `Stopped` or `Failed`? #7727

should unwinding instance-start sagas put the instance in `Stopped` or `Failed`? #7727

hawkw commented Mar 4, 2025

should unwinding instance-start sagas put the instance in Stopped or Failed? #7727

should unwinding instance-start sagas put the instance in Stopped or Failed? #7727

Comments

hawkw commented Mar 4, 2025

Footnotes

should unwinding instance-start sagas put the instance in `Stopped` or `Failed`? #7727

should unwinding instance-start sagas put the instance in `Stopped` or `Failed`? #7727