Workflow Run Lifecycle State Machine
The Workflow Run Lifecycle State Machine diagram illustrates the progression of a workflow run (represented as an 'Action' in the Flyte backend) through various execution phases.
The lifecycle begins in the Task State and Transitions state when an action is first enqueued. It then moves through resource allocation (Task State and Transitions) and environment setup (Task State and Transitions) before entering the active Task State and Transitions state.
For standard tasks, the run transitions from Task State and Transitions to terminal states like Task State and Transitions, Task State and Transitions, The Abort Reconciliation System, or Task State and Transitions. A key feature discovered in the code is the retry logic: if a run fails or times out but has remaining retry attempts, the TaskActionReconciler resets it back to the Task State and Transitions state.
Condition actions introduce a Action Lifecycle and State Management state where the execution waits for an external signal. This state can transition to Task State and Transitions upon receiving a signal, or to Task State and Transitions/The Abort Reconciliation System if the signal is not received or the run is cancelled.
The ActionRepo ensures state integrity by enforcing that phase transitions generally move forward, except during retries where it allows moving from a failed state back to queued.
Key Architectural Findings:
- Workflow runs are primarily managed as 'Actions' using the ActionPhase enum defined in flyteidl2/common/phase.proto.
- The TaskActionReconciler in executor/pkg/controller/taskaction_controller.go manages the core transitions based on plugin execution results.
- Retry logic is implemented by transitioning from FAILED or TIMED_OUT back to QUEUED if the current attempt count is less than the maximum allowed.
- Condition actions utilize a PAUSED state to await external signals, as documented in the ActionPhase proto comments.
- The database layer (ActionRepo) implements a guard condition in SQL to prevent invalid backward phase transitions, except for specific retryable phases.