Flyte Domain Data Model
The data model for Flyte revolves around the concepts of Runs, Actions, and Events, which are primarily managed within the runs.repository.models package and defined via gRPC/Protobuf in flyteidl2.
Core Entities
- Action: The central entity representing a unit of work. In the database, Runs are simply root actions (where
ParentActionNameis null). Actions can be of different types: Tasks, Traces, or Conditions. They store their state, specification, and execution details as serialized protobufs (ActionSpec,ActionDetails,DetailedInfo). - ActionEvent: Represents a phase transition or a significant event in the lifecycle of an action attempt. It provides an append-only history of what happened during execution, including error information and timestamps.
- Project: A top-level container for organizing workflows, tasks, and runs.
- Task: Represents the definition of a task, including its project, domain, name, and version. It serves as a template for
Actionexecutions. - Trigger: Defines how a task should be automatically initiated (e.g., on a schedule). Triggers are associated with specific tasks and maintain a revision history.
- Secret: Manages sensitive information required by tasks. Secrets are scoped to projects and domains and are typically backed by Kubernetes Secrets in the underlying infrastructure.
Relationships
- Action to Action: A self-referencing relationship where an action can have a
ParentActionName, forming a hierarchy (Run -> Action -> Sub-action). - Action to ActionEvent: One action can have many events, each representing a state change or attempt.
- Task to Action: An action (of type TASK) refers to a specific task definition.
- Task to Trigger: A task can have multiple triggers defined for it.
- Project to Run/Action: All runs and actions are scoped within a project and domain.
- Secret to Project/Domain: Secrets are scoped to ensure they are only accessible to authorized workloads within the same project/domain.
Metadata and Specifications
- TaskActionSpec: Contains execution-time configuration like labels, annotations, environment variables, and resource overrides.
- Run: Stores detailed runtime metadata such as storage URIs for inputs/outputs and the digest of the resolved task spec.
- TaskExecutionMetadata: Provides high-level metadata for an action, including who executed it and the trigger that initiated it.
Key Architectural Findings:
- Runs and Actions are stored in the same database table, distinguished by the presence of a parent action name.
- ActionEvent serves as an append-only log of phase transitions for actions, enabling detailed execution history tracking.
- The system uses a 'Project -> Domain -> Name -> Version' hierarchy for identifying tasks and actions.
- Metadata like RunSpec and RunInfo are stored as serialized protobuf blobs within the Action record for efficiency and flexibility.
- Secrets are managed as a separate service but are tightly integrated with the project/domain scoping of the rest of the system.
Loading diagram...