Skip to main content

Flyte Domain Data Model

The data model for Flyte revolves around the concepts of Runs, Actions, and Events, which are primarily managed within the runs.repository.models package and defined via gRPC/Protobuf in flyteidl2.

Core Entities

  • Action: The central entity representing a unit of work. In the database, Runs are simply root actions (where ParentActionName is null). Actions can be of different types: Tasks, Traces, or Conditions. They store their state, specification, and execution details as serialized protobufs (ActionSpec, ActionDetails, DetailedInfo).
  • ActionEvent: Represents a phase transition or a significant event in the lifecycle of an action attempt. It provides an append-only history of what happened during execution, including error information and timestamps.
  • Project: A top-level container for organizing workflows, tasks, and runs.
  • Task: Represents the definition of a task, including its project, domain, name, and version. It serves as a template for Action executions.
  • Trigger: Defines how a task should be automatically initiated (e.g., on a schedule). Triggers are associated with specific tasks and maintain a revision history.
  • Secret: Manages sensitive information required by tasks. Secrets are scoped to projects and domains and are typically backed by Kubernetes Secrets in the underlying infrastructure.

Relationships

  • Action to Action: A self-referencing relationship where an action can have a ParentActionName, forming a hierarchy (Run -> Action -> Sub-action).
  • Action to ActionEvent: One action can have many events, each representing a state change or attempt.
  • Task to Action: An action (of type TASK) refers to a specific task definition.
  • Task to Trigger: A task can have multiple triggers defined for it.
  • Project to Run/Action: All runs and actions are scoped within a project and domain.
  • Secret to Project/Domain: Secrets are scoped to ensure they are only accessible to authorized workloads within the same project/domain.

Metadata and Specifications

  • TaskActionSpec: Contains execution-time configuration like labels, annotations, environment variables, and resource overrides.
  • Run: Stores detailed runtime metadata such as storage URIs for inputs/outputs and the digest of the resolved task spec.
  • TaskExecutionMetadata: Provides high-level metadata for an action, including who executed it and the trigger that initiated it.

Key Architectural Findings:

  • Runs and Actions are stored in the same database table, distinguished by the presence of a parent action name.
  • ActionEvent serves as an append-only log of phase transitions for actions, enabling detailed execution history tracking.
  • The system uses a 'Project -> Domain -> Name -> Version' hierarchy for identifying tasks and actions.
  • Metadata like RunSpec and RunInfo are stored as serialized protobuf blobs within the Action record for efficiency and flexibility.
  • Secrets are managed as a separate service but are tightly integrated with the project/domain scoping of the rest of the system.
Loading diagram...