Skip to main content

Flyte System Context Diagram

The Flyte System Context diagram illustrates the high-level interactions between the Flyte platform and its external environment.

At the center is the Flyte Service, a Go-based gRPC system (using Connect RPC) that manages the lifecycle of workflows, tasks, and runs. It is composed of several sub-services including Admin, Runs, and Tasks.

Users (such as Data Scientists and ML Engineers) interact with the system primarily through CLI Tools (like flytectl) or a Web UI. Flyte Copilot is a specialized CLI tool used as a sidecar within Kubernetes pods to manage data movement.

The system relies on several external infrastructure components:

  • PostgreSQL Database: Stores persistent metadata about workflows, projects, and execution history.
  • Cloud Storage: Uses a multi-cloud abstraction (StowStore) to store large data artifacts in AWS S3, Google Cloud Storage, or Azure Blob Storage.
  • Kubernetes API: Flyte orchestrates workflow execution by interacting with Kubernetes to manage pods and custom resources. The Flyte Executor component specifically acts as a Kubernetes controller.
  • Secret Management Services: Integrates with AWS Secrets Manager, Google Secret Manager, Azure Key Vault, and HashiCorp Vault to securely inject credentials into tasks.
  • Observability Stack: Exports metrics to Prometheus and traces to OpenTelemetry/Jaeger for system monitoring and debugging.
  • External Frameworks: Supports distributed computing frameworks like Spark, Ray, and Dask via their respective Kubernetes operators.

Key Architectural Findings:

  • Flyte is a gRPC-based service using Connect RPC for its API layer.
  • It uses a PostgreSQL database for metadata persistence and Cloud Storage (S3/GCS/Azure) for data artifacts.
  • Kubernetes is the primary orchestration engine, with Flyte acting as a controller for custom resources.
  • The system supports multiple secret managers including AWS, GCP, Azure, and HashiCorp Vault.
  • Observability is handled through Prometheus for metrics and OpenTelemetry for distributed tracing.
  • Flyte Copilot serves as a sidecar/init container for data handling within execution pods.
Loading diagram...