Flyte System Context Diagram
The Flyte System Context diagram illustrates the high-level interactions between the Flyte platform and its external environment.
At the center is the Flyte Service, a Go-based gRPC system (using Connect RPC) that manages the lifecycle of workflows, tasks, and runs. It is composed of several sub-services including Admin, Runs, and Tasks.
Users (such as Data Scientists and ML Engineers) interact with the system primarily through CLI Tools (like flytectl) or a Web UI. Flyte Copilot is a specialized CLI tool used as a sidecar within Kubernetes pods to manage data movement.
The system relies on several external infrastructure components:
- PostgreSQL Database: Stores persistent metadata about workflows, projects, and execution history.
- Cloud Storage: Uses a multi-cloud abstraction (StowStore) to store large data artifacts in AWS S3, Google Cloud Storage, or Azure Blob Storage.
- Kubernetes API: Flyte orchestrates workflow execution by interacting with Kubernetes to manage pods and custom resources. The Flyte Executor component specifically acts as a Kubernetes controller.
- Secret Management Services: Integrates with AWS Secrets Manager, Google Secret Manager, Azure Key Vault, and HashiCorp Vault to securely inject credentials into tasks.
- Observability Stack: Exports metrics to Prometheus and traces to OpenTelemetry/Jaeger for system monitoring and debugging.
- External Frameworks: Supports distributed computing frameworks like Spark, Ray, and Dask via their respective Kubernetes operators.
Key Architectural Findings:
- Flyte is a gRPC-based service using Connect RPC for its API layer.
- It uses a PostgreSQL database for metadata persistence and Cloud Storage (S3/GCS/Azure) for data artifacts.
- Kubernetes is the primary orchestration engine, with Flyte acting as a controller for custom resources.
- The system supports multiple secret managers including AWS, GCP, Azure, and HashiCorp Vault.
- Observability is handled through Prometheus for metrics and OpenTelemetry for distributed tracing.
- Flyte Copilot serves as a sidecar/init container for data handling within execution pods.
Loading diagram...