Skip to main content

Building Resilient Flyte Applications

Flyte provides a standardized application skeleton in the flytestdlib/app package that simplifies building resilient microservices. This skeleton manages the lifecycle of HTTP servers and background workers, integrates with the Flyte configuration system, and ensures graceful shutdown when receiving termination signals.

In this tutorial, you will build a service that initializes a Kubernetes client, registers a background worker (a controller-runtime manager), and exposes readiness checks.

Prerequisites

To follow this tutorial, your project should depend on the following Flyte and Kubernetes libraries:

  • github.com/flyteorg/flyte/v2/flytestdlib/app
  • github.com/flyteorg/flyte/v2/flytestdlib/config
  • sigs.k8s.io/controller-runtime

Step 1: Define the Main Entry Point

The app.App struct is the primary entry point for Flyte services. It wraps a cobra.Command and handles configuration initialization automatically.

Create a main.go file:

package main

import (
"context"
"fmt"
"os"

"github.com/flyteorg/flyte/v2/flytestdlib/app"
)

func main() {
a := &app.App{
Name: "my-service",
Short: "A resilient Flyte service example",
Setup: func(ctx context.Context, sc *app.SetupContext) error {
// We will implement this in the next steps
return setup(ctx, sc)
},
}

if err := a.Run(); err != nil {
fmt.Fprintf(os.Stderr, "Application failed: %v\n", err)
os.Exit(1)
}
}

The Run() method builds a CLI command that supports a --config flag (defaulting to ./config.yaml). It also searches for configuration in /etc/flyte/config.

Step 2: Initialize Shared Resources

The Setup function receives a SetupContext, which acts as a container for shared resources like database connections, Kubernetes clients, and metrics scopes.

In your setup.go (or within main.go), initialize the Kubernetes client using the InitKubernetesClient helper:

func setup(ctx context.Context, sc *app.SetupContext) error {
// Initialize K8s client with in-cluster config or kubeconfig fallback
k8sClient, k8sConfig, err := app.InitKubernetesClient(ctx, app.K8sConfig{
QPS: 100,
Burst: 200,
}, nil) // Pass a *runtime.Scheme if you have custom CRDs
if err != nil {
return fmt.Errorf("failed to init k8s client: %w", err)
}

// Store resources in the context for other components to use
sc.K8sClient = k8sClient
sc.K8sConfig = k8sConfig

return nil
}

InitKubernetesClient (defined in flytestdlib/app/k8s.go) attempts to load the in-cluster configuration first and falls back to the default local kubeconfig if necessary.

Step 3: Register Background Workers

Many Flyte services run background tasks, such as Kubernetes controllers. You can register these using sc.AddWorker. These workers are started in separate goroutines and are provided a context that is cancelled during shutdown.

import ctrl "sigs.k8s.io/controller-runtime"

func setup(ctx context.Context, sc *app.SetupContext) error {
// ... (previous K8s initialization) ...

mgr, err := ctrl.NewManager(sc.K8sConfig, ctrl.Options{
Scheme: sc.K8sClient.Scheme(),
})
if err != nil {
return fmt.Errorf("failed to create manager: %w", err)
}

// Register the manager as a background worker
sc.AddWorker("controller-manager", func(ctx context.Context) error {
return mgr.Start(ctx)
})

return nil
}

The app.App skeleton ensures that if any worker returns a fatal error, the entire application initiates a graceful shutdown.

Step 4: Configure Health and Readiness Checks

Flyte services automatically expose /healthz (always returns 200 OK) and /readyz endpoints. You can add custom logic to the readiness check using sc.AddReadyCheck.

import "net/http"

func setup(ctx context.Context, sc *app.SetupContext) error {
// ... (previous initialization) ...

sc.AddReadyCheck(func(r *http.Request) error {
if sc.K8sClient == nil {
return fmt.Errorf("kubernetes client not initialized")
}
return nil
})

// If this service only runs workers and doesn't need an HTTP API,
// set Port to 0 to disable the server (except for health/ready checks).
// sc.Port = 0

return nil
}

Readiness checks are useful for Kubernetes probes to determine when a Pod is ready to receive traffic.

Step 5: Graceful Shutdown

The Flyte application skeleton handles SIGINT and SIGTERM signals. When a signal is received:

  1. The application logs the signal and starts a 30-second graceful shutdown period.
  2. The HTTP server is shut down using server.Shutdown(ctx).
  3. The context passed to background workers is cancelled.
  4. The application waits for all workers to finish.

If a second signal is received during this period, the application forces an immediate exit via os.Exit(1).

Complete Example Result

By combining these steps, you have a service that:

  • Loads configuration from files or environment variables.
  • Connects to Kubernetes with sensible defaults.
  • Runs a controller manager in the background.
  • Provides standard health and readiness probes.
  • Shuts down cleanly, allowing in-flight requests and controller loops to finish.

For a real-world implementation of this pattern, see executor/cmd/main.go and executor/setup.go in the Flyte codebase.