Skip to main content

GCP Secret Manager Integration

Flyte integrates with Google Cloud Platform (GCP) Secret Manager by injecting an init-container into task Pods. This init-container uses the Google Cloud SDK to fetch secrets and mount them into a shared memory volume accessible by the main task container.

Enabling GCP Secret Manager

To enable the GCP Secret Manager integration, you must configure the Flyte Pod Webhook to use the GCP secret manager type. This is typically done in the FlytePropeller or Webhook configuration.

webhook:
secretManagerTypes:
- GCP
gcpSecretManager:
sidecarImage: "gcr.io/google.com/cloudsdktool/cloud-sdk:alpine"
resources:
requests:
cpu: "200m"
memory: "500Mi"
limits:
cpu: "200m"
memory: "500Mi"

The GCPSecretManagerConfig struct in flyteplugins/go/tasks/pluginmachinery/secret/config/config.go defines these parameters:

type GCPSecretManagerConfig struct {
SidecarImage string `json:"sidecarImage" pflag:",Specifies the sidecar docker image to use"`
Resources corev1.ResourceRequirements `json:"resources" pflag:"-,Specifies resource requirements for the init container."`
}

Security and Permissions

The Pod's Google Service Account (GSA) must have the Secret Manager Secret Accessor role (roles/secretmanager.secretAccessor) for the secrets it needs to access.

Flyte assumes that:

  1. The secrets exist in the same GCP project as the Flyte cluster.
  2. The Pod is associated with a GSA, ideally via Workload Identity.

If the GSA lacks permissions, the init-container will fail, preventing the task Pod from starting.

How Injection Works

When a task requests a secret, the GCPSecretManagerInjector (found in flyteplugins/go/tasks/pluginmachinery/secret/gcp_secret_manager.go) modifies the Pod specification:

  1. Shared Volume: It adds an EmptyDir volume with Medium: Memory named gcp-secret-vol.
  2. Init-Container: For each secret requested, it appends an init-container named gcp-pull-secret-<index>.
  3. Command: The init-container runs a gcloud command to fetch the secret version and save it to the shared volume.

The formatGCPSecretAccessCommand function generates the command used by the sidecar:

func formatGCPSecretAccessCommand(secret *core.Secret) []string {
secretDir := strings.ToLower(filepath.Join(GCPSecretMountPath, secret.Group))
secretPath := strings.ToLower(filepath.Join(secretDir, secret.GroupVersion))
args := fmt.Sprintf(
"gcloud secrets versions access %[1]s/versions/%[2]s --out-file=%[4]s || gcloud secrets versions access %[2]s --secret=%[1]s --out-file=%[4]s; chmod +rX %[3]s %[4]s",
secret.Group,
secret.GroupVersion,
secretDir,
secretPath,
)
return []string{"sh", "-ec", args}
}

Permission Fix

The injector automatically runs chmod +rX on the downloaded secret. This is necessary because gcloud writes files with 0600 permissions, which would prevent non-root users in the main task container from reading the secret.

Secret Mount Paths

Secrets are mounted as files in the following directory structure: /etc/flyte/secrets/<secret_group>/<secret_version>

Flyte also injects environment variables into the task container to help the SDK locate these secrets:

  • FLYTE_SECRETS_DEFAULT_DIR: Set to /etc/flyte/secrets.
  • FLYTE_SECRETS_FILE_PREFIX: Set to an empty string.

Limitations and Requirements

  • Mount Requirements: Only Secret_FILE and Secret_ANY mount requirements are supported. Secret_ENV_VAR is not supported for the GCP integration.
  • Secret Identification: Both Group (the secret name in GCP) and GroupVersion (the version ID or latest) must be provided in the secret request.
  • Project Locality: The injector currently expects secrets to be in the same GCP project as the cluster.
  • Sidecar Overhead: Each secret requested results in a separate init-container. For tasks requiring many secrets, this can increase Pod startup latency.

Troubleshooting

Init-Container Failures

If a task Pod remains in Init:Error or Init:CrashLoopBackOff, check the logs of the gcp-pull-secret-N container:

kubectl logs <pod-name> -c gcp-pull-secret-0

Common causes include:

  • Permission Denied: The GSA does not have roles/secretmanager.secretAccessor.
  • Secret Not Found: The Group or GroupVersion specified in the Flyte task does not match the secret name or version in GCP Secret Manager.
  • Workload Identity Issues: The Kubernetes Service Account (KSA) is not correctly annotated or linked to the GSA.