Configuring Storage Backends
Flyte uses a unified storage interface to interact with different backends like AWS S3, Google Cloud Storage, Azure Blob Storage, Redis, and local filesystems. You configure these backends using the Config struct in flytestdlib/storage.
Configuring S3 or Minio
To set up S3 or a Minio-compatible backend, use the StowConfig within the main Config. While Flyte supports a legacy ConnectionConfig, using StowConfig is the recommended approach as it provides more flexibility.
The following example shows how to configure a Minio backend in YAML:
storage:
type: minio
container: "flyte-data"
enable-multicontainer: false
stow:
kind: "s3"
config:
auth_type: "accesskey"
access_key_id: "minio-access-key"
secret_key: "minio-secret-key"
region: "us-east-1"
endpoint: "http://localhost:9000"
disable_ssl: "true"
In Go, you can initialize the DataStore programmatically:
import (
"github.com/flyteorg/flyte/v2/flytestdlib/storage"
"github.com/flyteorg/flyte/v2/flytestdlib/promutils"
)
cfg := &storage.Config{
Type: storage.TypeMinio,
InitContainer: "flyte-data",
Stow: storage.StowConfig{
Kind: "s3",
Config: map[string]string{
"auth_type": "accesskey",
"access_key_id": "minio-access-key",
"secret_key": "minio-secret-key",
"region": "us-east-1",
"endpoint": "http://localhost:9000",
"disable_ssl": "true",
},
},
}
dataStore, err := storage.NewDataStore(cfg, promutils.NewTestScope())
Configuring Local Filesystem
For local development, you can use the local storage type. This maps a local directory to a Flyte container.
cfg := &storage.Config{
Type: storage.TypeLocal,
InitContainer: "testdata",
Stow: storage.StowConfig{
Kind: "local",
Config: map[string]string{
"path": "./tmp/storage",
},
},
}
dataStore, err := storage.NewDataStore(cfg, promutils.NewTestScope())
When using local storage, a DataReference like file://testdata/config.yaml will resolve to ./tmp/storage/config.yaml.
Configuring Redis and Scheme Routing
Flyte supports Redis for high-performance metadata storage. If you configure RedisConfig alongside a blob store (like S3), Flyte automatically uses a schemeRoutingStore. This allows you to use redis:// URIs for metadata while keeping large artifacts in s3://.
storage:
type: s3
container: "flyte-data"
stow:
kind: "s3"
config:
region: "us-east-1"
redis:
addr: "localhost:6379"
db: 0
When this configuration is detected in RefreshConfig, Flyte wraps the default store and the Redis store:
// flytestdlib/storage/rawstores.go
if cfg.Type != TypeRedis && len(cfg.Redis.Addr) > 0 {
redisStore, err := NewRedisRawStore(ctx, cfg, ds.metrics)
if err != nil {
return err
}
rawStore = newSchemeRoutingStore(rawStore, redisStore, ds.metrics.copyMetrics)
}
The schemeRoutingStore dispatches operations based on the URI scheme:
redis://<addr>/<key>goes to the Redis backend.- All other schemes (e.g.,
s3://,file://) go to the default backend.
Global Storage Settings
The Config struct provides several parameters that control how Flyte interacts with storage regardless of the backend.
Multi-Container Support
By default, Flyte only allows access to the bucket defined in InitContainer. To allow access to any bucket specified in a DataReference, set MultiContainerEnabled to true.
cfg := &storage.Config{
Type: storage.TypeS3,
InitContainer: "default-bucket",
MultiContainerEnabled: true, // Allows s3://other-bucket/data.pb
}
Download Limits
To prevent excessive memory usage, you can set limits on download sizes using LimitsConfig.
cfg := &storage.Config{
Limits: storage.LimitsConfig{
MaxDownloadMBs: 10, // Rejects downloads larger than 10MB
},
}
Troubleshooting
Missing InitContainer
The InitContainer field is mandatory even if MultiContainerEnabled is set to true. If it is missing, NewDataStore will return an error:
initContainer is required even with 'enable-multicontainer'
Redis Value Limits
Redis storage has a hard limit of 512MiB per value. Additionally, the Redis backend does not support signed URLs. If your application requires signed URLs for data access, ensure those references use a blob store scheme (like s3://) rather than redis://.
Local Storage Paths
When using the local backend, ensure the path in StowConfig exists. The DataReference must include the InitContainer name as the first segment of the path after the scheme (e.g., file://my-container/path/to/file).