Skip to main content

Performance Tuning and Caching

Flyte's storage layer in flytestdlib/storage provides mechanisms to optimize data access and protect system resources through in-memory caching and hard download limits. These features are primarily controlled via the CachingConfig and LimitsConfig structures.

In-Memory Caching with freecache

Flyte implements a transparent caching layer using the cachedRawStore struct (found in flytestdlib/storage/cached_rawstore.go). This struct acts as a decorator for any RawStore implementation (such as S3, GCS, or Local storage). When caching is enabled, Flyte uses the freecache library to store data in-memory with zero garbage collection overhead.

The cachedRawStore Decorator

The cachedRawStore intercepts ReadRaw, WriteRaw, and Head operations to serve data from memory whenever possible:

  • ReadRaw: Checks the cache first. On a miss, it fetches from the underlying store, populates the cache, and returns the data.
  • WriteRaw: Writes to the underlying store first, then updates the cache.
  • Head: Returns metadata (like object size) from the cache if the object is present.

Caching Constraints and Tradeoffs

The behavior of the cache is governed by CachingConfig in flytestdlib/storage/config.go:

type CachingConfig struct {
MaxSizeMegabytes int `json:"max_size_mbs" pflag:",Maximum size of the cache..."`
TargetGCPercent int `json:"target_gc_percent" pflag:",Sets the garbage collection target percentage."`
}

There are two critical implementation details to consider when tuning the cache:

  1. The 1/1024 Rule: Due to the design of the underlying freecache library, any single object larger than 1/1024 of the total MaxSizeMegabytes will not be cached. For example, if the cache is set to 1024 MB, objects larger than 1 MB will be ignored by the cache.
  2. Garbage Collection Tuning: The TargetGCPercent field allows Flyte to tune the Go runtime's GC behavior via debug.SetGCPercent. This is often used when a large portion of the heap is dedicated to the cache, helping to prevent frequent GC cycles. Note that this setting affects the entire process, not just the storage layer.

If a write to the cache fails (e.g., because the entry is too large), Flyte returns a wrapped ErrFailedToWriteCache. The operation itself is still considered successful, and callers can use IsFailedWriteToCache(err) to identify these non-fatal errors.

Storage Download Limits

To prevent a single request from exhausting the memory of a Flyte component (like FlytePropeller or FlyteAdmin), the storage layer enforces a maximum download size via LimitsConfig.

In flytestdlib/storage/stow_store.go, the ReadRaw method checks the size of the object before opening a stream:

if GetConfig().Limits.GetLimitMegabytes != 0 {
if sizeBytes > GetConfig().Limits.GetLimitMegabytes*MiB {
return nil, errors.Errorf(ErrExceedsLimit, "limit exceeded. %.6fmb > %vmb...", ...)
}
}

By default, Flyte sets GetLimitMegabytes to 2 MB. If a file exceeds this limit, the storage layer returns an ErrExceedsLimit error, preventing the system from attempting to load the large payload into memory.

Monitoring Performance

The storage layer exports several Prometheus metrics through the cacheMetrics struct to help monitor the effectiveness of the cache:

  • cache_hit: Incremented when data is served directly from the in-memory cache.
  • cache_miss: Incremented when data must be fetched from the remote store (S3/GCS/etc.).
  • cache_write_err: Incremented when an object could not be written to the cache (often due to the 1/1024 size limit).
  • remote_fetch: A stopwatch metric measuring the latency of reading from the underlying remote store.

Configuration Example

These settings are configured under the Storage section of the Flyte configuration. Below is an example of how to enable a 512 MB cache and increase the download limit to 10 MB:

storage:
type: stow
stow:
kind: s3
config:
region: us-east-1
cache:
max_size_mbs: 512
target_gc_percent: 100
limits:
maxDownloadMBs: 10

If max_size_mbs is omitted or set to 0, the cachedRawStore decorator is not applied, and caching is disabled entirely.