Skip to main content

GetLogs

Returns the logs for a Kubeflow job by initializing a log plugin and retrieving logs for master, worker, parameter server, chief, and evaluator replicas based on the task type and replica counts.

def GetLogs(
pluginContext: k8s.PluginContext,
taskType: string,
objectMeta: meta_v1.ObjectMeta,
taskTemplate: *core.TaskTemplate,
hasMaster: bool,
workersCount: int32,
psReplicasCount: int32,
chiefReplicasCount: int32,
evaluatorReplicasCount: int32,
primaryContainerName: string
) - > []*core.TaskLog, error

Retrieves the logs for a Kubeflow job, including logs for master, worker, parameter server, chief, and evaluator replicas based on the task type and replica counts. This function aggregates logs from various components of a distributed Kubeflow job.

Parameters

NameTypeDescription
pluginContextk8s.PluginContextThe context for the Kubernetes plugin, providing access to task execution metadata.
taskTypestringThe type of the task (e.g., PytorchTaskType, MPITaskType), which determines which specific logs to retrieve.
objectMetameta_v1.ObjectMetaKubernetes object metadata, containing the name and namespace of the Kubeflow job to identify the pods.
taskTemplate*core.TaskTemplateThe task template associated with the job, used for initializing the log plugin.
hasMasterboolA boolean indicating whether the job has a master replica, relevant for Pytorch tasks.
workersCountint32The number of worker replicas in the job, used to iterate and retrieve logs for each worker.
psReplicasCountint32The number of parameter server replicas in the job, used to retrieve logs for each parameter server.
chiefReplicasCountint32The number of chief replicas in the job (typically 0 or 1), used to retrieve logs for the chief replica.
evaluatorReplicasCountint32The number of evaluator replicas in the job (typically 0 or 1), used to retrieve logs for the evaluator replica.
primaryContainerNamestringThe name of the primary container within the pods from which to fetch logs.

Returns

TypeDescription
[]*core.TaskLog, errorA slice of TaskLog objects containing the aggregated logs, or an error if log retrieval fails for any component.