What happened?
Two bugs in LocalJob.logs(follow=True) that make it pretty unusable as a streaming interface:
1. Duplicate output. The method calls print() internally, so if you iterate get_job_logs(follow=True) and print each line yourself, everything shows up twice — once from inside the SDK, once from your own code.
2. No real-time streaming. Despite follow=True implying otherwise, nothing is yielded until the job finishes. It's just a batched return wearing a streaming API's clothes.
Steps to Reproduce
from kubeflow.trainer import TrainerClient
from kubeflow.trainer.backends.localprocess.backend import LocalProcessBackendConfig
client = TrainerClient(backend_config=LocalProcessBackendConfig())
job_id = client.train(...)
for line in client.get_job_logs(name=job_id, follow=True):
print(line) # every line prints twice
Where it's happening
kubeflow/trainer/backends/localprocess/job.py — logs() method:
def logs(self, follow=False) -> list[str]:
if not follow:
return self._stdout.splitlines()
try:
for chunk in self.stream_logs():
print(chunk, end="", flush=True) # ← writes directly to stdout, caller has no say
except StopIteration:
pass
return self._stdout.splitlines() # ← blocks until job is done, then dumps everything at once
What should happen instead
- No
print() inside the method — yielding lines and leaving output decisions to the caller is the whole point of this API.
- Lines should reach the caller as they're produced, not in one batch after the job exits.
Environment
- Kubeflow SDK:
0.4.0
- Backend:
LocalProcessBackend
- No Kubernetes cluster needed to reproduce
/kind bug
/area local
What happened?
Two bugs in
LocalJob.logs(follow=True)that make it pretty unusable as a streaming interface:1. Duplicate output. The method calls
print()internally, so if you iterateget_job_logs(follow=True)and print each line yourself, everything shows up twice — once from inside the SDK, once from your own code.2. No real-time streaming. Despite
follow=Trueimplying otherwise, nothing is yielded until the job finishes. It's just a batched return wearing a streaming API's clothes.Steps to Reproduce
Where it's happening
kubeflow/trainer/backends/localprocess/job.py—logs()method:What should happen instead
print()inside the method — yielding lines and leaving output decisions to the caller is the whole point of this API.Environment
0.4.0LocalProcessBackend/kind bug
/area local