One of my dagster jobs failed and the logs in the ...
# announcements
m
One of my dagster jobs failed and the logs in the job pod give me a traceback error. I'd like to see the same traceback in the Dagit UI but I can't seem to find it (I have Debug filter enabled). I only see that my job failed and a message that logs are being retrieved. Is it possible to see the pod logs in the dagit ui?
c
hey Michiel 👋 when you say “my dagster job fails” — did you mean that inspecting the K8s Job shows the K8s Job has status failed or that dagster emits a step event saying the dagster step failed?
In the former case, we will fetch all the logs and surface the raw logs in the UI. (sometimes this is not possible, for example if the job is deleted)
in the latter, we just fetch the logs in order for the executor to keep traversing the execution graph. We do not surface the raw logs in the UI but do surface the parsed dagster events in the UI
👍 1
m
Hi Cat, this is the traceback that I get in the logs of the K8s job pod:
Copy code
Traceback (most recent call last): File "/usr/local/bin/dagster", line 8, in <module> sys.exit(main()) File "/usr/local/lib/python3.7/site-packages/dagster/cli/__init__.py", line 40, in main cli(obj={}) # pylint:disable=E1123 File "/usr/local/lib/python3.7/site-packages/click/core.py", line 829, in __call__ return self.main(*args, **kwargs) File "/usr/local/lib/python3.7/site-packages/click/core.py", line 782, in main rv = self.invoke(ctx) File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1066, in invoke return ctx.invoke(self.callback, **ctx.params) File "/usr/local/lib/python3.7/site-packages/click/core.py", line 610, in invoke return callback(*args, **kwargs) File "/usr/local/lib/python3.7/site-packages/dagster/cli/api.py", line 427, in execute_step_with_structured_logs_command execution_plan, pipeline_run, instance, run_config=args.run_config, retries=retries, File "/usr/local/lib/python3.7/site-packages/dagster/core/execution/api.py", line 732, in __iter__ execution_plan=self.execution_plan, pipeline_context=self.pipeline_context, File "/usr/local/lib/python3.7/site-packages/dagster/core/execution/plan/execute_plan.py", line 63, in inner_plan_execution_iterator _assert_missing_inputs_optional(uncovered_inputs, execution_plan, step.key) File "/usr/local/lib/python3.7/site-packages/dagster/core/execution/plan/execute_plan.py", line 154, in _assert_missing_inputs_optional output_name=nonoptionals[0].output_name, dagster.core.errors.DagsterStepOutputNotFoundError: When executing test_kafka.compute discovered required outputs missing from previous step: [StepOutputHandle(step_key='initialising.compute', output_name='brokers')]
I think the cause of this error is that I haven't setup intermediate storage, so it is logical that it cannot find the output of another job because that is another file system. But I thought I would have been able to see this this traceback in the UI but that is not the case. Does this have to do with the fact that I'm using K8S Celery execution but with filesystem storage enabled?
c
yes, would recommend using s3 or similar for re-execution cases
Copy code
storage:
  s3:
    config:
      s3_bucket: "bucket"
      s3_prefix: "prefix"
when you click on the “view raw step output” on the step start event for that step, do you see the error there?