Craig Morris
07/18/2023, 4:28 PMCraig Morris
07/18/2023, 4:36 PMCraig Morris
07/18/2023, 4:46 PMCraig Morris
07/18/2023, 4:46 PMRun timed out due to taking longer than 300 seconds to start.
Debug information for pod dagster-run-62fa0cb3-526d-4662-b69f-22953c2efb5e-d4ngr:
Pod status: Running
Container 'dagster' status: Ready
No logs in pod.
No warning events for pod.
For more information about the failure, try running `kubectl describe pod dagster-run-62fa0cb3-526d-4662-b69f-22953c2efb5e-d4ngr`, `kubectl logs dagster-run-62fa0cb3-526d-4662-b69f-22953c2efb5e-d4ngr`, or `kubectl describe job dagster-run-62fa0cb3-526d-4662-b69f-22953c2efb5e` in your cluster.
Craig Morris
07/18/2023, 4:46 PMCraig Morris
07/18/2023, 4:47 PM> kubectl logs dagster-run-62fa0cb3-526d-4662-b69f-22953c2efb5e-d4ngr
{"__class__": "DagsterEvent", "event_specific_data": {"__class__": "EngineEventData", "error": null, "marker_end": null, "marker_start": null, "metadata_entries": []}, "event_type_value": "ENGINE_EVENT", "logging_tags": {}, "message": "Ignoring a run worker that started after the run had already finished.", "pid": null, "pipeline_name": "sql_cache_updater", "solid_handle": null, "step_handle": null, "step_key": null, "step_kind_value": null}
alex
07/20/2023, 5:57 PM300
seconds
https://github.com/dagster-io/dagster/blame/master/helm/dagster/values.yaml#L1100
for your kubernetes cluster to start the pod for the job. One possibility is that your cluster is over loaded which is preventing the pod from being scheduled in time.
The error message from the kubectl logs is for when the pod actually does come up and sees it has already been marked as failed by the run monitoring daemon.
You can change the timeout via your helm values if there is an expected reason for pod start up to be so long.Craig Morris
07/21/2023, 2:52 PM