hi team, wondering if you could help with pointers to debug jobs pipelines stuck after “[K8sRunLauncher] Kubernetes run worker job created” please?
Stuck for more than 30m…checked that there is no pod created for dagster-run-dce20411-3155-4040-87c3-801bac287de7
Hi hebo - that second line happens after the call to create_namespaced_job completes, so it would be surprising for there to be no pod created. (note that the name of the pod would start with dagster-run-dce20411-3155-4040-87c3-801bac287de7 but would also have a suffix at the end afterwards - if you run
kubectl get pods -n dagster | grep dagster-run-dce20411-3155-4040-87c3-801bac287de7
do you still not see anything?
Thanks Daniel! Yeah..the pod started but failed with some vault issues on our side. Thanks a ton for pointing out that the pod name has a suffix😅