Pablo Beltran
04/03/2023, 11:37 PMStack Trace:
File "/usr/local/lib/python3.7/site-packages/dagster_k8s/launcher.py", line 379, in check_run_worker_health
job_name=job_name,
File "/usr/local/lib/python3.7/site-packages/dagster_k8s/client.py", line 366, in get_job_status
return k8s_api_retry(_get_job_status, max_retries=3, timeout=wait_time_between_attempts)
File "/usr/local/lib/python3.7/site-packages/dagster_k8s/client.py", line 124, in k8s_api_retry
) from e
The above exception was caused by the following exception:
kubernetes.client.exceptions.ApiException: (404)
Reason: Not Found
HTTP response headers: HTTPHeaderDict({'Audit-Id': 'a65413c7-fd00-46ef-94e2-e056f1261a2e', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Kubernetes-Pf-Flowschema-Uid': '802301fe-53ea-4d4d-aeb0-efd3891f18ac', 'X-Kubernetes-Pf-Prioritylevel-Uid': '2952e8a6-fb75-43dc-b244-b39008d24b04', 'Date': 'Mon, 03 Apr 2023 23:23:27 GMT', 'Content-Length': '290'})
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"jobs.batch \"dagster-run-52fc0109-804d-441f-b1b9-c077f14a4005-1\" not found","reason":"NotFound","details":{"name":"dagster-run-52fc0109-804d-441f-b1b9-c077f14a4005-1","group":"batch","kind":"jobs"},"code":404}
daniel
04/04/2023, 2:09 AMPablo Beltran
04/04/2023, 2:11 AMdaniel
04/04/2023, 2:12 AMdaniel
04/04/2023, 2:13 AMPablo Beltran
04/04/2023, 6:13 PMdaniel
04/05/2023, 1:00 AMPablo Beltran
04/05/2023, 6:24 AMrunMonitoring:
enabled: true
# Timeout for runs to start (avoids runs hanging in STARTED)
startTimeoutSeconds: 180
# How often to check on in progress runs
pollIntervalSeconds: 120
# Max number of times to attempt to resume a run with a new run worker. Defaults to 3 if the the
# run launcher supports resuming runs, otherwise defaults to 0.
maxResumeRunAttempts: 0
So I would expect there to be no resumes.daniel
04/05/2023, 11:51 AMdaniel
04/05/2023, 11:53 AMdaniel
04/05/2023, 11:55 AMPablo Beltran
04/05/2023, 3:53 PMArsenii Poriadin
05/09/2023, 4:44 PMdagster_k8s.client.DagsterK8sUnrecoverableAPIError
have you figured out the issue?Pablo Beltran
05/09/2023, 4:44 PMArsenii Poriadin
05/09/2023, 4:59 PMmaxResumeRunAttempts
?Arsenii Poriadin
05/09/2023, 5:00 PMdagster_k8s.client.DagsterK8sUnrecoverableAPIError
itself, will it?Pablo Beltran
05/09/2023, 5:02 PM