https://dagster.io/ logo
Title
a

Alex Rudolph

03/03/2022, 6:12 PM
Hey all! A lot of our pipelines will occasionally fail with this error. We're on Dagster 0.13.12 and using the k8s_run_executor and are not sure why this started happening:
kubernetes.client.exceptions.ApiException: (404)
Reason: Not Found
HTTP response headers: HTTPHeaderDict({'Audit-Id': '6829c683-9321-4b55-9592-c94794c50c81', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Kubernetes-Pf-Flowschema-Uid': 'd62db2cc-fd32-4618-af78-d12b3e721ec9', 'X-Kubernetes-Pf-Prioritylevel-Uid': 'f91e6c2b-ed56-4c59-98f2-420aed375018', 'Date': 'Thu, 03 Mar 2022 17:51:34 GMT', 'Content-Length': '282'})
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"jobs.batch \"dagster-job-365e6e829dfcac3a5f67eb4970876776-2\" not found","reason":"NotFound","details":{"name":"dagster-job-365e6e829dfcac3a5f67eb4970876776-2","group":"batch","kind":"jobs"},"code":404}



  File "/usr/local/lib/python3.7/site-packages/dagster/core/execution/api.py", line 775, in pipeline_execution_iterator
    for event in pipeline_context.executor.execute(pipeline_context, execution_plan):
  File "/usr/local/lib/python3.7/site-packages/dagster/core/executor/step_delegating/step_delegating_executor.py", line 205, in execute
    plan_context, [step], active_execution
  File "/usr/local/lib/python3.7/site-packages/dagster_k8s/executor.py", line 217, in check_step_health
    job = self._batch_api.read_namespaced_job(namespace=self._job_namespace, name=job_name)
  File "/usr/local/lib/python3.7/site-packages/kubernetes/client/api/batch_v1_api.py", line 1257, in read_namespaced_job
    return self.read_namespaced_job_with_http_info(name, namespace, **kwargs)  # noqa: E501
  File "/usr/local/lib/python3.7/site-packages/kubernetes/client/api/batch_v1_api.py", line 1366, in read_namespaced_job_with_http_info
    collection_formats=collection_formats)
  File "/usr/local/lib/python3.7/site-packages/kubernetes/client/api_client.py", line 353, in call_api
    _preload_content, _request_timeout, _host)
  File "/usr/local/lib/python3.7/site-packages/kubernetes/client/api_client.py", line 184, in __call_api
    _request_timeout=_request_timeout)
  File "/usr/local/lib/python3.7/site-packages/kubernetes/client/api_client.py", line 377, in request
    headers=headers)
  File "/usr/local/lib/python3.7/site-packages/kubernetes/client/rest.py", line 243, in GET
    query_params=query_params)
  File "/usr/local/lib/python3.7/site-packages/kubernetes/client/rest.py", line 233, in request
    raise ApiException(http_resp=r)
a

alex

03/03/2022, 7:00 PM
@johann
r

Roei Jacobovich

06/09/2022, 11:06 PM
Hi @Alex Rudolph, did you solve the issue? We encountered the same one. Thanks
a

Alex Rudolph

06/09/2022, 11:09 PM
@Roei Jacobovich it seems like for this version of dagster, python errors weren't surfacing and we'd have to inspect the logs of the pods to see what went wrong. Since then we've updated dagster versions and have had nicer error messages