Ashish Sharma
08/01/2022, 7:05 AMException: timeout expired
error from the past 1 month, but when we rerun the job it gets completed successfully. I have been trying to kow the Rootcause with there is no information provided on the net. Can you please check this log and let me know, why this issue happens and how to fix it.
dagster.core.errors.DagsterExecutionStepExecutionError: Error occurred while executing op "complete_dq_request":
2
3 File "/usr/local/lib/python3.8/dist-packages/dagster/core/execution/plan/execute_plan.py", line 230, in dagster_event_sequence_for_step
4 for step_event in check.generator(step_events):
5 File "/usr/local/lib/python3.8/dist-packages/dagster/core/execution/plan/execute_step.py", line 353, in core_dagster_event_sequence_for_step
6 for user_event in check.generator(
7 File "/usr/local/lib/python3.8/dist-packages/dagster/core/execution/plan/execute_step.py", line 69, in _step_output_error_checked_user_event_sequence
8 for user_event in user_event_sequence:
9 File "/usr/local/lib/python3.8/dist-packages/dagster/core/execution/plan/compute.py", line 174, in execute_core_compute
10 for step_output in _yield_compute_results(step_context, inputs, compute_fn):
11 File "/usr/local/lib/python3.8/dist-packages/dagster/core/execution/plan/compute.py", line 142, in _yield_compute_results
12 for event in iterate_with_context(
13 File "/usr/local/lib/python3.8/dist-packages/dagster/utils/__init__.py", line 407, in iterate_with_context
14 return
15 File "/usr/lib/python3.8/contextlib.py", line 131, in __exit__
16 self.gen.throw(type, value, traceback)
17 File "/usr/local/lib/python3.8/dist-packages/dagster/core/execution/plan/utils.py", line 73, in solid_execution_error_boundary
18 raise error_cls(
19
20The above exception was caused by the following exception:
21Exception: timeout expired
22
23
24 File "/usr/local/lib/python3.8/dist-packages/dagster/core/execution/plan/utils.py", line 47, in solid_execution_error_boundary
25 yield
26 File "/usr/local/lib/python3.8/dist-packages/dagster/utils/__init__.py", line 405, in iterate_with_context
27 next_output = next(iterator)
28 File "/usr/local/lib/python3.8/dist-packages/dagster/core/execution/plan/compute_generator.py", line 65, in _coerce_solid_compute_fn_to_iterator
29 result = fn(context, **kwargs) if context_arg_provided else fn(**kwargs)
30 File "/opt/dagster/home/orchestration_manager/ops/data_quality_ops/op_complete_dq_request.py", line 741, in complete_dq_request
31 raise Exception(l_error)
Ashish Sharma
08/01/2022, 8:18 AMowen
08/01/2022, 6:22 PMcomplete_dq_request
op, rather than from Dagster machinery. It looks like you might be wrapping a call to some request with a try/except block, which might be eating some of the extra context to the error, but at the surface it looks like you're just hitting an API which is taking too long to respond sometimes. Depending on what library you're using to make that request, you can often increase the timeout on your requests to be more lenient. It also seems like the request will usually succeed after another try, so it's possible that enabling a RetryPolicy on your op could be a quick fix, so that if it fails, Dagster will automatically retry the operation.