Oliver
09/14/2022, 4:25 AMdagster._core.definitions.events.RetryRequested
File "/usr/local/lib/python3.9/site-packages/dagster/_core/execution/plan/execute_plan.py", line 224, in dagster_event_sequence_for_step
for step_event in check.generator(step_events):
File "/usr/local/lib/python3.9/site-packages/dagster/_core/execution/plan/execute_step.py", line 357, in core_dagster_event_sequence_for_step
for user_event in check.generator(
File "/usr/local/lib/python3.9/site-packages/dagster/_core/execution/plan/execute_step.py", line 69, in _step_output_error_checked_user_event_sequence
for user_event in user_event_sequence:
File "/usr/local/lib/python3.9/site-packages/dagster/_core/execution/plan/compute.py", line 174, in execute_core_compute
for step_output in _yield_compute_results(step_context, inputs, compute_fn):
File "/usr/local/lib/python3.9/site-packages/dagster/_core/execution/plan/compute.py", line 142, in _yield_compute_results
for event in iterate_with_context(
File "/usr/local/lib/python3.9/site-packages/dagster/_utils/__init__.py", line 432, in iterate_with_context
return
File "/usr/local/lib/python3.9/contextlib.py", line 137, in __exit__
self.gen.throw(typ, value, traceback)
File "/usr/local/lib/python3.9/site-packages/dagster/_core/execution/plan/utils.py", line 87, in solid_execution_error_boundary
raise RetryRequested(
The above exception was caused by the following exception:
dagster._core.errors.DagsterExecutionInterruptedError
File "/usr/local/lib/python3.9/site-packages/dagster/_core/execution/plan/utils.py", line 47, in solid_execution_error_boundary
yield
File "/usr/local/lib/python3.9/site-packages/dagster/_utils/__init__.py", line 430, in iterate_with_context
next_output = next(iterator)
File "/usr/local/lib/python3.9/site-packages/dagster/_core/execution/plan/compute_generator.py", line 73, in _coerce_solid_compute_fn_to_iterator
result = fn(context, **kwargs) if context_arg_provided else fn(**kwargs)
File "/opt/datarwe_nlp/pipelines/freetext_ner_inference/data.py", line 164, in preprocessed_texts_op
p = multiprocessing.Pool(6)
File "/usr/local/lib/python3.9/multiprocessing/context.py", line 119, in Pool
return Pool(processes, initializer, initargs, maxtasksperchild,
File "/usr/local/lib/python3.9/multiprocessing/pool.py", line 212, in __init__
self._repopulate_pool()
File "/usr/local/lib/python3.9/multiprocessing/pool.py", line 303, in _repopulate_pool
return self._repopulate_pool_static(self._ctx, self.Process,
File "/usr/local/lib/python3.9/multiprocessing/pool.py", line 326, in _repopulate_pool_static
w.start()
File "/usr/local/lib/python3.9/multiprocessing/process.py", line 121, in start
self._popen = self._Popen(self)
File "/usr/local/lib/python3.9/multiprocessing/context.py", line 277, in _Popen
return Popen(process_obj)
File "/usr/local/lib/python3.9/multiprocessing/popen_fork.py", line 19, in __init__
self._launch(process_obj)
File "/usr/local/lib/python3.9/multiprocessing/popen_fork.py", line 73, in _launch
os._exit(code)
File "/usr/local/lib/python3.9/site-packages/dagster/_utils/interrupts.py", line 74, in _new_signal_handler
raise error_cls()
and then a little later
dagster._check.CheckError: Invariant failed. Description: Attempted to mark step preprocessed_texts.preprocessed_texts_op[8] as complete that was not known to be in flight
File "/usr/local/lib/python3.9/site-packages/dagster/_core/execution/api.py", line 1035, in pipeline_execution_iterator
for event in pipeline_context.executor.execute(pipeline_context, execution_plan):
File "/usr/local/lib/python3.9/site-packages/dagster/_core/executor/step_delegating/step_delegating_executor.py", line 220, in execute
active_execution.handle_event(dagster_event)
File "/usr/local/lib/python3.9/site-packages/dagster/_core/execution/plan/active.py", line 402, in handle_event
self.mark_success(step_key)
File "/usr/local/lib/python3.9/site-packages/dagster/_core/execution/plan/active.py", line 346, in mark_success
self._mark_complete(step_key)
File "/usr/local/lib/python3.9/site-packages/dagster/_core/execution/plan/active.py", line 387, in _mark_complete
check.invariant(
File "/usr/local/lib/python3.9/site-packages/dagster/_check/__init__.py", line 1470, in invariant
raise CheckError(f"Invariant failed. Description: {desc}")
And in dagit the step is marked as completeOliver
09/15/2022, 5:20 AMPool()
init is sometimes crashing. can't reproduce locally at all and it doesn't even seem to get caught by exception handling
----
is retry behaviour supported in dynamic graphs? if not is the multiprocessing the cause of the strange errors?alex
09/15/2022, 2:15 PMDagsterExecutionInterruptedError
indicates that the process was sent a signal to stop, likely this was kubernetes terminating the container potentially for something like moving the pod for autoscaling
we try to set the properties on the job spec to tell k8s not to restart the pod but some autoscaling systems cause it to do it anyway, which is my best guess for the cause on the second error with the invariant violationOliver
09/20/2022, 1:43 AM