I have question. when the k8s pod is terminated or...
# ask-community
g
I have question. when the k8s pod is terminated or killed some reason, how can I make the flow move forward and run the final op? I know how to handle the failure in op, but I am not sure how to handle op killed.
1
d
Hey Gatsby - which k8s pod is being referred to here? Are you using the executor that runs each pod in its own step and that's getting killed, or is it the pod for the whole run that's getting killed? What's the reason for the pod being terminated? autoscaling?
g
I guess that it is terminated by autoscaling although the k8s config is set with this.
Copy code
"annotations": {"<http://cluster-autoscaler.kubernetes.io/safe-to-evict|cluster-autoscaler.kubernetes.io/safe-to-evict>": "false"},
let me check which pod is killed.
( Thank you for your reply 😄 )
d
Assuming the run failed, I think the easiest way is probably to set up run retries: https://docs.dagster.io/deployment/run-retries#run-retries - which will create a new run but will be able to pick up where it left off
g
Screen Shot 2023-02-28 at 10.27.12 AM.png
d
Got it - it looks like its the run pod that's getting interrupted then
g
can I make it move forward although the step is terminated.
d
I don't think we currently have a way to do that unfortunately - the retry strategy can be either FROM_FAILURE (which would start from the op that failed) or (ALL_STEPS)
g
ic.
thank you!