Hello, we recently have a pipeline that failed to ...
# ask-community
a
Hello, we recently have a pipeline that failed to start due to capacity issues on AWS:
Copy code
Caught an unrecoverable error while dequeuing the run. Marking the run as failed and dropping it from the queue
Exception: Task failed. Failure reason: Capacity is unavailable at this time. Please try again later or in a different availability zone
However, it failed silently without sending us a slack message and there don't seems to be any retry mechanism. Is there anyway I can add a retry for such cases? How about setting up slack alerts?
o
hi @Alvin Lee -- re: slack messages, I believe a run_failure_sensor is what you're looking for. Re automated retries, you can set that up as described here