Hi! I currently use a `run_failure_sensor` that mo...
# ask-community
j
Hi! I currently use a
run_failure_sensor
that monitors all my repositories every 30 seconds to notify me. I would like to add a retry system only for some errors (via a regex on
context.failure_event.message
) with a delay between the error and the next execution (for example 10 minutes after the first failure, 1h after the second failure, etc.). The difficulty I have here is to schedule the executions. I had thought of adding the ids of the failures in the sensor cursor, to then check at each tick if the execution should be scheduled or wait for the next ticks. Since
run_failure_sensor
is a predefined sensor in dagster, I don't have control over the cursor for example. I guess I need to define my own sensor that can do this. Maybe there are other alternatives that are easier to set up? I was going to use
EventRecordsFilter
which allows to list the last executions. I notice that this function has
after_timestamp
and
after_cursor
parameters. Is there one of the two parameters to be privileged in my case? I would also like to make sure that the
EventRecordsFilter
runtime is not too important, (especially when initializing the sensor, or in case I have to catch up a lot of runs because the sensor was disabled). Maybe there is a way to limit the number of ticks to a few tens/hundreds per tick so that the sensor does not reach its maximum time limit. I had a quick look at the source code to understand how to define my own sensor, but I'm having a bit of trouble identifying the necessary parts, especially since I'd like to avoid using dagster-specific functions that might evolve in the next releases. Do you have a strategy to advise me to implement this? Thanks in advance