https://dagster.io/ logo
#dagster-support
Title
# dagster-support
m

Manga Dhatrika

05/20/2022, 3:46 PM
Hi I have a job where it needs to extract 8 millions rows from API and load into database, The problem now is this job is failling everytime at different offset number, so I would like to do the retry approach where in the first run if it fails at 1000 offset in the next retry of job try to extract from the 1001 records and keep on retrying until it is succesfully extracted all records, the offset number I could get from the count of table, but I am new to dasgter not sure how could I achieve this, should I use sensor or any other better way I could achieve this?
🤖 1
j

jamie

05/20/2022, 5:46 PM
Hey @Manga Dhatrika we don't support changing op inputs/config on retry requests. If you know you need to pull exactly 8million rows each time the job is run, you could try splitting the pull into separate ops of a certain fixed size and set a retry policy for each op, that way if an op fails it will just try to repull the rows specific to that op example:
Copy code
@job     
def my_job():
    pull_1000_rows.alias("pull_0_999")(first=0, last=999)
    pull_1000_rows.alias("pull_1000_1999)(first=1000, last=1999)
    ...
(you could also put this op generation in a loop)
Copy code
@job     
def my_job():
    for i in range(0, 3000, 1000):
       first = i
       last = i + 9999
       pull_1000_rows.alias(f"pull_{first}_{last}")(first=first, last=last)
m

Manga Dhatrika

05/20/2022, 5:50 PM
the number will not be 8 millions all the time, it would be more than that its dynamic
j

jamie

05/20/2022, 5:53 PM
In that case you could also use Dynamic Ops https://docs.dagster.io/concepts/ops-jobs-graphs/dynamic-graphs retry policies should still work if you define your graph this way
m

Manga Dhatrika

05/20/2022, 6:14 PM
what is difference between the retry sensors than the retry ops, I was just looking here https://docs.dagster.io/concepts/partitions-schedules-sensors/sensors#job-failure-sensor
could we add a job-failure-sensor when ever it is failed could we query the table and get offset and then run the job again?
j

jamie

05/20/2022, 6:18 PM
retry sensor will run if the job fails, and the retry on ops will just attempt to run an op again if it fails, and wont cause the whole job to fail unless the op fails on every retry. and yeah i think having a job failure sensor that gets and offset and returns a RunRequest to run the job again seems totally reasonable! its a good solution to your setup!
m

Manga Dhatrika

05/20/2022, 6:27 PM
how could we make a job run again on failure?
if I was trying out the job-failure-sensor
pseudo code should be helpful
j

jamie

05/20/2022, 7:01 PM
hmm thinking more about this, i don't think i've ever tried (or seen anyone try) yielding RunRequests from a run_failure_sensor. Definitely worth playing around with, but I can't say with 100% certainty that it will work Might look something like this
Copy code
@run_failure_sensor(
   job_selection=[fetch_db_rows], # so the sensor only runs when this job fails
)
def rerun_db_fetch_sensor(context: RunFailureSensorContext):
    # logic to get the rows to start on
    yield RunRequest(
       run_config={<your run config here>},
       job_name="fetch_db_rows"
    )
m

Manga Dhatrika

05/20/2022, 9:03 PM
ok I could try Thanks
This sensor seems to be not rerunning the job, when there is a failure